Protocol message was too large error during bulk load

F. Jerrell Schivers Sat, 25 Jul 2015 18:30:55 -0700

Hello,

I'm getting the following error when I try to bulk load some data into
an HBase table at the end of a MapReduce job:


org.apache.hadoop.mapred.YarnChild: Exception running child :
com.google.protobuf.InvalidProtocolBufferException: Protocol message
was too large.  May be malicious.  Use CodedInputStream.setSizeLimit()
to increase the size limit.

This process was working fine until recently, so presumably as the
dataset has grown I've hit the default 64MB protobuf message size
limit.

How can I increase this limit?  I'm doing the bulk load
programatically, and I haven't found a way to call
CodedInputStream.setSizeLimit() as suggested.

Only one reducer is failing, out of 500.  Is there any way to figure
out which keys are in that reducer?  When this happened once in the
past I was able to trace the problem to one particular key
corresponding to a very wide row.  Since I knew that key wasn't
important I simply removed it from the dataset.  However I'm having no
luck this time around.

One last question.  Can someone explain what exactly is exceeding this
size limit?  Is it the size of a particular row, or something else?

I'm running HBase 0.98.2.

Thanks,
Jerrell

Protocol message was too large error during bulk load

Reply via email to