Re: Protocol message was too large error during bulk load

Ted Yu Sat, 25 Jul 2015 20:31:53 -0700

To figure out the row key which caused the exception, you can apply the
change similar to the following:
http://pastebin.com/M2Jus3PK


Cheers

On Sat, Jul 25, 2015 at 6:59 PM, F. Jerrell Schivers <[email protected]
> wrote:

> Hi Ted.  Thanks for the response.
>
> Here's the full stack trace.
>
> 2015-07-25 21:28:36,503 WARN [main]
> org.apache.hadoop.mapred.YarnChild: Exception running child :
> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> was too large.  May be malicious.  Use CodedInputStream.setSizeLimit()
> to increase the size limit.
> at
> com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
> at
> com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
> at
> com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811)
> at
> com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue.<init>(ClientProtos.java:8618)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue.<init>(ClientProtos.java:8563)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue$1.parsePartialFrom(ClientProtos.java:8672)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue$1.parsePartialFrom(ClientProtos.java:8667)
> at
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue.<init>(ClientProtos.java:8462)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue.<init>(ClientProtos.java:8404)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$1.parsePartialFrom(ClientProtos.java:8498)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$1.parsePartialFrom(ClientProtos.java:8493)
> at
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto.<init>(ClientProtos.java:7959)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto.<init>(ClientProtos.java:7890)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$1.parsePartialFrom(ClientProtos.java:8045)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$1.parsePartialFrom(ClientProtos.java:8040)
> at
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
> at
> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
> at
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> at
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> at
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto.parseDelimitedFrom(ClientProtos.java:10468)
> at
> org.apache.hadoop.hbase.mapreduce.MutationSerialization$MutationDeserializer.deserialize(MutationSerialization.java:60)
> at
> org.apache.hadoop.hbase.mapreduce.MutationSerialization$MutationDeserializer.deserialize(MutationSerialization.java:50)
> at
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:146)
> at
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
> at
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>
> --Jerrell
>
> On Sat, Jul 25, 2015 at 9:51 PM, Ted Yu <[email protected]> wrote:
> > bq. Is it the size of a particular row
> >
> > That was likely the cause.
> >
> > Can you post the full stack trace ?
> >
> > Thanks
> >
> > On Sat, Jul 25, 2015 at 6:28 PM, F. Jerrell Schivers <
> [email protected]
> >> wrote:
> >
> >> Hello,
> >>
> >> I'm getting the following error when I try to bulk load some data into
> >> an HBase table at the end of a MapReduce job:
> >>
> >> org.apache.hadoop.mapred.YarnChild: Exception running child :
> >> com.google.protobuf.InvalidProtocolBufferException: Protocol message
> >> was too large.  May be malicious.  Use CodedInputStream.setSizeLimit()
> >> to increase the size limit.
> >>
> >> This process was working fine until recently, so presumably as the
> >> dataset has grown I've hit the default 64MB protobuf message size
> >> limit.
> >>
> >> How can I increase this limit?  I'm doing the bulk load
> >> programatically, and I haven't found a way to call
> >> CodedInputStream.setSizeLimit() as suggested.
> >>
> >> Only one reducer is failing, out of 500.  Is there any way to figure
> >> out which keys are in that reducer?  When this happened once in the
> >> past I was able to trace the problem to one particular key
> >> corresponding to a very wide row.  Since I knew that key wasn't
> >> important I simply removed it from the dataset.  However I'm having no
> >> luck this time around.
> >>
> >> One last question.  Can someone explain what exactly is exceeding this
> >> size limit?  Is it the size of a particular row, or something else?
> >>
> >> I'm running HBase 0.98.2.
> >>
> >> Thanks,
> >> Jerrell
> >>
>

Re: Protocol message was too large error during bulk load

Reply via email to