To figure out the row key which caused the exception, you can apply the change similar to the following: http://pastebin.com/M2Jus3PK
Cheers On Sat, Jul 25, 2015 at 6:59 PM, F. Jerrell Schivers <[email protected] > wrote: > Hi Ted. Thanks for the response. > > Here's the full stack trace. > > 2015-07-25 21:28:36,503 WARN [main] > org.apache.hadoop.mapred.YarnChild: Exception running child : > com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() > to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue.<init>(ClientProtos.java:8618) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue.<init>(ClientProtos.java:8563) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue$1.parsePartialFrom(ClientProtos.java:8672) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue$1.parsePartialFrom(ClientProtos.java:8667) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue.<init>(ClientProtos.java:8462) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue.<init>(ClientProtos.java:8404) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$1.parsePartialFrom(ClientProtos.java:8498) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$1.parsePartialFrom(ClientProtos.java:8493) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto.<init>(ClientProtos.java:7959) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto.<init>(ClientProtos.java:7890) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$1.parsePartialFrom(ClientProtos.java:8045) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$1.parsePartialFrom(ClientProtos.java:8040) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241) > at > com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253) > at > com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259) > at > com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto.parseDelimitedFrom(ClientProtos.java:10468) > at > org.apache.hadoop.hbase.mapreduce.MutationSerialization$MutationDeserializer.deserialize(MutationSerialization.java:60) > at > org.apache.hadoop.hbase.mapreduce.MutationSerialization$MutationDeserializer.deserialize(MutationSerialization.java:50) > at > org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:146) > at > org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121) > at > org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170) > at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > > --Jerrell > > On Sat, Jul 25, 2015 at 9:51 PM, Ted Yu <[email protected]> wrote: > > bq. Is it the size of a particular row > > > > That was likely the cause. > > > > Can you post the full stack trace ? > > > > Thanks > > > > On Sat, Jul 25, 2015 at 6:28 PM, F. Jerrell Schivers < > [email protected] > >> wrote: > > > >> Hello, > >> > >> I'm getting the following error when I try to bulk load some data into > >> an HBase table at the end of a MapReduce job: > >> > >> org.apache.hadoop.mapred.YarnChild: Exception running child : > >> com.google.protobuf.InvalidProtocolBufferException: Protocol message > >> was too large. May be malicious. Use CodedInputStream.setSizeLimit() > >> to increase the size limit. > >> > >> This process was working fine until recently, so presumably as the > >> dataset has grown I've hit the default 64MB protobuf message size > >> limit. > >> > >> How can I increase this limit? I'm doing the bulk load > >> programatically, and I haven't found a way to call > >> CodedInputStream.setSizeLimit() as suggested. > >> > >> Only one reducer is failing, out of 500. Is there any way to figure > >> out which keys are in that reducer? When this happened once in the > >> past I was able to trace the problem to one particular key > >> corresponding to a very wide row. Since I knew that key wasn't > >> important I simply removed it from the dataset. However I'm having no > >> luck this time around. > >> > >> One last question. Can someone explain what exactly is exceeding this > >> size limit? Is it the size of a particular row, or something else? > >> > >> I'm running HBase 0.98.2. > >> > >> Thanks, > >> Jerrell > >> >
