[ 
https://issues.apache.org/jira/browse/HDFS-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248577#comment-15248577
 ] 

Chris Nauroth commented on HDFS-10312:
--------------------------------------

I saw this happen with a block report from a DataNode containing ~6 million 
blocks.  All blocks were on a single data directory, so unfortunately, the 
block report splitting by storage didn't help.  Here is a sample stack trace:

{code}
org.apache.hadoop.ipc.RemoteException: java.lang.IllegalStateException: 
com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the 
size limit.
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.runBlockOp(BlockManager.java:4404)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:1436)
        at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReport(DatanodeProtocolServerSideTranslatorPB.java:173)
        at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:30059)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2417)
Caused by: java.lang.IllegalStateException: 
com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the 
size limit.
        at 
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369)
        at 
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiffSorted(BlockManager.java:2478)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:2313)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:2121)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$1.call(NameNodeRpcServer.java:1439)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$1.call(NameNodeRpcServer.java:1436)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:4463)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:4442)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message 
was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to 
increase the size limit.
        at 
com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
        at 
com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
        at 
com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
        at 
com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
        at 
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:365)
        ... 9 more

        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1443)
        at org.apache.hadoop.ipc.Client.call(Client.java:1402)
        at org.apache.hadoop.ipc.Client.call(Client.java:1352)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy21.blockReport(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:204)
        at 
org.apache.hadoop.hdfs.server.datanode.TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit(TestLargeBlockReport.java:86)
{code}

This is an unusual situation, but we should provide a way for it to succeed.


> Large block reports may fail to decode at NameNode due to 64 MB protobuf 
> maximum length restriction.
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10312
>                 URL: https://issues.apache.org/jira/browse/HDFS-10312
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>
> Our RPC server caps the maximum size of incoming messages at 64 MB by 
> default.  For exceptional circumstances, this can be uptuned using 
> {{ipc.maximum.data.length}}.  However, for block reports, there is still an 
> internal maximum length restriction of 64 MB enforced by protobuf.  (Sample 
> stack trace to follow in comments.)  This issue proposes to apply the same 
> override to our block list decoding, so that large block reports can proceed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to