[ 
https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074969#comment-14074969
 ] 

Arpit Agarwal edited comment on HDFS-6756 at 7/25/14 9:37 PM:
--------------------------------------------------------------

Did you figure out which specific RPC call? Was it a block report? Also what 
version of Hadoop are you running?

We used to see this error message when the block count per DataNode would 
exceed roughly 6 Million. We fixed it in v2.4 by splitting block reports per 
storage. A large protocol message take seconds to process and can 'freeze' the 
callee if there is a lock held while processing it.

As a last resort this limit can be increased on a cluster-specific basis. I 
don't think it is a good idea to just change the default.


was (Author: arpitagarwal):
Did you figure out which specific RPC call? Was it a block report? Also what 
version of Hadoop are you running?

We used to see this error message when the block count per DataNode would 
exceed roughly 6 Million. We fixed it in Apache Hadoop 2.4 by splitting block 
reports per storage. This error is likely a symptom of an underlying problem 
that needs to be fixed. A arge protocol message take seconds to process and can 
'freeze' the callee if there is a lock held while processing it.

As a last resort this limit can be increased on a cluster-specific basis. I 
don't think it is a good idea to just change the default.

> Default ipc.maximum.data.length should be increased to 128MB from 64MB
> ----------------------------------------------------------------------
>
>                 Key: HDFS-6756
>                 URL: https://issues.apache.org/jira/browse/HDFS-6756
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Juan Yu
>            Assignee: Juan Yu
>            Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to