[ https://issues.apache.org/jira/browse/HDFS-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074969#comment-14074969 ]
Arpit Agarwal edited comment on HDFS-6756 at 7/25/14 9:37 PM: -------------------------------------------------------------- Did you figure out which specific RPC call? Was it a block report? Also what version of Hadoop are you running? We used to see this error message when the block count per DataNode would exceed roughly 6 Million. We fixed it in v2.4 by splitting block reports per storage. A large protocol message take seconds to process and can 'freeze' the callee if there is a lock held while processing it. As a last resort this limit can be increased on a cluster-specific basis. I don't think it is a good idea to just change the default. was (Author: arpitagarwal): Did you figure out which specific RPC call? Was it a block report? Also what version of Hadoop are you running? We used to see this error message when the block count per DataNode would exceed roughly 6 Million. We fixed it in Apache Hadoop 2.4 by splitting block reports per storage. This error is likely a symptom of an underlying problem that needs to be fixed. A arge protocol message take seconds to process and can 'freeze' the callee if there is a lock held while processing it. As a last resort this limit can be increased on a cluster-specific basis. I don't think it is a good idea to just change the default. > Default ipc.maximum.data.length should be increased to 128MB from 64MB > ---------------------------------------------------------------------- > > Key: HDFS-6756 > URL: https://issues.apache.org/jira/browse/HDFS-6756 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Juan Yu > Assignee: Juan Yu > Priority: Minor > -- This message was sent by Atlassian JIRA (v6.2#6252)