[
https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14731799#comment-14731799
]
Tsz Wo Nicholas Sze commented on HDFS-9011:
-------------------------------------------
It seems there is a bug: for each partial report rpc, NN calls reportDiff(..)
but reportDiff(..) assumes full block report. I think the diff is incorrect
for a partial report. In particular, the toRemove set may contain some blocks
reported by other rpcs.
> Support splitting BlockReport of a storage into multiple RPC
> ------------------------------------------------------------
>
> Key: HDFS-9011
> URL: https://issues.apache.org/jira/browse/HDFS-9011
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch,
> HDFS-9011.002.patch
>
>
> Currently if a DataNode has too many blocks (more than 1m by default), it
> sends multiple RPC to the NameNode for the block report, each RPC contains
> report for a single storage. However, in practice we've seen sometimes even a
> single storage can contains large amount of blocks and the report even
> exceeds the max RPC data length. It may be helpful to support sending
> multiple RPC for the block report of a storage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)