[
https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733188#comment-14733188
]
Yi Liu commented on HDFS-9011:
------------------------------
Thanks [~jingzhao] for working on this. Besides Nicholas' comments.
*1.* In BlockPoolSlice
{code}
+ private void saveReplicas(List<BlockListAsLongs> persistList) {
+ if (persistList == null || persistList.isEmpty()) {
return;
}
File tmpFile = new File(currentDir, REPLICA_CACHE_FILE + ".tmp");
@@ -787,7 +787,9 @@ private void saveReplicas(BlockListAsLongs
blocksListToPersist) {
FileOutputStream out = null;
try {
out = new FileOutputStream(tmpFile);
- blocksListToPersist.writeTo(out);
+ for (BlockListAsLongs blockLists : persistList) {
+ blockLists.writeTo(out);
+ }
{code}
Now we write {{BlockListAsLongs}} *list* to {{REPLICA_CACHE_FILE}}, so we
should also change the logic of {{readReplicasFromCache}}:
{code}
BlockListAsLongs blocksList = BlockListAsLongs.readFrom(inputStream);
{code}
It currently read the first {{BlockListAsLongs}}.
Also in {{saveReplicas}}, if one BlockListAsLongs has 0 number of blocks, it's
better not to persist it, otherwise there is NullPointerException while reading
replicas from cache file.
*2.* We should also change the description about
{{dfs.blockreport.split.threshold}} in hdfs-default.xml
Nits: some line are longer than 80 characters in the patch.
> Support splitting BlockReport of a storage into multiple RPC
> ------------------------------------------------------------
>
> Key: HDFS-9011
> URL: https://issues.apache.org/jira/browse/HDFS-9011
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch,
> HDFS-9011.002.patch
>
>
> Currently if a DataNode has too many blocks (more than 1m by default), it
> sends multiple RPC to the NameNode for the block report, each RPC contains
> report for a single storage. However, in practice we've seen sometimes even a
> single storage can contains large amount of blocks and the report even
> exceeds the max RPC data length. It may be helpful to support sending
> multiple RPC for the block report of a storage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)