[
https://issues.apache.org/jira/browse/HDFS-8652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jing Zhao updated HDFS-8652:
----------------------------
Attachment: HDFS-8652.002.patch
Did some debugging and confirmed that the failure of
TestFIleTruncate#testTruncateWithDataNodesRestartImmediately is unrelated. The
cause of the failure is a race scenario in the block recovery process: the
second dn sends block report after the block truncation is finished thus its
replica is marked as corrupted. However the replication monitor cannot schedule
an extra replica because there are only 3 datanodes in the test. I will file a
separate jira to fix this.
The new patch addresses Zhe's comment by adding more java comment in
{{BlockManager#invalidateBlock}}. Since this change is trivial, I will not wait
for Jenkins again and will commit the patch based on the +1 from Nicholas.
> Track BlockInfo instead of Block in CorruptReplicasMap
> ------------------------------------------------------
>
> Key: HDFS-8652
> URL: https://issues.apache.org/jira/browse/HDFS-8652
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: HDFS-8652.000.patch, HDFS-8652.001.patch,
> HDFS-8652.002.patch
>
>
> Currently {{CorruptReplicasMap}} uses {{Block}} as its key and records the
> list of DataNodes with corrupted replicas. For Erasure Coding since a striped
> block group contains multiple internal blocks with different block ID, we
> should use {{BlockInfo}} as the key.
> HDFS-8619 is the jira to fix this for EC. To ease merging we will use jira to
> first make changes in trunk/branch-2.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)