[
https://issues.apache.org/jira/browse/HDFS-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315110#comment-14315110
]
Jitendra Nath Pandey commented on HDFS-7604:
--------------------------------------------
The patch looks pretty good, only a couple of comments:
{code}
+ boolean checkFailedStorages = (volFailures > volumeFailures) ||
{code}
If volumeFailureSummary is not null, it might be more accurate to compare last
failure timestamp?
{code}
public int getVolumeFailures() {
- return volumeFailures;
+ return volumeFailureSummary != null ?
+ volumeFailureSummary.getFailedStorageLocations().length : 0;
+ }
{code}
In case of rolling upgrades, the older version of datanodes, will not send
volumeFailureSummary, and the newer namenode might erroneously conclude 0
volume failures.
> Track and display failed DataNode storage locations in NameNode.
> ----------------------------------------------------------------
>
> Key: HDFS-7604
> URL: https://issues.apache.org/jira/browse/HDFS-7604
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode, namenode
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: HDFS-7604-screenshot-1.png, HDFS-7604-screenshot-2.png,
> HDFS-7604-screenshot-3.png, HDFS-7604-screenshot-4.png,
> HDFS-7604-screenshot-5.png, HDFS-7604-screenshot-6.png,
> HDFS-7604-screenshot-7.png, HDFS-7604.001.patch, HDFS-7604.002.patch,
> HDFS-7604.004.patch, HDFS-7604.prototype.patch
>
>
> During heartbeats, the DataNode can report a list of its storage locations
> that have been taken out of service due to failure (such as due to a bad disk
> or a permissions problem). The NameNode can track these failed storage
> locations and then report them in JMX and the NameNode web UI.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)