[ https://issues.apache.org/jira/browse/HDFS-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chen Zhang updated HDFS-13709: ------------------------------ Attachment: HDFS-13709.004.patch > Report bad block to NN when transfer block encounter EIO exception > ------------------------------------------------------------------ > > Key: HDFS-13709 > URL: https://issues.apache.org/jira/browse/HDFS-13709 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Reporter: Chen Zhang > Assignee: Chen Zhang > Priority: Major > Attachments: HDFS-13709.002.patch, HDFS-13709.003.patch, > HDFS-13709.004.patch, HDFS-13709.patch > > > In our online cluster, the BlockPoolSliceScanner is turned off, and sometimes > disk bad track may cause data loss. > For example, there are 3 replicas on 3 machines A/B/C, if a bad track occurs > on A's replica data, and someday B and C crushed at the same time, NN will > try to replicate data from A but failed, this block is corrupt now but no one > knows, because NN think there is at least 1 healthy replica and it keep > trying to replicate it. > When reading a replica which have data on bad track, OS will return an EIO > error, if DN reports the bad block as soon as it got an EIO, we can find > this case ASAP and try to avoid data loss -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org