[jira] [Commented] (HDFS-15357) Do not trust bad block reports from clients
[ https://issues.apache.org/jira/browse/HDFS-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17802416#comment-17802416 ] Shilun Fan commented on HDFS-15357: --- updated the target version for preparing 3.4.0 release. > Do not trust bad block reports from clients > --- > > Key: HDFS-15357 > URL: https://issues.apache.org/jira/browse/HDFS-15357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Priority: Major > > {{reportBadBlocks()}} is implemented by both ClientNamenodeProtocol and > DatanodeProtocol. When DFSClient is calling it, a faulty client can cause > data availability issues in a cluster. > In the past we had such an incident where a node with a faulty NIC was > randomly corrupting data. All clients ran on the machine reported all > accessed blocks and all associated replicas to be corrupt. More recently, a > single faulty client process caused a small number of missing blocks. In > all cases, actual data was fine. > The bad block reports from clients shouldn't be trusted blindly. Instead, the > namenode should send a datanode command to verify the claim. A bonus would be > to keep the record for a while and ignore repeated reports from the same > nodes. > At minimum, there should be an option to ignore bad block reports from > clients, perhaps after logging it. A very crude way would be to make it short > out in {{ClientNamenodeProtocolServerSideTranslatorPB#reportBadBlocks()}}. > More sophisticated way would be to check for the datanode user name in > {{FSNamesystem#reportBadBlocks()}} so that it can be easily logged, or > optionally do further processing. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15357) Do not trust bad block reports from clients
[ https://issues.apache.org/jira/browse/HDFS-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541336#comment-17541336 ] Masatake Iwasaki commented on HDFS-15357: - updated the target version for preparing 2.10.2 release. > Do not trust bad block reports from clients > --- > > Key: HDFS-15357 > URL: https://issues.apache.org/jira/browse/HDFS-15357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Priority: Major > > {{reportBadBlocks()}} is implemented by both ClientNamenodeProtocol and > DatanodeProtocol. When DFSClient is calling it, a faulty client can cause > data availability issues in a cluster. > In the past we had such an incident where a node with a faulty NIC was > randomly corrupting data. All clients ran on the machine reported all > accessed blocks and all associated replicas to be corrupt. More recently, a > single faulty client process caused a small number of missing blocks. In > all cases, actual data was fine. > The bad block reports from clients shouldn't be trusted blindly. Instead, the > namenode should send a datanode command to verify the claim. A bonus would be > to keep the record for a while and ignore repeated reports from the same > nodes. > At minimum, there should be an option to ignore bad block reports from > clients, perhaps after logging it. A very crude way would be to make it short > out in {{ClientNamenodeProtocolServerSideTranslatorPB#reportBadBlocks()}}. > More sophisticated way would be to check for the datanode user name in > {{FSNamesystem#reportBadBlocks()}} so that it can be easily logged, or > optionally do further processing. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15357) Do not trust bad block reports from clients
[ https://issues.apache.org/jira/browse/HDFS-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190559#comment-17190559 ] Masatake Iwasaki commented on HDFS-15357: - updated the target version for preparing 2.10.1 release. > Do not trust bad block reports from clients > --- > > Key: HDFS-15357 > URL: https://issues.apache.org/jira/browse/HDFS-15357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Priority: Major > > {{reportBadBlocks()}} is implemented by both ClientNamenodeProtocol and > DatanodeProtocol. When DFSClient is calling it, a faulty client can cause > data availability issues in a cluster. > In the past we had such an incident where a node with a faulty NIC was > randomly corrupting data. All clients ran on the machine reported all > accessed blocks and all associated replicas to be corrupt. More recently, a > single faulty client process caused a small number of missing blocks. In > all cases, actual data was fine. > The bad block reports from clients shouldn't be trusted blindly. Instead, the > namenode should send a datanode command to verify the claim. A bonus would be > to keep the record for a while and ignore repeated reports from the same > nodes. > At minimum, there should be an option to ignore bad block reports from > clients, perhaps after logging it. A very crude way would be to make it short > out in {{ClientNamenodeProtocolServerSideTranslatorPB#reportBadBlocks()}}. > More sophisticated way would be to check for the datanode user name in > {{FSNamesystem#reportBadBlocks()}} so that it can be easily logged, or > optionally do further processing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org