[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137088#comment-14137088 ] Hudson commented on HDFS-6965: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #683 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/683/]) HDFS-6965. NN continues to issue block locations for DNs with full (kihwal: rev 0c26412be4b3ec40130b7200506c957f0402ecbc) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Rushabh S Shah Fix For: 2.6.0 Attachments: HDFS-6965.patch Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137239#comment-14137239 ] Hudson commented on HDFS-6965: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1899 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1899/]) HDFS-6965. NN continues to issue block locations for DNs with full (kihwal: rev 0c26412be4b3ec40130b7200506c957f0402ecbc) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Rushabh S Shah Fix For: 2.6.0 Attachments: HDFS-6965.patch Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137266#comment-14137266 ] Hudson commented on HDFS-6965: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1874 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1874/]) HDFS-6965. NN continues to issue block locations for DNs with full (kihwal: rev 0c26412be4b3ec40130b7200506c957f0402ecbc) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Rushabh S Shah Fix For: 2.6.0 Attachments: HDFS-6965.patch Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14135469#comment-14135469 ] Kihwal Lee commented on HDFS-6965: -- Filed HDFS-7069 for TestMissingBlocksAlert. +1 for the patch NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Rushabh S Shah Attachments: HDFS-6965.patch Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14135496#comment-14135496 ] Rushabh S Shah commented on HDFS-6965: -- Thanks Kihwal for reviewing and committing. NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Rushabh S Shah Fix For: 2.6.0 Attachments: HDFS-6965.patch Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130712#comment-14130712 ] Hadoop QA commented on HDFS-6965: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12668100/HDFS-6965.patch against trunk revision 4be9517. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestEncryptionZones org.apache.hadoop.hdfs.server.datanode.TestBPOfferService org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.TestMissingBlocksAlert {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7998//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7998//console This message is automatically generated. NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Rushabh S Shah Attachments: HDFS-6965.patch Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130763#comment-14130763 ] Rushabh S Shah commented on HDFS-6965: -- I ran all the above failed tests on my local setup multiple times and all of them are passing. I assume this failure is due to test timeout issue. NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Rushabh S Shah Attachments: HDFS-6965.patch Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127160#comment-14127160 ] Daryn Sharp commented on HDFS-6965: --- Simply changing {{node.getRemaining()}} to {{storage.getRemaining()}} is probably sufficient. Checking both should be redundant, shouldn't it? If the storage has space, then the node certainly should have space unless there's an accounting bug. NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: chang li Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127333#comment-14127333 ] Kihwal Lee commented on HDFS-6965: -- Yes checking at storage level should be sufficient. Whenever a node is chosen, {{chooseLocalStorage()}} and {{chooseRandom()}} iterate over storage objects in random order and pick the first one that is acceptable. So {{storage.getRemaining()}} is the correct check. NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: chang li Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6965) NN continues to issue block locations for DNs with full disks
[ https://issues.apache.org/jira/browse/HDFS-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114065#comment-14114065 ] Kihwal Lee commented on HDFS-6965: -- I think there is a flaw in isGoodTarget(). When checking the available space, it checks against {{node.getRemaining()}}. It has too check at individual storage level. NN continues to issue block locations for DNs with full disks - Key: HDFS-6965 URL: https://issues.apache.org/jira/browse/HDFS-6965 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Encountered issues where DNs have less space than a full block and reject incoming transfers. The NN continues giving out locations for these nodes for some period of time. It does not appear to be related to the DN's cached disk usage. One impact is required replications are delayed when a full DN is chosen for the pipeline. A DN cannot report a broken pipeline so the replication must timeout (5m) before new targets are chosen. -- This message was sent by Atlassian JIRA (v6.2#6252)