subject:"\[jira\] \[Commented\] \(HDFS\-7411\) Refactor and improve decommissioning logic into DecommissionManager"


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352834#comment-14352834
 ] 

Hudson commented on HDFS-7411:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #127 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/127/])
HDFS-7411. Change decommission logic to throttle by blocks rather (cdouglas: 
rev 6ee0d32b98bc3aa5ed42859f1325d5a14fd1722a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java


 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.7.0

 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352856#comment-14352856
 ] 

Hudson commented on HDFS-7411:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #861 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/861/])
HDFS-7411. Change decommission logic to throttle by blocks rather (cdouglas: 
rev 6ee0d32b98bc3aa5ed42859f1325d5a14fd1722a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java


 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.7.0

 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352862#comment-14352862
 ] 

Hudson commented on HDFS-7411:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2059 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2059/])
HDFS-7411. Change decommission logic to throttle by blocks rather (cdouglas: 
rev 6ee0d32b98bc3aa5ed42859f1325d5a14fd1722a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java


 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.7.0

 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353126#comment-14353126
 ] 

Hudson commented on HDFS-7411:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2077 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2077/])
HDFS-7411. Change decommission logic to throttle by blocks rather (cdouglas: 
rev 6ee0d32b98bc3aa5ed42859f1325d5a14fd1722a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.7.0

 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353070#comment-14353070
 ] 

Hudson commented on HDFS-7411:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #118 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/118/])
HDFS-7411. Change decommission logic to throttle by blocks rather (cdouglas: 
rev 6ee0d32b98bc3aa5ed42859f1325d5a14fd1722a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.7.0

 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353094#comment-14353094
 ] 

Hudson commented on HDFS-7411:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #127 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/127/])
HDFS-7411. Change decommission logic to throttle by blocks rather (cdouglas: 
rev 6ee0d32b98bc3aa5ed42859f1325d5a14fd1722a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java


 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.7.0

 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-03-09 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353270#comment-14353270
 ] 

Andrew Wang commented on HDFS-7411:
---

Glad to see this committed, thanks Chris for the final push, and everyone else 
for the reviews and comments.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.7.0

 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-03-08 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352424#comment-14352424
 ] 

Hudson commented on HDFS-7411:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7281 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7281/])
HDFS-7411. Change decommission logic to throttle by blocks rather (cdouglas: 
rev 6ee0d32b98bc3aa5ed42859f1325d5a14fd1722a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.7.0

 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-03-08 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352115#comment-14352115
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

If you believe that the new patch could keep the existing behavior, I am happy 
to remove my -1.  Unfortunately, I won't be able to review the patch.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-03-06 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351449#comment-14351449
 ] 

Chris Douglas commented on HDFS-7411:
-

Looked through the patch; it addresses the feedback. [~szetszwo], do you want 
to review the patch before commit?

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-03-03 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345509#comment-14345509
 ] 

Andrew Wang commented on HDFS-7411:
---

Been about a week, [~szetszwo] anything? your veto is still blocking this from 
being fixed.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-26 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339691#comment-14339691
 ] 

Andrew Wang commented on HDFS-7411:
---

[~szetszwo] any comments? I'd like to move forward on this.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335738#comment-14335738
 ] 

Hadoop QA commented on HDFS-7411:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12700583/hdfs-7411.011.patch
  against trunk revision 9a37247.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9658//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9658//console

This message is automatically generated.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-11 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316793#comment-14316793
 ] 

Chris Douglas commented on HDFS-7411:
-

bq. The -1 is not for the refactoring. It is for keeping the existing behavior.

Andrew, even though you prefer estimates or averages that approximate the 
existing behavior, halting when either of the limits are hit would move this 
forward.

Nicholas, would you be OK changing the default so this uses the new algorithm 
in clusters where the node limit is not explicitly configured (default value 
for nodes is {{Integer.MAX_VALUE}})? You're also OK enforcing the existing 
semantics in the new code?

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-11 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316965#comment-14316965
 ] 

Andrew Wang commented on HDFS-7411:
---

bq. Andrew, even though you prefer estimates or averages that approximate the 
existing behavior, halting when either of the limits are hit would move this 
forward.

Saying to use the node limit is underspecified, since the new code only 
iterates over decomming nodes, whereas the old code iterates over all nodes. 
This constitutes a major behavior change, but Nicholas said that iterating over 
non-decomming nodes is a bug that should be fixed.

This is why I've been trying to elevate the discussion to what constitutes good 
or bad user experience. I have a hard time understanding why the iterating over 
just decomming nodes is an allowable change (even though it'll have a huge 
affect on pause times and decom rate), but the rest of my proposals are not 
okay because they constitute a behavior change.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-11 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317210#comment-14317210
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

 ... would you be OK changing the default so this uses the new algorithm in 
 clusters where the node limit is not explicitly configured (default value for 
 nodes is Integer.MAX_VALUE)?

Agree.  This is the same as [my 
propose|https://issues.apache.org/jira/browse/HDFS-7411?focusedCommentId=14302030page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14302030]
 mentioned multiple times.

The default value could be removed from hdfs-default.xml.  Then then passing -1 
as default in the code.  Then, returning -1 means the conf is not set.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-11 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317206#comment-14317206
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

 Again I would appreciate examples where user experience is negatively 
 impacted compared to before.

When a node have many small blocks, say 10m, then would setting 100k blocks per 
node be slower then the original node base algorithm?

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-11 Thread Andrew Wang (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317443#comment-14317443
]

Andrew Wang commented on HDFS-7411:
---

bq. When a node have many small blocks, say 10m, then would setting 100k blocks
per node be slower then the original node base algorithm?

Could you explain how you chose this value? Taking some sample numbers, a
really big Hadoop cluster might be 4000 nodes and 300 million blocks. This is
an average of 75k blocks per node. My experience with smaller, denser clusters
is that they top out at 500k blocks per node, because there are issues with
block report processing above that. 10m is 20x that already dense number.

Even so, it's not clear that it would be slower. The new algo does incremental
scans, so it'll be a lot faster toward the end of decom. Also with the old
code, doing full scans of 5 10m block DNs is going to be many seconds (maybe
even 1min) of pause each time, which is also certainly not what an admin wants.

bq. Agree. This is the same as my propose mentioned multiple times.

I think it's a bit different. Chris's proposal is to also enforce a # of nodes
limit in the new code, not to use the old code via a config toggle. The current
patch already does detection of the old config, so it could be tweaked a bit to
do this.

Refactor and improve decommissioning logic into DecommissionManager
---

Key: HDFS-7411
URL: https://issues.apache.org/jira/browse/HDFS-7411
Project: Hadoop HDFS
Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch,
hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch,
hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch,
hdfs-7411.009.patch, hdfs-7411.010.patch

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315506#comment-14315506
]

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

Isn't it better that we can quickly mark all these nodes as decommissioned,
rather than waiting minutes or hours? ...

Then, why don't you setting it to some larger values, say 1m or 10m?

... Really these statements can be made about any patch.

That's correct. My comments are not subjective to you or your patches. For
any patch, if it wants to remove an existing public feature, it should first
deprecate and then remove the feature in some later release. It is not
well-thought-out for anyone to say that one could fix any bug in the future so
that we have nothing to worry.

... could you please address my proposals about how to translate the old
config into the new config? If you're okay removing the old code in a later
2.x release, ...

I cannot think of a good way to do the translation. Later on, when we have
more experience the new approach, we may be able to make a better decision.

... The example you gave exhibits surprising behavior, but it's a pleasant
surprise. It's like finding presents under the tree on Christmas day.

It seems that most cluster admins do not like surprise. A surprising event
means they cannot understand the system well enough to predict the behavior.

I do agree that children like the surprise if they found presents under the
Christmas tree.

As this has already been reviewed by two people, I still feel like a patch
split to ease review is a strange request. -1 on an already +1'd patch
because it doesn't split a refactor would be quite novel.

The -1 is not for the refactoring. It is for keeping the existing behavior. I
actually don't understand why the patch can get +1'ed without bringing up the
incompatibility issue.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315571#comment-14315571
]

Andrew Wang commented on HDFS-7411:
---

bq. Then, why don't you setting it to some larger values, say 1m or 10m?

Could you clarify what you mean by this? Faster decommission times seems like a
good thing from an user's perspective. Performance improvements aren't an
incompatible change.

bq. For any patch, if it wants to remove an existing public feature, it should
first deprecate and then remove the feature in some later release.

We're not removing a feature, we're improving an existing one. As I've said
above, limiting by # of nodes is flawed in very many ways. It's difficult for
admins to use since the behavior is so variable, it's always surprising. As you
indicate, this is not pleasing to cluster admins. The new patch actually makes
decom much more predictable.

Again I would appreciate examples where user experience is negatively impacted
compared to before.

bq. I cannot think of a good way to do the translation. Later on, when we have
more experience the new approach, we may be able to make a better decision.

Can you comment on my proposals above specifically? I would appreciate examples
where user experience is negatively impacted. This will help improve the patch.
Else we're delaying for unspecified, unknown possible future concerns.

Also I said above, if you're okay with removing the old decom code in a later
2.x release, we still need to figure out compatibility with the old config
option now.

bq. The -1 is not for the refactoring. It is for keeping the existing behavior.
I actually don't understand why the patch can get +1'ed without bringing up the
incompatibility issue.

Compatibility was brought up quite a bit during review, even before your first
comment. The patch and comment history demonstrates that. I think what we have
right now satisfies compatibility concerns. The existing behavior is very hard
for admins to use, so there is very little value to keeping it exactly as is.
This patch improves upon that, even for users of the old configuration key, and
I've proposed some other ways we could avoid admin surprises.

I'll also ask again, do you have any specific tests you'd like to see run to
improve your confidence in the code quality?

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315082#comment-14315082
]

Andrew Wang commented on HDFS-7411:
---

bq. Is it the intended behavior?

Isn't it better that we can quickly mark all these nodes as decommissioned,
rather than waiting minutes or hours? I don't see why an admin would prefer
these nodes to stay decom-in-progress for hours. If anything, I hear admins
complaining about decom being too slow, never the opposite.

bq. Thanks for signing it up. What if you are unavailable later on? What if
there is a bug you don't know how to fix?

Well, Colin and Ming also reviewed this, so there are two more people who also
can help maintain. I can't comment on a hypothetical bug you don't know how to
fix. Really these statements can be made about any patch.

Do you have specific requirements around additional testing? Or comments about
my proposals?

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315141#comment-14315141
]

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

It does not seem the case to me since no one ever commented on the
incompatible change earlier.

I am actually quite surpurised that this was an incompatible change in many
ways:
- From the JIRA summary, it starts with Refactor and improve which does not
sound like incompatible.
- From the earlier comments such as ... but this scheme preserves backwards
compatibility. and ... The idea here was to be compatible with the old config
option, It does seem that the patch is compatible.
- Before I said that [dfs.namenode.decommission.nodes.per.interval should be
deprecated
first|https://issues.apache.org/jira/browse/HDFS-7411?focusedCommentId=14294224page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14294224].
No other contributors/reviewers mentioned that the patch removed a public
conf so that it was incompatible.
- The JIRA was not marked as an Incompatible change.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315086#comment-14315086
 ] 

Andrew Wang commented on HDFS-7411:
---

I'll also note that some of my earlier proposals addresses a low block-count 
per node, if we agree this is a concern.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315112#comment-14315112
]

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

... Heh; I don't think you intended to float this as a precondition for
commit.

No. I just want to point out that keeping the existing code is important.

... the patch has received a fair amount of review. ...

It does not seem the case to me since no one ever commented on the incompatible
change earlier.

The block base approach does sound a good idea to me. I never object adding
it. However, as we generally cannot guarantee any new code to be perfect and
the new approach can solve the old problems, we should keep the existing code
for the existing conf. The existing code can be removed after one or two
releases.

Also, I insist to have a simple refactoring patch before adding the new code
because the refactoring patch is going to be very easily reviewed. The patch
with the new code will be smaller and more easier to understand.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315051#comment-14315051
]

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

... with pause time being the most important one. ...

It is not necessary the most important in all the cases. Consider a large
cluster with many datanode, say 4000, but few data so that each datanode only
has few number of blocks, say 100. Then the admin decide to reduce the size
of the cluster dramatically, say 2000. With setting 100k blocks per node, it
could decommission 1000 nodes per iteration. Is it the intended behavior?

As usual, I'll sign myself up to fix any issues that surface, so there's no
worry about ongoing maintenance.

Thanks for signing it up. What if you are unavailable later on? What if there
is a bug you don't know how to fix?

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-10 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315092#comment-14315092
 ] 

Chris Douglas commented on HDFS-7411:
-

bq. What if there is a bug you don't know how to fix?
Heh; I don't think you intended to float this as a precondition for commit.

[~szetszwo], the patch has received a fair amount of review. Are there cases 
that should be tested that would increase your confidence in its correctness 
and performance? Per your example, are there additional parameters that should 
be added to the algorithm?

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315197#comment-14315197
]

Andrew Wang commented on HDFS-7411:
---

Nicholas, could you please address my proposals about how to translate the old
config into the new config? If you're okay removing the old code in a later 2.x
release, then we still need to agree on compatibility now. As I said above, I
struggle to see downsides from a user point of view. The example you gave
exhibits surprising behavior, but it's a pleasant surprise. It's like finding
presents under the tree on Christmas day.

Same for my question about testing. If the concern is code quality, let's think
up some more testing.

As this has already been reviewed by two people, I still feel like a patch
split to ease review is a strange request. -1 on an already +1'd patch because
it doesn't split a refactor would be quite novel.

We're all buds, so I can do a split as a show of good faith, but I'd like
agreement on the above two questions first.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-10 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315201#comment-14315201
 ] 

Chris Douglas commented on HDFS-7411:
-

If the new code were to implement a switch for the current semantics (but not 
the existing code), would that satisfy your reservations? A workaround for 
unforeseen flaws that falls back to the node-based algorithm is prudent, but 
the multi-stage refactor adds more generality to the code that is probably 
required. Per your comment, would you want the algorithm to stop when _either_ 
the max #blocks or max #nodes have been checked? {{numNodesChecked}} is kept 
around for metrics anyway; it looks like it could be straightforward to add 
this (Andrew, please correct this if it's mistaken)

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315220#comment-14315220
 ] 

Andrew Wang commented on HDFS-7411:
---

Yea, it'd be easy to do a node-based limiter, though I'd really prefer some of 
the other schemes I offered up. They offer very similar properties in terms of 
pause times, and make more sense when considering the new incremental scan 
scheme.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-09 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313358#comment-14313358
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

 ..., HDFS-7712 and HDFS-7734 also do not relate to this JIRA. HDFS-7706 is 
 what I split out from this JIRA. ...

If HDFS-7706 is not split out (which is your original plan), the work of 
HDFS-7712 may possibly be added here.  So it seems related.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-09 Thread Tsz Wo Nicholas Sze (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313355#comment-14313355
]

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

Since the decom manager iterates over the whole datanode list, both live and
decommissioning nodes count towards the limit. Thus, the actual number of
decomming nodes processed varies between 0 and the limit.

I agree that live node should not be counted. It is a bug in the node based
decommission logic and we probably should fix the bug.

A problem for translating node-based to block-based decommission is that there
is no good choice of number of nodes per block. It may result in surprising
behavior for clusters extremely large or extremely small. This is a reason
that we want to keep the old code.

A second reason is that the old code is proven to be working. If we remove it
now and then find a serious bug in this patch after a release, then we don't
have anything to fallback. That's why the standard procedure to remove an
existing feature is to deprecate it first and the remove it later.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-09 Thread Andrew Wang (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313386#comment-14313386
]

Andrew Wang commented on HDFS-7411:
---

Re: node-to-block translation, do any of my proposed schemes meet your
criteria? I think I laid out concerns from the perspective of the admin
correctly, with pause time being the most important one. I'll also note by that
metric, fixing the bug of counting live nodes will quite severely affect pause
time.

Re: proven to be working, is there additional testing that you will find
satisfactory? This change passes the existing decommission unit tests, as well
as adding a fair bit more. I think it's quite rare to preserve old code that's
functionally the same behind a config. I actually can't think of an example
from the last two years.

As usual, I'll sign myself up to fix any issues that surface, so there's no
worry about ongoing maintenance.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-09 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312877#comment-14312877
 ] 

Andrew Wang commented on HDFS-7411:
---

[~szetszwo] have you gotten a chance to read the above? Any other comments from 
other reviewers? I'd like to move on this if possible.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-05 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308115#comment-14308115
 ] 

Andrew Wang commented on HDFS-7411:
---

Nicholas, HDFS-7712 and HDFS-7734 also do not relate to this JIRA. HDFS-7706 is 
what I split out from this JIRA. The other two are a separate effort to convert 
more of HDFS to slf4j.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-05 Thread Andrew Wang (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308105#comment-14308105
]

Andrew Wang commented on HDFS-7411:
---

I had an offline request to summarize some of the above.

Nicholas's compatibility concern regards the rate limiting of decommissioning.
Currently, this is expressed as a number of nodes to process per decom manager
wakeup. There are a number of flaws with this scheme:

* Since the decom manager iterates over the whole datanode list, both live and
decommissioning nodes count towards the limit. Thus, the actual number of
decomming nodes processed varies between 0 and the limit.
* Since datanodes have different number of blocks, the amount of actual work
can vary based on this as well.

This means:
* This config parameter only very loosely corresponds to decom rate and decom
pause times, which are the two things that admins care about.
* Trying to tune decom behavior with this parameter is thus somewhat futile.
* In the grand scope of HDFS, this is also not a common parameter to be tweaked.

Because this, we felt it was okay to change the interpretation of this config
option. I view the old behavior more as a bug than something that is being
depended upon by a user.

Translating this number of nodes limit instead into a number of blocks limit
(as done in the current patch) makes the config far more predictable and thus
usable. Since the new code also supports incremental scans (which is what makes
it faster), specifying the limit in a number of nodes limit doesn't make much
sense.

The only potential surprise I see for cluster operators is if the translation
of the limit from {{# nodes}} to {{# blocks}} is too liberal. This would result
in longer maximum pause times than before. We thought 100k nodes per block was
a conservative estimate, but this could be further reduced.

One avenue I do not want to pursue is keeping the old code around, as Nicholas
has proposed. This increases our maintenance burden, and means many people will
keep running into the same issues surrounding decom.

If Nicholas still does not agree with the above rationale, I see the following
potential options for improvement:

* Be even more conservative with translation factor, e.g. assume only 50k
blocks per node
* Factor in the number of nodes and/or avg blocks per node to the translation.
This will better approximate the old average pause times.
* Make the new decom manager also support a {{# nodes}} limit. This isn't great
since scans are incremental now, but it means we'll be doing strictly less work
per pause than before.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-04 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306261#comment-14306261
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

I do not intention to block the decommission improvement.  However, it is 
really a bad idea to arbitrarily change the behavior of a public conf when 
keeping the existing behavior is easy, and mixing code refactoring with 
improvement in a big patch.

[~andrew.wang], I am glad that you split the slf4j change from here to 
HDFS-7706 and filed another minor JIRA HDFS-7712 for the further work.  If the 
work is included here, the blocker JIRA HDFS-7734 will be broken by the 
unnecessarily complicated patch here.

In order to move faster, how about I volunteer doing the refactoring?

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-04 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306167#comment-14306167
 ] 

Kihwal Lee commented on HDFS-7411:
--

HDFS-7735 is also being worked on.  If we can convince our selves that the 
performance benefit of this change is great, it will be worth while to spend 
time to find a way to introduce this without incompatibility.  The performance 
here can be viewed in two angles: 1) impact on name node and 2) decommissioning 
speed.  I will also try to review the patch.  I think everyone agrees that we 
need to improve decommissioning.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-04 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306193#comment-14306193
 ] 

Andrew Wang commented on HDFS-7411:
---

I would definitely welcome some other opinions on the perceived incompatibility 
here. I think Colin and myself have already made our thoughts on the matter 
pretty clear above. Thanks in advance Kihwal.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-02 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301870#comment-14301870
 ] 

Colin Patrick McCabe commented on HDFS-7411:


Thanks for splitting off the logging changes, Andrew.

It's reasonable to keep backwards compatibility by supporting 
{{dfs.namenode.decommission.nodes.per.interval}}.  I can see that you have done 
this in the latest patch.  Maybe eventually in Hadoop 3.0 we can drop support 
for this parameter.

+1.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302015#comment-14302015
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

{code}
+// Assume 100k blocks per node.
+blocksPerInterval = 100 * 1000 * numNodes;
{code}
How to come up with such assumption?  It seems invalid for some clusters.  
Also, nodes in a cluster may have different numbers of blocks.  Simply assuming 
all datanodes having the same number of blocks does not seem correct.

Why not keeping the existing code?  It is a simple easy way to support backward 
compatibility.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-02 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301881#comment-14301881
 ] 

Andrew Wang commented on HDFS-7411:
---

Thanks for reviewing Colin. I'll wait another day before committing, in case 
there are further review comments. I think we can take most things to a 
follow-on, would be great to get this in to unblock further decom improvements.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302030#comment-14302030
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

I also like to get this work here committed faster.  So, I suggest
# Create another JIRA for a pure code refactoring.  Move all the existing code 
to DecommissionManager.  No logic change.
# Change the patch here to add the new decommission code but NOT removing the 
existing code so that it uses the old code if the old conf is set and the new 
code if new conf is set.

Sound good?

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-02 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302322#comment-14302322
 ] 

Arpit Agarwal commented on HDFS-7411:
-

bq.  Ming Ma, Arpit Agarwal, and myself have reviewed this.
I have not reviewed the patch. I just commented on a change to a single 
function.

 Refactor and improve decommissioning logic into DecommissionManager
 ---

 Key: HDFS-7411
 URL: https://issues.apache.org/jira/browse/HDFS-7411
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
 hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, 
 hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, 
 hdfs-7411.009.patch, hdfs-7411.010.patch


 Would be nice to split out decommission logic from DatanodeManager to 
 DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-02 Thread Colin Patrick McCabe (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302319#comment-14302319
]

Colin Patrick McCabe commented on HDFS-7411:

I don't see any benefit to splitting the patch further. The old logic is
seriously flawed... it's not effective at rate limiting and often goes way too
fast or too slow, because it is based on number of nodes rather than number of
blocks.

I think it's great that [~andrew.wang] took on this task, which has been a
maintenance problem for us for a while. This is a great example of someone who
really cares about the project making things better by working on something
which is boring (it will never make it into a list of new features or
exciting research talks) but essential to our users.

There is no benefit to keeping around multiple broken implementations of things
to do the same job. Hadoop has enough dead and obsolete code as is... more
than enough. Things like the {{RemoteBlockReader}} / {{RemoteBlockReader2}}
split increase our maintenance burden and confuse users and potential
contributors. I only allowed the {{BlockReaderLocalLegacy}} /
{{BlockReaderLocal}} split because we didn't have platform support for file
descriptor passing on Windows. But since there are no platform support issues
here, there is no reason to increase our maintenance burden.

If we are concerned about stability, we can let this soak in trunk for a while.

It has been through three months of review. [~mingma], [~arpitagarwal], and
myself have reviewed this. It's ready to go in, and I think it should. +1.
Let's commit this today if there are no other comments about the patch.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302510#comment-14302510
]

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

... the old limiting scheme is seriously flawed. ...
... The old logic is seriously flawed...

Sure, you see it this way but some users may feel that the old code works just
fine. They may not have time to deal with new behavior. We cannot force them
to do so.

There is no benefit to keeping around multiple broken implementations of
things to do the same job. ...

We are not keeping multiple implementations. The old implementation will be
removed in the future.

... It's ready to go in, and I think it should. +1. Let's commit this today
if there are no other comments about the patch.

Let me clarify my -1. The patch changes an existing conf property to a
different behavior. Instead, we should keep the existing behavior, deprecate
the conf property first and then remove it later.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

2015-02-02 Thread Andrew Wang (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302092#comment-14302092
]

Andrew Wang commented on HDFS-7411:
---

As discussed above, the old limiting scheme is seriously flawed. The amount of
time spent is highly variable, since it's # nodes rather than # blocks, and the
size of each node is variable. It also counts both decommissioning and not
decommissioning nodes towards the limit.

That nodes can vary in # of blocks and is really an argument for *not* using #
nodes as a limit. # of blocks is superior. The 100k was chosen as a
conservative number that will not lead to overly long wake-up times, which is
the point of this limit. In fact, with this patch we should see far more
predictable pause times for decommission work even with the old config. In
addition, it'll also result in an improvement in overall decommission speed
because of the incremental scan logic.

Because of this, I do not see any advantage to keeping this old code around.
The old code is worse in terms of predictable pause times and overall
decommissioning speed. It also has other flaws that are corrected by this
patch. The new code is compatible with the old configuration. It also requires
a lot of work to split the refactoring.

I still plan to commit tomorrow.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

[
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302585#comment-14302585
]

Tsz Wo Nicholas Sze commented on HDFS-7411:
---

I think it's pretty common for us to change the behavior of the system when
the behavior change is a strict improvement. Keeping around inferior behavior
just for the purpose of consistency seems rather pointless.

First of all, we never have showed that the new behavior is strictly better.
It is a hypothesis. No?

For example, it used to be the case that fsimage transfers ...

The fsimage transfer change is an internal protocol change. The http interface
is not a public API. However, the conf property discussed here is public.

Similarly, when we find ways that CPU performance or memory usage in the NN
can be improved, ...

The change here is not like that. It changes the scheme from node based to
block based. It is not making the node based decommission faster.

As Andrew Wang has already described, the new behavior should be both more
performant and more predictable. ...

As mentioned previously, it is a hypothesis. That why Andrew described as
should be.

Refactor and improve decommissioning logic into DecommissionManager
---

Would be nice to split out decommission logic from DatanodeManager to
DecommissionManager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager