[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Release Note: This change introduces a new configuration key used to throttle decommissioning work, dfs.namenode.decommission.blocks.per.interval. This new key overrides and deprecates the previous related configuration key dfs.namenode.decommission.nodes.per.interval. The new key is intended to result in more predictable pause times while scanning decommissioning nodes. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.7.0 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HDFS-7411: Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks, Nicholas. I looked at a diff from v10 of the patch, and checked the new unit test and additional logic for exiting the decom loop at the node limit. I committed this. Thanks, Andrew Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.7.0 Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.011.patch Sorry for the delay on this everyone, I was on vacation last week. I've implemented Chris D's suggestion (with unit test) that does a per node limit. If the deprecated config key is set, it is used preferentially over the default for the new config key. Nicholas, does this satisfy your criteria? Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch, hdfs-7411.010.patch, hdfs-7411.011.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.010.patch Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch, hdfs-7411.010.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.009.patch Added the @Ignore in this latest patch, no other changes. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.008.patch Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: (was: hdfs-7411.008.patch) Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.008.patch Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.007.patch Here's a new patch addressing Colin's review feedback: * Got rid of the nodes.per.interval config key and instead use blocks.per.interval. I didn't bother with a DeprecationDelta because the keys are semantically different and can't just be aliased. Not sure if there's more to do here. * Added support for 0 means unlimited for the max.concurrent.tracked.nodes property * Fixed typos, spacing, added additional debug logging requested. The every 30 minute print since we print out the first insufficiently block encountered while checking each node. * Fixed a small OBO in the rate limiting, and added a test for the limiting Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.006.patch Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.005.patch Thanks again Arpit for commenting. The TestOpenFilesWithSnapshot failure was caused by the block report change, some digging shows the NN won't come out of startup safemode. I'm going to defer this fix until later, it's unrelated to the decom manager refactor itself. New patch attached without the block report change, but plus test improvements. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.004.patch Thanks Ming for reviewing. I attached a new patch which hopefully addresses your feedback. I added a new config param which limits the number of nodes that can be decom'd at once. Above the limit, nodes sit in a queue. I also found and fixed what I think is a bug during block report processing of UC files. In the new unit test I added, I had blocks that would show up in the blocksMap with replicas, but the blocks wouldn't show up on each DN's block list. [~arpitagarwal] could you take a quick look at the one-line addition in BlockManager#addStoredBlockUnderConstruction? I also wonder if the same issue exists for hsync(SyncFlag.UPDATE_LENGTH), but I didn't investigate deeper. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Summary: Refactor and improve decommissioning logic into DecommissionManager (was: Refactor decommissioning logic into DecommissionManager) Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)