[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Resolution: Fixed Fix Version/s: 3.0.0-alpha2 2.9.0 Status: Resolved (was: Patch Available) Thanks [~eddyxu] again. I have committed it to trunk and branch-2. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390-4.patch, > HDFS-9390-5.patch, HDFS-9390-branch-2.002.patch, HDFS-9390-branch-2.patch, > HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390-branch-2.002.patch Reload with the proper patch name for Jenkins to run. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390-4.patch, > HDFS-9390-5.patch, HDFS-9390-branch-2.002.patch, HDFS-9390-branch-2.patch, > HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: (was: HDFS-9390-2-branch-2.patch) > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390-4.patch, > HDFS-9390-5.patch, HDFS-9390-branch-2.patch, HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390-2-branch-2.patch Updated branch-2 patches to address some of the checkstyle issues. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2-branch-2.patch, HDFS-9390-2.patch, > HDFS-9390-3.patch, HDFS-9390-4.patch, HDFS-9390-5.patch, > HDFS-9390-branch-2.patch, HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390-branch-2.patch Here is the patch for branch-2. Note that the backport isn't trivial due to difference between branch-2 and trunk. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390-4.patch, > HDFS-9390-5.patch, HDFS-9390-branch-2.patch, HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390-5.patch Reload the correct patch. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390-4.patch, > HDFS-9390-5.patch, HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: (was: HDFS-9390-5.patch) > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390-4.patch, > HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390-5.patch Thanks [~eddyxu]! Here is the patch after rebase. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390-4.patch, > HDFS-9390-5.patch, HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390-4.patch Updated patch to address checkstyle issues. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390-4.patch, > HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390-3.patch Thanks [~eddyxu] for the review! Here is the new patch. bq. If we change it to the following code, we can undo most of the DatanodeManager.java changes, of which the motivation of these changes are not clear to me in the first sight. The main reason is {{DatanodeManager#removeDatanode}} performs other operations such as {{heartbeatManager.removeDatanode(nodeInfo);}} and {{blockManager.getBlockReportLeaseManager().unregister(nodeInfo);}} which should be called when a maintenance node becomes dead. bq. Why it does not re-calculate stats when minReplicationToBeInMaintanence == 0? Good catch. In addition to fixing it, the new patch also updates TestNamenodeCapacityReport to cover maintenance scenario. bq. Is the comment correct in the context? Fixed. bq. One related question is that, why startMaintenance() and stopMaintenance() are in DecommissionManager. This is similar to startDecommission() and stopDecommission() in DecommissionManager. I plan to rename DecommissionManager to AdminServiceManager as part of HDFS-9388. bq. In NumberReplicas.java, you might want consider rename int maintenance() to int maintenanceReplicas, so is liveEnteringMaintence(). Fixed. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390-3.patch, HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390-2.patch Updated patch to address the checkstyle and test code issues. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390-2.patch, HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Attachment: HDFS-9390.patch [~eddyxu] sorry for the delay. Due to the big difference between trunk and 2.6 which the initial patch is based on, it requires quite amount of work. Here is the draft patch. Couple notes: * Erasure coding might need more work, at least new unit tests are required. We can use another jira for that. * It seems the safety properties maintained by BlockManager is implied in the code. I have started to document more as part of this patch. * There are other issues the patch try to fix along the way, for example {BlockManager#getRedundancy} can be removed. > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9390) Block management for maintenance states
[ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9390: -- Assignee: Ming Ma Status: Patch Available (was: Open) > Block management for maintenance states > --- > > Key: HDFS-9390 > URL: https://issues.apache.org/jira/browse/HDFS-9390 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9390.patch > > > When a node is transitioned to/stay in/transitioned out of maintenance state, > we need to make sure blocks w.r.t. that nodes are properly handled. > * When nodes are put into maintenance, it will first go to > ENTERING_MAINTENANCE, and make sure blocks are minimally replicated before > the nodes are transitioned to IN_MAINTENANCE. > * Do not replica blocks when nodes are in maintenance states. Maintenance > replica will remain in BlockMaps and thus is still considered valid from > block replication point of view. In other words, putting a node to > “maintenance” mode won’t trigger BlockManager to replicate its blocks. > * Do not invalidate replicas on node under maintenance. After any file's > replication factor is reduced, NN needs to invalidate some replicas. It > should exclude nodes under maintenance in the handling. > * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation. > * Do not allocate any new block on nodes under maintenance. > * Have Balancer exclude nodes under maintenance. > * Exclude nodes under maintenance for DN cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org