[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yanbin.zhang updated HDFS-14920: Labels: (was: 无) > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch, > HDFS-14920.003.patch, HDFS-14920.004.patch, HDFS-14920.005.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8. b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is still in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yanbin.zhang updated HDFS-14920: Labels: 无 (was: pull) > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Labels: 无 > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch, > HDFS-14920.003.patch, HDFS-14920.004.patch, HDFS-14920.005.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8. b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is still in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yanbin.zhang updated HDFS-14920: Labels: pull (was: ) > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Labels: pull > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch, > HDFS-14920.003.patch, HDFS-14920.004.patch, HDFS-14920.005.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8. b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is still in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HDFS-14920: -- Description: Decommission test hangs in our clusters. Have seen the messages as follow {quote} 2019-10-22 15:58:51,514 TRACE org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block blk_-9223372035600425840_372987973 numExpected=9, numLive=5 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, maintenance replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes having this block: 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current datanode decommissioning: true, Is current datanode entering maintenance: false 2019-10-22 15:58:51,514 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate to finish Decommission In Progress {quote} After digging the source code and cluster log, guess it happens as follow steps. # Storage strategy is RS-6-3-1024k. # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8. b0 is from datanode dn0, b1 is from datanode dn1, ...etc # At the beginning dn0 is in decommission progress, b0 is replicated successfully, and dn0 is still in decommission progress. # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of service, so need to reconstruct, and create ErasureCodingWork to do it, in the ErasureCodingWork, additionalReplRequired is 4 # Because hasAllInternalBlocks is false, Will call ErasureCodingWork#addTaskToDatanode -> DatanodeDescriptor#addBlockToBeErasureCoded, and send BlockECReconstructionInfo task to Datanode # DataNode can not reconstruction the block because targets is 4, greater than 3( parity number). There is a problem as follow, from BlockManager.java#scheduleReconstruction {code} // should reconstruct all the internal blocks before scheduling // replication task for decommissioning node(s). if (additionalReplRequired - numReplicas.decommissioning() - numReplicas.liveEnteringMaintenanceReplicas() > 0) { additionalReplRequired = additionalReplRequired - numReplicas.decommissioning() - numReplicas.liveEnteringMaintenanceReplicas(); } {code} Should reconstruction firstly and then replicate for decommissioning. Because numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's wrong, numReplicas.decommissioning() should be 3, it should exclude live replica. If so, additionalReplRequired will be 1, reconstruction will schedule as expected. After that, decommission goes on. was: Decommission test hangs in our clusters. Have seen the messages as follow {quote} 2019-10-22 15:58:51,514 TRACE org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block blk_-9223372035600425840_372987973 numExpected=9, numLive=5 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, maintenance replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes having this block: 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current datanode decommissioning: true, Is current datanode entering maintenance: false 2019-10-22 15:58:51,514 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate to finish Decommission In Progress {quote} After digging the source code and cluster log, guess it happens as follow steps. # Storage strategy is RS-6-3-1024k. # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from datanode dn0, b1 is from datanode dn1, ...etc # At the beginning dn0 is in decommission progress, b0 is replicated successfully, and dn0 is staill in decommission progress. # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of service, so need to reconstruct, and create ErasureCodingWork to do it, in the ErasureCodingWork, additionalReplRequired is 4 # Because hasAllInternalBlocks is false, Will call ErasureCodingWork#addTaskToDatanode -> DatanodeDescriptor#addBlockToBeErasureCoded, and send BlockECReconstructionInfo task to Datanode # DataNode can not reconstruction
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-14920: Fix Version/s: 3.2.2 3.1.4 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanx [~ferhui] for the contribution!!! > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch, > HDFS-14920.003.patch, HDFS-14920.004.patch, HDFS-14920.005.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is staill in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14920: --- Attachment: HDFS-14920.005.patch > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch, > HDFS-14920.003.patch, HDFS-14920.004.patch, HDFS-14920.005.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is staill in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14920: --- Attachment: HDFS-14920.004.patch > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch, > HDFS-14920.003.patch, HDFS-14920.004.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is staill in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14920: --- Attachment: HDFS-14920.003.patch > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch, > HDFS-14920.003.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is staill in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14920: --- Attachment: HDFS-14920.002.patch > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is staill in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14920: --- Description: Decommission test hangs in our clusters. Have seen the messages as follow {quote} 2019-10-22 15:58:51,514 TRACE org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block blk_-9223372035600425840_372987973 numExpected=9, numLive=5 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, maintenance replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes having this block: 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current datanode decommissioning: true, Is current datanode entering maintenance: false 2019-10-22 15:58:51,514 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate to finish Decommission In Progress {quote} After digging the source code and cluster log, guess it happens as follow steps. # Storage strategy is RS-6-3-1024k. # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from datanode dn0, b1 is from datanode dn1, ...etc # At the beginning dn0 is in decommission progress, b0 is replicated successfully, and dn0 is staill in decommission progress. # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of service, so need to reconstruct, and create ErasureCodingWork to do it, in the ErasureCodingWork, additionalReplRequired is 4 # Because hasAllInternalBlocks is false, Will call ErasureCodingWork#addTaskToDatanode -> DatanodeDescriptor#addBlockToBeErasureCoded, and send BlockECReconstructionInfo task to Datanode # DataNode can not reconstruction the block because targets is 4, greater than 3( parity number). There is a problem as follow, from BlockManager.java#scheduleReconstruction {code} // should reconstruct all the internal blocks before scheduling // replication task for decommissioning node(s). if (additionalReplRequired - numReplicas.decommissioning() - numReplicas.liveEnteringMaintenanceReplicas() > 0) { additionalReplRequired = additionalReplRequired - numReplicas.decommissioning() - numReplicas.liveEnteringMaintenanceReplicas(); } {code} Should reconstruction firstly and then replicate for decommissioning. Because numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's wrong, numReplicas.decommissioning() should be 3, it should exclude live replica. If so, additionalReplRequired will be 1, reconstruction will schedule as expected. After that, decommission goes on. was: Decommission test hangs in our clusters. Have seen the messages as follow {quote} 2019-10-22 15:58:51,514 TRACE org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block blk_-9223372035600425840_372987973 numExpected=9, numLive=5 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, maintenance replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes having this block: 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current datanode decommissioning: true, Is current datanode entering maintenance: false 2019-10-22 15:58:51,514 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate to finish Decommission In Progress {quote} After digging the source code and cluster log, guess it happens as follow steps. # Storage strategy is RS-6-3-1024k. # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from datanode dn0, b1 is from datanode dn1, ...etc # At the beginning dn0 is in decommission progress, b0 is replicated successfully, and dn0 is staill in decommission progress. # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of service, so need to reconstruct, and create ErasureCodingWork to do it, in the ErasureCodingWork, additionalReplRequired is 4 # Because hasAllInternalBlocks is false, Will call ErasureCodingWork#addTaskToDatanode -> DatanodeDescriptor#addBlockToBeErasureCoded, and send BlockECReconstructionInfo task to Datanode # DataNode can not reconstruction the
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14920: --- Status: Patch Available (was: Open) > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.1.3, 3.2.1, 3.0.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14920.001.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is staill in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong, > numReplicas.decommissioning() should be 3, it should exclude live replica. > If so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14920: --- Attachment: HDFS-14920.001.patch > Erasure Coding: Decommission may hang If one or more datanodes are out of > service during decommission > --- > > Key: HDFS-14920 > URL: https://issues.apache.org/jira/browse/HDFS-14920 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14920.001.patch > > > Decommission test hangs in our clusters. > Have seen the messages as follow > {quote} > 2019-10-22 15:58:51,514 TRACE > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block > blk_-9223372035600425840_372987973 numExpected=9, numLive=5 > 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: > blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, > corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, > maintenance replicas: 0, live entering maintenance replicas: 0, excess > replicas: 0, Is Open File: false, Datanodes having this block: > 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 > 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 > 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > 2019-10-22 15:58:51,514 DEBUG > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node > 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate > to finish Decommission In Progress > {quote} > After digging the source code and cluster log, guess it happens as follow > steps. > # Storage strategy is RS-6-3-1024k. > # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from > datanode dn0, b1 is from datanode dn1, ...etc > # At the beginning dn0 is in decommission progress, b0 is replicated > successfully, and dn0 is staill in decommission progress. > # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of > service, so need to reconstruct, and create ErasureCodingWork to do it, in > the ErasureCodingWork, additionalReplRequired is 4 > # Because hasAllInternalBlocks is false, Will call > ErasureCodingWork#addTaskToDatanode -> > DatanodeDescriptor#addBlockToBeErasureCoded, and send > BlockECReconstructionInfo task to Datanode > # DataNode can not reconstruction the block because targets is 4, greater > than 3( parity number). > There is a problem as follow, from BlockManager.java#scheduleReconstruction > {code} > // should reconstruct all the internal blocks before scheduling > // replication task for decommissioning node(s). > if (additionalReplRequired - numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas() > 0) { > additionalReplRequired = additionalReplRequired - > numReplicas.decommissioning() - > numReplicas.liveEnteringMaintenanceReplicas(); > } > {code} > Should reconstruction firstly and then replicate for decommissioning. Because > numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's > wrong. > numReplicas.decommissioning() should be 3, it should exclude live replica. If > so, additionalReplRequired will be 1, reconstruction will schedule as > expected. After that, decommission goes on. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission
[ https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14920: --- Description: Decommission test hangs in our clusters. Have seen the messages as follow {quote} 2019-10-22 15:58:51,514 TRACE org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block blk_-9223372035600425840_372987973 numExpected=9, numLive=5 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, maintenance replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes having this block: 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current datanode decommissioning: true, Is current datanode entering maintenance: false 2019-10-22 15:58:51,514 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate to finish Decommission In Progress {quote} After digging the source code and cluster log, guess it happens as follow steps. # Storage strategy is RS-6-3-1024k. # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from datanode dn0, b1 is from datanode dn1, ...etc # At the beginning dn0 is in decommission progress, b0 is replicated successfully, and dn0 is staill in decommission progress. # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of service, so need to reconstruct, and create ErasureCodingWork to do it, in the ErasureCodingWork, additionalReplRequired is 4 # Because hasAllInternalBlocks is false, Will call ErasureCodingWork#addTaskToDatanode -> DatanodeDescriptor#addBlockToBeErasureCoded, and send BlockECReconstructionInfo task to Datanode # DataNode can not reconstruction the block because targets is 4, greater than 3( parity number). There is a problem as follow, from BlockManager.java#scheduleReconstruction {code} // should reconstruct all the internal blocks before scheduling // replication task for decommissioning node(s). if (additionalReplRequired - numReplicas.decommissioning() - numReplicas.liveEnteringMaintenanceReplicas() > 0) { additionalReplRequired = additionalReplRequired - numReplicas.decommissioning() - numReplicas.liveEnteringMaintenanceReplicas(); } {code} Should reconstruction firstly and then replicate for decommissioning. Because numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's wrong. numReplicas.decommissioning() should be 3, it should exclude live replica. If so, additionalReplRequired will be 1, reconstruction will schedule as expected. After that, decommission goes on. was: Decommission test hangs in our clusters. Have seen the messages as follow {quote} 2019-10-22 15:58:51,514 TRACE org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block blk_-9223372035600425840_372987973 numExpected=9, numLive=5 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, maintenance replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes having this block: 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current datanode decommissioning: true, Is current datanode entering maintenance: false 2019-10-22 15:58:51,514 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate to finish Decommission In Progress {quote} After digging the source code and cluster log, guess it happens as follow steps. # Storage strategy is RS-6-3-1024k. # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from datanode dn0, b1 is from datanode dn1, ...etc # At the beginning dn0 is in decommission progress, b0 is replicated successfully, and dn0 is staill in decommission progress. # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of service, so need to reconstruct, and create ErasureCodingWork to do it, in the ErasureCodingWork, additionalReplRequired is 4 # Because hasAllInternalBlocks is false, Will call ErasureCodingWork#addTaskToDatanode -> DatanodeDescriptor#addBlockToBeErasureCoded, and send BlockECReconstructionInfo task to Datanode # DataNode can not reconstruction the