[GitHub] [hadoop] ayushtkn commented on a diff in pull request #4138: HDFS-16479. EC: NameNode should not send a reconstruction work when the source datanodes are insufficient

GitBox Sun, 10 Apr 2022 02:22:08 -0700


ayushtkn commented on code in PR #4138:
URL: https://github.com/apache/hadoop/pull/4138#discussion_r846742614



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:
##########
@@ -2163,6 +2163,17 @@ BlockReconstructionWork scheduleReconstruction(BlockInfo 
block,
       return null;
     }
 
+    // skip if source datanodes for reconstructing ec block are not enough
+    if (block.isStriped()) {
+      BlockInfoStriped stripedBlock = (BlockInfoStriped) block;
+      int cellsNum = (int) ((stripedBlock.getNumBytes() - 1) / 
stripedBlock.getCellSize() + 1);
+      int minRequiredSources = Math.min(cellsNum, 
stripedBlock.getDataBlockNum());
+      if (minRequiredSources > srcNodes.length) {
+        LOG.debug("Block {} cannot be reconstructed due to shortage of source 
datanodes ", block);
+        return null;

Review Comment:
   Should we increment the metrics before returning ``null``
   ```
       NameNode.getNameNodeMetrics().incNumTimesReReplicationNotScheduled();
   ```



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:
##########
@@ -2163,6 +2163,17 @@ BlockReconstructionWork scheduleReconstruction(BlockInfo 
block,
       return null;
     }
 
+    // skip if source datanodes for reconstructing ec block are not enough
+    if (block.isStriped()) {
+      BlockInfoStriped stripedBlock = (BlockInfoStriped) block;
+      int cellsNum = (int) ((stripedBlock.getNumBytes() - 1) / 
stripedBlock.getCellSize() + 1);
+      int minRequiredSources = Math.min(cellsNum, 
stripedBlock.getDataBlockNum());

Review Comment:
   Is this logic same as BlockInfoStriped.getRealDataBlockNum() can we use or 
extract the logic from there? or do some refactoring there, just trying if we 
can keep the logic at one place, in case there is some issue in the logic 
changing at one places fixes all the places..



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java:
##########
@@ -852,6 +852,101 @@ public void 
testChooseSrcDNWithDupECInDecommissioningNode() throws Exception {
         0, numReplicas.redundantInternalBlocks());
   }
 
+  @Test
+  public void testSkipReconstructionWithManyBusyNodes() {
+    long blockId = -9223372036854775776L; // real ec block id
+    // RS-3-2 EC policy
+    ErasureCodingPolicy ecPolicy =
+        SystemErasureCodingPolicies.getPolicies().get(1);
+
+    // striped blockInfo: 3 data blocks + 2 parity blocks
+    Block aBlock = new Block(blockId, ecPolicy.getCellSize() * 
ecPolicy.getNumDataUnits(), 0);
+    BlockInfoStriped aBlockInfoStriped = new BlockInfoStriped(aBlock, 
ecPolicy);
+
+    // create 4 storageInfo, which means 1 block is missing
+    DatanodeStorageInfo ds1 = DFSTestUtil.createDatanodeStorageInfo(
+        "storage1", "1.1.1.1", "rack1", "host1");
+    DatanodeStorageInfo ds2 = DFSTestUtil.createDatanodeStorageInfo(
+        "storage2", "2.2.2.2", "rack2", "host2");
+    DatanodeStorageInfo ds3 = DFSTestUtil.createDatanodeStorageInfo(
+        "storage3", "3.3.3.3", "rack3", "host3");
+    DatanodeStorageInfo ds4 = DFSTestUtil.createDatanodeStorageInfo(
+        "storage4", "4.4.4.4", "rack4", "host4");
+
+    // link block with storage
+    aBlockInfoStriped.addStorage(ds1, aBlock);
+    aBlockInfoStriped.addStorage(ds2, new Block(blockId + 1, 0, 0));
+    aBlockInfoStriped.addStorage(ds3, new Block(blockId + 2, 0, 0));
+    aBlockInfoStriped.addStorage(ds4, new Block(blockId + 3, 0, 0));
+
+    addEcBlockToBM(blockId, ecPolicy);
+    aBlockInfoStriped.setBlockCollectionId(mockINodeId);
+
+    // reconstruction should be scheduled
+    BlockReconstructionWork work = 
bm.scheduleReconstruction(aBlockInfoStriped, 3);
+    assertNotNull(work);
+
+    // simulate the 2 nodes reach maxReplicationStreams
+    for(int i = 0; i < bm.maxReplicationStreams; i++){
+      ds3.getDatanodeDescriptor().incrementPendingReplicationWithoutTargets();
+      ds4.getDatanodeDescriptor().incrementPendingReplicationWithoutTargets();
+    }
+
+    // reconstruction should be skipped since the number of non-busy nodes are 
not enough
+    work = bm.scheduleReconstruction(aBlockInfoStriped, 3);
+    assertNull(work);
+  }
+
+  @Test
+  public void testSkipReconstructionWithManyBusyNodes2() {
+    long blockId = -9223372036854775776L; // real ec block id
+    // RS-3-2 EC policy
+    ErasureCodingPolicy ecPolicy =
+        SystemErasureCodingPolicies.getPolicies().get(1);
+
+    // striped blockInfo: 2 data blocks + 2 paritys

Review Comment:
   typo `paritys`



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java:
##########
@@ -852,6 +852,101 @@ public void 
testChooseSrcDNWithDupECInDecommissioningNode() throws Exception {
         0, numReplicas.redundantInternalBlocks());
   }
 
+  @Test
+  public void testSkipReconstructionWithManyBusyNodes() {
+    long blockId = -9223372036854775776L; // real ec block id
+    // RS-3-2 EC policy
+    ErasureCodingPolicy ecPolicy =
+        SystemErasureCodingPolicies.getPolicies().get(1);
+
+    // striped blockInfo: 3 data blocks + 2 parity blocks
+    Block aBlock = new Block(blockId, ecPolicy.getCellSize() * 
ecPolicy.getNumDataUnits(), 0);
+    BlockInfoStriped aBlockInfoStriped = new BlockInfoStriped(aBlock, 
ecPolicy);
+
+    // create 4 storageInfo, which means 1 block is missing
+    DatanodeStorageInfo ds1 = DFSTestUtil.createDatanodeStorageInfo(
+        "storage1", "1.1.1.1", "rack1", "host1");
+    DatanodeStorageInfo ds2 = DFSTestUtil.createDatanodeStorageInfo(
+        "storage2", "2.2.2.2", "rack2", "host2");
+    DatanodeStorageInfo ds3 = DFSTestUtil.createDatanodeStorageInfo(
+        "storage3", "3.3.3.3", "rack3", "host3");
+    DatanodeStorageInfo ds4 = DFSTestUtil.createDatanodeStorageInfo(
+        "storage4", "4.4.4.4", "rack4", "host4");
+
+    // link block with storage
+    aBlockInfoStriped.addStorage(ds1, aBlock);
+    aBlockInfoStriped.addStorage(ds2, new Block(blockId + 1, 0, 0));
+    aBlockInfoStriped.addStorage(ds3, new Block(blockId + 2, 0, 0));
+    aBlockInfoStriped.addStorage(ds4, new Block(blockId + 3, 0, 0));
+
+    addEcBlockToBM(blockId, ecPolicy);
+    aBlockInfoStriped.setBlockCollectionId(mockINodeId);
+
+    // reconstruction should be scheduled
+    BlockReconstructionWork work = 
bm.scheduleReconstruction(aBlockInfoStriped, 3);
+    assertNotNull(work);
+
+    // simulate the 2 nodes reach maxReplicationStreams
+    for(int i = 0; i < bm.maxReplicationStreams; i++){
+      ds3.getDatanodeDescriptor().incrementPendingReplicationWithoutTargets();
+      ds4.getDatanodeDescriptor().incrementPendingReplicationWithoutTargets();
+    }
+
+    // reconstruction should be skipped since the number of non-busy nodes are 
not enough
+    work = bm.scheduleReconstruction(aBlockInfoStriped, 3);
+    assertNull(work);
+  }
+
+  @Test
+  public void testSkipReconstructionWithManyBusyNodes2() {
+    long blockId = -9223372036854775776L; // real ec block id
+    // RS-3-2 EC policy
+    ErasureCodingPolicy ecPolicy =
+        SystemErasureCodingPolicies.getPolicies().get(1);
+
+    // striped blockInfo: 2 data blocks + 2 paritys
+    Block aBlock = new Block(blockId, ecPolicy.getCellSize() * 
(ecPolicy.getNumDataUnits() - 1), 0);
+    BlockInfoStriped aBlockInfoStriped = new BlockInfoStriped(aBlock, 
ecPolicy);

Review Comment:
   nit: Can you use a better variable name, couldn't decode what does `a` 
stands for, or drop a comment above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] ayushtkn commented on a diff in pull request #4138: HDFS-16479. EC: NameNode should not send a reconstruction work when the source datanodes are insufficient

Reply via email to