[ 
https://issues.apache.org/jira/browse/HDFS-17060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736981#comment-17736981
 ] 

ASF GitHub Bot commented on HDFS-17060:
---------------------------------------

xinglin commented on code in PR #5778:
URL: https://github.com/apache/hadoop/pull/5778#discussion_r1241492929


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml:
##########
@@ -342,6 +342,14 @@
     </description>
   </property>
 
+  <property>
+    <name>dfs.namenode.choosereplicatodelete.considerLoad</name>
+    <value>false</value>
+    <description>Decide if chooseReplicaToDelete considers the target's load or

Review Comment:
   Please add a note on how the load is being determined. which metric is being 
used as the load.



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java:
##########
@@ -1018,6 +1018,75 @@ public void testChooseReplicaToDelete() throws Exception 
{
     assertEquals(chosen, storages[1]);
   }
 
+  /**
+   * Test for the chooseReplicaToDelete are processed based on
+   * block locality, load and free space.
+   */
+  @Test
+  public void testChooseReplicaToDeleteConsiderLoad() throws Exception {
+    ((BlockPlacementPolicyDefault) 
replicator).setConsiderLoad2chooseReplicaDeleting(true);
+    List<DatanodeStorageInfo> replicaList = new ArrayList<>();
+    final Map<String, List<DatanodeStorageInfo>> rackMap
+        = new HashMap<String, List<DatanodeStorageInfo>>();
+
+    storages[0].setRemainingForTests(4*1024*1024);
+    dataNodes[0].setRemaining(calculateRemaining(dataNodes[0]));
+    dataNodes[0].setXceiverCount(20);
+    replicaList.add(storages[0]);
+
+    storages[1].setRemainingForTests(3*1024*1024);
+    dataNodes[1].setRemaining(calculateRemaining(dataNodes[1]));
+    dataNodes[1].setXceiverCount(10);
+    replicaList.add(storages[1]);
+
+    storages[2].setRemainingForTests(2*1024*1024);
+    dataNodes[2].setRemaining(calculateRemaining(dataNodes[2]));
+    dataNodes[2].setXceiverCount(15);
+    replicaList.add(storages[2]);
+
+    //Even if this node has the most space, because the storage[5] has
+    //the lowest it should be chosen in case of block delete.
+    storages[5].setRemainingForTests(512 * 1024);
+    dataNodes[5].setRemaining(calculateRemaining(dataNodes[5]));
+    dataNodes[5].setXceiverCount(5);
+    replicaList.add(storages[5]);
+
+    // Refresh the last update time for all the datanodes
+    for (int i = 0; i < dataNodes.length; i++) {
+      DFSTestUtil.resetLastUpdatesWithOffset(dataNodes[i], 0);
+    }
+
+    List<DatanodeStorageInfo> first = new ArrayList<>();
+    List<DatanodeStorageInfo> second = new ArrayList<>();
+    replicator.splitNodesWithRack(replicaList, replicaList, rackMap, first,
+        second);
+    // storages[0] and storages[1] are in first set as their rack has two
+    // replica nodes, while storages[2] and dataNodes[5] are in second set.
+    assertEquals(2, first.size());
+    assertEquals(2, second.size());
+    List<StorageType> excessTypes = new ArrayList<>();
+    {
+      // test returning null
+      excessTypes.add(StorageType.SSD);
+      assertNull(((BlockPlacementPolicyDefault) replicator)
+          .chooseReplicaToDelete(first, second, excessTypes, rackMap));
+    }
+    excessTypes.add(StorageType.DEFAULT);
+    DatanodeStorageInfo chosen = ((BlockPlacementPolicyDefault) replicator)
+        .chooseReplicaToDelete(first, second, excessTypes, rackMap);
+    // Within all storages, storages[5] with least load.

Review Comment:
   this test case does not seem to be properly picked: storages[5] also has the 
least remaining space. Can we change space for storage[5] to 5*1024*1024 and 
see if storage[5] is still picked? 



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java:
##########
@@ -276,6 +276,10 @@ public class DFSConfigKeys extends CommonConfigurationKeys 
{
   public static final boolean
       DFS_NAMENODE_REDUNDANCY_CONSIDERLOADBYVOLUME_DEFAULT
       = false;
+  public static final String 
DFS_NAMENODE_CHOOSEREPLICATODELETE_CONSIDERLOAD_KEY =
+      "dfs.namenode.choosereplicatodelete.considerLoad";

Review Comment:
   nit: could we change to "dfs.namenode.load-based.replica.deletion"?



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java:
##########
@@ -1236,11 +1245,17 @@ public DatanodeStorageInfo chooseReplicaToDelete(
         minSpace = free;
         minSpaceStorage = storage;
       }
+      if (considerLoad2chooseReplicaDeleting && nodeLoad < minLoad) {
+        minLoad = nodeLoad;
+        minLoadStorage = storage;
+      }
     }
 
     final DatanodeStorageInfo storage;
     if (oldestHeartbeatStorage != null) {

Review Comment:
   this `if else` code section seems problematic: After exiting from the 
previous for loop, minSpaceStorage should not be null because minSpace is 
initialized to Long.MAX_VALUE. So, `storage = minSpaceStorage;` will always be 
executed. 



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java:
##########
@@ -1018,6 +1018,75 @@ public void testChooseReplicaToDelete() throws Exception 
{
     assertEquals(chosen, storages[1]);
   }
 
+  /**
+   * Test for the chooseReplicaToDelete are processed based on
+   * block locality, load and free space.
+   */
+  @Test
+  public void testChooseReplicaToDeleteConsiderLoad() throws Exception {
+    ((BlockPlacementPolicyDefault) 
replicator).setConsiderLoad2chooseReplicaDeleting(true);
+    List<DatanodeStorageInfo> replicaList = new ArrayList<>();
+    final Map<String, List<DatanodeStorageInfo>> rackMap
+        = new HashMap<String, List<DatanodeStorageInfo>>();
+
+    storages[0].setRemainingForTests(4*1024*1024);
+    dataNodes[0].setRemaining(calculateRemaining(dataNodes[0]));
+    dataNodes[0].setXceiverCount(20);
+    replicaList.add(storages[0]);
+
+    storages[1].setRemainingForTests(3*1024*1024);
+    dataNodes[1].setRemaining(calculateRemaining(dataNodes[1]));
+    dataNodes[1].setXceiverCount(10);
+    replicaList.add(storages[1]);
+
+    storages[2].setRemainingForTests(2*1024*1024);
+    dataNodes[2].setRemaining(calculateRemaining(dataNodes[2]));
+    dataNodes[2].setXceiverCount(15);
+    replicaList.add(storages[2]);
+
+    //Even if this node has the most space, because the storage[5] has
+    //the lowest it should be chosen in case of block delete.
+    storages[5].setRemainingForTests(512 * 1024);
+    dataNodes[5].setRemaining(calculateRemaining(dataNodes[5]));
+    dataNodes[5].setXceiverCount(5);
+    replicaList.add(storages[5]);
+
+    // Refresh the last update time for all the datanodes
+    for (int i = 0; i < dataNodes.length; i++) {
+      DFSTestUtil.resetLastUpdatesWithOffset(dataNodes[i], 0);
+    }
+
+    List<DatanodeStorageInfo> first = new ArrayList<>();
+    List<DatanodeStorageInfo> second = new ArrayList<>();
+    replicator.splitNodesWithRack(replicaList, replicaList, rackMap, first,
+        second);
+    // storages[0] and storages[1] are in first set as their rack has two
+    // replica nodes, while storages[2] and dataNodes[5] are in second set.
+    assertEquals(2, first.size());
+    assertEquals(2, second.size());
+    List<StorageType> excessTypes = new ArrayList<>();
+    {
+      // test returning null
+      excessTypes.add(StorageType.SSD);
+      assertNull(((BlockPlacementPolicyDefault) replicator)
+          .chooseReplicaToDelete(first, second, excessTypes, rackMap));
+    }
+    excessTypes.add(StorageType.DEFAULT);
+    DatanodeStorageInfo chosen = ((BlockPlacementPolicyDefault) replicator)
+        .chooseReplicaToDelete(first, second, excessTypes, rackMap);
+    // Within all storages, storages[5] with least load.
+    assertEquals(chosen, storages[5]);
+
+    replicator.adjustSetsWithChosenReplica(rackMap, first, second, chosen);
+    assertEquals(2, first.size());
+    assertEquals(1, second.size());
+    // Within first set, storages[1] with less load.

Review Comment:
   same here: storage[1] also has less space left than storage [0]. It would 
have been chose, even when we pick replica to delete based on remaining space.





> BlockPlacementPolicyDefault#chooseReplicaToDelete should consider datanode 
> load
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-17060
>                 URL: https://issues.apache.org/jira/browse/HDFS-17060
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: farmmamba
>            Assignee: farmmamba
>            Priority: Minor
>              Labels: pull-request-available
>
> When choose extra replicas for deleting, we should consider datanode load as 
> well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to