[GitHub] [hadoop] LeonGao91 commented on a change in pull request #2288: HDFS-15548. Allow configuring DISK/ARCHIVE storage types on same device mount

GitBox Fri, 25 Sep 2020 06:41:44 -0700


LeonGao91 commented on a change in pull request #2288:
URL: https://github.com/apache/hadoop/pull/2288#discussion_r494599382




##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##########
@@ -190,6 +193,26 @@
     }
     this.conf = conf;
     this.fileIoProvider = fileIoProvider;
+    this.enableSameDiskArchival =
+        conf.getBoolean(DFSConfigKeys.DFS_DATANODE_ALLOW_SAME_DISK_TIERING,
+            DFSConfigKeys.DFS_DATANODE_ALLOW_SAME_DISK_TIERING_DEFAULT);
+    if (enableSameDiskArchival) {
+      this.mount = usage.getMount();
+      reservedForArchive = conf.getDouble(

Review comment:
       Yeah, it's a good point. The reason I put it this way is to make 
configuration less verbose for normal use cases that datanode only has one type 
of disk. Otherwise, users will need to tag all the disks which is less readable 
and easy to make mistakes.
   
   I think we can introduce additional config for the use case you mentioned 
later, to list out each volume and target ratio.

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##########
@@ -412,16 +435,28 @@ long getBlockPoolUsed(String bpid) throws IOException {
    */
   @VisibleForTesting
   public long getCapacity() {
+    long capacity;
     if (configuredCapacity < 0L) {
       long remaining;
       if (cachedCapacity > 0L) {
         remaining = cachedCapacity - getReserved();
       } else {
         remaining = usage.getCapacity() - getReserved();
       }
-      return Math.max(remaining, 0L);
+      capacity = Math.max(remaining, 0L);
+    } else {
+      capacity = configuredCapacity;
+    }
+
+    if (enableSameDiskArchival) {

Review comment:
       This is actually the important part to enable this feature, to allow 
users to configure the capacity of a fsVolume.
   
   As we are configuring two fsVolume on the same underlying filesystem, if we 
do nothing the capacity will be calculated twice thus all the stats being 
reported will be incorrect.
   
   Here is an example:
   Let's say we want to configure `[DISK]/data01/dfs` and 
`[ARCHIVE]/data01/dfs_archive` on a 4TB disk mount `/data01`, and we want to 
assign 1 TB to `[DISK]/data01/dfs` and 3 TB for `[ARCHIVE]/data01/dfs_archive`, 
we can make `reservedForArchive` to be 0.75 and put those two dirs in the 
volume list.
   
   In this case, `/data01/dfs` will be reported as a 1TB volume and 
`/data01/dfs_archive` will be reported as 3TB volume to HDFS. Logically, HDFS 
will just treat them as two separate volumes.
   
   If we don't make the change here, HDFS will see two volumes and each of them 
is 4TB, in that case, the 4TB disk will be counted as 4 * 2 = 8TB capacity in 
namenode and all the related stats will be wrong.
   
   Another change we need to make is the `getActualNonDfsUsed()` as below. 
Let's say in the above 4TB disk setup we use 0.1TB as reserved, and 
`[ARCHIVE]/data01/dfs_archive` already has 2TB capacity used, in this case when 
we are calculating the `getActualNonDfsUsed()` for `[DISK]/data01/dfs` it will 
always return 0, which is not correct and it will cause other weird issues. As 
the two fsVolumes are on the same filesystem, the reserved space should be 
shared.
   
   According to our analysis and cluster testing result, updating these two 
functions `getCapacity()` and `getActualNonDfsUsed()` is enough to keep stats 
correct for the two "logical" fsVolumes on same disk.
   
   I can update the java doc to reflect this when the feature is turned on.

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##########
@@ -452,7 +487,33 @@ public long getAvailable() throws IOException {
   }
 
   long getActualNonDfsUsed() throws IOException {
-    return usage.getUsed() - getDfsUsed();
+    // DISK and ARCHIVAL on same disk

Review comment:
       Commented with an example use case as above, hopefully it explains well 
: )

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
##########
@@ -62,9 +64,14 @@
   private final VolumeChoosingPolicy<FsVolumeImpl> blockChooser;
   private final BlockScanner blockScanner;
 
+  private boolean enableSameDiskTiering;

Review comment:
       Good catch! I will update them to be the same name.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hadoop] LeonGao91 commented on a change in pull request #2288: HDFS-15548. Allow configuring DISK/ARCHIVE storage types on same device mount

Reply via email to