LeonGao91 commented on a change in pull request #2288:
URL: https://github.com/apache/hadoop/pull/2288#discussion_r494599382
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##########
@@ -190,6 +193,26 @@
}
this.conf = conf;
this.fileIoProvider = fileIoProvider;
+ this.enableSameDiskArchival =
+ conf.getBoolean(DFSConfigKeys.DFS_DATANODE_ALLOW_SAME_DISK_TIERING,
+ DFSConfigKeys.DFS_DATANODE_ALLOW_SAME_DISK_TIERING_DEFAULT);
+ if (enableSameDiskArchival) {
+ this.mount = usage.getMount();
+ reservedForArchive = conf.getDouble(
Review comment:
Yeah, it's a good point. The reason I put it this way is to make
configuration less verbose for normal use cases that datanode only has one type
of disk. Otherwise, users will need to tag all the disks which is less readable
and easy to make mistakes.
I think we can introduce additional config for the use case you mentioned
later, to list out each volume and target ratio.
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##########
@@ -412,16 +435,28 @@ long getBlockPoolUsed(String bpid) throws IOException {
*/
@VisibleForTesting
public long getCapacity() {
+ long capacity;
if (configuredCapacity < 0L) {
long remaining;
if (cachedCapacity > 0L) {
remaining = cachedCapacity - getReserved();
} else {
remaining = usage.getCapacity() - getReserved();
}
- return Math.max(remaining, 0L);
+ capacity = Math.max(remaining, 0L);
+ } else {
+ capacity = configuredCapacity;
+ }
+
+ if (enableSameDiskArchival) {
Review comment:
This is actually the important part to enable this feature, to allow
users to configure the capacity of a fsVolume.
As we are configuring two fsVolume on the same underlying filesystem, if we
do nothing the capacity will be calculated twice thus all the stats being
reported will be incorrect.
Here is an example:
Let's say we want to configure `[DISK]/data01/dfs` and
`[ARCHIVE]/data01/dfs_archive` on a 4TB disk mount `/data01`, and we want to
assign 1 TB to `[DISK]/data01/dfs` and 3 TB for `[ARCHIVE]/data01/dfs_archive`,
we can make `reservedForArchive` to be 0.75 and put those two dirs in the
volume list.
In this case, `/data01/dfs` will be reported as a 1TB volume and
`/data01/dfs_archive` will be reported as 3TB volume to HDFS. Logically, HDFS
will just treat them as two separate volumes.
If we don't make the change here, HDFS will see two volumes and each of them
is 4TB, in that case, the 4TB disk will be counted as 4 * 2 = 8TB capacity in
namenode and all the related stats will be wrong.
Another change we need to make is the `getActualNonDfsUsed()` as below.
Let's say in the above 4TB disk setup we use 0.1TB as reserved, and
`[ARCHIVE]/data01/dfs_archive` already has 2TB capacity used, in this case when
we are calculating the `getActualNonDfsUsed()` for `[DISK]/data01/dfs` it will
always return 0, which is not correct and it will cause other weird issues. As
the two fsVolumes are on the same filesystem, the reserved space should be
shared.
According to our analysis and cluster testing result, updating these two
functions `getCapacity()` and `getActualNonDfsUsed()` is enough to keep stats
correct for the two "logical" fsVolumes on same disk.
I can update the java doc to reflect this when the feature is turned on.
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
##########
@@ -452,7 +487,33 @@ public long getAvailable() throws IOException {
}
long getActualNonDfsUsed() throws IOException {
- return usage.getUsed() - getDfsUsed();
+ // DISK and ARCHIVAL on same disk
Review comment:
Commented with an example use case as above, hopefully it explains well
: )
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
##########
@@ -62,9 +64,14 @@
private final VolumeChoosingPolicy<FsVolumeImpl> blockChooser;
private final BlockScanner blockScanner;
+ private boolean enableSameDiskTiering;
Review comment:
Good catch! I will update them to be the same name.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]