[
https://issues.apache.org/jira/browse/HDFS-15937?focusedWorklogId=574582&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-574582
]
ASF GitHub Bot logged work on HDFS-15937:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 31/Mar/21 04:52
Start Date: 31/Mar/21 04:52
Worklog Time Spent: 10m
Work Description: Hexiaoqiao commented on a change in pull request #2838:
URL: https://github.com/apache/hadoop/pull/2838#discussion_r604588150
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
##########
@@ -1071,12 +1071,26 @@ private static void linkAllBlocks(File fromDir, File
fromBbwDir, File toDir,
}
private static class LinkArgs {
- File src;
- File dst;
+ private File srcDir;
+ private File dstDir;
+ private String blockFile;
+
+ LinkArgs(File srcDir, File dstDir, String blockFile) {
+ this.srcDir = srcDir;
+ this.dstDir = dstDir;
+ this.blockFile = blockFile;
+ }
+
+ public File src() {
Review comment:
`getSrc` shoud be more graceful? The same as method `dst(...)` +
`blockFile()`.
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
##########
@@ -1161,10 +1176,12 @@ public Void call() throws IOException {
*/
@Override
public int compare(LinkArgs a, LinkArgs b) {
+ File asrc = a.src();
+ File bsrc = b.src();
return ComparisonChain.start().
- compare(a.src.getName(), b.src.getName()).
- compare(a.src, b.src).
- compare(a.dst, b.dst).
+ compare(asrc.getName(), bsrc.getName()).
+ compare(asrc, bsrc).
Review comment:
is it duplicated comparator between `compare(asrc.getName(),
bsrc.getName())` and `compare(asrc, bsrc)`?
For UnixFileSystem, the implement is as following. Not sure if it is same
for other FileSystem instance.
> public int compare(File f1, File f2) {
> return f1.getPath().compareTo(f2.getPath());
> }
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
##########
@@ -1345,8 +1363,18 @@ public boolean accept(File dir, String name) {
throw new IOException("Failed to mkdirs " + blockLocation);
}
}
- idBasedLayoutSingleLinks.add(new LinkArgs(new File(from, blockName),
- new File(blockLocation, blockName)));
+ /**
+ * The destination path is 32x32, so 1024 distinct paths. Therefore
+ * we cache the destination path and reuse the same File object on
+ * potentially thousands of blocks located on this volume.
+ * This method is called recursively so the cache is passed through
+ * each recursive call. There is one cache per volume, and it is only
+ * accessed by a single thread so no locking is needed.
+ */
+ File cachedDest = pathCache
+ .computeIfAbsent(blockLocation, k -> blockLocation);
+ idBasedLayoutSingleLinks.add(new LinkArgs(from,
Review comment:
Here it reuses prefix path of blocks and no `File` instances created to
reduce memory used, right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 574582)
Time Spent: 40m (was: 0.5h)
> Reduce memory used during datanode layout upgrade
> -------------------------------------------------
>
> Key: HDFS-15937
> URL: https://issues.apache.org/jira/browse/HDFS-15937
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 3.3.0, 3.1.4, 3.2.2, 3.4.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Labels: pull-request-available
> Attachments: heap-dump-after.png, heap-dump-before.png
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> When the datanode block layout is upgrade from -56 (256x256) to -57 (32x32),
> we have found the datanode uses a lot more memory than usual.
> For each volume, the blocks are scanned and a list is created holding a
> series of LinkArgs objects. This object contains a File object for the block
> source and destination. The file object stores the path as a string, eg:
> /data01/dfs/dn/current/BP-586623041-127.0.0.1-1617017575175/current/finalized/subdir0/subdir0/blk_1073741825_1001.meta
> /data01/dfs/dn/current/BP-586623041-127.0.0.1-1617017575175/current/finalized/subdir0/subdir0/blk_1073741825
> This is string is repeated for every block and meta file on the DN, and much
> of the string is the same each time, leading to a large amount of memory.
> If we change the linkArgs to store:
> * Src Path without the block, eg
> /data01/dfs/dn/previous.tmp/BP-586623041-127.0.0.1-1617017575175/current/finalized/subdir0/subdir0
> * Dest Path without the block eg
> /data01/dfs/dn/current/BP-586623041-127.0.0.1-1617017575175/current/finalized/subdir0/subdir10
> * Block / Meta file name, eg blk_12345678_1001 or blk_12345678_1001.meta
> Then ensure were reuse the same file object for repeated src and dest paths,
> we can save most of the memory without reworking the logic of the code.
> The current logic works along the source paths recursively, so you can easily
> re-use the src path object.
> For the destination path, there are only 32x32 (1024) distinct paths, so we
> can simply cache them in a hashMap and lookup the re-useable object each time.
> I tested locally by generating 100k block files and attempting the layout
> upgrade. A heap dump showed the 100k blocks using about 140MB of heap. That
> is close to 1.5GB per 1M blocks.
> After the change outlined above the same 100K blocks used about 20MB of heap,
> so 200MB per million blocks.
> A general DN sizing recommendation is 1GB of heap per 1M blocks, so the
> upgrade should be able to happen within the pre-upgrade heap.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]