[
https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168635#comment-17168635
]
Chengwei Wang commented on HDFS-15493:
--------------------------------------
After reviewed code about update blocks map and name cache carefully,I found
that it's feasible to start to do these when started loading INodeSection, and
shutdown the executors when completed loading INodeDirectorySection. So that,
it taken almost no time cost to wait executor terminated.
Submit a patch [^HDFS-15493.004.patch] base on this means. It uses two single
thread executors and updates without lock.
Tested this patch twice.
{code:java}
Test1.
20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed loading all
INodeDirectory sub-sections
20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed update
blocks map and name cache, total waiting duration: 1
20/07/31 18:27:51 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in
367 seconds.
Test2.
20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed loading all
INodeDirectory sub-sections
20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed update
blocks map and name cache, total waiting duration: 1
20/07/31 18:48:04 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in
363 seconds.{code}
It takes about 20% speed up base my tests and reduces the time cost from 460s+
to 360s+.
I think this patch may be the best choice, [~sodonnell] can you help me test it
on trunk.
> Update block map and name cache in parallel while loading fsimage.
> ------------------------------------------------------------------
>
> Key: HDFS-15493
> URL: https://issues.apache.org/jira/browse/HDFS-15493
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Chengwei Wang
> Priority: Major
> Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch,
> HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log
>
>
> While loading INodeDirectorySection of fsimage, it will update name cache and
> block map after added inode file to inode directory. It would reduce time
> cost of fsimage loading to enable these steps run in parallel.
> In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load
> fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost
> reduc to 410s.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]