[
https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166818#comment-17166818
]
Chengwei Wang commented on HDFS-15493:
--------------------------------------
Hi [~sodonnell], thanks for your detailed review and testing.
{quote}When you tested, are you sure the parallel loading in HDFS-14617 was
enabled correctly, by first saving the image to create the sub-sections in the
image index? If it is working correctly, you should see log messages like:
{quote}
I'm sure that the parallel loading was eabled correctly, and I had tested again
yesterday as your test suggestions, and submit a summary log
here.[^fsimage-loading.log]
In my tests, (240M inode + 220M blcoks) when update blocks async enabled, the
time cost of loading fsimage reduce from 467s to 420s. So, I guess if the scale
of fsimage make the loading improment not obvious.
{quote}It would be very interesting to check the performance of my earlier
suggestion with two single threaded executors and see how it performs.
{quote}
I had tested loading the caches and blocks by two single thread executors, same
to your test result, there would be a long time to wait the executors
terminated, so the time cost was not better than the one executor with four
threads.
{quote}If we could move the executor shutdown to the end of image loading,
rather than wait on it, we would see a good improvement in the parallel case
too. However, I am not sure if that is a safe thing to do - other sections may
depend on the block map / cache being loaded fully when the inode directory
section has completed.
{quote}
I agree this idea is a better way, I will try to check if it is safe and give
a test result.
By the way, I will refactor some code as your suggestions, and submit a patch
soon.
> Update block map and name cache in parallel while loading fsimage.
> ------------------------------------------------------------------
>
> Key: HDFS-15493
> URL: https://issues.apache.org/jira/browse/HDFS-15493
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Chengwei Wang
> Priority: Major
> Attachments: HDFS-15493.001.patch, fsimage-loading.log
>
>
> While loading INodeDirectorySection of fsimage, it will update name cache and
> block map after added inode file to inode directory. It would reduce time
> cost of fsimage loading to enable these steps run in parallel.
> In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load
> fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost
> reduc to 410s.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]