[ 
https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166393#comment-17166393
 ] 

Stephen O'Donnell commented on HDFS-15493:
------------------------------------------

I did a bit more testing:

1. Changed the code to have two single threaded executors - one for cache Map 
and one for Block Map

2. Added a debug message to let me know how long the executor service is taking 
to shutdown.

With the parallel image loading disabled - the runtime is about the same or 
marginally better with the two single thread executors vs 1 executor with 4 
threads.

With parallel on and:

 * Two single threaded executors: 229 / 226 seconds (about 26 seconds waiting 
on executors to shutdown)
 * One executor with 4 threads: 243 / 238 seconds (this is a small performance 
degradation)
 * Feature disabled: 235 / 230 seconds

There are two times for each run, as I ran each option twice. 

>From this, I believe two single threaded executors are the best choice.

An interesting point from the parallel case with the single thread executors - 
the threadpools are taking about 25 - 30 seconds to shutdown. This means that 
the single thread cannot keep up with processing the number tasks. Adding more 
threads will not help due to locking. In the serial case the executors shutdown 
almost immediately, indicating they can keep up.

If we could move the executor shutdown to the end of image loading, rather than 
wait on it, we would see a good improvement in the parallel case too. However, 
I am not sure if that is a safe thing to do - other sections may depend on the 
block map / cache being loaded fully when the inode directory section has 
completed.


> Update block map and name cache in parallel while loading fsimage.
> ------------------------------------------------------------------
>
>                 Key: HDFS-15493
>                 URL: https://issues.apache.org/jira/browse/HDFS-15493
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Chengwei Wang
>            Priority: Major
>         Attachments: HDFS-15493.001.patch
>
>
> While loading INodeDirectorySection of fsimage, it will update name cache and 
> block map after added inode file to inode directory. It would reduce time 
> cost of fsimage loading to enable these steps run in parallel.
> In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load 
> fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost 
> reduc to 410s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to