[jira] [Updated] (HDFS-11535) Performance analysis of new DFSNetworkTopology#chooseRandom

Yiqun Lin (JIRA) Mon, 20 Mar 2017 03:36:54 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yiqun Lin updated HDFS-11535:
-----------------------------
    Attachment: HDFS-11535.002.patch

Thanks [~szetszwo] for sharing your thought!
I agree on that the threshold-based approach may seems a little complicated for 
users to use. So I also agree on the two-trial approach. In general, the total 
trials of the two-trial way will be never big than 2. I suppose this is 
acceptable for us. 

I have taken some time to add the unit test using two-trial way. The following 
are my local test results:
{code}
Percentage: 0.9 avg time: 0.005580342 avg trials: 1.1042
Percentage: 0.8 avg time: 0.00815461 avg trials: 1.1996
Percentage: 0.7 avg time: 0.008995014 avg trials: 1.30315
Percentage: 0.6 avg time: 0.010933414 avg trials: 1.3927
Percentage: 0.5 avg time: 0.009327865 avg trials: 1.50345
Percentage: 0.4 avg time: 0.015638033 avg trials: 1.59705
Percentage: 0.3 avg time: 0.014731338 avg trials: 1.7
Percentage: 0.2 avg time: 0.013827324 avg trials: 1.8023
Percentage: 0.1 avg time: 0.017193155 avg trials: 1.89965
{code}
I think we can add my test in your if we are sure to use two-trial way. Finally 
attach the new patch with two-trial way test added.

> Performance analysis of new DFSNetworkTopology#chooseRandom
> -----------------------------------------------------------
>
>                 Key: HDFS-11535
>                 URL: https://issues.apache.org/jira/browse/HDFS-11535
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>         Attachments: HDFS-11535.001.patch, HDFS-11535.002.patch, PerfTest.pdf
>
>
> This JIRA is created to post the results of some performance experiments we 
> did.  For those who are interested, please the attached .pdf file for more 
> detail. The attached patch file includes the experiment code we ran. 
> The key insights we got from these tests is that: although *the new method 
> outperforms the current one in most cases*. There is still *one case where 
> the current one is better*. Which is when there is only one storage type in 
> the cluster, and we also always look for this storage type. In this case, it 
> is simply a waste of time to perform storage-type-based pruning, blindly 
> picking up a random node (current methods) would suffice.
> Therefore, based on the analysis, we propose to use a *combination of both 
> the old and the new methods*:
> say, we search for a node of type X, since now inner node all keep storage 
> type info, we can *just check root node to see if X is the only type it has*. 
> If yes, blindly picking a random leaf will work, so we simply call the old 
> method, otherwise we call the new method.
> There is still at least one missing piece in this performance test, which is 
> garbage collection. The new method does a few more object creation when doing 
> the search, which adds overhead to GC. I'm still thinking of any potential 
> optimization but this seems tricky, also I'm not sure whether this 
> optimization worth doing at all. Please feel free to leave any 
> comments/suggestions.
> Thanks [~arpitagarwal] and [~szetszwo] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDFS-11535) Performance analysis of new DFSNetworkTopology#chooseRandom

Reply via email to