[ 
https://issues.apache.org/jira/browse/HADOOP-17408?focusedWorklogId=532169&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-532169
 ]

ASF GitHub Bot logged work on HADOOP-17408:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Jan/21 22:30
            Start Date: 06/Jan/21 22:30
    Worklog Time Spent: 10m 
      Work Description: hadoop-yetus commented on pull request #2514:
URL: https://github.com/apache/hadoop/pull/2514#issuecomment-755756961


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m  0s |  |  Docker mode activated.  |
   | -1 :x: |  patch  |   0m  9s |  |  
https://github.com/apache/hadoop/pull/2514 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.  
|
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | GITHUB PR | https://github.com/apache/hadoop/pull/2514 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2514/3/console |
   | versions | git=2.17.1 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 532169)
    Time Spent: 40m  (was: 0.5h)

> Optimize NetworkTopology while sorting of block locations
> ---------------------------------------------------------
>
>                 Key: HADOOP-17408
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17408
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common, net
>            Reporter: Ahmed Hussein
>            Assignee: Ahmed Hussein
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> In {{NetworkTopology}}, I noticed that there are some hanging fruits to 
> improve the performance.
> Inside {{sortByDistance}}, collections.shuffle is performed on the list 
> before calling {{secondarySort}}.
> {code:java}
> Collections.shuffle(list, r);
> if (secondarySort != null) {
>   secondarySort.accept(list);
> }
> {code}
> However, in different call sites, {{collections.shuffle}} is passed as the 
> secondarySort to {{sortByDistance}}. This means that the shuffle is executed 
> twice on each list.
> Also, logic wise, it is useless to shuffle before applying a tie breaker 
> which might make the shuffle work obsolete.
> In addition, [~daryn] reported that:
> * topology is unnecessarily locking/unlocking to calculate the distance for 
> every node
> * shuffling uses a seeded Random, instead of ThreadLocalRandom, which is 
> heavily synchronized



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to