[ 
https://issues.apache.org/jira/browse/HDFS-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005546#comment-14005546
 ] 

Yongjun Zhang commented on HDFS-6268:
-------------------------------------

HI Andrew,

I went through your latest version (4), it's a nice rewrite and much easier to 
read, plus code sharing with NetworkTopologyWithNodeGroup
is a very good thing. Thanks for the effort.

All looks good to me. I had one minor comment here. Basically I was still 
thinking about reordering the two sorts:
{code}
     networktopology.sortByDistance(client, b.getLocations(), b
          .getBlock().getBlockId());
      // Move decommissioned/stale datanodes to the bottom
      Arrays.sort(b.getLocations(), comparator);
{code}
to
{code}
      // Move decommissioned/stale datanodes to the bottom
      Arrays.sort(b.getLocations(), comparator);
     int  activeLen = (find out the index of the first live DN by backward 
travsersing of the the array, add 1);
     networktopology.sortByDistance(client, b.getLocations(), activeLen, b
          .getBlock().getBlockId());
{code}
And modify the sortByDistance method to take in an additional length parameter, 
so we can exclude the stale nodes from the sort.
I wonder if this makes sense to you. If you agree with this change, we can have 
a separate jira for it.

Thanks.




> Better sorting in NetworkTopology#pseudoSortByDistance when no local node is 
> found
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-6268
>                 URL: https://issues.apache.org/jira/browse/HDFS-6268
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.4.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>            Priority: Minor
>         Attachments: hdfs-6268-1.patch, hdfs-6268-2.patch, hdfs-6268-3.patch, 
> hdfs-6268-4.patch
>
>
> In NetworkTopology#pseudoSortByDistance, if no local node is found, it will 
> always place the first rack local node in the list in front.
> This became an issue when a dataset was loaded from a single datanode. This 
> datanode ended up being the first replica for all the blocks in the dataset. 
> When running an Impala query, the non-local reads when reading past a block 
> boundary were all hitting this node, meaning massive load skew.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to