[
https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093613#comment-14093613
]
Jason Lowe commented on HDFS-6840:
----------------------------------
I think the previous behavior was not deterministic due to this change that was
removed in the HDFS-6268 patch:
{code}
// put a random node at position 0 if it is not a local/local-rack node
if(tempIndex == 0 && localRackNode == -1 && nodes.length != 0) {
swap(nodes, 0, r.nextInt(nodes.length));
{code}
The list used to be mostly deterministic, but the first node in the list (i.e.:
the one clients are likely to be the only one to use) was random.
I have not done the bisect to prove without a doubt it was HDFS-6268, but we've
run builds based on something 2.4.1+ and 2.5 and this behavior is brand-new
with 2.5. There weren't a lot of changes in the topology sorting arena besides
this one between 2.4.1 and 2.5.0, and the code and JIRA for HDFS-6268 state
it's intentionally not randomizing the datanode list between clients. Besides
the bisect approach I probably can try replacing the network topology class
with the one from before HDFS-6268 and see if the behavior reverts to what it
used to be.
> Clients are always sent to the same datanode when read is off rack
> ------------------------------------------------------------------
>
> Key: HDFS-6840
> URL: https://issues.apache.org/jira/browse/HDFS-6840
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.5.0
> Reporter: Jason Lowe
> Priority: Critical
>
> After HDFS-6268 the sorting order of block locations is deterministic for a
> given block and locality level (e.g.: local, rack. off-rack), so off-rack
> clients all see the same datanode for the same block. This leads to very
> poor behavior in distributed cache localization and other scenarios where
> many clients all want the same block data at approximately the same time.
> The one datanode is crushed by the load while the other replicas only handle
> local and rack-local requests.
--
This message was sent by Atlassian JIRA
(v6.2#6252)