[
https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093584#comment-14093584
]
Jason Lowe commented on HDFS-6840:
----------------------------------
HDFS-6701 gives the option to randomize the returned datanodes but the default
is off. I'm not sure if defaulting to off is a good thing, given the
significantly different load behavior and heavy skew to the one datanode. If
that skew is desired then I think it should be opted-in rather than having to
opt-out to avoid the skew.
> Clients are always sent to the same datanode when read is off rack
> ------------------------------------------------------------------
>
> Key: HDFS-6840
> URL: https://issues.apache.org/jira/browse/HDFS-6840
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.5.0
> Reporter: Jason Lowe
> Priority: Critical
>
> After HDFS-6268 the sorting order of block locations is deterministic for a
> given block and locality level (e.g.: local, rack. off-rack), so off-rack
> clients all see the same datanode for the same block. This leads to very
> poor behavior in distributed cache localization and other scenarios where
> many clients all want the same block data at approximately the same time.
> The one datanode is crushed by the load while the other replicas only handle
> local and rack-local requests.
--
This message was sent by Atlassian JIRA
(v6.2#6252)