[
https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095763#comment-14095763
]
Daryn Sharp commented on HDFS-6840:
-----------------------------------
We believe but haven't proven that this deterministic behavior is causing even
more problems. Block replication and invalidation appear to be impacted. As
in changing the replication factor sometimes takes up to an hour to start, and
there's a slow but steady increase in blocks pending deletion on clusters
running 2.5. We believe the NN is repeatedly picking the same faulty DN to
issue the copy block and invalidate block.
> Clients are always sent to the same datanode when read is off rack
> ------------------------------------------------------------------
>
> Key: HDFS-6840
> URL: https://issues.apache.org/jira/browse/HDFS-6840
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.5.0
> Reporter: Jason Lowe
> Priority: Critical
>
> After HDFS-6268 the sorting order of block locations is deterministic for a
> given block and locality level (e.g.: local, rack. off-rack), so off-rack
> clients all see the same datanode for the same block. This leads to very
> poor behavior in distributed cache localization and other scenarios where
> many clients all want the same block data at approximately the same time.
> The one datanode is crushed by the load while the other replicas only handle
> local and rack-local requests.
--
This message was sent by Atlassian JIRA
(v6.2#6252)