Wei-Chiu Chuang created HDFS-14283:
--------------------------------------
Summary: DFSInputStream to prefer cached replica
Key: HDFS-14283
URL: https://issues.apache.org/jira/browse/HDFS-14283
Project: Hadoop HDFS
Issue Type: Improvement
Affects Versions: 2.6.0
Environment: HDFS Caching
Reporter: Wei-Chiu Chuang
HDFS Caching offers performance benefits. However, currently NameNode does not
treat cached replica with higher priority, so HDFS caching is only useful when
cache replication = 3, that is to say, all replicas are cached in memory, so
that a client doesn't randomly pick an uncached replica.
HDFS-6846 proposed to let NameNode give higher priority to cached replica.
Changing a logic in NameNode is always tricky so that didn't get much traction.
Here I propose a different approach: let client (DFSInputStream) prefer cached
replica.
A {{LocatedBlock}} object already contains cached replica location so a client
has the needed information. I think we can change
{{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]