[
https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957551#comment-16957551
]
Ayush Saxena commented on HDFS-14283:
-------------------------------------
[~smeng] whether to give priority to cached replica or not is client's choice
and would be a client configuration, specific to each client. If we have the
change in server, The client's who doesn't want to use this feature will also
get affected due to the sorting stuff at Namenode, which they won't even use.
This would impact the normal client's read performance.
Secondly, we can't use this config at Namenode to check, since this would be a
client config, that won't be specified at namenode and secondly, shouldn't be
universal for all client, if you enable it at namenode, irrespective of what
the client has configured, it will use value at namenode for server side
computation work. The client choice or configured value won't be sent to
namenode.
> DFSInputStream to prefer cached replica
> ---------------------------------------
>
> Key: HDFS-14283
> URL: https://issues.apache.org/jira/browse/HDFS-14283
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 2.6.0
> Environment: HDFS Caching
> Reporter: Wei-Chiu Chuang
> Assignee: Lisheng Sun
> Priority: Major
> Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch,
> HDFS-14283.003.patch, HDFS-14283.004.patch
>
>
> HDFS Caching offers performance benefits. However, currently NameNode does
> not treat cached replica with higher priority, so HDFS caching is only useful
> when cache replication = 3, that is to say, all replicas are cached in
> memory, so that a client doesn't randomly pick an uncached replica.
> HDFS-6846 proposed to let NameNode give higher priority to cached replica.
> Changing a logic in NameNode is always tricky so that didn't get much
> traction. Here I propose a different approach: let client (DFSInputStream)
> prefer cached replica.
> A {{LocatedBlock}} object already contains cached replica location so a
> client has the needed information. I think we can change
> {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]