[ 
https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955428#comment-16955428
 ] 

Ayush Saxena commented on HDFS-14283:
-------------------------------------

Thanx [~leosun08] for the patch. Had a quick look on the idea, Couldn't check 
the whole code, Some concerns :
 * I think the feature to prefer cached Replica should be optional and governed 
by a config at the client side, whether he wants it or not.
 * Secondly, The changes have moved to the server side too, for the sorting 
stuff. I think this would have performance impact for those who even don't want 
to prefer the cached locations. The intent with which this Jira started was to 
keep the logic down at client side, So I think we should refrain from changes 
at server side.
 * Even make sure those not interested of using cached Replica, should not get 
affected by any means all the process work for this should be only done, if 
this feature is turned, which by default should be turned off.

 

> DFSInputStream to prefer cached replica
> ---------------------------------------
>
>                 Key: HDFS-14283
>                 URL: https://issues.apache.org/jira/browse/HDFS-14283
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.6.0
>         Environment: HDFS Caching
>            Reporter: Wei-Chiu Chuang
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch, 
> HDFS-14283.003.patch
>
>
> HDFS Caching offers performance benefits. However, currently NameNode does 
> not treat cached replica with higher priority, so HDFS caching is only useful 
> when cache replication = 3, that is to say, all replicas are cached in 
> memory, so that a client doesn't randomly pick an uncached replica.
> HDFS-6846 proposed to let NameNode give higher priority to cached replica. 
> Changing a logic in NameNode is always tricky so that didn't get much 
> traction. Here I propose a different approach: let client (DFSInputStream) 
> prefer cached replica.
> A {{LocatedBlock}} object already contains cached replica location so a 
> client has the needed information. I think we can change 
> {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to