[ 
https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972895#comment-16972895
 ] 

Siyao Meng edited comment on HDFS-14283 at 11/12/19 11:20 PM:
--------------------------------------------------------------

[~leosun08] I discussed with [~weichiu]. I'm fine with ditching the sorting 
logic on the server side so that we don't need to make any server side changes 
in this patch. One reason is that in most cases there will only be one cached 
replica for a block.

We will simply allow the client to prefer the cached replica with a 
configuration option then.


was (Author: smeng):
[~leosun08] I discussed with [~weichiu]. I'm fine with ditching the sorting 
logic on the server side so that we don't need to make any server side changed 
in this patch. One reason is that in most cases there will only be one cached 
replica for a block.

We will simply allow the client to prefer the cached replica with a 
configuration option then.

> DFSInputStream to prefer cached replica
> ---------------------------------------
>
>                 Key: HDFS-14283
>                 URL: https://issues.apache.org/jira/browse/HDFS-14283
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.6.0
>         Environment: HDFS Caching
>            Reporter: Wei-Chiu Chuang
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch, 
> HDFS-14283.003.patch, HDFS-14283.004.patch
>
>
> HDFS Caching offers performance benefits. However, currently NameNode does 
> not treat cached replica with higher priority, so HDFS caching is only useful 
> when cache replication = 3, that is to say, all replicas are cached in 
> memory, so that a client doesn't randomly pick an uncached replica.
> HDFS-6846 proposed to let NameNode give higher priority to cached replica. 
> Changing a logic in NameNode is always tricky so that didn't get much 
> traction. Here I propose a different approach: let client (DFSInputStream) 
> prefer cached replica.
> A {{LocatedBlock}} object already contains cached replica location so a 
> client has the needed information. I think we can change 
> {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to