[
https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099652#comment-17099652
]
Lisheng Sun commented on HDFS-14283:
------------------------------------
Thanx [~ayushtkn] for your suggestion.
{quote}
+ if (!deadNodes.containsKey(cachedLocs[i])
For this can we use dfsClient.getDeadNodes(this).containsKey(nodes[i])? it is
added as part of DeadDatanodeDetection feature. If yes, May be we can refactor
the if checks into a single method and use at both places.
{quote}
the v008 patch fixed this problem.
{quote}
return new DNAddrPair(chosenNode, targetAddr, storageType, block);
storagaeType will be null if using cachedReplica, is it ok?
{quote}
storagaeType is only used when NameNode choose DataNode.
It will not be used when Client establishes a connection with DataNode.
See Send and DataXceiver#readBlock parameters:
{code:java}
public void readBlock(final ExtendedBlock block,
final Token<BlockTokenIdentifier> blockToken,
final String clientName,
final long blockOffset,
final long length,
final boolean sendChecksum,
final CachingStrategy cachingStrategy) throws IOException {
{code}
> DFSInputStream to prefer cached replica
> ---------------------------------------
>
> Key: HDFS-14283
> URL: https://issues.apache.org/jira/browse/HDFS-14283
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 2.6.0
> Environment: HDFS Caching
> Reporter: Wei-Chiu Chuang
> Assignee: Lisheng Sun
> Priority: Major
> Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch,
> HDFS-14283.003.patch, HDFS-14283.004.patch, HDFS-14283.005.patch,
> HDFS-14283.006.patch, HDFS-14283.007.patch, HDFS-14283.008.patch
>
>
> HDFS Caching offers performance benefits. However, currently NameNode does
> not treat cached replica with higher priority, so HDFS caching is only useful
> when cache replication = 3, that is to say, all replicas are cached in
> memory, so that a client doesn't randomly pick an uncached replica.
> HDFS-6846 proposed to let NameNode give higher priority to cached replica.
> Changing a logic in NameNode is always tricky so that didn't get much
> traction. Here I propose a different approach: let client (DFSInputStream)
> prefer cached replica.
> A {{LocatedBlock}} object already contains cached replica location so a
> client has the needed information. I think we can change
> {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]