[ 
https://issues.apache.org/jira/browse/HDFS-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170134#comment-14170134
 ] 

Colin Patrick McCabe commented on HDFS-7142:
--------------------------------------------

bq. The call to ceiling need not return the exact match. If you get a non-null 
result you need a check for blockId and bpid match. Perhaps I misunderstood the 
intention.

You're right... I need to check to make sure that the bpid and block id are the 
same after getting back a result from {{ceiling}}.  Fixed.

bq. dequeueNextReplicaToPersist appears to have a starvation issue. If replicas 
in a higher blockPoolId keep getting added constantly a replica in a lower bpid 
may wait indefinitely to get persisted. It would be good to persist replicas in 
the same order in which they were originally added, you can do that with an 
additional set.

My mistake here was looking at the lowest value in 
{{replicasSortedByBlockPoolAndId}}.  It should be looking at the lowest value 
in {{replicasSortedByTierAndLastUsed}}.  There's no starvation issue if it 
looks at the set which is sorted by lastUsed, because oldest replicas will get 
picked first.  Fixed.

bq. Same issue with numReplicasNotPersisted, it should not count all replicas 
in RAM. Let me clarify the documentation on 
RamDiskReplicaTracker.dequeueNextReplicaToPersist.

It seems to me that the number of replicas not persisted *is* all the replicas 
in RAM.  So perhaps the function needs to be renamed.  Can you clarify what 
this should count?

bq. Colin Patrick McCabe, any comments and updates to the patch?

Let me repost a patch fixing the first two issues pointing out.  I will wait 
for clarification on any other API issues.  I'm going to mark this as 
targetting 2.7.

> Implement a 2Q eviction strategy for HDFS-6581
> ----------------------------------------------
>
>                 Key: HDFS-7142
>                 URL: https://issues.apache.org/jira/browse/HDFS-7142
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: 0002-Add-RamDiskReplica2QTracker.patch
>
>
> We should implement a 2Q or approximate 2Q eviction strategy for HDFS-6581.  
> It is well known that LRU is a poor fit for scanning workloads, which HDFS 
> may often encounter. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to