[
https://issues.apache.org/jira/browse/HDFS-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170134#comment-14170134
]
Colin Patrick McCabe commented on HDFS-7142:
--------------------------------------------
bq. The call to ceiling need not return the exact match. If you get a non-null
result you need a check for blockId and bpid match. Perhaps I misunderstood the
intention.
You're right... I need to check to make sure that the bpid and block id are the
same after getting back a result from {{ceiling}}. Fixed.
bq. dequeueNextReplicaToPersist appears to have a starvation issue. If replicas
in a higher blockPoolId keep getting added constantly a replica in a lower bpid
may wait indefinitely to get persisted. It would be good to persist replicas in
the same order in which they were originally added, you can do that with an
additional set.
My mistake here was looking at the lowest value in
{{replicasSortedByBlockPoolAndId}}. It should be looking at the lowest value
in {{replicasSortedByTierAndLastUsed}}. There's no starvation issue if it
looks at the set which is sorted by lastUsed, because oldest replicas will get
picked first. Fixed.
bq. Same issue with numReplicasNotPersisted, it should not count all replicas
in RAM. Let me clarify the documentation on
RamDiskReplicaTracker.dequeueNextReplicaToPersist.
It seems to me that the number of replicas not persisted *is* all the replicas
in RAM. So perhaps the function needs to be renamed. Can you clarify what
this should count?
bq. Colin Patrick McCabe, any comments and updates to the patch?
Let me repost a patch fixing the first two issues pointing out. I will wait
for clarification on any other API issues. I'm going to mark this as
targetting 2.7.
> Implement a 2Q eviction strategy for HDFS-6581
> ----------------------------------------------
>
> Key: HDFS-7142
> URL: https://issues.apache.org/jira/browse/HDFS-7142
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: 0002-Add-RamDiskReplica2QTracker.patch
>
>
> We should implement a 2Q or approximate 2Q eviction strategy for HDFS-6581.
> It is well known that LRU is a poor fit for scanning workloads, which HDFS
> may often encounter.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)