[
https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143698#comment-14143698
]
Colin Patrick McCabe commented on HDFS-6581:
--------------------------------------------
bq. Rebase is no silver bullet. Conflicts still need to be resolved manually.
Colin, explaining how to use git is a little condescending.
I apologize if it sounded condescending. I was just trying to point out that
the cost of maintaining a branch has gone down due to the switch to git.
bq. It would be very little code to convert the current FIFO approach to
something like LFU but writing code is easy and demonstrating it actually helps
HDFS clients is harder. For your caching feature the measurement was fairly
straightforward. The logic for deciding which replicas need to be in memory was
outside HDFS. For this feature we'd first need to define "better scheme" and
we'd need help from other stack components for evaluation. Think of this
feature as providing the overall framework including API, protocol changes and
DN support. If there is no argument with the framework design then there should
be no objection to doing the eviction fine tuning (which is a very small
proportion of the patch, perhaps less than 5% content wise) post-merge. And to
restate, we cannot get clients to start evaluating it until the changes are in
mainline.
I agree that testing is needed, and it will be time-consuming. But I don't
understand why LRU was implemented first. It's very well-known that LRU is a
poor fit for scan workloads, which most HDFS workloads are.
My fear here is that we will try to implement a better eviction strategy, but
find that the pluggable API introduced in HDFS-7100 is too inflexible to do so.
I'm hoping that this fear is not justified, but until there is an actual LFU
or cold/warm/hot scheme implemented, we won't know for sure. As you said, this
isn't much code, so maybe I'll do it if it remains to be done later.
bq. Colin Patrick McCabe, I keep hearing usable eviction strategy and better
eviction strategy. What is it? How do you decide it is better or usable? We
should make sure the policy we go with is decent enough. I agree Fifo is not
it. As regards to other approaches and improvements, one can certainly make it
available using the plugin approach.
That's a good point. I think system-level testing will be needed. I think
it's fine to merge without this system-level testing being done, but I want
there to be at least one non-LRU implementation of eviction so that we know
that it's possible within this framework. Basically validating the plugin
architecture.
bq. Micro-benchmark to verify that SCR performance does not suffer with this
feature.
Thank you, Arpit. You might also consider using:
{code}
sudo sh -c “/usr/bin/echo 3 > /proc/sys/vm/drop_caches”
time hadoop fs -cat /my/non-lazy-persist-file
sudo sh -c “/usr/bin/echo 3 > /proc/sys/vm/drop_caches”
time hadoop fs -cat /my/lazy-persist-file
{code}
to get a benchmark that makes you look better :) Clearly the lazy-persist file
will still be in RAM after caches are dropped, whereas the non-lazy one will
not. I always repeat experiments 3 times and average, I left that out for
brevity
> Write to single replica in memory
> ---------------------------------
>
> Key: HDFS-6581
> URL: https://issues.apache.org/jira/browse/HDFS-6581
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Arpit Agarwal
> Assignee: Arpit Agarwal
> Attachments: HDFS-6581.merge.01.patch, HDFS-6581.merge.02.patch,
> HDFS-6581.merge.03.patch, HDFS-6581.merge.04.patch, HDFS-6581.merge.05.patch,
> HDFS-6581.merge.06.patch, HDFS-6581.merge.07.patch, HDFS-6581.merge.08.patch,
> HDFS-6581.merge.09.patch, HDFSWriteableReplicasInMemory.pdf,
> Test-Plan-for-HDFS-6581-Memory-Storage.pdf
>
>
> Per discussion with the community on HDFS-5851, we will implement writing to
> a single replica in DN memory via DataTransferProtocol.
> This avoids some of the issues with short-circuit writes, which we can
> revisit at a later time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)