[ 
https://issues.apache.org/jira/browse/HDFS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138252#comment-14138252
 ] 

Colin Patrick McCabe commented on HDFS-6581:
--------------------------------------------

bq. Benchmark the impact of CRC computation, evaluate moving it off the hot 
path.

We did some benchmarks for HDFS-4949 that put checksumming at 15% overhead when 
using native CRC32 with the intel CRC instructions.  This was for the read 
path, not the write path, though.  For writes there will also be a small write 
I/O amplification factor due to checksumming.

bq. Improve eviction, there's multiple ideas floating around, including 
integration with CCM.

Sorry, what's CCM?  A good eviction strategy is a core part of this, I have a 
hard time imagining merging without addressing that.

bq. The comparison will be interesting but I can tell you without measurement 
it is not going to be a substantial fraction of memory bandwidth. We are still 
going through DataTransferProtocol with all the copies and overhead that 
involves.

Before we merged HDFS-4949 we did substantial benchmarking to show performance 
improvements.  I'd like to see some similar benchmarks here before we can 
consider merging this branch.  If we can't acheive high write speeds, we should 
at least be able to demonstrate high read speeds of the data which is in the 
"HDFS-6581 write cache" (is that the right terminology still?)  We should also 
figure out how much CPU is used while doing these writes and reads.

In general, if we can't hit our performance numbers, it suggests a design issue 
that we should address before merging.  There is nothing to be gained from 
rushing the merge process.  This is especially true since our "competition" 
here is caching systems that sit outside HDFS, which have been extensively 
benchmarked and evaulated.

> Write to single replica in memory
> ---------------------------------
>
>                 Key: HDFS-6581
>                 URL: https://issues.apache.org/jira/browse/HDFS-6581
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>         Attachments: HDFS-6581.merge.01.patch, HDFS-6581.merge.02.patch, 
> HDFS-6581.merge.03.patch, HDFSWriteableReplicasInMemory.pdf, 
> Test-Plan-for-HDFS-6581-Memory-Storage.pdf
>
>
> Per discussion with the community on HDFS-5851, we will implement writing to 
> a single replica in DN memory via DataTransferProtocol.
> This avoids some of the issues with short-circuit writes, which we can 
> revisit at a later time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to