[ 
https://issues.apache.org/jira/browse/HDFS-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491686#comment-13491686
 ] 

Gopal V commented on HDFS-3529:
-------------------------------

I ran through some benchmarks with this patch on (and no tests failed on 
test-patch).

On a spinning disk it barely made any difference in execution time, but it does 
speed up DFS write throughput by a couple of percentage points when I backed 
HDFS with an SSD with I/O elevators on deadline.

But for my benchmarks, the bottleneck for the operations are on the client 
right now - the client DataStreamer could not write enough data to saturate the 
SSD on a single thread.

Adding a native chunked checksum generator similar to HDFS-3528 (well, generate 
is just verify without a check) on the client code would be worthwhile if this 
patch needs to actually improve the big picture of execution - otherwise it 
might only benefit the replication scenarios slightly.
                
> Use direct buffers for data in write path
> -----------------------------------------
>
>                 Key: HDFS-3529
>                 URL: https://issues.apache.org/jira/browse/HDFS-3529
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, performance
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Trevor Robinson
>         Attachments: dfsio-x86-trunk-vs-3529.png, HDFS-3529.patch
>
>
> The write path currently makes several unnecessary data copies in order to go 
> to and from byte arrays. We can improve performance by using direct byte 
> buffers to avoid the copy. This is also a prerequisite for native checksum 
> calculation (HDFS-3528)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to