[ 
https://issues.apache.org/jira/browse/HDFS-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson updated HDFS-3529:
----------------------------------

    Attachment: HDFS-3529.patch

The attached patch is based on [Todd Lipcon's 
patch|https://github.com/toddlipcon/hadoop-common/tree/trunk-write-pipeline-fast],
 but was modified significantly to pass all unit tests and to merge with the 
datanode encryption (HDFS-3637) changes.

By switching to direct buffers, the use of native CRC is automatically enabled 
in {{DataChecksum.verifyChunkedSums}} (HDFS-3528).

On ARM systems, I consistently see about a 10% improvement in TestDFSIO write 
throughput; on x86, it varies more, but the average so far is 4%. Read 
throughput seemed slightly higher, but was within the run-to-run deviation. 
Obviously, hardware and tuning configurations vary, so hopefully others will 
try it out and share their results. The patch is based on trunk revision 
{{ce25c352c5a4e69baf39fe00abdd48c832402cf3}}.
                
> Use direct buffers for data in write path
> -----------------------------------------
>
>                 Key: HDFS-3529
>                 URL: https://issues.apache.org/jira/browse/HDFS-3529
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, performance
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: HDFS-3529.patch
>
>
> The write path currently makes several unnecessary data copies in order to go 
> to and from byte arrays. We can improve performance by using direct byte 
> buffers to avoid the copy. This is also a prerequisite for native checksum 
> calculation (HDFS-3528)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to