[ 
https://issues.apache.org/jira/browse/HADOOP-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HADOOP-9601:
----------------------------

    Attachment: HADOOP-9601-bench.patch

The bottleneck for -put does not seems to be verify checksums, but 
calculateChunkedSums on the client side, which doesn't have a native equiv in 
NativeCrc32.c

I wrote a micro-benchmark, which shows the array buffers are now the same speed 
as the direct buffers, with the patch.

Before

{code}
Checksumming CRC32+array: 32768 MB took 35944 ms (911.64 MB/s)
Checksumming CRC32C+array: 32768 MB took 35517 ms (922.60 MB/s)
Checksumming CRC32+direct: 32768 MB took 24318 ms (1347.48 MB/s)
Checksumming CRC32C+direct: 32768 MB took 13229 ms (2476.98 MB/s)
{code}

After

{code}
Checksumming CRC32+array: 32768 MB took 24399 ms (1343.01 MB/s)
Checksumming CRC32C+array: 32768 MB took 13238 ms (2475.30 MB/s)
Checksumming CRC32+direct: 32768 MB took 25190 ms (1300.83 MB/s)
Checksumming CRC32C+direct: 32768 MB took 13075 ms (2506.16 MB/s)
{code}
                
> Support native CRC on byte arrays
> ---------------------------------
>
>                 Key: HADOOP-9601
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9601
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: performance, util
>    Affects Versions: 3.0.0
>            Reporter: Todd Lipcon
>            Assignee: Gopal V
>              Labels: perfomance
>         Attachments: HADOOP-9601-bench.patch, 
> HADOOP-9601-trunk-rebase-2.patch, HADOOP-9601-trunk-rebase.patch, 
> HADOOP-9601-WIP-01.patch, HADOOP-9601-WIP-02.patch
>
>
> When we first implemented the Native CRC code, we only did so for direct byte 
> buffers, because these correspond directly to native heap memory and thus 
> make it easy to access via JNI. We'd generally assumed that accessing byte[] 
> arrays from JNI was not efficient enough, but now that I know more about JNI 
> I don't think that's true -- we just need to make sure that the critical 
> sections where we lock the buffers are short.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to