[jira] [Commented] (HADOOP-9601) Support native CRC on byte arrays

Colin Patrick McCabe (JIRA) Mon, 18 Aug 2014 14:28:21 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101335#comment-14101335
 ]


Colin Patrick McCabe commented on HADOOP-9601:
----------------------------------------------

bq. btw, I found out bad interaction between between GC & getArrayCritical when 
the memory is fragmented.  This is faster until it gets slow all of a sudden.  
Please pass in the &isCopy and run with G1GC to make sure it is doing zero-copy 
ops for getArrayRegion.

Interesting.

The documentation says this about {{GetPrimitiveArrayCritical}}:

bq. After calling GetPrimitiveArrayCritical, the native code should not run for 
an extended period of time before it calls ReleasePrimitiveArrayCritical. We 
must treat the code inside this pair of functions as running in a "critical 
region." Inside a critical region, native code must not call other JNI 
functions, or any system call that may cause the current thread to block and 
wait for another Java thread. (For example, the current thread must not call 
read on a stream being written by another Java thread.)

This is exactly what we're doing in the HADOOP-10838 patch.  We call 
{{GetPrimitiveArrayCritical}}, do the checksums, and then immediately call 
{{ReleasePrimitiveArrayCritical}}.  If the JVM chooses not to take the 
zero-copy route, we can't override its decision.  And we can't access that 
array without calling one of the accessor functions.  So I don't know how this 
could be improved; do you have any ideas?

> Support native CRC on byte arrays
> ---------------------------------
>
>                 Key: HADOOP-9601
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9601
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: performance, util
>    Affects Versions: 3.0.0
>            Reporter: Todd Lipcon
>            Assignee: Gopal V
>              Labels: perfomance
>         Attachments: HADOOP-9601-WIP-01.patch, HADOOP-9601-WIP-02.patch, 
> HADOOP-9601-bench.patch, HADOOP-9601-rebase+benchmark.patch, 
> HADOOP-9601-trunk-rebase-2.patch, HADOOP-9601-trunk-rebase.patch
>
>
> When we first implemented the Native CRC code, we only did so for direct byte 
> buffers, because these correspond directly to native heap memory and thus 
> make it easy to access via JNI. We'd generally assumed that accessing byte[] 
> arrays from JNI was not efficient enough, but now that I know more about JNI 
> I don't think that's true -- we just need to make sure that the critical 
> sections where we lock the buffers are short.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-9601) Support native CRC on byte arrays

Reply via email to