[
https://issues.apache.org/jira/browse/HADOOP-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667360#comment-13667360
]
Todd Lipcon commented on HADOOP-9601:
-------------------------------------
If someone wants to work on this, I think it will help the CPU efficiency of
our write path in particular without too much complication in the actual HDFS
code. HBase would also be a consumer of this API.
The specific thing to keep in mind is that, to avoid a memcpy in the JNI code,
you need to use GetPrimitiveArrayCritical [1]. But, while you're in the
"critical section", GCs are blocked, so if you hold it for a long time you'll
end up stalling all of the threads waiting on the heap-wide lock that you're
holding. So, in the context of CRC, it's probably reasonable to compute only
256kb or so per critical region -- at 1GB+/sec CRC speed this is only a maximum
250us delay on the GCs, which is noise next to typical GC lengths (~50ms young
gen).
[1]
http://docs.oracle.com/javase/1.4.2/docs/guide/jni/jni-12.html#GetPrimitiveArrayCritical
> Support native CRC on byte arrays
> ---------------------------------
>
> Key: HADOOP-9601
> URL: https://issues.apache.org/jira/browse/HADOOP-9601
> Project: Hadoop Common
> Issue Type: Improvement
> Components: performance, util
> Affects Versions: 3.0.0
> Reporter: Todd Lipcon
>
> When we first implemented the Native CRC code, we only did so for direct byte
> buffers, because these correspond directly to native heap memory and thus
> make it easy to access via JNI. We'd generally assumed that accessing byte[]
> arrays from JNI was not efficient enough, but now that I know more about JNI
> I don't think that's true -- we just need to make sure that the critical
> sections where we lock the buffers are short.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira