[ 
https://issues.apache.org/jira/browse/HADOOP-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722132#action_12722132
 ] 

Todd Lipcon commented on HADOOP-5598:
-------------------------------------

Thanks, Owen. I definitely see your point about the JNI code possibly locking 
or copying. The API we're using says it tries to avoid doing that when 
possible, but I guess for large blocks, heap fragmentation is quite likely and 
we'd run into that issue.

I'm going on vacation this next week, so I'm unlikely to work on it, but if you 
have a moment to upload your implementation for comparison I'll be back on this 
the week after next.

> Implement a pure Java CRC32 calculator
> --------------------------------------
>
>                 Key: HADOOP-5598
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5598
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Owen O'Malley
>            Assignee: Todd Lipcon
>         Attachments: crc32-results.txt, hadoop-5598-evil.txt, 
> hadoop-5598-hybrid.txt, hadoop-5598.txt, hadoop-5598.txt, PureJavaCrc32.java, 
> PureJavaCrc32.java, PureJavaCrc32.java, TestCrc32Performance.java, 
> TestCrc32Performance.java, TestCrc32Performance.java, TestPureJavaCrc32.java
>
>
> We've seen a reducer writing 200MB to HDFS with replication = 1 spending a 
> long time in crc calculation. In particular, it was spending 5 seconds in crc 
> calculation out of a total of 6 for the write. I suspect that it is the 
> java-jni border that is causing us grief.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to