[ https://issues.apache.org/jira/browse/HADOOP-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Scott Carey updated HADOOP-5598: -------------------------------- Attachment: TestCrc32Performance.java TestPureJavaCrc32.java PureJavaCrc32.java This version of PureJavaCrc32 is between 10x and 1.8x faster than Sun's native implementation depending on the chunk size. Results on my latop (mac osx with Java 1.6 64 bit, -Xmx512m, 2.5Ghz processor) below. Run to run, results vary by about 5%. ||CRC32Class||num bytes||throughput|| | PureJava |1 |99.533 MB/sec| | SunNative |1 |9.772 MB/sec| | PureJava |2 |163.265 MB/sec| | SunNative |2 |18.846 MB/sec| | PureJava |4 |234.004 MB/sec| | SunNative |4 |37.307 MB/sec| | PureJava |8 |307.692 MB/sec| | SunNative |8 |66.876 MB/sec| | PureJava |16 |432.432 MB/sec| | SunNative |16 |110.919 MB/sec| | PureJava |32 |522.449 MB/sec| | SunNative |32 |161.616 MB/sec| | PureJava |64 |547.009 MB/sec| | SunNative |64 |217.687 MB/sec| | PureJava |128 |432.432 MB/sec| | SunNative |128 |270.042 MB/sec| | PureJava |256 |551.724 MB/sec| | SunNative |256 |299.065 MB/sec| | PureJava |512 |615.385 MB/sec| | SunNative |512 |321.608 MB/sec| | PureJava |1024 |551.724 MB/sec| | SunNative |1024 |212.625 MB/sec| | PureJava |2048 |561.404 MB/sec| | SunNative |2048 |309.179 MB/sec| | PureJava |4096 |551.724 MB/sec| | SunNative |4096 |307.692 MB/sec| | PureJava |8192 |589.862 MB/sec| | SunNative |8192 |316.049 MB/sec| | PureJava |16384 |640.000 MB/sec| | SunNative |16384 |343.164 MB/sec| | PureJava |32768 |643.216 MB/sec| | SunNative |32768 |343.164 MB/sec| | PureJava |65536 |621.359 MB/sec| | SunNative |65536 |345.013 MB/sec| | PureJava |131072 |636.816 MB/sec| | SunNative |131072 |345.946 MB/sec| | PureJava |262144 |636.816 MB/sec| | SunNative |262144 |343.164 MB/sec| | PureJava |524288 |646.465 MB/sec| | SunNative |524288 |345.946 MB/sec| | PureJava |1048576 |640.000 MB/sec| | SunNative |1048576 |343.164 MB/sec| | PureJava |2097152 |633.663 MB/sec| | SunNative |2097152 |347.826 MB/sec| | PureJava |4194304 |636.816 MB/sec| | SunNative |4194304 |291.572 MB/sec| | PureJava |8388608 |618.357 MB/sec| | SunNative |8388608 |342.246 MB/sec| | PureJava |16777216 |624.390 MB/sec| | SunNative |16777216 |307.692 MB/sec| > Implement a pure Java CRC32 calculator > -------------------------------------- > > Key: HADOOP-5598 > URL: https://issues.apache.org/jira/browse/HADOOP-5598 > Project: Hadoop Core > Issue Type: Improvement > Components: dfs > Reporter: Owen O'Malley > Assignee: Todd Lipcon > Attachments: crc32-results.txt, hadoop-5598-evil.txt, > hadoop-5598-hybrid.txt, hadoop-5598.txt, hadoop-5598.txt, PureJavaCrc32.java, > PureJavaCrc32.java, PureJavaCrc32.java, TestCrc32Performance.java, > TestCrc32Performance.java, TestCrc32Performance.java, TestPureJavaCrc32.java > > > We've seen a reducer writing 200MB to HDFS with replication = 1 spending a > long time in crc calculation. In particular, it was spending 5 seconds in crc > calculation out of a total of 6 for the write. I suspect that it is the > java-jni border that is causing us grief. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.