Performance improvement in PureJavaCrc32 ----------------------------------------
Key: HADOOP-7333 URL: https://issues.apache.org/jira/browse/HADOOP-7333 Project: Hadoop Common Issue Type: Improvement Components: util Affects Versions: 0.21.0 Environment: Linux x64 Reporter: Eric Caspole Priority: Minor I would like to propose a small patch to org.apache.hadoop.util.PureJavaCrc32.update(byte[] b, int off, int len) Currently the method stores the intermediate result back into the data member "crc." I noticed this method gets inlined into DataChecksum.update() and that method appears as one of the hotter methods in a simple hprof profile collected while running terasort and gridmix. If the code is modified to save the temporary result into a local and just once store the final result back into the data member, it results in slightly more efficient hotspot codegen. I tested this change using the the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest" which is embedded in the existing unit test for this class, TestPureJavaCrc32 on a variety of linux x64 AMD and Intel multi-socket and multi-core systems I have available to test. The patch removes several stores of the intermediate result to memory yielding a 0%-10% speedup in the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest" which is embedded in the existing unit test for this class, TestPureJavaCrc32. If you use a debug hotspot JVM with -XX:+PrintOptoAssembly, you can see the intermediate stores such as: 414 movq R9, [rsp + #24] # spill 419 movl [R9 + #12 (8-bit)], RDX # int ! Field PureJavaCrc32.crc 41d xorl R10, RDX # int The patch results in just one final store of the fully computed value. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira