[ https://issues.apache.org/jira/browse/CASSANDRA-8684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292316#comment-14292316 ]
Ariel Weisberg edited comment on CASSANDRA-8684 at 1/26/15 8:02 PM: -------------------------------------------------------------------- !https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1480884345&format=image! !https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1206341035&format=image! The ever inscrutable results for OS X. I am very skeptical it is doing 13 gigabytes/second on a single core. I don't get why there is a gradual increase in speed as the size of the data being checksummed increases, and that speed up doesn't exist on Linux. Looking at a CPU monitor doesn't make it look like the application is using multiple cores. !https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1911364989&format=image! I think the real speed is 3 gigabytes/second which is what I have seen in the past and seen on Linux. There are faster hashes like xxhash or MurmurHash3 to consider that operate in the 5-6 gigabyte/second range. However a Java implementation of xxhash might not hit those numbers. The murmur3 implementation certainly doesn't. A native implementation incurs JNI overhead and there is nothing packaged at the moment. was (Author: aweisberg): !https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1480884345&format=image! !https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1206341035&format=image! The ever inscrutable results for OS X. I don't buy for a second that it is doing 13 gigabytes/second. !https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1911364989&format=image! I think the real speed is 3 gigabytes/second which is what I have seen in the past and seen on Linux. There are faster hashes like xxhash or MurmurHash3 to consider that operate in the 5-6 gigabyte/second range. However a Java implementation of xxhash might not hit those numbers. The murmur3 implementation certainly doesn't. A native implementation incurs JNI overhead and there is nothing packaged at the moment. > Replace usage of Adler32 with CRC32 > ----------------------------------- > > Key: CASSANDRA-8684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8684 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Ariel Weisberg > Assignee: Ariel Weisberg > Attachments: CRCBenchmark.java, PureJavaCrc32.java, Sample.java > > > I could not find a situation in which Adler32 outperformed PureJavaCrc32 much > less the intrinsic from Java 8. For small allocations PureJavaCrc32 was much > faster probably due to the JNI overhead of invoking the native Adler32 > implementation where the array has to be allocated and copied. > I tested on a 65w Sandy Bridge i5 running Ubuntu 14.04 with JDK 1.7.0_71 as > well as a c3.8xlarge running Ubuntu 14.04. > I think it makes sense to stop using Adler32 when generating new checksums. > c3.8xlarge, results are time in milliseconds, lower is better > ||Allocation size|Adler32|CRC32|PureJavaCrc32|| > |64|47636|46075|25782| > |128|36755|36712|23782| > |256|31194|32211|22731| > |1024|27194|28792|22010| > |1048576|25941|27807|21808| > |536870912|25957|27840|21836| > i5 > ||Allocation size|Adler32|CRC32|PureJavaCrc32|| > |64|50539|50466|26826| > |128|37092|38533|24553| > |256|30630|32938|23459| > |1024|26064|29079|22592| > |1048576|24357|27911|22481| > |536870912|24838|28360|22853| > Another fun fact. Performance of the CRC32 intrinsic appears to double from > Sandy Bridge -> Haswell. Unless I am measuring something different when going > from Linux/Sandy to Haswell/OS X. > The intrinsic/JDK 8 implementation also operates against DirectByteBuffers > better and coding against the wrapper will get that boost when run with Java > 8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)