[ 
https://issues.apache.org/jira/browse/CASSANDRA-8684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292316#comment-14292316
 ] 

Ariel Weisberg edited comment on CASSANDRA-8684 at 1/26/15 8:02 PM:
--------------------------------------------------------------------

!https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1480884345&format=image!
!https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1206341035&format=image!

The ever inscrutable results for OS X. I am very skeptical it is doing 13 
gigabytes/second on a single core. I don't get why there is a gradual increase 
in speed as the size of the data being checksummed increases, and that speed up 
doesn't exist on Linux. Looking at a CPU monitor doesn't make it look like the 
application is using multiple cores.

!https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1911364989&format=image!

I think the real speed is 3 gigabytes/second which is what I have seen in the 
past and seen on Linux. There are faster hashes like xxhash or MurmurHash3 to 
consider that operate in the 5-6 gigabyte/second range.

However a Java implementation of xxhash might not hit those numbers. The 
murmur3 implementation certainly doesn't. A native implementation incurs JNI 
overhead and there is nothing packaged at the moment.


was (Author: aweisberg):
!https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1480884345&format=image!
!https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1206341035&format=image!

The ever inscrutable results for OS X. I don't buy for a second that it is 
doing 13 gigabytes/second.

!https://docs.google.com/spreadsheets/d/1cxf-V4b8dXdz1vLb5ySUNxK09bukDHHpq79a09xHw20/pubchart?oid=1911364989&format=image!

I think the real speed is 3 gigabytes/second which is what I have seen in the 
past and seen on Linux. There are faster hashes like xxhash or MurmurHash3 to 
consider that operate in the 5-6 gigabyte/second range.

However a Java implementation of xxhash might not hit those numbers. The 
murmur3 implementation certainly doesn't. A native implementation incurs JNI 
overhead and there is nothing packaged at the moment.

> Replace usage of Adler32 with CRC32
> -----------------------------------
>
>                 Key: CASSANDRA-8684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8684
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>         Attachments: CRCBenchmark.java, PureJavaCrc32.java, Sample.java
>
>
> I could not find a situation in which Adler32 outperformed PureJavaCrc32 much 
> less the intrinsic from Java 8. For small allocations PureJavaCrc32 was much 
> faster probably due to the JNI overhead of invoking the native Adler32 
> implementation where the array has to be allocated and copied.
> I tested on a 65w Sandy Bridge i5 running Ubuntu 14.04 with JDK 1.7.0_71 as 
> well as a c3.8xlarge running Ubuntu 14.04.
> I think it makes sense to stop using Adler32 when generating new checksums.
> c3.8xlarge, results are time in milliseconds, lower is better
> ||Allocation size|Adler32|CRC32|PureJavaCrc32||
> |64|47636|46075|25782|
> |128|36755|36712|23782|
> |256|31194|32211|22731|
> |1024|27194|28792|22010|
> |1048576|25941|27807|21808|
> |536870912|25957|27840|21836|
> i5
> ||Allocation size|Adler32|CRC32|PureJavaCrc32||
> |64|50539|50466|26826|
> |128|37092|38533|24553|
> |256|30630|32938|23459|
> |1024|26064|29079|22592|
> |1048576|24357|27911|22481|
> |536870912|24838|28360|22853|
> Another fun fact. Performance of the CRC32 intrinsic appears to double from 
> Sandy Bridge -> Haswell. Unless I am measuring something different when going 
> from Linux/Sandy to Haswell/OS X.
> The intrinsic/JDK 8 implementation also operates against DirectByteBuffers 
> better and coding against the wrapper will get that boost when run with Java 
> 8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to