[
https://issues.apache.org/jira/browse/CASSANDRA-16360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314690#comment-17314690
]
Alexey Zotov edited comment on CASSANDRA-16360 at 4/5/21, 8:43 AM:
-------------------------------------------------------------------
Based on the benchmarking, I could see that native _CRC32C_ implementation
works really fast even without intrinsic. Here is one more test to highlight
that (the results are consistent with other runs, run was made using Java
11.0.9):
{code:java}
[java] Benchmark (bufferSize) Mode Cnt Score
Error Units
[java] ChecksumBench.benchCrc32 31 avgt 5 107.191 ±
5.251 ns/op
[java] ChecksumBench.benchCrc32 131 avgt 5 83.716 ±
1.578 ns/op
[java] ChecksumBench.benchCrc32 517 avgt 5 123.176 ±
17.512 ns/op
[java] ChecksumBench.benchCrc32 2041 avgt 5 273.591 ±
9.123 ns/op
[java] ChecksumBench.benchCrc32cNoIntrinsic 31 avgt 5 52.850 ±
3.461 ns/op
[java] ChecksumBench.benchCrc32cNoIntrinsic 131 avgt 5 73.552 ±
1.624 ns/op
[java] ChecksumBench.benchCrc32cNoIntrinsic 517 avgt 5 196.009 ±
9.141 ns/op
[java] ChecksumBench.benchCrc32cNoIntrinsic 2041 avgt 5 278.980 ±
7.515 ns/op
[java] ChecksumBench.benchPureJavaCrc32c 31 avgt 5 98.419 ±
5.221 ns/op
[java] ChecksumBench.benchPureJavaCrc32c 131 avgt 5 239.515 ±
5.118 ns/op
[java] ChecksumBench.benchPureJavaCrc32c 517 avgt 5 828.281 ±
107.874 ns/op
[java] ChecksumBench.benchPureJavaCrc32c 2041 avgt 5 2941.934 ±
55.716 ns/op
{code}
I've checked the implementation and looks like the reason of such a great
performance of native _CRC32C_ implementation is that it heavily relies on
_Unsafe_ operations. Initially I though we can easily implement a custom
_CRC32C_ similar to the native one, however, now I do not think it is easy
enough and I have two concerns:
# there will be a need to use some libraries that wrap up work with Unsafe
# I'm not sure that from licensing perspective we are permitted to "re-work"
(copy-paste and adapt) the code from CRC32C
So I'm waiting for some input before moving forward in any direction.
was (Author: azotcsit):
Based on the benchmarking, I could see that native _CRC32C_ implementation
works really fast even without intrinsic. Here is one more test to highlight
that (the results are consistent with other runs, run was made using Java
11.0.9):
{code:java}
[java] Benchmark (bufferSize) Mode Cnt Score
Error Units
[java] ChecksumBench.benchCrc32 31 avgt 5 107.191 ±
5.251 ns/op
[java] ChecksumBench.benchCrc32 131 avgt 5 83.716 ±
1.578 ns/op
[java] ChecksumBench.benchCrc32 517 avgt 5 123.176 ±
17.512 ns/op
[java] ChecksumBench.benchCrc32 2041 avgt 5 273.591 ±
9.123 ns/op
[java] ChecksumBench.benchCrc32cNoIntrinsic 31 avgt 5 52.850 ±
3.461 ns/op
[java] ChecksumBench.benchCrc32cNoIntrinsic 131 avgt 5 73.552 ±
1.624 ns/op
[java] ChecksumBench.benchCrc32cNoIntrinsic 517 avgt 5 196.009 ±
9.141 ns/op
[java] ChecksumBench.benchCrc32cNoIntrinsic 2041 avgt 5 278.980 ±
7.515 ns/op
[java] ChecksumBench.benchPureJavaCrc32c 31 avgt 5 98.419 ±
5.221 ns/op
[java] ChecksumBench.benchPureJavaCrc32c 131 avgt 5 239.515 ±
5.118 ns/op
[java] ChecksumBench.benchPureJavaCrc32c 517 avgt 5 828.281 ±
107.874 ns/op
[java] ChecksumBench.benchPureJavaCrc32c 2041 avgt 5 2941.934 ±
55.716 ns/op
{code}
I've checked the implementation and looks like the reason of such a great
performance of native _CRC32C_ implementation is that it heavily relies on
_Unsafe_ operations. Initially I though we can implement a custom _CRC32C_
similar to the native one, however, now I do not think it is possible because
we cannot use _Unsafe_ in the code.
> CRC32 is inefficient on x86
> ---------------------------
>
> Key: CASSANDRA-16360
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16360
> Project: Cassandra
> Issue Type: Improvement
> Components: Messaging/Client
> Reporter: Avi Kivity
> Priority: Normal
> Labels: protocolv5
> Fix For: 4.0.x
>
>
> The client/server protocol specifies CRC24 and CRC32 as the checksum
> algorithm (cql_protocol_V5_framing.asc). Those however are expensive to
> compute; this affects both the client and the server.
>
> A better checksum algorithm is CRC32C, which has hardware support on x86 (as
> well as other modern architectures).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]