[
https://issues.apache.org/jira/browse/CASSANDRA-16360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335042#comment-17335042
]
Alexey Zotov commented on CASSANDRA-16360:
------------------------------------------
Thanks for the feedback!
[~benedict]
# I'm glad to hear that is smth useful. Though I'm not sure how to get the
existing PR merged. Please, let me know if I need to take any further actions
(e.g. update {{CHANGES.txt}}), otherwise I'll be just waiting.
# Ok, sounds like a good suggestion. I'll add it to the list of further action
items.
# That's what I thought too. Then I'll just update the fix version to 4.x and
put the work on hold for now.
{quote}That said, it looks like there might be some measurement error in your
results, and the crc32 intrinsic version may be benefitting from tight loop
benchmarking (so high cache occupancy). It certainly doesn't look like a slam
dunk for urgent migration anyway.{quote}
That's an interesting point. My feeling was that all the implementations
equally benefit from the cache occupancy. Therefore, it should not affect the
final results significantly (obviously it should affects the total numbers, but
not the relative numbers). At the end of the day, the origin of the
optimization is more about the number of CPU cycles required to calculate the
checksum which does not seem to be directly related to the cache occupancy.
I guess you expected to see a bigger improvement for _CRC32C_ (compared to
_CRC32_) and that's what causes the suspicion. I'm just curious, why do you
think there should be a bigger improvement and what kind of difference did you
expect?
In the meantime, I'll try to start generating a large byte array and calculate
hashes for slices (sub-arrays). By doing that, we'll ensure that data for hash
calculation is not cached because every time we calculate the hash for a new
slice. I'll raise one more experimental PR, so we can see the results and
re-visit the conclusion if needed.
[~samt]
I was going to start asking about clients compatibility a bit later, but since
you brought this point let me raise a couple of questions:
# Obviously there are multiple clients (java, python, etc). How should it be
approached? Do I just need to submit tickets to the corresponding projects?
Where can I find the whole list of clients? I just want to understand the high
level steps (no need to provide a detailed step-by-step instruction, it is
simply too early for that).
# Am I right that {{InboundConnectionInitiator / OutboundConnectionInitiator /
HandshakeProtocol}} are about internode communication, whereas
{{PipelineConfigurator / InitialConnectionHandler}} are about client
communication? Am I missing smth here?
{quote}If you implement step 7 so that the encoders/decoders support both CRC32
and CRC32C then this becomes two separate (possibly parallelisable) pieces of
work.{quote}
You suggestion makes perfect sense to me! Basically I can proceed with this
change only (it would be just a small refactoring/re-design without changing
the existing behavior). As a result of the change, {{ChecksumType.CRC32}} will
be used in decoders/encoders (as stated in step 7). By doing that, we will need
much less changes in the future when we actually start adding _CRC32C_ support.
Do you think it is worth to make this part now or we should wait and make the
full change later on?
PS:
Even though we're going to complete this ticket now and have to wait until Java
8 support in C* is abandoned, I'm still encouraged to get it done. If by some
reason I miss the moment when the full blown work can be started, please, ping
me :)
> CRC32 is inefficient on x86
> ---------------------------
>
> Key: CASSANDRA-16360
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16360
> Project: Cassandra
> Issue Type: Improvement
> Components: Messaging/Client
> Reporter: Avi Kivity
> Assignee: Alexey Zotov
> Priority: Normal
> Labels: protocolv6
> Fix For: 4.0.x
>
>
> The client/server protocol specifies CRC24 and CRC32 as the checksum
> algorithm (cql_protocol_V5_framing.asc). Those however are expensive to
> compute; this affects both the client and the server.
>
> A better checksum algorithm is CRC32C, which has hardware support on x86 (as
> well as other modern architectures).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]