[ 
https://issues.apache.org/jira/browse/CASSANDRA-15299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106257#comment-17106257
 ] 

Aleksey Yeschenko commented on CASSANDRA-15299:
-----------------------------------------------

bq. Firstly, we think that not all users will be interested in enabling 
checksumming. Making it mandatory may keep them away from v5, potentially 
decreasing its adoption rate.

The change is less about checksumming per-se, and more about improving protocol 
performance by making framing sane. In particular when compression is enabled. 
In addition, I strongly believe that no new protocol should be released without 
application-level checksumming built-in, and v5 is a new protocol.

bq. For these drivers, it's a race against time that starts now: they must 
implement protocol checksumming before Cassandra 4.0 GA, or they will lose 
access to all the v5 features implemented so far, especially keyspace-per-query 
– which is a nice usability improvement that many users are waiting for.

All of the v5-beta features implemented so far are nice-to-haves and 
non-essential, IMO. The new framing, to me, *is* the core of V5. And if those 
drivers don't immediately support the new protocol upon C* release, that is 
fine, too. C* and drivers are on separate release cadences, for one, and V4 
protocol is still supported, for two.

bq. Making checksumming opt-in would allow many drivers to be ready for 
Cassandra 4.0 GA, and others to catch up quickly. We feel this would greatly 
contribute to promoting Cassandra 4.0's adoption.

Again, it would be nice if all the drivers supported protocol v5 by the time C* 
4.0 was out, but I don't see it as an issue if they don't. Protocol v5 is not 
the selling feature of C* 4.0, and other changes should provide plenty of 
motivation for people to upgrade. The drivers will still be ready for 4.0 GA - 
so long as they can speak v4 - and implement v5 at some point later.

> CASSANDRA-13304 follow-up: improve checksumming and compression in protocol 
> v5-beta
> -----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Client
>            Reporter: Aleksey Yeschenko
>            Assignee: Sam Tunnicliffe
>            Priority: Normal
>              Labels: protocolv5
>             Fix For: 4.0-alpha
>
>
> CASSANDRA-13304 made an important improvement to our native protocol: it 
> introduced checksumming/CRC32 to request and response bodies. It’s an 
> important step forward, but it doesn’t cover the entire stream. In 
> particular, the message header is not covered by a checksum or a crc, which 
> poses a correctness issue if, for example, {{streamId}} gets corrupted.
> Additionally, we aren’t quite using CRC32 correctly, in two ways:
> 1. We are calculating the CRC32 of the *decompressed* value instead of 
> computing the CRC32 on the bytes written on the wire - losing the properties 
> of the CRC32. In some cases, due to this sequencing, attempting to decompress 
> a corrupt stream can cause a segfault by LZ4.
> 2. When using CRC32, the CRC32 value is written in the incorrect byte order, 
> also losing some of the protections.
> See https://users.ece.cmu.edu/~koopman/pubs/KoopmanCRCWebinar9May2012.pdf for 
> explanation for the two points above.
> Separately, there are some long-standing issues with the protocol - since 
> *way* before CASSANDRA-13304. Importantly, both checksumming and compression 
> operate on individual message bodies rather than frames of multiple complete 
> messages. In reality, this has several important additional downsides. To 
> name a couple:
> # For compression, we are getting poor compression ratios for smaller 
> messages - when operating on tiny sequences of bytes. In reality, for most 
> small requests and responses we are discarding the compressed value as it’d 
> be smaller than the uncompressed one - incurring both redundant allocations 
> and compressions.
> # For checksumming and CRC32 we pay a high overhead price for small messages. 
> 4 bytes extra is *a lot* for an empty write response, for example.
> To address the correctness issue of {{streamId}} not being covered by the 
> checksum/CRC32 and the inefficiency in compression and checksumming/CRC32, we 
> should switch to a framing protocol with multiple messages in a single frame.
> I suggest we reuse the framing protocol recently implemented for internode 
> messaging in CASSANDRA-15066 to the extent that its logic can be borrowed, 
> and that we do it before native protocol v5 graduates from beta. See 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderCrc.java
>  and 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderLZ4.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to