[ 
https://issues.apache.org/jira/browse/CASSANDRA-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582108#comment-16582108
 ] 

Sylvain Lebresne commented on CASSANDRA-14568:
----------------------------------------------

It's absolutely possible I'm missing some problem, as god knows those backward 
compatibility issues often comes with nasty surprises, but I don't think the 
problem is too hard to solve.

That is, {{LegacyBound}} is only here for backward compatibility and never used 
by 3.0 before being converting first, so I'm sure to understand what you mean 
by "(setting it) to something that breaks the 3.0 definition of a static 
clustering".

I believe we should simply make sure that when encoding to 2.x a 
{{STATIC_CLUSTERING}}, we append "clustering size" empty values before adding 
the rest (the column name), and when decoding, we assume those empty values are 
here (but ignore them). This is what we do for "cell names" 
([here|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/LegacyLayout.java#L287-L291]
 and 
[here|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/LegacyLayout.java#L329-L330])
 and to the best of my knowledge is working properly.

In other words, I "think" we need to do the 2 following changes:
# revert the change made by this patch in 
[{{LegacyLayout.decodeBound}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/LegacyLayout.java#L210-L215].
# make sure that in 
[{{LegacyLayout.serializeCompound}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/LegacyLayout.java#L2394-L2400]
 (and {{serializeSizeCompound}}), the static case is specialized to 
automatically add the empty values, which might possibly be better handler 
directly within {{CompositeType.Builder}}.


> Static collection deletions are corrupted in 3.0 -> 2.{1,2} messages
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-14568
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14568
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Critical
>             Fix For: 3.0.17, 3.11.3
>
>
> In 2.1 and 2.2, row and complex deletions were represented as range 
> tombstones.  LegacyLayout is our compatibility layer, that translates the 
> relevant RT patterns in 2.1/2.2 to row/complex deletions in 3.0, and vice 
> versa.  Unfortunately, it does not handle the special case of static row 
> deletions, they are treated as regular row deletions. Since static rows are 
> themselves never directly deleted, the only issue is with collection 
> deletions.
> Collection deletions in 2.1/2.2 were encoded as a range tombstone, consisting 
> of a sequence of the clustering keys’ data for the affected row, followed by 
> the bytes representing the name of the collection column.  STATIC_CLUSTERING 
> contains zero clusterings, so by treating the deletion as for a regular row, 
> zero clusterings are written to precede the column name of the erased 
> collection, so the column name is written at position zero.
> This can exhibit itself in at least two ways:
>  # If the type of your first clustering key is a variable width type, new 
> deletes will begin appearing covering the clustering key represented by the 
> column name.
>  ** If you have multiple clustering keys, you will receive a RT covering all 
> those rows with a matching first clustering key.
>  ** This RT will be valid as far as the system is concerned, and go 
> undetected unless there are outside data quality checks in place.
>  # Otherwise, an invalid size of data will be written to the clustering and 
> sent over the network to the 2.1 node.
>  ** The 2.1/2.2 node will handle this just fine, even though the record is 
> junk.  Since it is a deletion covering impossible data, there will be no 
> user-API visible effect.  But if received as a write from a 3.0 node, it will 
> dutifully persist the junk record.
>  ** The 3.0 node that originally sent this junk, may later coordinate a read 
> of the partition, and will notice a digest mismatch, read-repair and 
> serialize the junk to disk
>  ** The sstable containing this record is now corrupt; the deserialization 
> expects fixed-width data, but it encounters too many (or too few) bytes, and 
> is now at an incorrect position to read its structural information
>  ** (Alternatively when the 2.1 node is upgraded this will occur on eventual 
> compaction)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to