Benedict created CASSANDRA-14843:
------------------------------------

             Summary: Drop/ddd column name with different Kind can result in 
corruption
                 Key: CASSANDRA-14843
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14843
             Project: Cassandra
          Issue Type: Bug
            Reporter: Benedict
            Assignee: Benedict
             Fix For: 3.0.18


While we have always imposed that the type of any column name remains 
consistent, we have not done the same for its kind.  If a column’s kind is 
changed via Drop/Add, the SerializationHeader’s column sets will be interpreted 
incorrectly, and may either lead to CorruptSSTableException or silent data 
corruption.

The problem occurs in SerializationHeader.Component.toHeader().  In this 
method, we lookup columns from the metadata only by name, ignoring the static 
status.  If a column of the same name but different kind to our dropped column 
exists in the schema, we will only add this column to the list of fields we 
expect.  So we will be missing a column from our set of expected static 
columns.  We will also add the regular variant to the set of regular columns 
present, which it may not be.

There are a number of ways this can play out:

1) There are no other static columns in the affected sstables.  In this case we 
will incorrectly treat the table as containing no static data, so we will not 
attempt to read any static row.  This leaves a static row to be read as the 
first regular row, and will throw the exception in the description.
1a) If for some reason the static row is absent - say, it has expired - then 
depending on lexicographical ordering, and if the sstable did not contain any 
instance of the regular variant, we may misattribute column data.  This will 
probably result in corruption exceptions, but if the columns were of the same 
type it could be undetectable.
2) There are other static columns on the table, and in the affected sstables.  
In this case, the row will be correctly treated as a static row, but depending 
on the dropped column’s lexicographical position relative to these, it may 
result in column data being misattributed to other columns.  This *could* be 
undetectable corruption, if the column types were the same.
3) Either of these above scenario could be replayed with the static/regular 
relations replaced.

CASSANDRA-14591 would protect against 1a and 2.

Probably the safest and correct fix is to prevent adding a column back with a 
different kind to the original, however this would require storing more 
metadata about dropped columns.  This is probably viable for 4.0, but for 3.0 
perhaps too invasive. 

It is quite easy to make this particular method safe against this particular 
corruption.  However it is hard to be certain there aren’t other places where 
assumptions were made about a name being of only one kind.  There are only a 
handful of places that *should* need to be of concern, specifically those 
places that invoke CFMetaData.getDroppedColumn, so we should be fairly 
confident that - if these call sites are safe - we are now insulated.  This 
might suffice for 3.0.  Thoughts?




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to