[ 
https://issues.apache.org/jira/browse/CASSANDRA-15035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-15035:
---------------------------------------
    Fix Version/s:     (was: 3.11.5)
                   3.11.6

> C* 3.0 sstables w/ UDTs are corrupted in 3.11 + 4.0
> ---------------------------------------------------
>
>                 Key: CASSANDRA-15035
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15035
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/UDT, Local/SSTable
>            Reporter: Robert Stupp
>            Assignee: Robert Stupp
>            Priority: Urgent
>             Fix For: 4.0, 3.11.6
>
>
> OSS C* 3.0 writes incorrect type information for UDTs into the 
> serialization-header of each sstable.
> In C* 3.0, both UDTs and tuple are always frozen. A frozen type must be 
> enclosed in a {{frozen<...>}} via the {{CQL3Type}} hierarchy (resp 
> {{org.apache.cassandra.db.marshal.FrozenType(...)}} via the {{AbstractType}} 
> hierarchy) “bracket” in the schema and serialization-header.
> Since CASSANDRA-7423 (committed to C* 3.6) UDTs can also be non-frozen (= 
> multi-cell).
> Unfortunately, C* 3.0 does not write the 
> {{org.apache.cassandra.db.marshal.FrozenType(...)}} “bracket” for UDTs into 
> the {{SerializationHeader.Component}} in the {{-Stats.db}} sstable component.
> The order in which columns of a row are serialized depends on the concrete 
> {{AbstractType}}. Columns with variable length types (frozen types belong to 
> this category) are serialized before columns with multi-cell types 
> (non-frozen types belong to that category).
> If C* 3.6 (or any newer version) reads an sstable written by C* 3.0 (up to 
> 3.5), it will read the type information “non-frozen UDT” from the 
> serialization header, which is technically correct.
> This means, that upgrades from C* 3.0 to C* 3.11 and 4.0, using a schema that 
> uses UDTs, result in inaccessible data in those sstables. Reads against 3.0 
> sstables as well as attempts to scrub these sstables result in a wide variety 
> of errors/exceptions ({{CorruptSSTableException}}, {{EOFExcepiton}}, 
> {{OutOfMemoryError}}, etc etc), as usual in such cases.
> Mitigation strategy in the proposed patch:
> * Fix the broken serialization-headers automatically when an upgrade from C* 
> 3.0 is detected.
> * Enhance {{sstablescrub}} to verify the serialization-header against the 
> schema and allow {{sstablescrub}} to fix the UDT types according to the 
> information in the schema. This does not apply to "online scrub" (e.g. 
> nodetool scrub). The behavior of {{sstablescrub}} has been changed to first 
> inspect the serialization-header and verify the type information against the 
> schema. 
> Differences between the schema and the sstable serialization-headers cause 
> {{sstablescrub}} to error out and stop - i.e. safety first (there’s a way to 
> opt-out though).
> A new class {{SSTableHeaderFix}} can inspect the serialization-header 
> ({{SerializationHeader.Component}}) in the the {{-Statistics.db}} component 
> and fix the type information in those sstables for UDTs according to the 
> schema information.
> This new class could be used during verify and before sstables are imported. 
> But changes to “verify” and “import” are out of the scope of this ticket, as 
> the patch is already bigger than I originally expected.
> Another issue not tackled by this ticket is that the wrong ‘kind’ is written 
> to the type information in {{system_schema.dropped_columns}} when a 
> non-frozen UDT column is dropped. When a UDT column is dropped, the type of 
> the dropped column is converted from the UDT definition to its 
> “corresponding” tuple type definition. But all versions currently write 
> {{frozen<tuple<...>>}}, but for non-frozen UDTs it should actually just be 
> {{tuple<...>}}. Unfortunately, there is nothing that could be done in this 
> ticket to fix (or even consider) the type information of a dropped column. 
> But for correctness, the tuple type should be a multi-cell one (only 
> accessible for dropped UDTs though - not as something that a user can create 
> as a type).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to