Robert Stupp created CASSANDRA-15035:
----------------------------------------
Summary: C* 3.0 sstables w/ UDTs are corrupted in 3.11 + 4.0
Key: CASSANDRA-15035
URL: https://issues.apache.org/jira/browse/CASSANDRA-15035
Project: Cassandra
Issue Type: Bug
Components: Feature/UDT, Local/SSTable
Reporter: Robert Stupp
Assignee: Robert Stupp
Fix For: 3.11.5, 4.0
OSS C* 3.0 writes incorrect type information for UDTs into the
serialization-header of each sstable.
In C* 3.0, both UDTs and tuple are always frozen. A frozen type must be
enclosed in a {{frozen<...>}} via the {{CQL3Type}} hierarchy (resp
{{org.apache.cassandra.db.marshal.FrozenType(...)}} via the {{AbstractType}}
hierarchy) “bracket” in the schema and serialization-header.
Since CASSANDRA-7423 (committed to C* 3.6) UDTs can also be non-frozen (=
multi-cell).
Unfortunately, C* 3.0 does not write the
{{org.apache.cassandra.db.marshal.FrozenType(...)}} “bracket” for UDTs into the
{{SerializationHeader.Component}} in the {{-Stats.db}} sstable component.
The order in which columns of a row are serialized depends on the concrete
{{AbstractType}}. Columns with variable length types (frozen types belong to
this category) are serialized before columns with multi-cell types (non-frozen
types belong to that category).
If C* 3.6 (or any newer version) reads an sstable written by C* 3.0 (up to
3.5), it will read the type information “non-frozen UDT” from the serialization
header, which is technically correct.
This means, that upgrades from C* 3.0 to C* 3.11 and 4.0, using a schema that
uses UDTs, result in inaccessible data in those sstables. Reads against 3.0
sstables as well as attempts to scrub these sstables result in a wide variety
of errors/exceptions ({{CorruptSSTableException}}, {{EOFExcepiton}},
{{OutOfMemoryError}}, etc etc), as usual in such cases.
Mitigation strategy in the proposed patch:
* Fix the broken serialization-headers automatically when an upgrade from C*
3.0 is detected.
* Enhance {{sstablescrub}} to verify the serialization-header against the
schema and allow {{sstablescrub}} to fix the UDT types according to the
information in the schema. This does not apply to "online scrub" (e.g. nodetool
scrub). The behavior of {{sstablescrub}} has been changed to first inspect the
serialization-header and verify the type information against the schema.
Differences between the schema and the sstable serialization-headers cause
{{sstablescrub}} to error out and stop - i.e. safety first (there’s a way to
opt-out though).
A new class {{SSTableHeaderFix}} can inspect the serialization-header
({{SerializationHeader.Component}}) in the the {{-Statistics.db}} component and
fix the type information in those sstables for UDTs according to the schema
information.
This new class could be used during verify and before sstables are imported.
But changes to “verify” and “import” are out of the scope of this ticket, as
the patch is already bigger than I originally expected.
Another issue not tackled by this ticket is that the wrong ‘kind’ is written to
the type information in {{system_schema.dropped_columns}} when a non-frozen UDT
column is dropped. When a UDT column is dropped, the type of the dropped column
is converted from the UDT definition to its “corresponding” tuple type
definition. But all versions currently write {{frozen<tuple<...>>}}, but for
non-frozen UDTs it should actually just be {{tuple<...>}}. Unfortunately, there
is nothing that could be done in this ticket to fix (or even consider) the type
information of a dropped column. But for correctness, the tuple type should be
a multi-cell one (only accessible for dropped UDTs though - not as something
that a user can create as a type).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]