org.apache.cassandra.io.sstable.SSTableHeaderFix was added due to bugs in 3.6 
causing invalidate types or incompatible types (due to toString changes) in the 
SSTableHeader… this logic runs on start and rewrites all Stats files that had a 
mismatch from the local schema; with 5.0 requiring upgrades from 4.x only, this 
logic should have already run as its a 3.x to 4.0 migration step (though users 
are able to opt out [1]) which should have already fixed the SSTables to have 
correct schema…

Why is this a problem now?  CASSANDRA-18504 is adding a lot of property/fuzz 
tests to the type system and the read/write path, which has found several bugs; 
fixing some of the bugs actually impacts SSTableHeader because it requires 
generating and working with types that are not valid, so it can fix them…   By 
removing this logic, we can push this type validation into the type system to 
avoid generating incorrect types.  

If we wish to keep this class, we need to maintain allowing invalid types to be 
created, which may cause bugs down the road.


[1] if a user opts out there are 2 real cases that are impacted: UDTs, and 
collections of collections…
* For UDTs, the frozen vs non-frozen type are not the same, so mixing these 
causes us to fail to read the data, failing the read…. I believe 
writes/compactions will not corrupt the data, but anything that touches these 
SSTables will fail due to the schema mismatch… the only way to resolve this is 
to fix the SSTables… If you disabled in 4.x, you were living with broken / 
unreadable SSTables, so by removing 5.0 would loose the ability to repair them 
(but 4.x would still be able to)
* for collections of collections, this is less of an issue.  The logic would 
detect that the collection has a non-frozen collection as the element, so would 
migrate them to frozen.  This behavior has been moved to the type system, so a 
read from SSTable of “list<list<int>>” automagically becomes a 
"ListType(FrozenType(ListType(Int32Type)))”.  The SSTables are not “fixed”, but 
compaction is able to read the data correctly, and the new SSTables will have 
the correct header.  

Reply via email to