arvindKandpal-ksolves opened a new pull request, #4825:
URL: https://github.com/apache/cassandra/pull/4825

   Make bin/sstableupgrade functionally on par with nodetool upgradesstables
   
   ### What this PR does / why we need it:
   This PR brings functional parity between `bin/sstableupgrade` and `nodetool 
upgradesstables` by introducing the `-a` / `--include-all-sstables` flag to the 
offline tool. 
   
   Previously, `bin/sstableupgrade` would automatically skip SSTables that were 
already on the latest version. This patch allows users to force a rewrite of 
all SSTables even if they are on the current format.
   
   ### Technical Details & Safety:
   Adding this flag to an offline tool introduced a known edge case: If the 
offline tool is run while the Cassandra node is online, it consumes a new 
SSTable ID on disk. However, the live node's internal `sstableIdGenerator` 
remains unaware of this. Upon the next flush, the live node attempts to use the 
same ID, which previously triggered a `java.lang.AssertionError` in 
`ColumnFamilyStore.newSSTableDescriptor()`.
   
   To safely resolve this without introducing regressions:
   1. Replaced the strict `assert 
!newDescriptor.fileFor(Components.DATA).exists();` check in 
`newSSTableDescriptor` with a safe `while(true)` loop. It checks if the 
generated ID already exists on disk. If a collision is detected, it logs a 
warning and advances the generator, self-healing the state.
   2. Updated `StandaloneUpgrader` to parse and apply the `-a` option 
consistently.
   3. Added a robust test (`testNewSSTableDescriptorCollision`) in 
`ColumnFamilyStoreTest` that probes the current ID, uses `SSTableIdFactory` to 
pre-create files for future IDs (`N+1`, `N+2`), and verifies that the loop 
correctly skips them and yields `N+3`.
   4. Updated flag assertions in `StandaloneUpgraderTest`.
   5. Updated documentation in `sstableupgrade.adoc`.
   
   patch by Arvind Kandpal; reviewed by TBD for CASSANDRA-21133


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to