arvindKandpal-ksolves opened a new pull request, #4825: URL: https://github.com/apache/cassandra/pull/4825
Make bin/sstableupgrade functionally on par with nodetool upgradesstables ### What this PR does / why we need it: This PR brings functional parity between `bin/sstableupgrade` and `nodetool upgradesstables` by introducing the `-a` / `--include-all-sstables` flag to the offline tool. Previously, `bin/sstableupgrade` would automatically skip SSTables that were already on the latest version. This patch allows users to force a rewrite of all SSTables even if they are on the current format. ### Technical Details & Safety: Adding this flag to an offline tool introduced a known edge case: If the offline tool is run while the Cassandra node is online, it consumes a new SSTable ID on disk. However, the live node's internal `sstableIdGenerator` remains unaware of this. Upon the next flush, the live node attempts to use the same ID, which previously triggered a `java.lang.AssertionError` in `ColumnFamilyStore.newSSTableDescriptor()`. To safely resolve this without introducing regressions: 1. Replaced the strict `assert !newDescriptor.fileFor(Components.DATA).exists();` check in `newSSTableDescriptor` with a safe `while(true)` loop. It checks if the generated ID already exists on disk. If a collision is detected, it logs a warning and advances the generator, self-healing the state. 2. Updated `StandaloneUpgrader` to parse and apply the `-a` option consistently. 3. Added a robust test (`testNewSSTableDescriptorCollision`) in `ColumnFamilyStoreTest` that probes the current ID, uses `SSTableIdFactory` to pre-create files for future IDs (`N+1`, `N+2`), and verifies that the loop correctly skips them and yields `N+3`. 4. Updated flag assertions in `StandaloneUpgraderTest`. 5. Updated documentation in `sstableupgrade.adoc`. patch by Arvind Kandpal; reviewed by TBD for CASSANDRA-21133 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

