We now know what caused the problem, and have a better workaround for it, especially if the compaction strategy used is LCS or UCS.
Use sstable_preemptive_open_interval: 1TiB to disable early opening on size interval, but still keep early opening of completed sstables enabled. This will not go through the affected code path but still maintain most of the benefit of early open. Regards, Branimir ________________________________ From: Abe Ratnofsky <[email protected]> Sent: Wednesday, October 15, 2025 8:22:18 PM To: [email protected] <[email protected]> Subject: [EXTERNAL] Re: AssertionError when reading BTI sstable in 5.0.5 during token range query Do you have any logs you can share for operations on these SSTables? Thinking of things like flush, compaction, streaming, etc. On Oct 15, 2025, at 11: 37 AM, Ivan Zalozhnykh <ivan. zalozhnih@ gmail. com> wrote: We use LeveledCompactionStrategy Do you have any logs you can share for operations on these SSTables? Thinking of things like flush, compaction, streaming, etc. On Oct 15, 2025, at 11:37 AM, Ivan Zalozhnykh <[email protected]> wrote: We use LeveledCompactionStrategy on both affected tables. The issue is not deterministic: the same query may fail once and then succeed on retry. Each time the error occurs, it's on a different token range, node and file. We’ll try to reproduce the issue with sstable_preemptive_open_interval: null this week and I’ll follow up with the results. Best regards Ivan Zalozhnykh ср, 15 окт. 2025 г. в 16:58, Branimir Lambov <[email protected]<mailto:[email protected]>>: Hello, To help pinpoint the problem, could you check if it still happens if early open is disabled (i.e. `preemptive_open_interval` is set to null in `cassandra.yaml`)? Regards, Branimir On 2025/10/15 12:07:06 Ivan Zalozhnykh wrote: > Hello, > > We're experiencing a recurring issue in Cassandra 5.0.5 when performing > token-range read queries on a table using the BTI sstable format. The issue > results in a fatal `AssertionError` during read, which causes read failures > in the client with `ReadFailureException`. > > The exception from Cassandra logs looks like this: > > ``` > ERROR [ReadStage-6] 2025-08-27 19:22:49,331 JVMStabilityInspector.java:70 - > Exception in thread Thread[ReadStage-6,5,SharedPool] > java.lang.RuntimeException: java.lang.AssertionError: Caught an error while > trying to process the command: SELECT * FROM keyspace.table WHERE > token(profile_id) > -8987898918496238646 AND token(profile_id) <= > -8718169305572166374 LIMIT 1000 ALLOW FILTERING > at org.apache.cassandra.net<http://org.apache.cassandra.net/ > >.InboundSink.accept(InboundSink.java:108) > at org.apache.cassandra.net<http://org.apache.cassandra.net/ > >.InboundSink.accept(InboundSink.java:45) > at > org.apache.cassandra.net<http://org.apache.cassandra.net/ > >.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.lang.AssertionError: Caught an error while trying to > process the command: SELECT * FROM keyspace.table WHERE token(profile_id) > > -8987898918496238646 AND token(profile_id) <= -8718169305572166374 LIMIT > 1000 ALLOW FILTERING > at > org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:112) > at > org.apache.cassandra.net<http://org.apache.cassandra.net/ > >.InboundSink.lambda$new$0(InboundSink.java:78) > at org.apache.cassandra.net<http://org.apache.cassandra.net/ > >.InboundSink.accept(InboundSink.java:97) > ... 6 common frames omitted > Caused by: java.lang.AssertionError: 406005 > 393216 > at > org.apache.cassandra.io.util.MmappedRegions$State.floor(MmappedRegions.java:362) > at > org.apache.cassandra.io.util.MmappedRegions.floor(MmappedRegions.java:241) > at > org.apache.cassandra.io.util.MmapRebufferer.rebuffer(MmapRebufferer.java:40) > at org.apache.cassandra.io.tries.Walker.<init>(Walker.java:75) > at > org.apache.cassandra.io.tries.ValueIterator.<init>(ValueIterator.java:96) > at > org.apache.cassandra.io.tries.ValueIterator.<init>(ValueIterator.java:80) > at > org.apache.cassandra.io<http://org.apache.cassandra.io/ > >.sstable.format.bti.PartitionIndex$IndexPosIterator.<init>(PartitionIndex.java:407) > at > org.apache.cassandra.io<http://org.apache.cassandra.io/ > >.sstable.format.bti.PartitionIterator.<init>(PartitionIterator.java:113) > at > org.apache.cassandra.io<http://org.apache.cassandra.io/ > >.sstable.format.bti.PartitionIterator.create(PartitionIterator.java:75) > at > org.apache.cassandra.io<http://org.apache.cassandra.io/ > >.sstable.format.bti.BtiTableReader.coveredKeysIterator(BtiTableReader.java:293) > at > org.apache.cassandra.io<http://org.apache.cassandra.io/ > >.sstable.format.bti.BtiTableScanner$BtiScanningIterator.prepareToIterateRow(BtiTableScanner.java:93) > at > org.apache.cassandra.io<http://org.apache.cassandra.io/ > >.sstable.format.SSTableScanner$BaseKeyScanningIterator.computeNext(SSTableScanner.java:248) > at > org.apache.cassandra.io<http://org.apache.cassandra.io/ > >.sstable.format.SSTableScanner$BaseKeyScanningIterator.computeNext(SSTableScanner.java:228) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.io<http://org.apache.cassandra.io/ > >.sstable.format.SSTableScanner.hasNext(SSTableScanner.java:190) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:90) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:375) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:187) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$4.hasNext(UnfilteredPartitionIterators.java:264) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:90) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:334) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:201) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:186) > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48) > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:374) > at > org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:93) > ... 8 common frames omitted > ``` > > The query itself is a standard paginated scan over token ranges with a > LIMIT. Here's a simplified version of the code performing it: > > ```kotlin > @Query("select profile_id, token(profile_id) as page_token from > keyspace.table where token(profile_id) > :pageToken limit :limitCount") > fun selectWithToken( > pageToken: Long, > limitCount: Int, > ): MappedReactiveResultSet<MigratorEntityToken> > ``` > > This results in read failures like this on the client side: > > ``` > Caused by: > com.datastax.oss.driver.api.core.servererrors.ReadFailureException: > Cassandra failure during read query at consistency LOCAL_QUORUM (2 > responses were required but only 0 replica responded, 1 failed) > ``` > > The issue occurs with multiple SSTables, across multiple token ranges, on > different nodes. > All the affected Trie-indexed SSTables, and the error disappears when > switching the sstable format back to Big. > > Has anyone else seen similar behavior with BTI sstables in Cassandra 5.0.x? > Is this a known issue or something we should file a Jira ticket for? > > Thanks in advance, > Ivan Zalozhnykh >
