[
https://issues.apache.org/jira/browse/CASSANDRA-21111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18052165#comment-18052165
]
Stefan Miklosovic edited comment on CASSANDRA-21111 at 1/15/26 4:12 PM:
------------------------------------------------------------------------
2,147,483,647 is Integer.MAX_INT, adding 1 to that will make -2,147,483,648
from that. -2,146,021,117 is _bigger_ than -2,147,483,648. That means that we
just kept adding +1 to that again. The overflowing logic checks out here. I am
not completely sure though how that might happen, the difference between these
numbers is still 1,462,531. So somebody had to generate pretty aggressively to
hit this.
The main generation logic is here. What is suspicious is that it runs in a
{{while}} loop. Could not it happen that it just could not "allocate" new ID
for so long, for some time, that it just kept increasing it? That seems like
some kind of a race condition or something of that fashion. Like creating two
SSTables at the very same time so one would "override" the other loop. That
seems to be more probable to me than somebody generation 2 billions of
SSTables.
(1)
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/AbstractSSTableSimpleWriter.java#L158-L173
was (Author: smiklosovic):
2,147,483,647 is Integer.MAX_INT, adding 1 to that will make -2,147,483,648
from that. -2,146,021,117 is _bigger_ than -2,147,483,648. That means that we
just kept adding +1 to that again. The overflowing logic checks out here. I am
not completely sure though how that might happen, the difference between these
numbers is still 1,462,531. So somebody had to generate pretty aggressively to
hit this.
> Cassandra sstable generation ID's are a signed int which might overflow
> -----------------------------------------------------------------------
>
> Key: CASSANDRA-21111
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21111
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Local/SSTable
> Reporter: Stefan Miklosovic
> Assignee: Stefan Miklosovic
> Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> We need to triage this but this seems to be legit:
> We have hit an issue with a cluster that where sstable generation ID’s for a
> peer has incremented so high that it has overflowed, which has introduced a
> 2nd dash into the file name. For example:
> nb--2146021117-big-data.db
> This caused an issue with our solution which is aimed to consolidate
> partition statistics in the data set. The current version of the solution
> does not recognise the possibility of sstable generation ID’s being a
> negative value - which causes it to fail.
> Workaround:
> Rename all sstables in the table to “reset” the generation ID.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]