[
https://issues.apache.org/jira/browse/CASSANDRA-18644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ke Han updated CASSANDRA-18644:
-------------------------------
Description:
h2. Symptoms
I was migrating from Cassandra-3.11.15 to Cassandra-4.1.2(Single node), and I
encountered the following error message when I drained and shutdown the 3.11.15
node.
{code:java}
INFO [MigrationStage:1] 2023-06-30 02:41:08,822 ColumnFamilyStore.java:432 -
Initializing uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi.dbqxbgocor
INFO [MutationStage-4] 2023-06-30 02:41:11,041 ColumnFamilyStore.java:2250 -
Truncating uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi
INFO [MutationStage-4] 2023-06-30 02:41:13,547 ColumnFamilyStore.java:2304 -
Truncate of uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi is complete
INFO [Native-Transport-Requests-5] 2023-06-30 02:41:13,552
MigrationManager.java:362 - Drop Keyspace 'uuidff0deb8dbf894fb1b3f59c0eafb94467'
INFO [IndexSummaryManager:1] 2023-06-30 02:41:36,223
IndexSummaryRedistribution.java:78 - Redistributing index summaries
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,637
StorageService.java:1660 - DRAINING: starting drain process
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,641
HintsService.java:210 - Paused hints dispatch
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 Server.java:179
- Stop listening for CQL clients
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645
Gossiper.java:1720 - Announcing shutdown
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,648
StorageService.java:2585 - Node /192.168.215.2 state jump to shutdown
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:48,652
MessagingService.java:985 - Waiting for messaging service to quiesce
INFO [ACCEPT-/192.168.215.2] 2023-06-30 02:41:48,655
MessagingService.java:1346 - MessagingService has terminated the accept() thread
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:52,618
HintsService.java:210 - Paused hints dispatch
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:56,835
StorageService.java:1660 - DRAINED
ERROR [MigrationStage:1] 2023-06-30 02:41:56,838 CassandraDaemon.java:244 -
Exception in thread Thread[MigrationStage:1,5,main]
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:424)
at java.util.ArrayList.get(ArrayList.java:437)
at
org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
at
org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
at
org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
at java.lang.Thread.run(Thread.java:750)
ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,839
QueryMessage.java:129 - Unexpected error during query
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: -1
at
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442)
at
org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429)
at
org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363)
at
org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63)
at
org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123)
at
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243)
at
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274)
at
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259)
at
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
at
org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688)
at
org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438)
... 14 common frames omitted
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:424)
at java.util.ArrayList.get(ArrayList.java:437)
at
org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
at
org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
at
org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
... 1 common frames omitted
ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,841
ErrorMessage.java:384 - Unexpected exception during request
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: -1
at
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442)
at
org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429)
at
org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363)
at
org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63)
at
org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123)
at
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243)
at
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274)
at
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259)
at
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
at
org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688)
at
org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438)
... 14 common frames omitted
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:424)
at java.util.ArrayList.get(ArrayList.java:437)
at
org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
at
org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
at
org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
... 1 common frames omitted
INFO [RMI TCP Connection(4)-127.0.0.1] 2023-06-30 02:41:59,326
CassandraDaemon.java:576 - Cassandra shutting down... {code}
The _java.lang.ArrayIndexOutOfBoundsException_ is thrown in MigrationStage
thread.
h2. Root Cause
>From my preliminary analysis, this bug is related to thread interleaving
>between
# Thread trying to shutdown the Cassandra instance.
# MigrationStage thread trying to recycle segments.
if user invokes {_}bin/nodetool stopdaemon{_}, the activeSegments in
AbstractCommitLogSegmentManager will be empty.
{code:java}
// AbstractCommitLogSegmentManager.java
/** Active segments, containing unflushed data. The tail of this queue is the
one we allocate writes to */private final
ConcurrentLinkedQueue<CommitLogSegment> activeSegments = new
ConcurrentLinkedQueue<>(); {code}
At this time, if the migration thread tries to invoke the forceRecycleAll
function, it throws _ArrayIndexOutOfBoundsException_ if there's no active
segment.
{code:java}
// AbstractCommitLogSegmentManager.java
void forceRecycleAll(Iterable<UUID> droppedCfs)
{
List<CommitLogSegment> segmentsToRecycle = new ArrayList<>(activeSegments);
CommitLogSegment last = segmentsToRecycle.get(segmentsToRecycle.size() - 1);
advanceAllocatingFrom(last); // wait for the commit log modifications
last.waitForModifications(); {code}
It seems that this ERROR won't impact the data integrity. But I am not sure
whether it will cause some problems in the segment recycling process.
I think this should be handled considering its ERROR level in system.log. It
might confuse users whether their data is still in an integrity state and
whether they can perform the migration from 3.x to 4.x.
I have attached my full system.log and cassandra.yaml file.
was:
h2. Symptoms
I was migrating from Cassandra-3.11.15 to Cassandra-4.1.2(Single node), and I
encountered the following error message when I drained and shutdown the 3.11.15
node.
{code:java}
INFO [MigrationStage:1] 2023-06-30 02:41:08,822 ColumnFamilyStore.java:432 -
Initializing uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi.dbqxbgocor
INFO [MutationStage-4] 2023-06-30 02:41:11,041 ColumnFamilyStore.java:2250 -
Truncating uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi
INFO [MutationStage-4] 2023-06-30 02:41:13,547 ColumnFamilyStore.java:2304 -
Truncate of uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi is complete
INFO [Native-Transport-Requests-5] 2023-06-30 02:41:13,552
MigrationManager.java:362 - Drop Keyspace 'uuidff0deb8dbf894fb1b3f59c0eafb94467'
INFO [IndexSummaryManager:1] 2023-06-30 02:41:36,223
IndexSummaryRedistribution.java:78 - Redistributing index summaries
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,637
StorageService.java:1660 - DRAINING: starting drain process
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,641
HintsService.java:210 - Paused hints dispatch
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645 Server.java:179
- Stop listening for CQL clients
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645
Gossiper.java:1720 - Announcing shutdown
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,648
StorageService.java:2585 - Node /192.168.215.2 state jump to shutdown
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:48,652
MessagingService.java:985 - Waiting for messaging service to quiesce
INFO [ACCEPT-/192.168.215.2] 2023-06-30 02:41:48,655
MessagingService.java:1346 - MessagingService has terminated the accept() thread
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:52,618
HintsService.java:210 - Paused hints dispatch
INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:56,835
StorageService.java:1660 - DRAINED
ERROR [MigrationStage:1] 2023-06-30 02:41:56,838 CassandraDaemon.java:244 -
Exception in thread Thread[MigrationStage:1,5,main]
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:424)
at java.util.ArrayList.get(ArrayList.java:437)
at
org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
at
org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
at
org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
at java.lang.Thread.run(Thread.java:750)
ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,839
QueryMessage.java:129 - Unexpected error during query
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: -1
at
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442)
at
org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429)
at
org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363)
at
org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63)
at
org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123)
at
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243)
at
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274)
at
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259)
at
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
at
org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688)
at
org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438)
... 14 common frames omitted
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:424)
at java.util.ArrayList.get(ArrayList.java:437)
at
org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
at
org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
at
org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
... 1 common frames omitted
ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,841
ErrorMessage.java:384 - Unexpected exception during request
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: -1
at
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442)
at
org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429)
at
org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363)
at
org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63)
at
org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123)
at
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243)
at
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274)
at
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259)
at
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
at
org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688)
at
org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438)
... 14 common frames omitted
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:424)
at java.util.ArrayList.get(ArrayList.java:437)
at
org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
at
org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
at
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
at
org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
... 1 common frames omitted
INFO [RMI TCP Connection(4)-127.0.0.1] 2023-06-30 02:41:59,326
CassandraDaemon.java:576 - Cassandra shutting down... {code}
The _java.lang.ArrayIndexOutOfBoundsException_ is thrown in MigrationStage
thread.
h2. Root Cause
This bug is related to thread interleaving between
# Thread trying to shutdown the Cassandra instance.
# MigrationStage thread trying to recycle segments.
if user invokes {_}bin/nodetool stopdaemon{_}, the activeSegments in
AbstractCommitLogSegmentManager will be empty.
{code:java}
// AbstractCommitLogSegmentManager.java
/** Active segments, containing unflushed data. The tail of this queue is the
one we allocate writes to */private final
ConcurrentLinkedQueue<CommitLogSegment> activeSegments = new
ConcurrentLinkedQueue<>(); {code}
At this time, if the migration thread tries to invoke the forceRecycleAll
function, it throws ArrayIndexOutOfBoundsException if there's no active segment.
{code:java}
// AbstractCommitLogSegmentManager.java
void forceRecycleAll(Iterable<UUID> droppedCfs)
{
List<CommitLogSegment> segmentsToRecycle = new ArrayList<>(activeSegments);
CommitLogSegment last = segmentsToRecycle.get(segmentsToRecycle.size() - 1);
advanceAllocatingFrom(last); // wait for the commit log modifications
last.waitForModifications(); {code}
It seems that this ERROR won't impact the data integrity. But I am not sure
whether it will cause some problems in the segment recycling process.
I think this should be handled considering its ERROR level. It might confuse
users whether their data is still in an integrity state => whether they can
perform the migration from 3.x to 4.x.
I have attached my full system.log and cassandra.yaml file.
> ERROR: ArrayIndexOutOfBoundsException: -1 in MigrationStage thread when
> shutdown Cassandra-3.11.15 node
> -------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-18644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18644
> Project: Cassandra
> Issue Type: Bug
> Reporter: Ke Han
> Priority: Normal
> Attachments: cassandra.yaml, system.log
>
>
> h2. Symptoms
> I was migrating from Cassandra-3.11.15 to Cassandra-4.1.2(Single node), and I
> encountered the following error message when I drained and shutdown the
> 3.11.15 node.
> {code:java}
> INFO [MigrationStage:1] 2023-06-30 02:41:08,822 ColumnFamilyStore.java:432 -
> Initializing uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi.dbqxbgocor
> INFO [MutationStage-4] 2023-06-30 02:41:11,041 ColumnFamilyStore.java:2250 -
> Truncating uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi
> INFO [MutationStage-4] 2023-06-30 02:41:13,547 ColumnFamilyStore.java:2304 -
> Truncate of uuidff0deb8dbf894fb1b3f59c0eafb94467.xlxzwsuwi is complete
> INFO [Native-Transport-Requests-5] 2023-06-30 02:41:13,552
> MigrationManager.java:362 - Drop Keyspace
> 'uuidff0deb8dbf894fb1b3f59c0eafb94467'
> INFO [IndexSummaryManager:1] 2023-06-30 02:41:36,223
> IndexSummaryRedistribution.java:78 - Redistributing index summaries
> INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,637
> StorageService.java:1660 - DRAINING: starting drain process
> INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,641
> HintsService.java:210 - Paused hints dispatch
> INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645
> Server.java:179 - Stop listening for CQL clients
> INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,645
> Gossiper.java:1720 - Announcing shutdown
> INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:46,648
> StorageService.java:2585 - Node /192.168.215.2 state jump to shutdown
> INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:48,652
> MessagingService.java:985 - Waiting for messaging service to quiesce
> INFO [ACCEPT-/192.168.215.2] 2023-06-30 02:41:48,655
> MessagingService.java:1346 - MessagingService has terminated the accept()
> thread
> INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:52,618
> HintsService.java:210 - Paused hints dispatch
> INFO [RMI TCP Connection(2)-127.0.0.1] 2023-06-30 02:41:56,835
> StorageService.java:1660 - DRAINED
> ERROR [MigrationStage:1] 2023-06-30 02:41:56,838 CassandraDaemon.java:244 -
> Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:424)
> at java.util.ArrayList.get(ArrayList.java:437)
> at
> org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
> at
> org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
> at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
> at
> org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
> at java.lang.Thread.run(Thread.java:750)
> ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,839
> QueryMessage.java:129 - Unexpected error during query
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.ArrayIndexOutOfBoundsException: -1
> at
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442)
> at
> org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429)
> at
> org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363)
> at
> org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63)
> at
> org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123)
> at
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243)
> at
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274)
> at
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259)
> at
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
> at
> org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688)
> at
> org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113)
> at java.lang.Thread.run(Thread.java:750)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438)
> ... 14 common frames omitted
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:424)
> at java.util.ArrayList.get(ArrayList.java:437)
> at
> org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
> at
> org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
> at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
> at
> org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
> ... 1 common frames omitted
> ERROR [Native-Transport-Requests-5] 2023-06-30 02:41:56,841
> ErrorMessage.java:384 - Unexpected exception during request
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.ArrayIndexOutOfBoundsException: -1
> at
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:442)
> at
> org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:429)
> at
> org.apache.cassandra.service.MigrationManager.announceKeyspaceDrop(MigrationManager.java:363)
> at
> org.apache.cassandra.cql3.statements.DropKeyspaceStatement.announceMigration(DropKeyspaceStatement.java:63)
> at
> org.apache.cassandra.cql3.statements.SchemaAlteringStatement.execute(SchemaAlteringStatement.java:123)
> at
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:243)
> at
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:274)
> at
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:259)
> at
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
> at
> org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:688)
> at
> org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:594)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:113)
> at java.lang.Thread.run(Thread.java:750)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:438)
> ... 14 common frames omitted
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:424)
> at java.util.ArrayList.get(ArrayList.java:437)
> at
> org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:289)
> at
> org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:227)
> at org.apache.cassandra.config.Schema.dropKeyspace(Schema.java:663)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1428)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1413)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1390)
> at
> org.apache.cassandra.service.MigrationManager$1.runMayThrow(MigrationManager.java:464)
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
> ... 1 common frames omitted
> INFO [RMI TCP Connection(4)-127.0.0.1] 2023-06-30 02:41:59,326
> CassandraDaemon.java:576 - Cassandra shutting down... {code}
> The _java.lang.ArrayIndexOutOfBoundsException_ is thrown in MigrationStage
> thread.
> h2. Root Cause
> From my preliminary analysis, this bug is related to thread interleaving
> between
> # Thread trying to shutdown the Cassandra instance.
> # MigrationStage thread trying to recycle segments.
> if user invokes {_}bin/nodetool stopdaemon{_}, the activeSegments in
> AbstractCommitLogSegmentManager will be empty.
> {code:java}
> // AbstractCommitLogSegmentManager.java
> /** Active segments, containing unflushed data. The tail of this queue is the
> one we allocate writes to */private final
> ConcurrentLinkedQueue<CommitLogSegment> activeSegments = new
> ConcurrentLinkedQueue<>(); {code}
> At this time, if the migration thread tries to invoke the forceRecycleAll
> function, it throws _ArrayIndexOutOfBoundsException_ if there's no active
> segment.
> {code:java}
> // AbstractCommitLogSegmentManager.java
> void forceRecycleAll(Iterable<UUID> droppedCfs)
> {
> List<CommitLogSegment> segmentsToRecycle = new
> ArrayList<>(activeSegments);
> CommitLogSegment last = segmentsToRecycle.get(segmentsToRecycle.size() -
> 1);
> advanceAllocatingFrom(last); // wait for the commit log modifications
> last.waitForModifications(); {code}
> It seems that this ERROR won't impact the data integrity. But I am not sure
> whether it will cause some problems in the segment recycling process.
> I think this should be handled considering its ERROR level in system.log. It
> might confuse users whether their data is still in an integrity state and
> whether they can perform the migration from 3.x to 4.x.
> I have attached my full system.log and cassandra.yaml file.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]