[ 
https://issues.apache.org/jira/browse/CASSANDRA-21365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Kumar updated CASSANDRA-21365:
-----------------------------------
    Summary: Premature UJ→UN transition when restarting gossip on joining node 
during bootstrap  (was: Premature UJ→UN Transition During Bootstrap Scale-Up)

> Premature UJ→UN transition when restarting gossip on joining node during 
> bootstrap
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21365
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21365
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip
>            Reporter: Amit Kumar
>            Priority: Normal
>
> We had a node joining the cluster with -Dcassandra.join_ring=false, the node 
> was streaming data from other node (State: UJ and bootstrap = true). The node 
> gossip was restarted (stop & start) due to a maintainance event which caused 
> the node status to move the status from UJ -> UN without streaming all the 
> data.
> Code Reference:
> [https://github.com/apache/cassandra/blob/cassandra-4.0/src/java/org/apache/cassandra/service/StorageService.java#L508]
> Version : 4.0.13
> Below are log (Node did not complete the streaming but moved the status from 
> UJ  ->  UN after gossip restart): 
> INFO  [RMI TCP Connection(4)-10.134.160.29] 2026-04-09 20:45:48,005 
> StorageService.java:1219 - Joining ring by operator request
> INFO  [RMI TCP Connection(4)-10.134.160.29] 2026-04-09 20:45:48,060 
> StorageService.java:1732 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(4)-10.134.160.29] 2026-04-09 20:46:42,162 
> StorageService.java:1732 - JOINING: Starting to bootstrap...
> WARN  [RMI TCP Connection(2030)-10.134.163.199] 2026-04-13 03:02:37,604 
> StorageService.java:447 - Stopping gossip by operator request
> WARN  [RMI TCP Connection(2038)-10.134.163.199] 2026-04-13 03:15:38,965 
> StorageService.java:466 - Starting gossip by operator request
> INFO  [RMI TCP Connection(2038)-10.134.163.199] 2026-04-13 03:15:38,982 
> StorageService.java:2913 - Node /10.xxx.xxx.xxx:7000 state jump to NORMAL
> Error logs: 
> ERROR [ReadStage-1] 2026-04-13 15:56:21,967 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[ReadStage-1,5,main]
> java.lang.RuntimeException: Cannot service reads while bootstrapping!
>         at 
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:46)
>         at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
>         at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
>         at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
>         at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:432)
>         at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>         at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165)
>         at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:137)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>         at java.base/java.lang.Thread.run(Thread.java:840)
> Possible solution: change the status of node to JOINING or Bootstrapping when 
> enabling gossip if the node is in bootstrap mode.
> public void setGossipTokens(Collection<Token> tokens)
>     {
>         List<Pair<ApplicationState, VersionedValue>> states = new 
> ArrayList<Pair<ApplicationState, VersionedValue>>();
>         states.add(Pair.create(ApplicationState.TOKENS, 
> valueFactory.tokens(tokens)));
>         states.add(Pair.create(ApplicationState.STATUS_WITH_PORT, 
> valueFactory.normal(tokens)));
>         states.add(Pair.create(ApplicationState.STATUS, *isBootstrapMode() : 
> valueFactory.bootstrapping(tokens) ?* valueFactory.normal(tokens)));
>         Gossiper.instance.addLocalApplicationStates(states);
>     }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to