[jira] [Updated] (CASSANDRA-19729) Cassandra node failed to startup in dtests when all nodes in the cluster are shutdown and ongoing a restart

ConfX (Jira) Thu, 27 Jun 2024 09:32:16 -0700


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-19729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ConfX updated CASSANDRA-19729:
------------------------------
    Description: 
h3. *What happened*

During an upgrade testing scenario, when all nodes are shutdown and ongoing a 
new restart, the node instance fails to be restarted due to 
`IllegalStateException` in Cassandra distributed tests.
h3. *How to reproduce*

Put the following code under 
`cassandra/test/distributed/org/apache/cassandra/distributed/upgrade`.

```java
package org.apache.cassandra.distributed.upgrade;
public class JVMDTestUpgradeTest extends UpgradeTestBase
{
    @Test
    public void shutdownAllAndRestart() throws Throwable
    {
        new TestCase()
                .nodes(2)
                .nodesToUpgrade(1)
                .upgradesToCurrentFrom(v3X)
                .setup((cluster) ->

{                     cluster.schemaChangeIgnoringStoppedInstances("CREATE 
TABLE "+KEYSPACE+".tbl1 (id int primary key, i int)");                 }

)
                .runAfterNodeUpgrade((cluster, node) ->

{                     cluster.get(2).shutdown(true).get(1, TimeUnit.MINUTES);   
                  cluster.get(1).shutdown(true).get(1, TimeUnit.MINUTES);       
              assertTrue(cluster.get(1).isShutdown());                     
assertTrue(cluster.get(2).isShutdown());                     
cluster.get(1).startup();                     cluster.get(2).startup();         
            assertFalse(cluster.get(1).isShutdown());                     
assertFalse(cluster.get(2).isShutdown());         }

).run();
    }
}
```
Build and run the above tests with any dtest version jars. In my case I'm using 
`dtest-4.0.1.jar` and `dtest-4.0.2.jar`.
Run it with the following command:
```bash
$ ant test-jvm-dtest-some -Duse.jdk11=true 
-Dtest.name=org.apache.cassandra.distributed.upgrade.JVMDTestUpgradeTest
```
You will see the following error message:
```bash
[junit-timeout] Caused by: java.lang.IllegalStateException: Can't use shutdown 
instances, delegate is null
[junit-timeout]         at 
org.apache.cassandra.distributed.impl.AbstractCluster$Wrapper.delegate(AbstractCluster.java:283)
[junit-timeout]         at 
org.apache.cassandra.distributed.impl.DelegatingInvokableInstance.getMessagingVersion(DelegatingInvokableInstance.java:90)
[junit-timeout]         at 
org.apache.cassandra.distributed.action.GossipHelper.unsafeStatusToNormal(GossipHelper.java:89)
[junit-timeout]         at 
org.apache.cassandra.distributed.impl.Instance.lambda$startup$9(Instance.java:555)
[junit-timeout]         at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
[junit-timeout]         at 
java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)
[junit-timeout]         at 
org.apache.cassandra.distributed.impl.Instance.lambda$startup$10(Instance.java:551)
[junit-timeout]         at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
[junit-timeout]         at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[junit-timeout]         at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[junit-timeout]         at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[junit-timeout]         at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[junit-timeout]         at java.base/java.lang.Thread.run(Thread.java:829)
```

However, this bug only happens if you shut down all the nodes in the cluster 
and try to restart them. If you only shut down partial nodes, for example, the 
code below shuts down one of the two nodes and restarts it without any problem.
```java
  @Test
  public void shutdownOneAndRestart() throws Throwable
  {
      new TestCase()
              .nodes(2)
              .nodesToUpgrade(1)
              .upgradesToCurrentFrom(v3X)
              .setup((cluster) ->

{                   cluster.schemaChangeIgnoringStoppedInstances("CREATE TABLE 
"+KEYSPACE+".tbl1 (id int primary key, i int)");               }

)
              .runAfterNodeUpgrade((cluster, node) ->

{                   cluster.get(2).shutdown(true).get(1, TimeUnit.MINUTES);     
              assertTrue(cluster.get(2).isShutdown());                   
cluster.get(2).startup();                   
assertFalse(cluster.get(1).isShutdown());                   
assertFalse(cluster.get(2).isShutdown());               }

).run();
  }
```

 

 

  was:
### What happened

During an upgrade testing scenario, when all nodes are shutdown and ongoing a 
new restart, the node instance fails to be restarted due to 
`IllegalStateException` in Cassandra distributed tests.

### How to reproduce

Put the following code under 
`cassandra/test/distributed/org/apache/cassandra/distributed/upgrade`.

```java
package org.apache.cassandra.distributed.upgrade;
public class JVMDTestUpgradeTest extends UpgradeTestBase
{
    @Test
    public void shutdownAllAndRestart() throws Throwable
    {
        new TestCase()
                .nodes(2)
                .nodesToUpgrade(1)
                .upgradesToCurrentFrom(v3X)
                .setup((cluster) -> {
                    cluster.schemaChangeIgnoringStoppedInstances("CREATE TABLE 
"+KEYSPACE+".tbl1 (id int primary key, i int)");
                })
                .runAfterNodeUpgrade((cluster, node) -> {
                    cluster.get(2).shutdown(true).get(1, TimeUnit.MINUTES);
                    cluster.get(1).shutdown(true).get(1, TimeUnit.MINUTES);
                    assertTrue(cluster.get(1).isShutdown());
                    assertTrue(cluster.get(2).isShutdown());

                    cluster.get(1).startup();
                    cluster.get(2).startup();
                    assertFalse(cluster.get(1).isShutdown());
                    assertFalse(cluster.get(2).isShutdown());
        }).run();
    }
}
```
Build and run the above tests with any dtest version jars. In my case I'm using 
`dtest-4.0.1.jar` and `dtest-4.0.2.jar`.
Run it with the following command:
```bash
$ ant test-jvm-dtest-some -Duse.jdk11=true 
-Dtest.name=org.apache.cassandra.distributed.upgrade.JVMDTestUpgradeTest
```
You will see the following error message:
```bash
[junit-timeout] Caused by: java.lang.IllegalStateException: Can't use shutdown 
instances, delegate is null
[junit-timeout]         at 
org.apache.cassandra.distributed.impl.AbstractCluster$Wrapper.delegate(AbstractCluster.java:283)
[junit-timeout]         at 
org.apache.cassandra.distributed.impl.DelegatingInvokableInstance.getMessagingVersion(DelegatingInvokableInstance.java:90)
[junit-timeout]         at 
org.apache.cassandra.distributed.action.GossipHelper.unsafeStatusToNormal(GossipHelper.java:89)
[junit-timeout]         at 
org.apache.cassandra.distributed.impl.Instance.lambda$startup$9(Instance.java:555)
[junit-timeout]         at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
[junit-timeout]         at 
java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)
[junit-timeout]         at 
org.apache.cassandra.distributed.impl.Instance.lambda$startup$10(Instance.java:551)
[junit-timeout]         at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
[junit-timeout]         at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[junit-timeout]         at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[junit-timeout]         at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[junit-timeout]         at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[junit-timeout]         at java.base/java.lang.Thread.run(Thread.java:829)
```

However, this bug only happens if you shut down all the nodes in the cluster 
and try to restart them. If you only shut down partial nodes, for example, the 
code below shuts down one of the two nodes and restarts it without any problem.
```java
  @Test
  public void shutdownOneAndRestart() throws Throwable
  {
      new TestCase()
              .nodes(2)
              .nodesToUpgrade(1)
              .upgradesToCurrentFrom(v3X)
              .setup((cluster) -> {
                  cluster.schemaChangeIgnoringStoppedInstances("CREATE TABLE 
"+KEYSPACE+".tbl1 (id int primary key, i int)");
              })
              .runAfterNodeUpgrade((cluster, node) -> {
                  cluster.get(2).shutdown(true).get(1, TimeUnit.MINUTES);
                  assertTrue(cluster.get(2).isShutdown());

                  cluster.get(2).startup();
                  assertFalse(cluster.get(1).isShutdown());
                  assertFalse(cluster.get(2).isShutdown());
              }).run();
  }
```

 

 


> Cassandra node failed to startup in dtests when all nodes in the cluster are 
> shutdown and ongoing a restart
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19729
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19729
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/java
>            Reporter: ConfX
>            Priority: Normal
>
> h3. *What happened*
> During an upgrade testing scenario, when all nodes are shutdown and ongoing a 
> new restart, the node instance fails to be restarted due to 
> `IllegalStateException` in Cassandra distributed tests.
> h3. *How to reproduce*
> Put the following code under 
> `cassandra/test/distributed/org/apache/cassandra/distributed/upgrade`.
> ```java
> package org.apache.cassandra.distributed.upgrade;
> public class JVMDTestUpgradeTest extends UpgradeTestBase
> {
>     @Test
>     public void shutdownAllAndRestart() throws Throwable
>     {
>         new TestCase()
>                 .nodes(2)
>                 .nodesToUpgrade(1)
>                 .upgradesToCurrentFrom(v3X)
>                 .setup((cluster) ->
> {                     cluster.schemaChangeIgnoringStoppedInstances("CREATE 
> TABLE "+KEYSPACE+".tbl1 (id int primary key, i int)");                 }
> )
>                 .runAfterNodeUpgrade((cluster, node) ->
> {                     cluster.get(2).shutdown(true).get(1, TimeUnit.MINUTES); 
>                     cluster.get(1).shutdown(true).get(1, TimeUnit.MINUTES);   
>                   assertTrue(cluster.get(1).isShutdown());                    
>  assertTrue(cluster.get(2).isShutdown());                     
> cluster.get(1).startup();                     cluster.get(2).startup();       
>               assertFalse(cluster.get(1).isShutdown());                     
> assertFalse(cluster.get(2).isShutdown());         }
> ).run();
>     }
> }
> ```
> Build and run the above tests with any dtest version jars. In my case I'm 
> using `dtest-4.0.1.jar` and `dtest-4.0.2.jar`.
> Run it with the following command:
> ```bash
> $ ant test-jvm-dtest-some -Duse.jdk11=true 
> -Dtest.name=org.apache.cassandra.distributed.upgrade.JVMDTestUpgradeTest
> ```
> You will see the following error message:
> ```bash
> [junit-timeout] Caused by: java.lang.IllegalStateException: Can't use 
> shutdown instances, delegate is null
> [junit-timeout]         at 
> org.apache.cassandra.distributed.impl.AbstractCluster$Wrapper.delegate(AbstractCluster.java:283)
> [junit-timeout]         at 
> org.apache.cassandra.distributed.impl.DelegatingInvokableInstance.getMessagingVersion(DelegatingInvokableInstance.java:90)
> [junit-timeout]         at 
> org.apache.cassandra.distributed.action.GossipHelper.unsafeStatusToNormal(GossipHelper.java:89)
> [junit-timeout]         at 
> org.apache.cassandra.distributed.impl.Instance.lambda$startup$9(Instance.java:555)
> [junit-timeout]         at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
> [junit-timeout]         at 
> java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)
> [junit-timeout]         at 
> org.apache.cassandra.distributed.impl.Instance.lambda$startup$10(Instance.java:551)
> [junit-timeout]         at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> [junit-timeout]         at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> [junit-timeout]         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [junit-timeout]         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [junit-timeout]         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout]         at java.base/java.lang.Thread.run(Thread.java:829)
> ```
> However, this bug only happens if you shut down all the nodes in the cluster 
> and try to restart them. If you only shut down partial nodes, for example, 
> the code below shuts down one of the two nodes and restarts it without any 
> problem.
> ```java
>   @Test
>   public void shutdownOneAndRestart() throws Throwable
>   {
>       new TestCase()
>               .nodes(2)
>               .nodesToUpgrade(1)
>               .upgradesToCurrentFrom(v3X)
>               .setup((cluster) ->
> {                   cluster.schemaChangeIgnoringStoppedInstances("CREATE 
> TABLE "+KEYSPACE+".tbl1 (id int primary key, i int)");               }
> )
>               .runAfterNodeUpgrade((cluster, node) ->
> {                   cluster.get(2).shutdown(true).get(1, TimeUnit.MINUTES);   
>                 assertTrue(cluster.get(2).isShutdown());                   
> cluster.get(2).startup();                   
> assertFalse(cluster.get(1).isShutdown());                   
> assertFalse(cluster.get(2).isShutdown());               }
> ).run();
>   }
> ```
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (CASSANDRA-19729) Cassandra node failed to startup in dtests when all nodes in the cluster are shutdown and ongoing a restart

Reply via email to