[
https://issues.apache.org/jira/browse/CASSANDRA-16588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326000#comment-17326000
]
Brandon Williams commented on CASSANDRA-16588:
----------------------------------------------
||Patch|CI||
|[3.11|https://github.com/driftx/cassandra/tree/CASSANDRA-16588]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/691/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/691/pipeline]|
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-16588]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/692/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/692/pipeline]|
We have to be careful in the test not to let StorageService get too far in
joining the ring otherwise all kinds of things start up that are not easily
restartable, so instead we can test checkForEndpointCollision directly.
> NPE getting host_id in Gossiper.isSafeForStartup
> ------------------------------------------------
>
> Key: CASSANDRA-16588
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16588
> Project: Cassandra
> Issue Type: Bug
> Components: Cluster/Gossip
> Reporter: Brandon Williams
> Assignee: Brandon Williams
> Priority: Normal
> Fix For: 3.11.x, 4.0-rc
>
>
> As seen here:
> https://ci-cassandra.apache.org/job/Cassandra-devbranch/604/testReport/junit/org.apache.cassandra.distributed.upgrade/MixedModeGossipTest/testStatusFieldShouldExistInOldVersionNodesEdgeCase/
> {noformat}
> java.lang.NullPointerException
> at org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:952)
> at
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:657)
> at
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:933)
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:784)
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:729)
> at
> org.apache.cassandra.distributed.impl.Instance.lambda$startup$10(Instance.java:541)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> I believe what is happening is a GossipDigestAck has been queued to ack the
> shutdown state from the node on the seed, but isn't actually sent until the
> node has restarted and gone into shadow. Since the ack contains the node's
> IP, it assumes a host_id will be there but since this is not an actual shadow
> response, it is not.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]