[
https://issues.apache.org/jira/browse/CASSANDRA-15592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041795#comment-17041795
]
Marcus Olsson commented on CASSANDRA-15592:
-------------------------------------------
Sure thing.
In order to get the test case small and to avoid starting up the full Gossiper
I modified "doStatusCheck()" to package-private (with @VisibleForTesting).
The test case injects an application state for a "remote" node, marks it down,
removes it and then adds an expiry time earlier than "now" before running
"doStatusCheck()" so that it should be evicted.
I believe this should represent the state transition that has occurred but
gossip state transitions are not my strong suite so please correct me if I'm
wrong.
I'll update the branches shortly.
> IllegalStateException in gossip after removing node
> ---------------------------------------------------
>
> Key: CASSANDRA-15592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15592
> Project: Cassandra
> Issue Type: Bug
> Components: Cluster/Gossip
> Reporter: Marcus Olsson
> Assignee: Marcus Olsson
> Priority: Normal
>
> In one of our test environments we encountered the following exception:
> {noformat}
> 2020-02-02T10:50:13.276+0100 [GossipTasks:1] ERROR
> o.a.c.u.NoSpamLogger$NoSpamLogStatement:97 log
> java.lang.IllegalStateException: Attempting gossip state mutation from
> illegal thread: GossipTasks:1
> at
> org.apache.cassandra.gms.Gossiper.checkProperThreadForStateMutation(Gossiper.java:178)
> at org.apache.cassandra.gms.Gossiper.evictFromMembership(Gossiper.java:465)
> at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:895)
> at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:78)
> at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:240)
> at
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Thread.java:748)
> java.lang.IllegalStateException: Attempting gossip state mutation from
> illegal thread: GossipTasks:1
> at
> org.apache.cassandra.gms.Gossiper.checkProperThreadForStateMutation(Gossiper.java:178)
> [apache-cassandra-3.11.5.jar:3.11.5]
> at org.apache.cassandra.gms.Gossiper.evictFromMembership(Gossiper.java:465)
> [apache-cassandra-3.11.5.jar:3.11.5]
> at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:895)
> [apache-cassandra-3.11.5.jar:3.11.5]
> at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:78)
> [apache-cassandra-3.11.5.jar:3.11.5]
> at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:240)
> [apache-cassandra-3.11.5.jar:3.11.5]
> at
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
> [apache-cassandra-3.11.5.jar:3.11.5]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_231]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> [na:1.8.0_231]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> [na:1.8.0_231]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> [na:1.8.0_231]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [na:1.8.0_231]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [na:1.8.0_231]
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
> [apache-cassandra-3.11.5.jar:3.11.5]
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> ~[netty-all-4.1.42.Final.jar:4.1.42.Final]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_231]
> {noformat}
> Since CASSANDRA-15059 we check that all state changes are performed in the
> GossipStage but it seems like it was still performed in the "current" thread
> [here|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/gms/Gossiper.java#L895].
> It should be as simple as adding a
> {code:java}
> runInGossipStageBlocking(() ->)
> {code}
> for it.
> I'll upload patches for 3.0, 3.11 and 4.0.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]