GossipTimerTask stops running if an Exception occurs
----------------------------------------------------
Key: CASSANDRA-1289
URL: https://issues.apache.org/jira/browse/CASSANDRA-1289
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.6.3, 0.6.2, 0.6.1, 0.6, 0.7
Reporter: Wade Simmons
The GossipTimerTask run() method has a try/catch around its body, but it
re-throws all Exceptions as RuntimeExceptions. This causes the GossipTimerTask
to no longer run (due to the way the underlying Java Timer implementation
works), stopping the periodic gossip status checks.
Combine this problem with a bug like CASSANDRA-757 (not yet fixed in 0.6.x) and
you get into a state where the server keeps running, but gossip is no longer
occurring, preventing node addition / removal from happening.
I see two potential choices:
1) Log the error but don't re-throw it so that the GossipTimerTask will
continue to run on its next interval.
2) Shutdown the server, since continuing to run without gossip subtly breaks
other functionality / knowledge of other nodes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.