GossipTimerTask stops running if an Exception occurs
----------------------------------------------------

                 Key: CASSANDRA-1289
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1289
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.6.3, 0.6.2, 0.6.1, 0.6, 0.7
            Reporter: Wade Simmons


The GossipTimerTask run() method has a try/catch around its body, but it 
re-throws all Exceptions as RuntimeExceptions. This causes the GossipTimerTask 
to no longer run (due to the way the underlying Java Timer implementation 
works), stopping the periodic gossip status checks.

Combine this problem with a bug like CASSANDRA-757 (not yet fixed in 0.6.x) and 
you get into a state where the server keeps running, but gossip is no longer 
occurring, preventing node addition / removal from happening.

I see two potential choices:
1) Log the error but don't re-throw it so that the GossipTimerTask will 
continue to run on its next interval.
2) Shutdown the server, since continuing to run without gossip subtly breaks 
other functionality / knowledge of other nodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to