[jira] [Updated] (CASSANDRA-4066) Cassandra cluster stops responding on time change (scheduling not using monotonic time?)

Jonathan Ellis (Updated) (JIRA) Tue, 20 Mar 2012 09:00:02 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jonathan Ellis updated CASSANDRA-4066:
--------------------------------------

          Component/s: Core
             Priority: Minor  (was: Major)
    Affects Version/s:     (was: 1.0.6)
        Fix Version/s: 1.1.1
             Assignee: Brandon Williams
               Labels: gossip  (was: )

We make extensive use of Java's ScheduledExecutorService, which does not deal 
well with the system time being pulled out from under it: 
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7139684

I'm willing to live with this for the majority of scheduled tasks, however, it 
might be worth updating Gossip to use it's own thread + sleep calls to avoid 
this.

On the other hand, if you didn't have Gossip dying with UAE, it would be very 
difficult to figure out why the rest of the background tasks stopped executing, 
which would cause things to go bad a lot more gradually.

What do you think, Brandon?
                
> Cassandra cluster stops responding on time change (scheduling not using 
> monotonic time?) 
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4066
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Linux; CentOS6 2.6.32-220.4.2.el6.x86_64
>            Reporter: David Daeschler
>            Assignee: Brandon Williams
>            Priority: Minor
>              Labels: gossip
>             Fix For: 1.1.1
>
>
> The server installation I set up did not have ntpd installed in the base 
> installation. When I noticed that the clocks were skewing I installed ntp and 
> set the date on all the servers in the cluster. A short time later, I started 
> getting UnavailableExceptions on the clients. 
> Also, one sever seemed to be unaffected by the time change. That server 
> happened to have it's time pushed forward, not backwards like the other 3 in 
> the cluster. This leads me to believe something is running on a 
> timer/schedule that is not monotonic.
> I'm posting this as a bug, but I suppose it might just be part of the 
> communication protocols etc for the cluster and part of the design. But I 
> think the devs should be aware of what I saw.
> Otherwise, thank you for a fantastic product. Even after restarting 75% of 
> the cluster things seem to have recovered nicely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4066) Cassandra cluster stops responding on time change (scheduling not using monotonic time?)

Reply via email to