Interesting failure scenario, SolrCloud and ZK nodes on different times

Erick Erickson Tue, 06 Aug 2013 12:57:33 -0700

I've become aware of a situation I thought I'd pass along. A SolrCloud
installation had several ZK nodes that has very significantly offset times.
They were being hit with the "ClusterState says we are the leader, but
locally we don't think we are" error when nodes were recovering. Of course
whether this problem is now taken care of with recent Solr releases (I
haven't seen this go by the user's list for quite a while) I don't quite
know.


When the times were coordinated, many of the problems with recovery went
away. We're trying to reconstruct the scenario from memory, but it prompted
me to pass the incident in case it sparked any thoughts. Specifically, I
wonder if there's anything that comes to mind if the ZK nodes are
significantly out of synch with each other time-wise.

FWIW,
Erick

Interesting failure scenario, SolrCloud and ZK nodes on different times

Reply via email to