Yes, after 45 seconds a replica should take over as leader. It should likely explain in the logs of the replica that should be taking over why this is not happening.
- Mar On Wed Jan 28 2015 at 2:52:32 PM Joshi, Shital <shital.jo...@gs.com> wrote: > When leader reaches 99% physical memory on the box and starts swapping > (stops replicating), we forcefully bring down leader (first kill -15 and > then kill -9 if kill -15 doesn't work). This is when we are looking up to > replica to assume leader's role and it never happens. > > Zookeeper timeout is 45 seconds. We can increase it up to 2 minutes and > test. > > <cores adminPath="/admin/cores" defaultCoreName="collection1" > host="${host:}" hostPort="${jetty.port:8983}" > hostContext="${hostContext:solr}" > zkClientTimeout="${zkClientTimeout:45000}"> > > As per definition of zkClientTimeout, After the leader is brought down and > it doesn't talk to zookeeper for 45 seconds, shouldn't ZK promote replica > to leader? I am not sure how increasing zk timeout will help. > > > -----Original Message----- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Wednesday, January 28, 2015 11:42 AM > To: solr-user@lucene.apache.org > Subject: Re: replica never takes leader role > > This is not the desired behavior at all. I know there have been > improvements in this area since 4.8, but can't seem to locate the JIRAs. > > I'm curious _why_ the nodes are going down though, is it happening at > random or are you taking it down? One problem has been that the Zookeeper > timeout used to default to 15 seconds, and occasionally a node would be > unresponsive (sometimes due to GC pauses) and exceed the timeout. So upping > the ZK timeout has helped some people avoid this... > > FWIW, > Erick > > On Wed, Jan 28, 2015 at 7:11 AM, Joshi, Shital <shital.jo...@gs.com> > wrote: > > > We're using Solr 4.8.0 > > > > > > -----Original Message----- > > From: Erick Erickson [mailto:erickerick...@gmail.com] > > Sent: Tuesday, January 27, 2015 7:47 PM > > To: solr-user@lucene.apache.org > > Subject: Re: replica never takes leader role > > > > What version of Solr? This is an ongoing area of improvements and several > > are very recent. > > > > Try searching the JIRA for Solr for details. > > > > Best, > > Erick > > > > On Tue, Jan 27, 2015 at 1:51 PM, Joshi, Shital <shital.jo...@gs.com> > > wrote: > > > > > Hello, > > > > > > We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes and > three > > > zookeeper instances. We have noticed that when a leader node goes down > > the > > > replica never takes over as a leader, cloud becomes unusable and we > have > > to > > > bounce entire cloud for replica to assume leader role. Is this default > > > behavior? How can we change this? > > > > > > Thanks. > > > > > > > > > > > >