Re: shard failure, leader transition took 11s (seems high?)

Mark Miller Mon, 24 Jun 2013 06:47:36 -0700

It will take a short bit of a time before a new leader takes over when a leader 
goes - that's expected - how long it takes will vary. Some things will do short 
little retries to kind of deal with this, but you are alerted those updates 
failed, so you have to deal with that as you would other update fails on the 
client side. SolrCloud favors consistency over write availability. That's the 
short part where you lose write availability.


To get a 'clean' shutdown - eg you want to bring the machine down, it didn't 
get hit by lightening, we have to add some specific clean stop api you can call 
first - by the time jetty (or whatever container) tells Solr it's shutting 
down, it's too late to pull the node out gracefully.

I've danced around it in the past, but have never gotten to making that clean 
shutdown/stop API.

- Mark

On Jun 24, 2013, at 8:25 AM, Daniel Collins <danwcoll...@gmail.com> wrote:

> Just had an odd scenario in our current Solr system (4.3.0 + SOLR-4829
> patch), 4 shards, 2 replicas (leader + 1 other) per shard spread across 8
> machines.
> 
> We sent all our updates into a single instance, and we shutdown a leader
> for maintenance, expecting it to failover to the other replica.  What I saw
> was that when the leader shard went down, the instance taking updates
> started seeing rejections almost instantly, yet the cluster state changes
> didn't occur for several seconds.  During that time, we had no valid leader
> for one of our shards, so we were losing updates and queries.
> 
> (shard4 leader)
> 07:10:33,124 - xxxxxx4 (shard 4 leader) starts coming down.
> 07:10:35,885 - cluster state change is detected
> 07:10:37,172 - nsrchnj4 publishes itself as down
> 07:10:37,869 - second cluster state change detected
> 07:10:40,202 - closing searcher
> 07:10:43,447 - cluster state change (live_nodes)
> 
> (instance taking updates)
> 07:10:33,443 - starts seeing rejections from xxxxxx4
> 07:10:35,937 - detects a cluster state change (red herring)
> 07:10:37,899 - detects another cluster state change
> 07:10:43,478 - detects a live_nodes change (as shard4 leader is really down
> now)
> 07:10:44,586 - detects that shard4 has no leader anymore
> 
> (xxxxx8) - new shard4 leader
> 
> 07:10:32,981 - last story FROMLEADER (xxxxxx4)
> 07:10:35,980 - cluster state change detected (red herring)
> 07:10:37,975 - another cluster state change detected
> 07:10:43,868 - running election process(!)
> 07:10:44,069 - nsrchnj8 becomes leader, tries to sync from nsrchnj4 (which
> is already rejecting requests). My question is what should happen during
> leader transition?  As I understand it, the leader publishes that is DOWN,
> and waits until it sees the response (by effectively waiting for cluster
> state messages), so by the time it starts to shutdown its own
> reader/writers, the cluster should be aware that it is unavailable...  The
> fact that our update node took 11s and had to wait for the live_nodes
> changes in order to detect it didn't have a leader for shard4, seems like a
> real whole?
> 
> From what I am seeing here though, it is is like Jetty has shutdown its
> HTTP interface before any of that happens, so the instance taking updates
> can't communicate with it, we see a bunch of errors like this:
> 
> 2013-06-24 07:10:33,443 ERROR [qtp2128911821-403089]
> o.a.s.u.SolrCmdDistributor [SolrException.java:129] forwarding update to
> http://xxxxxx4:10600/solr/collection1/ failed - retrying ...
> 
> This is with Solr 4.3.0 + patch for SOLR-4829.  I couldn't find this in any
> list of existing issues, and I thought we'd seen valid leader swaps before,
> so is this a very specific scenario we've hit?  I can get full logs and
> such, will see how reproducible it is.
> 
> Surely, Jetty shouldn't shutdown the interface until Solr has stopped?  Or
> we are doing our shutdowns wrong (we are just using the "--stop" option on
> JETTY).
> 
> Cheers, Daniel

Re: shard failure, leader transition took 11s (seems high?)

Reply via email to