By tracing the output in the log files we see the following sequence.
Overseer role list has POD-1, POD-2, POD-3 in that order POD-3 has 2 shard leaders. POD-3 restarts. A) Logs for the shard whose leader moves successfully from POD-3 to POD-1 On POD-1: o.a.s.c.ShardLeaderElectionContext Replaying tlog before become new leader On POD-1: o.a.s.u.UpdateLog Starting log replay On POD-1: o.a.s.u.UpdateLog Log replay finished. On POD-1: o.a.s.c.SolrCore .... Registered new searcher autowarm time: 0 ms On POD-1: o.a.s.c.ShardLeaderElectionContextBase Creating leader registration node ... after winning as ... On POD-1: o.a.s.c.ShardLeaderElectionContext I am the new leader:.... B) Logs for the shard whose leader does not move from POD-3 to POD-1 On POD-1: o.a.s.c.ShardLeaderElectionContext Replaying tlog before become new leader On POD-1: o.a.s.u.UpdateLog Starting log replay... On POD-1: o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms.... < POD-3 is back up at this time > On POD-3: o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: No registered leader On was found after waiting for 4000ms.... It was odd to see no INFO, WARN or ERROR log message after "Starting log replay" on POD-1 for the shard which did not get its leader elected. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html