Hi,

We're testing SolrCloud under high scale and high load (many replicas per
node, multiple collection creations, nodes up and down, backed up Overseer
queues) and are *running into shard leader election issues* when state.json
and the Zookeeper leader registration node for the shard disagree (leader
registration node in Zookeeper is /collections/*<collectionName>*/leaders/
*<shardName>*/leader).
The inconsistency is the consequence of a delayed update of state.json due
to an overloaded Overseer cluster state change update queue (which sadly is
consumed by a single Overseer thread).

I want your opinion on *no longer tracking shard leaders in state.json but
only relying on the ephemeral shard ZK leader registration node*.

Minor side benefit would be less updates to state.json and less watches and
nodes fetching it.

Note that relying on state.json to determine who the leader is as currently
done implies dealing with stale/incorrect data since state.json is updated
async via watches. I mention that because it means caching of the
Zookeeper leader registration node content is likely ok, no need to fetch
it anew on every access.

Thanks,
Ilan

Reply via email to