Hi, We're testing SolrCloud under high scale and high load (many replicas per node, multiple collection creations, nodes up and down, backed up Overseer queues) and are *running into shard leader election issues* when state.json and the Zookeeper leader registration node for the shard disagree (leader registration node in Zookeeper is /collections/*<collectionName>*/leaders/ *<shardName>*/leader). The inconsistency is the consequence of a delayed update of state.json due to an overloaded Overseer cluster state change update queue (which sadly is consumed by a single Overseer thread).
I want your opinion on *no longer tracking shard leaders in state.json but only relying on the ephemeral shard ZK leader registration node*. Minor side benefit would be less updates to state.json and less watches and nodes fetching it. Note that relying on state.json to determine who the leader is as currently done implies dealing with stale/incorrect data since state.json is updated async via watches. I mention that because it means caching of the Zookeeper leader registration node content is likely ok, no need to fetch it anew on every access. Thanks, Ilan