Put up a patch a lets take a look. Most anywhere that holds up the zk processing thread for any decent amount of time is probably something waiting to be fixed.
-- Mark Miller about.me/markrmiller On July 15, 2014 at 10:09:56 AM, Ramkumar R. Aiyengar ([email protected]) wrote: > Currently when a replica is watching the current leader's ephemeral node > and the leader disappears, it runs the leadership check along with its two > way peer sync, ZK update etc. on the ZK event thread where the watch was > fired. > > What this means is that for instances with lots of cores, you would be > serializing leadership elections and the last in the list could take a long > time to have a replacement elected (during which you will have no leader). > > I did a quick change to make the checkIfIAmLeader call async, but Solr > cloud tests being what they are (thanks Shalin for cleaning them up btw :) > ), I wanted to check if I am doing something stupid. If not, I will raise a > JIRA. > > One contention could be if you might end up with two elections for the same > shard, but I can't see how that might happen.. > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
