Hi

In Solr Cloud when a Solr looses its ZooKeeper connection e.g. because of a session timeout the LeaderElector ZooKeeper Watchers handling its replica slices are notified with two events: a Disconnected and a SyncConnected event. Currently the org.apache.solr.cloud.LeaderElector#checkIfIamLeader code does two "bad" things when this happens: 1. On the disconnect event it "fails" with a session timeout when talking to ZooKeeper (it has no zookeeper connection at this point) 2. On the syncConnected event it adds a new watcher to the ZooKeeper leader election leader node. As documented in the zookeeper programming guide, the watchers are not "removed" when a zookeeper connection is lost (even though the watchers are notified twice), so for each zookeeper connection loss the number of watchers are doubled. It can be noted that there are two watchers per replica slice (the overseer and collection/slice/election).

A fix for this could be
org.apache.solr.cloud.LeaderElector#checkIfIamLeader
            ...
            new Watcher() {

              @Override
              public void process(WatchedEvent event) {
                log.debug(seq + " watcher received event: " + event);
// Reconnect should not add new watchers as the old watchers are still available!
                if (EventType.None.equals(event.getType())) {
                  log.debug("Skipping event: " + event);
                  return;
                }

The behaviour of this can be verified using the below test in the org.apache.solr.cloud.LeaderElectionIntegrationTest
Can someone confirm this and add it to svn?

Thanks in advance.

Best regards Trym

  @Test
  public void testReplicaZookeeperConnectionLoss() throws Exception {
      // who is the leader?
      String leader = getLeader();

      Set<Integer> shard1Ports = shardPorts.get("shard1");

      int leaderPort = getLeaderPort(leader);
      assertTrue(shard1Ports.toString(), shard1Ports.contains(leaderPort));

      // timeout a replica a couple of times
      System.setProperty("zkClientTimeout", "500");
      int replicaPort = 7001;
      if (leaderPort == 7001) {
        replicaPort = 7000;
      }
assertNotSame(containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper(), containerMap.get(leaderPort).getZkController().getZkClient().getSolrZooKeeper());
containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000);
      Thread.sleep(10 * 1000);
containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000);
      Thread.sleep(10 * 1000);
containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000);
      Thread.sleep(10 * 1000);

      // kill the leader
      if (VERBOSE) System.out.println("Killing " + leaderPort);
      shard1Ports.remove(leaderPort);
      containerMap.get(leaderPort).shutdown();

      // poll until leader change is visible
      for (int j = 0; j < 90; j++) {
        String currentLeader = getLeader();
        if(!leader.equals(currentLeader)) {
          break;
        }
        Thread.sleep(500);
      }

      leader = getLeader();
      int newLeaderPort = getLeaderPort(leader);
      int retry = 0;
      while (leaderPort == newLeaderPort) {
        if (retry++ == 20) {
          break;
        }
        Thread.sleep(1000);
      }

      if (leaderPort == newLeaderPort) {
fail("We didn't find a new leader! " + leaderPort + " was shutdown, but it's still showing as the leader");
      }

assertTrue("Could not find leader " + newLeaderPort + " in " + shard1Ports, shard1Ports.contains(newLeaderPort));
  }

Reply via email to