Raintung Li created SOLR-6498: --------------------------------- Summary: LeaderElector sometimes will appear multiple ephemeral nodes in the zookeeper Key: SOLR-6498 URL: https://issues.apache.org/jira/browse/SOLR-6498 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: linux Reporter: Raintung Li
Sometimes overseer_elect/collection_shard_leader_elect election path will appear multiple same node different sessionid ephemeral nodes. ex. 92427566579253248-core_node1-n_0000000032 92427566579253249-core_node1-n_0000000033 I can't trace what it happen. But if that, the result will be the new register node can't be elect the leader, we also know the old sessionid ephemeral node is invalid, but don't know why it is exist. And the other issue : joinElection method: try { leaderSeqPath = zkClient.create(shardsElectZkPath + "/" + id + "-n_", null, CreateMode.EPHEMERAL_SEQUENTIAL, false); context.leaderSeqPath = leaderSeqPath; cont = false; } catch (ConnectionLossException e) { // we don't know if we made our node or not... List<String> entries = zkClient.getChildren(shardsElectZkPath, null, true); boolean foundId = false; for (String entry : entries) { String nodeId = getNodeId(entry); if (id.equals(nodeId)) { // we did create our node... foundId = true; break; } } if (!foundId) { cont = true; if (tries++ > 20) { throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, "", e); } try { Thread.sleep(50); } catch (InterruptedException e2) { Thread.currentThread().interrupt(); } } } If meet the ConnectionLossException status, maybe will double create the ephemeral sequential node. For my suggestion, can't trace why create the two ephemeral nodes for the same server, but can protect it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org