[ 
https://issues.apache.org/jira/browse/SOLR-6498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raintung Li updated SOLR-6498:
------------------------------
    Attachment: SOLR-6498.txt

> LeaderElector sometimes will appear multiple ephemeral nodes in the zookeeper
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-6498
>                 URL: https://issues.apache.org/jira/browse/SOLR-6498
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.6.1
>         Environment: linux
>            Reporter: Raintung Li
>         Attachments: SOLR-6498.txt
>
>
> Sometimes overseer_elect/collection_shard_leader_elect election path will 
> appear multiple same node different sessionid ephemeral nodes.
> ex.
> 92427566579253248-core_node1-n_0000000032
> 92427566579253249-core_node1-n_0000000033
> I can't trace what it happen. But if that, the result will be the new 
> register node can't be elect the leader, we also know the old sessionid 
> ephemeral node is invalid, but don't know why it is exist.
> And the other issue :
> joinElection method:
> try {
>         leaderSeqPath = zkClient.create(shardsElectZkPath + "/" + id + "-n_", 
> null,
>             CreateMode.EPHEMERAL_SEQUENTIAL, false);
>         context.leaderSeqPath = leaderSeqPath;
>         cont = false;
>       } catch (ConnectionLossException e) {
>         // we don't know if we made our node or not...
>         List<String> entries = zkClient.getChildren(shardsElectZkPath, null, 
> true);
>         
>         boolean foundId = false;
>         for (String entry : entries) {
>           String nodeId = getNodeId(entry);
>           if (id.equals(nodeId)) {
>             // we did create our node...
>             foundId  = true;
>             break;
>           }
>         }
>         if (!foundId) {
>           cont = true;
>           if (tries++ > 20) {
>             throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR,
>                 "", e);
>           }
>           try {
>             Thread.sleep(50);
>           } catch (InterruptedException e2) {
>             Thread.currentThread().interrupt();
>           }
>         }
>       } 
> If meet the ConnectionLossException status, maybe will double create the 
> ephemeral sequential node.
> For my suggestion, can't trace why create the two ephemeral nodes for the 
> same server, but can protect it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to