[
https://issues.apache.org/jira/browse/ZOOKEEPER-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lea Morschel updated ZOOKEEPER-3890:
------------------------------------
Description:
When a ZooKeeper client session disappears, the associated ephemeral node that
is used for leader election is occasionally not deleted and persists
(indefinitely, it seems).
This of course leads to a leader election process frequently selecting such a
stale node to be the leader because it is oldest, so that none of the existent
redundant services that take action when acquiring leadership never do so.
One of the scenarios where such a stale ephemeral node is created can be
triggered by force-killing both the client and ZooKeeper server ({{kill -9
<pid}}>), which leads to the session being recreated after restarting the
server on its side, even though the actual client session is gone. This node
even persists after regular restarts from now on. This scenario involves a
single ZooKeeper server, but the problem has also been observed in a cluster of
three.
was:
When a ZooKeeper client session runs out, the associated ephemeral node that is
used for leader election is occasionally not deleted and persists
(indefinitely, it seems).
This of course leads to a leader election process frequently selecting such a
stale node to be the leader because it is oldest, so that none of the existent
redundant services that take action when acquiring leadership never do so.
One of the scenarios where such a stale ephemeral node is created can be
triggered by force-killing both the client and ZooKeeper server ({{kill -9
<pid}}>), which leads to the session being recreated after restarting the
server on its side, even though the actual client session is gone. This node
even persists after regular restarts from now on. This scenario involves a
single ZooKeeper server, but the problem has also been observed in a cluster of
three.
> Ephemeral node not deleted after session is gone, then elected as leader
> ------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3890
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3890
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.5.7
> Reporter: Lea Morschel
> Priority: Major
>
> When a ZooKeeper client session disappears, the associated ephemeral node
> that is used for leader election is occasionally not deleted and persists
> (indefinitely, it seems).
> This of course leads to a leader election process frequently selecting such
> a stale node to be the leader because it is oldest, so that none of the
> existent redundant services that take action when acquiring leadership never
> do so.
> One of the scenarios where such a stale ephemeral node is created can be
> triggered by force-killing both the client and ZooKeeper server ({{kill -9
> <pid}}>), which leads to the session being recreated after restarting the
> server on its side, even though the actual client session is gone. This node
> even persists after regular restarts from now on. This scenario involves a
> single ZooKeeper server, but the problem has also been observed in a cluster
> of three.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)