[jira] [Updated] (ZOOKEEPER-4428) ZooKeeper leaks "SyncThread" threads when leadership connection times out and is reestablished

Ryan Ruel (Jira) Wed, 15 Dec 2021 08:23:06 -0800


     [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ryan Ruel updated ZOOKEEPER-4428:
---------------------------------
    Environment: 
# On a follower node for an established ZooKeeper ensemble, issue the following 
command to determine number of SyncThreads:
ps -T -p `pidof mdtzookeeper` | grep SyncThread | wc
 # Issue the following IP tables command on the leader to drop traffic coming 
from the follower used in Step 1:
iptables -A INPUT -s <Follower IP Address> -j DROP
 # Watch the zookeeper logs on the nodes and wait for the connection to drop 
due to timeout.
 # Issue the following IP tables command on the leader to re-enable traffic 
coming from follower used in Step 1:
iptables -D INPUT -s <Follower IP Address> -j DROP
 # Watch the zookeeper logs on the nodes and wait for the connection to the 
leader to reestablish.
 # On the follower node (used in Step 1), check the number of SyncThreads.  
That value should have increased by one and stay pinned there indefinitely: 
ps -T -p `pidof mdtzookeeper` | grep SyncThread | wc

  was:
# On a follower node for an established ZooKeeper ensemble, issue the following 
command to determine number of SyncThreads:

ps -T -p `pidof mdtzookeeper` | grep SyncThread | wc


 # Issue the following IP tables command on the leader to drop traffic coming 
from the follower used in Step 1:

 iptables -A INPUT -s <Follower IP Address> -j DROP


 # Watch the zookeeper logs on the nodes and wait for the connection to drop 
due to timeout.


 # Issue the following IP tables command on the leader to re-enable traffic 
coming from follower used in Step 1:

iptables -D INPUT -s <Follower IP Address> -j DROP


 # Watch the zookeeper logs on the nodes and wait for the connection to the 
leader to reestablish.


 # On the follower node (used in Step 1), check the number of SyncThreads.  
That value should have increased by one and stay pinned there indefinitely: 

ps -T -p `pidof mdtzookeeper` | grep SyncThread | wc


> ZooKeeper leaks "SyncThread" threads when leadership connection times out and 
> is reestablished 
> -----------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4428
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4428
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.6.3
>         Environment: # On a follower node for an established ZooKeeper 
> ensemble, issue the following command to determine number of SyncThreads:
> ps -T -p `pidof mdtzookeeper` | grep SyncThread | wc
>  # Issue the following IP tables command on the leader to drop traffic coming 
> from the follower used in Step 1:
> iptables -A INPUT -s <Follower IP Address> -j DROP
>  # Watch the zookeeper logs on the nodes and wait for the connection to drop 
> due to timeout.
>  # Issue the following IP tables command on the leader to re-enable traffic 
> coming from follower used in Step 1:
> iptables -D INPUT -s <Follower IP Address> -j DROP
>  # Watch the zookeeper logs on the nodes and wait for the connection to the 
> leader to reestablish.
>  # On the follower node (used in Step 1), check the number of SyncThreads.  
> That value should have increased by one and stay pinned there indefinitely: 
> ps -T -p `pidof mdtzookeeper` | grep SyncThread | wc
>            Reporter: Ryan Ruel
>            Priority: Major
>
> In a production environment with some connectivity problems it was found the 
> ZooKeeper server was using over 1000 threads with name "SyncThread" (that 
> were never being freed).
> Looking through the server logs indicates that these nodes were experiencing 
> connection timeouts to the leader.
> A test environment (described below in the "environment" field of this 
> ticket) showed that these connection timeouts are what seem to be leaking 
> these threads.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (ZOOKEEPER-4428) ZooKeeper leaks "SyncThread" threads when leadership connection times out and is reestablished

Reply via email to