[ 
https://issues.apache.org/jira/browse/HBASE-20597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479433#comment-16479433
 ] 

Andrew Purtell edited comment on HBASE-20597 at 5/17/18 5:36 PM:
-----------------------------------------------------------------

Not sure if this is only a theoretical problem but might as well fix it. If the 
connection problem is persistent, like loss of shared trust between the 
clusters, we may accumulate unclosed ZKW instances over time, with a ZK send 
thread and event thread each, and eventually have enough leaked threads to 
cause OOME (cannot allocate native thread). 


was (Author: apurtell):
If the connection problem is persistent, like loss of shared trust between the 
clusters, we may accumulate unclosed ZKW instances over time, with a ZK send 
thread and event thread each, and eventually have enough leaked threads to 
cause OOME (cannot allocate native thread). 

> Use a lock to serialize access to a shared reference to ZooKeeperWatcher in 
> HBaseReplicationEndpoint
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20597
>                 URL: https://issues.apache.org/jira/browse/HBASE-20597
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.2, 1.4.4
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 2.0.1, 1.4.5
>
>
> The code that closes down a ZKW that fails to initialize when attempting to 
> connect to the remote cluster is not MT safe and can in theory leak 
> ZooKeeperWatcher instances. The allocation of a new ZKW and store to the 
> reference is not atomic. Might have concurrent allocations with only one 
> winning store, leading to leaked ZKW instances. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to