[jira] [Updated] (HBASE-21228) Memory leak since AbstractFSWAL caches Thread object and never clean later

Allan Yang (JIRA) Tue, 25 Sep 2018 08:29:21 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Allan Yang updated HBASE-21228:
-------------------------------
    Description: 
In AbstractFSWAL(FSHLog in branch-1), we have a map caches thread and 
SyncFutures.
{code}
/**
   * Map of {@link SyncFuture}s keyed by Handler objects. Used so we reuse 
SyncFutures.
   * <p>
   * TODO: Reuse FSWALEntry's rather than create them anew each time as we do 
SyncFutures here.
   * <p>
   * TODO: Add a FSWalEntry and SyncFuture as thread locals on handlers rather 
than have them get
   * them from this Map?
   */
  private final ConcurrentMap<Thread, SyncFuture> syncFuturesByHandler;
{code}

A colleague of mine find a memory leak case caused by this map.

Every thread who writes WAL will be cached in this map, And no one will clean 
the threads in the map even after the thread is dead. 

In one of our customer's cluster, we noticed that even though there is no 
requests, the heap of the RS is almost full and CMS GC was triggered every 
second.
We dumped the heap and then found out there were more than 30 thousands threads 
with Terminated state. which are all cached in this map above. Everything 
referenced in these threads were leaked. Most of the threads are:
1.PostOpenDeployTasksThread, which will write Open Region mark in WAL
2. hconnection-0x1f838e31-shared--pool, which are used to write index short 
circuit(Phoenix), and WAL will be write and sync in these threads.
3.  Index writer thread(Phoenix), which referenced by 
RegionCoprocessorHost$RegionEnvironment then by HRegion and finally been 
referenced by PostOpenDeployTasksThread.

We should turn this map into a thread local one, let JVM GC the terminated 
thread for us. 


  was:
In AbstractFSWAL(FSHLog in branch-1), we have a map caches thread and 
SyncFutures.
{code}
/**
   * Map of {@link SyncFuture}s keyed by Handler objects. Used so we reuse 
SyncFutures.
   * <p>
   * TODO: Reuse FSWALEntry's rather than create them anew each time as we do 
SyncFutures here.
   * <p>
   * TODO: Add a FSWalEntry and SyncFuture as thread locals on handlers rather 
than have them get
   * them from this Map?
   */
  private final ConcurrentMap<Thread, SyncFuture> syncFuturesByHandler;
{code}

A colleague of mine find a memory leak case caused by this map.

Every thread who writes WAL will be cached in this map, And no one will clean 
the threads in the map even after the thread is dead. 

In one of our customer's cluster, we noticed that even though there is no 
requests, the heap of the RS is almost full and CMS GC was triggered every 
second.
We dumped the heap and then found out there were more than 30 thousands threads 
with Terminated state. which are all cached in this map above. Everything 
referenced in these threads were leaked. Most of the threads are:
1.PostOpenDeployTasksThread, which will write Open Region mark in WAL
2. hconnection-0x1f838e31-shared--pool, which are used to write index short 
circuit(Phoenix), and WAL will be write and sync in these threads.
3.  Index writer thread(Phoenix), which referenced by RegionEnvironment  then 
by HRegion and finally been referenced by PostOpenDeployTasksThread.

We should turn this map into a thread local one, let JVM GC the terminated 
thread for us. 



> Memory leak since AbstractFSWAL caches Thread object and never clean later
> --------------------------------------------------------------------------
>
>                 Key: HBASE-21228
>                 URL: https://issues.apache.org/jira/browse/HBASE-21228
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.1.0, 2.0.2, 1.4.7
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>
> In AbstractFSWAL(FSHLog in branch-1), we have a map caches thread and 
> SyncFutures.
> {code}
> /**
>    * Map of {@link SyncFuture}s keyed by Handler objects. Used so we reuse 
> SyncFutures.
>    * <p>
>    * TODO: Reuse FSWALEntry's rather than create them anew each time as we do 
> SyncFutures here.
>    * <p>
>    * TODO: Add a FSWalEntry and SyncFuture as thread locals on handlers 
> rather than have them get
>    * them from this Map?
>    */
>   private final ConcurrentMap<Thread, SyncFuture> syncFuturesByHandler;
> {code}
> A colleague of mine find a memory leak case caused by this map.
> Every thread who writes WAL will be cached in this map, And no one will clean 
> the threads in the map even after the thread is dead. 
> In one of our customer's cluster, we noticed that even though there is no 
> requests, the heap of the RS is almost full and CMS GC was triggered every 
> second.
> We dumped the heap and then found out there were more than 30 thousands 
> threads with Terminated state. which are all cached in this map above. 
> Everything referenced in these threads were leaked. Most of the threads are:
> 1.PostOpenDeployTasksThread, which will write Open Region mark in WAL
> 2. hconnection-0x1f838e31-shared--pool, which are used to write index short 
> circuit(Phoenix), and WAL will be write and sync in these threads.
> 3.  Index writer thread(Phoenix), which referenced by 
> RegionCoprocessorHost$RegionEnvironment then by HRegion and finally been 
> referenced by PostOpenDeployTasksThread.
> We should turn this map into a thread local one, let JVM GC the terminated 
> thread for us. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-21228) Memory leak since AbstractFSWAL caches Thread object and never clean later

Reply via email to