[
https://issues.apache.org/jira/browse/HBASE-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627558#comment-16627558
]
Mike Drob commented on HBASE-21228:
-----------------------------------
If we override {{initialValue}} method on the ThreadLocal, then we can simplify
the logic in {{getSyncFuture}}.
> Memory leak since AbstractFSWAL caches Thread object and never clean later
> --------------------------------------------------------------------------
>
> Key: HBASE-21228
> URL: https://issues.apache.org/jira/browse/HBASE-21228
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.1.0, 2.0.2, 1.4.7
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Major
> Attachments: HBASE-21228.branch-2.0.001.patch
>
>
> In AbstractFSWAL(FSHLog in branch-1), we have a map caches thread and
> SyncFutures.
> {code}
> /**
> * Map of {@link SyncFuture}s keyed by Handler objects. Used so we reuse
> SyncFutures.
> * <p>
> * TODO: Reuse FSWALEntry's rather than create them anew each time as we do
> SyncFutures here.
> * <p>
> * TODO: Add a FSWalEntry and SyncFuture as thread locals on handlers
> rather than have them get
> * them from this Map?
> */
> private final ConcurrentMap<Thread, SyncFuture> syncFuturesByHandler;
> {code}
> A colleague of mine find a memory leak case caused by this map.
> Every thread who writes WAL will be cached in this map, And no one will clean
> the threads in the map even after the thread is dead.
> In one of our customer's cluster, we noticed that even though there is no
> requests, the heap of the RS is almost full and CMS GC was triggered every
> second.
> We dumped the heap and then found out there were more than 30 thousands
> threads with Terminated state. which are all cached in this map above.
> Everything referenced in these threads were leaked. Most of the threads are:
> 1.PostOpenDeployTasksThread, which will write Open Region mark in WAL
> 2. hconnection-0x1f838e31-shared--pool, which are used to write index short
> circuit(Phoenix), and WAL will be write and sync in these threads.
> 3. Index writer thread(Phoenix), which referenced by
> RegionCoprocessorHost$RegionEnvironment then by HRegion and finally been
> referenced by PostOpenDeployTasksThread.
> We should turn this map into a thread local one, let JVM GC the terminated
> thread for us.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)