[ 
https://issues.apache.org/jira/browse/HBASE-28222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790871#comment-17790871
 ] 

Bryan Beaudreault commented on HBASE-28222:
-------------------------------------------

I discovered a related issue https://issues.apache.org/jira/browse/HBASE-20433. 
I'm thinking I might try tracing the verifySnapshot calls and passing in the 
single unified FileSystem object in if possible.

> Leak in ExportSnapshot during verifySnapshot on S3A
> ---------------------------------------------------
>
>                 Key: HBASE-28222
>                 URL: https://issues.apache.org/jira/browse/HBASE-28222
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Bryan Beaudreault
>            Priority: Major
>
> Each S3AFileSystem creates an S3AInstrumentation and various metrics sources, 
> with no real way to disable that. In HADOOP-18526, a bug was fixed so that 
> these are not leaked. But in order to use that, you must call 
> S3AFileSystem.close() when done.
> In ExportSnapshot, ever since HBASE-12819 we set fs.impl.disable.cache to 
> true. It looks like that was added in order to prevent conflicting calls to 
> close() between mapper and main thread when running in a single JVM.
> When verifySnapshot is enabled, SnapshotReferenceUtil.verifySnapshot iterates 
> all storefiles (could be many thousands) and calls 
> SnapshotReferenceUtil.verifyStoreFile on them. verifyStoreFile makes a number 
> of static calls which end up in CommonFSUtils.getRootDir, which does 
> Path.getFileSystem().
> Since the FS cache is disabled, every single call to Path.getFileSystem() 
> creates a new FileSystem instance. That FS is short lived, and gets GC'd. But 
> in the case of S3AFileSystem, this leaks all of the metrics stuff.
> We have two easy possible fixes:
>  # Only set fs.impl.disable.cache when running hadoop in local mode, since 
> that was the original problem.
>  # When calling verifySnapshot, create a new Configuration which does not 
> include the fs.impl.disable.cache setting.
> I tested out #2 in my environment and it fixed the leak.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to