[
https://issues.apache.org/jira/browse/HBASE-28222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790871#comment-17790871
]
Bryan Beaudreault commented on HBASE-28222:
-------------------------------------------
I discovered a related issue https://issues.apache.org/jira/browse/HBASE-20433.
I'm thinking I might try tracing the verifySnapshot calls and passing in the
single unified FileSystem object in if possible.
> Leak in ExportSnapshot during verifySnapshot on S3A
> ---------------------------------------------------
>
> Key: HBASE-28222
> URL: https://issues.apache.org/jira/browse/HBASE-28222
> Project: HBase
> Issue Type: Bug
> Reporter: Bryan Beaudreault
> Priority: Major
>
> Each S3AFileSystem creates an S3AInstrumentation and various metrics sources,
> with no real way to disable that. In HADOOP-18526, a bug was fixed so that
> these are not leaked. But in order to use that, you must call
> S3AFileSystem.close() when done.
> In ExportSnapshot, ever since HBASE-12819 we set fs.impl.disable.cache to
> true. It looks like that was added in order to prevent conflicting calls to
> close() between mapper and main thread when running in a single JVM.
> When verifySnapshot is enabled, SnapshotReferenceUtil.verifySnapshot iterates
> all storefiles (could be many thousands) and calls
> SnapshotReferenceUtil.verifyStoreFile on them. verifyStoreFile makes a number
> of static calls which end up in CommonFSUtils.getRootDir, which does
> Path.getFileSystem().
> Since the FS cache is disabled, every single call to Path.getFileSystem()
> creates a new FileSystem instance. That FS is short lived, and gets GC'd. But
> in the case of S3AFileSystem, this leaks all of the metrics stuff.
> We have two easy possible fixes:
> # Only set fs.impl.disable.cache when running hadoop in local mode, since
> that was the original problem.
> # When calling verifySnapshot, create a new Configuration which does not
> include the fs.impl.disable.cache setting.
> I tested out #2 in my environment and it fixed the leak.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)