[
https://issues.apache.org/jira/browse/HBASE-28222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Beaudreault resolved HBASE-28222.
---------------------------------------
Fix Version/s: 2.6.0
3.0.0-beta-1
Release Note: ExportSnapshot now uses FileSystems from the global
FileSystem cache, and as such does not close those FileSystems when it
finishes. If users plan to run ExportSnapshot over and over in a single process
for different FileSystem urls, they should run FileSystem.closeAll() between
runs. See JIRA for details.
Assignee: Bryan Beaudreault
Resolution: Fixed
Pushed to master, branch-3, branch-2, branch-2.6. Thanks for the review
[~wchevreuil]!
I did not push to older branches, even though this is a bug. It might be an
unexpected change, but we can if there is a desire.
> Leak in ExportSnapshot during verifySnapshot on S3A
> ---------------------------------------------------
>
> Key: HBASE-28222
> URL: https://issues.apache.org/jira/browse/HBASE-28222
> Project: HBase
> Issue Type: Bug
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> Each S3AFileSystem creates an S3AInstrumentation and various metrics sources,
> with no real way to disable that. In HADOOP-18526, a bug was fixed so that
> these are not leaked. But in order to use that, you must call
> S3AFileSystem.close() when done.
> In ExportSnapshot, ever since HBASE-12819 we set fs.impl.disable.cache to
> true. It looks like that was added in order to prevent conflicting calls to
> close() between mapper and main thread when running in a single JVM.
> When verifySnapshot is enabled, SnapshotReferenceUtil.verifySnapshot iterates
> all storefiles (could be many thousands) and calls
> SnapshotReferenceUtil.verifyStoreFile on them. verifyStoreFile makes a number
> of static calls which end up in CommonFSUtils.getRootDir, which does
> Path.getFileSystem().
> Since the FS cache is disabled, every single call to Path.getFileSystem()
> creates a new FileSystem instance. That FS is short lived, and gets GC'd. But
> in the case of S3AFileSystem, this leaks all of the metrics stuff.
> We have two easy possible fixes:
> # Only set fs.impl.disable.cache when running hadoop in local mode, since
> that was the original problem.
> # When calling verifySnapshot, create a new Configuration which does not
> include the fs.impl.disable.cache setting.
> I tested out #2 in my environment and it fixed the leak.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)