[ 
https://issues.apache.org/jira/browse/HBASE-28222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-28222.
---------------------------------------
    Fix Version/s: 2.6.0
                   3.0.0-beta-1
     Release Note: ExportSnapshot now uses FileSystems from the global 
FileSystem cache, and as such does not close those FileSystems when it 
finishes. If users plan to run ExportSnapshot over and over in a single process 
for different FileSystem urls, they should run FileSystem.closeAll() between 
runs. See JIRA for details.
         Assignee: Bryan Beaudreault
       Resolution: Fixed

Pushed to master, branch-3, branch-2, branch-2.6. Thanks for the review 
[~wchevreuil]!

I did not push to older branches, even though this is a bug. It might be an 
unexpected change, but we can if there is a desire.

> Leak in ExportSnapshot during verifySnapshot on S3A
> ---------------------------------------------------
>
>                 Key: HBASE-28222
>                 URL: https://issues.apache.org/jira/browse/HBASE-28222
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>             Fix For: 2.6.0, 3.0.0-beta-1
>
>
> Each S3AFileSystem creates an S3AInstrumentation and various metrics sources, 
> with no real way to disable that. In HADOOP-18526, a bug was fixed so that 
> these are not leaked. But in order to use that, you must call 
> S3AFileSystem.close() when done.
> In ExportSnapshot, ever since HBASE-12819 we set fs.impl.disable.cache to 
> true. It looks like that was added in order to prevent conflicting calls to 
> close() between mapper and main thread when running in a single JVM.
> When verifySnapshot is enabled, SnapshotReferenceUtil.verifySnapshot iterates 
> all storefiles (could be many thousands) and calls 
> SnapshotReferenceUtil.verifyStoreFile on them. verifyStoreFile makes a number 
> of static calls which end up in CommonFSUtils.getRootDir, which does 
> Path.getFileSystem().
> Since the FS cache is disabled, every single call to Path.getFileSystem() 
> creates a new FileSystem instance. That FS is short lived, and gets GC'd. But 
> in the case of S3AFileSystem, this leaks all of the metrics stuff.
> We have two easy possible fixes:
>  # Only set fs.impl.disable.cache when running hadoop in local mode, since 
> that was the original problem.
>  # When calling verifySnapshot, create a new Configuration which does not 
> include the fs.impl.disable.cache setting.
> I tested out #2 in my environment and it fixed the leak.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to