[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791293#comment-17791293 ] Bryan Beaudreault commented on HADOOP-15392: Hi, sorry I know this is pretty old. There are a number of JIRAs related to this, and I started looking at it again recently in https://issues.apache.org/jira/browse/HBASE-28222. I can say that the problem comes from when verifySnapshot is enabled. This happens in the launcher, not the MR job. After the MR job completes, the verifySnapshot code iterates all storefiles in the snapshot. For the examples above, they had 50k storefiles. For each storefile, it calls numerous methods which leak FileSystems. For example, multiple calls (through different paths) happen to this method in CommonFSUtils: {code:java} public static Path getRootDir(final Configuration c) throws IOException { Path p = new Path(c.get(HConstants.HBASE_DIR)); FileSystem fs = p.getFileSystem(c); return p.makeQualified(fs.getUri(), fs.getWorkingDirectory()); } {code} When cache is disabled (as is in ExportSnapshot), this simple code leaks. There are other examples as well of this same pattern. It's very hard to fix this, because this method (and a couple others) are relatively ubiquitously used in HBase. In most cases, caching is enabled since that's the default, but not for ExportSnapshot. I've considered enabling caching for the verifySnapshot portion. That is probably the most future proof way, given the constraints of FileSystem, but it's not bullet proof. There are cases where ExportSnapshot can be called over and over for different sources/destinations. In that case, the cache would build up over time. Granted, this would be a much slower growth than what verifySnapshot produces, so that's why I think it might be worth it. Here's an easy way to visualize the current problem of ExportSnapshot: {code:java} for (int i = 0; i < 1000; i++) { Configuration conf = new Configuration(); conf.setBoolean("fs.impl.disable.cache", true); try (FileSystem fs = FileSystem.get(conf)) { doSomethingWithFs(fs); someCodeWhichCallsGetRootDirManyTimes(conf); } }{code} You are looking at this code and saying "there's no leak, FileSystem gets closed." But you don't realize that someCodeWhichCallsGetRootDirManyTimes creates a million other transient FileSystems through a bunch of calls to getRootDir, and they don't get cached because of fs.impl.disable.cache. We can't easily use FileSystem.closeAll(), because there's no guarantee this job gets called in isolation – you may inadvertently close an in-use FS from some other thread in the process. It would be helpful if FileSystem cache could be namespaced, and closed via namespace: {code:java} try { for (int i = 0; i < 1000; i++) { Configuration conf = new Configuration(); conf.setBoolean("fs.cache.namespace", "foo"); FileSystem fs = FileSystem.get(conf); doSomethingWithFs(fs); someCodeWhichCallsGetRootDirManyTimes(conf); } } finally { FileSystem.closeAllForCacheNamespace("foo"); }{code} This could would not leak or conflict with other cached FileSystems from other threads. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Major > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550220#comment-16550220 ] Steve Loughran commented on HADOOP-15392: - removed the 3.1.1 target marker > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Major > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474063#comment-16474063 ] Steve Loughran commented on HADOOP-15392: - I'm thinking about whether we can repeat this enough to say "block for 3.1.1", or indeed, how to address it. WASB is also setting up metrics in FS create, lots of people have been using that for a long time, and this hasn't surfaced. I'm going to change to a major, but we need more reproductions to say this is anything more than a special case. Still curious as to why its surfacing at all. If it's only in CDH, maybe the HBase code there isn't closing the fs instances? > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471330#comment-16471330 ] Wangda Tan commented on HADOOP-15392: - Since this is marked as 3.1.1 blocker issue which we plan to release soon. [~mackrorysd], [~fabbri], [~Krizek], Could you update what is the status of this Jira? > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456971#comment-16456971 ] Aaron Fabbri commented on HADOOP-15392: --- Hey [~Krizek] thanks for working with the community on this. CDH ships a pretty recent version of upstream S3A. [~mackrorysd] works with me. He did try to reproduce this on recent CDH distribution. I've been following the thread here and am stumped based on evidence so far. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456141#comment-16456141 ] Voyta commented on HADOOP-15392: [~mackrorysd] Thank you for your investigation. It should not be any of those as MapReduce is running in YARN and default file system and cache is not in S3. Although, it's worth mentioning we run Cloudera-forked version of Hadoop/HBase (version of Cloudera Manager 5.14.0). So far I didn't see any difference in the code in question. Maybe, [~fabbri] should be able to clarify if there is any alteration in the forked version of this code? > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455484#comment-16455484 ] Sean Mackrory commented on HADOOP-15392: [~Krizek] I'm trying to think of other ways that many instances might be getting created. Is there any chance that MapReduce is running in local mode, or that the default filesystem / dist. cache is on s3a and we're possibly loading a ton of JARs from there or something? > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454703#comment-16454703 ] Sean Mackrory commented on HADOOP-15392: Like Steve said, I don't see where the other almost 53k instances are coming from, but I think that's the root concern here. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454700#comment-16454700 ] Sean Mackrory commented on HADOOP-15392: So I did that this morning: with a snapshot consisting of about 100 files, I still only saw 1 FS instance created for the snapshot parent directory, and 1 FS instance created for this specific snapshot (the one where fs cache is disabled). A single metrics system started up at the first instance and was reused for both instances. As each was closed the ref count was decremented correctly and finally the metrics system shutdown when and only when the final instance was closed. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453176#comment-16453176 ] Sean Mackrory commented on HADOOP-15392: Ooh yeah that's a good idea. I'll try an export run with a bunch of instrumentation added to ExportSnapshot and S3A. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453160#comment-16453160 ] Ted Yu commented on HADOOP-15392: - I meant running ExportSnapshot with the DEBUG log. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453148#comment-16453148 ] Sean Mackrory commented on HADOOP-15392: [~te...@apache.org] I ran all of the s3a tests with such logging, and I checked that the ref counting stepped up and down exactly as expected. Tests that closed filesystems always reached 0 correctly - there was no skipping to -1 or getting stuck at 1. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453144#comment-16453144 ] Ted Yu commented on HADOOP-15392: - >From S3AInstrumentation : {code} public void close() { synchronized (metricsSystemLock) { metricsSystem.unregisterSource(metricsSourceName); int activeSources = --metricsSourceActiveCounter; if (activeSources == 0) { metricsSystem.publishMetricsNow(); metricsSystem.shutdown(); metricsSystem = null; {code} How about adding a DEBUG log with the value of activeSources so that we know whether the {{activeSources == 0}} case is ever reached ? > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452677#comment-16452677 ] Steve Loughran commented on HADOOP-15392: - And of course, what do we see in ExportSnapshot [line 792|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#L942] {code} srcConf.setBoolean("fs." + inputRoot.toUri().getScheme() + ".impl.disable.cache", true); {code} That is: the entry point process does disable fs caching in both src and dest. Now, it is cleaning them up in [Line 1074|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#L1074], but it's got me worried. It'd be better if they used FileSystem.newInstance(URI, config), so there'd be no altering of configs. Even so, I don't see from looking at the code where the other 52998 are coming from. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452661#comment-16452661 ] Sean Mackrory commented on HADOOP-15392: {quote}I found contain only commented lines{quote} Okay - thanks for checking. I believe the default is all commented out, except *.period=10, but I wasn't seeing the MutableQuantiles thread or the metrics system thread start up with only that either. The 53,000 instances is still weird. Is fs.s3a.impl.disable.cache set to true? It's not the default, so if you're not setting it, this shouldn't be happening. Within a JVM, communication with 1 S3 bucket should usually be done with a single, cached instance of S3AFileSystem, which should only yield a single S3AInstrumentation instance and thus the memory usage should 1/53,000th of what you're seeing. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452659#comment-16452659 ] Steve Loughran commented on HADOOP-15392: - OK< it's in the launcher? And there are 53K instances of S3A FS there? This sounds like a new instance is being created every time rather than the cache being involved. There's more at stake than just metrics: each s3a instance creates a thread pool, which is v. expensive if the runtime starts allocating memory for each thread's stack, plus a pool of HTTP connections, which, if kept open, will use up a lot of TCP connections. What does netstat -a say on this machine? Alternatively, if it's just the S3AInstrumentations which are hanging around, maybe there's some loop of ref counting going on? I think we ought to look @ the hbase problem independently of the memory management of the metrics. Voyta: thanks for your research here, it really helps us understand what's up. One thing, do make sure that fs.s3a.impl.disable.cache is either unset or false, as that would create lots of S3a instances. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452342#comment-16452342 ] Voyta commented on HADOOP-15392: [~mackrorysd] I was trying to locate hadoop-metrics2.properties but all files I found contain only commented lines. So I'd assume there is no metric configuration. bq. ExportSnapshot is essentially a MapReduce job If I observed it correctly it is a standalone Java process that launches multiple MapReduce jobs. The problem is in the standalone process. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
[ https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452326#comment-16452326 ] Sean Mackrory commented on HADOOP-15392: {quote}MapReduce job, but by hbase ExportSnapshot utility{quote} {quote}It might, however, be related to https://issues.apache.org/jira/browse/HBASE-20433 {quote} Yeah that's what I meant - ExportSnapshot is essentially a MapReduce job. I do see it closing the filesystem instances towards the end of doWork() {quote}Yes, we need to fix this{quote} Well let's make sure we're fixing the right problem first. 53,000 S3Ainstrumentation instances means S3AFileSystem.initialize is getting called once for every single file - that's also a lot of overhead that doesn't seem right to me. Has filesystem caching been disabled for some reason? And can you clarify what's configured in hadoop-metrics2.properties? I was testing with a much lower number of large files - but the threads I saw growing unbounded already only show up if you explicitly configure sinks for the s3a-file-system metrics. I'll try with a large number of files and verify that this accumulation is happening in threads that do exist without explicitly enabling them. > S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export > > > Key: HADOOP-15392 > URL: https://issues.apache.org/jira/browse/HADOOP-15392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.1.0 >Reporter: Voyta >Priority: Blocker > > While using HBase S3A Export Snapshot utility we started to experience memory > leaks of the process after version upgrade. > By running code analysis we traced the cause to revision > 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static > reference (singleton): > private static MetricsSystem metricsSystem = null; > When application uses S3AFileSystem instance that is not closed immediately > metrics are accumulated in this instance and memory grows without any limit. > > Expectation: > * It would be nice to have an option to disable metrics completely as this > is not needed for Export Snapshot utility. > * Usage of S3AFileSystem should not contain any static object that can grow > indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org