[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

Colin Patrick McCabe (JIRA) Wed, 24 Jun 2015 11:11:15 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599861#comment-14599861
 ]


Colin Patrick McCabe commented on HADOOP-12107:
-----------------------------------------------

bq. 003 patch is good. I just don't understand why 001 patch creates one 
additional thread per FileSystem object? I look at 
FileSystem#getStaticstics(..). I think it creates one Staticstics object per 
FileSystem class, so 001 patch creates one additional thread per FileSystem 
class? I'll be grateful if somebody can guide me through this.

D'oh.  You're absolutely right... there is only one thread per FileSystem 
class.  Forget what I said earlier.

bq. One other point of discussion: I'm removing the StatisticsData constructor 
that takes a Thread as the argument (along with the owner member variable) as 
it no longer needs the thread inside StatisticsData. But since StatisticsData 
is technically a public class, it could be considered an API incompatible 
change. Thoughts on this? No one outside FileSystem is using the constructor or 
the member variable, and I cannot think of why anyone would. But if we need to 
absolutely maintain the API compatibility, I cannot remove them. Let me know 
what you think.

StatisticsData is a public class, but its constructor is not public.  So there 
is no possible API breakage here.  (As a side note, even if StatisticsData were 
public, it doesn't have an interface annotation specifying that it is safe to 
use outside of Hadoop, so we're doubly safe here.)

{code}
2921              /**
2922           * This constructor is deprecated and is no longer used. One 
should remove
2923           * any use of this constructor.
2924           */
2925          @Deprecated
2926          StatisticsData(WeakReference<Thread> owner) {
{code}

Like I explained above, we don't need this.  Let's get rid of it.

StatisticsDataReference: let's put an {{\@Override}} annotation on the 
functions which implement {{PhantomReference}} APIs.

{code}
142         final int maxSeconds = 10;
143         for (int i = 0; i < maxSeconds; i++) {
144           Thread.sleep(1000L);
145           allDataSize = stats.getAllThreadLocalDataSize();
146           if (allDataSize == 0) {
147             LOG.info("cleaned up after " + (i+1) + " seconds");
148             break;
149           } else {
150             LOG.info("not cleaned up after " + (i+1) + " seconds; 
waiting...");
151           }
152         }
{code}

Let's use {{GenericTestUtils#waitFor}} here.

+1 once those are addressed.  Thanks, [~sjlee0].

> long running apps may have a huge number of StatisticsData instances under 
> FileSystem
> -------------------------------------------------------------------------------------
>
>                 Key: HADOOP-12107
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12107
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.7.0
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>            Priority: Minor
>         Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
> HADOOP-12107.003.patch, HADOOP-12107.004.patch
>
>
> We observed with some of our apps (non-mapreduce apps that use filesystems) 
> that they end up accumulating a huge memory footprint coming from 
> {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
> {{Statistics}}).
> Although the thread reference from {{StatisticsData}} is a weak reference, 
> and thus can get cleared once a thread goes away, the actual 
> {{StatisticsData}} instances in the list won't get cleared until any of these 
> following methods is called on {{Statistics}}:
> - {{getBytesRead()}}
> - {{getBytesWritten()}}
> - {{getReadOps()}}
> - {{getLargeReadOps()}}
> - {{getWriteOps()}}
> - {{toString()}}
> It is quite possible to have an application that interacts with a filesystem 
> but does not call any of these methods on the {{Statistics}}. If such an 
> application runs for a long time and has a large amount of thread churn, the 
> memory footprint will grow significantly.
> The current workaround is either to limit the thread churn or to invoke these 
> operations occasionally to pare down the memory. However, this is still a 
> deficiency with {{FileSystem$Statistics}} itself in that the memory is 
> controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

Reply via email to