subject:"\[jira\] \[Commented\] \(HADOOP\-12107\) long running apps may have a huge number of StatisticsData instances under FileSystem"

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2016-03-23 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209500#comment-15209500
 ] 

Jason Lowe commented on HADOOP-12107:
-

This appears to be triggering OOM errors in Tez during task transitions in 
container reuse for some scenarios, see HADOOP-12958 for details.  We had at 
least one job that would fail with an OOM every time during container reuse, 
and it passes once this change was reverted.

> long running apps may have a huge number of StatisticsData instances under 
> FileSystem
> -
>
> Key: HADOOP-12107
> URL: https://issues.apache.org/jira/browse/HADOOP-12107
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Fix For: 2.8.0, 2.7.3, 2.6.4
>
> Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
> HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch
>
>
> We observed with some of our apps (non-mapreduce apps that use filesystems) 
> that they end up accumulating a huge memory footprint coming from 
> {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
> {{Statistics}}).
> Although the thread reference from {{StatisticsData}} is a weak reference, 
> and thus can get cleared once a thread goes away, the actual 
> {{StatisticsData}} instances in the list won't get cleared until any of these 
> following methods is called on {{Statistics}}:
> - {{getBytesRead()}}
> - {{getBytesWritten()}}
> - {{getReadOps()}}
> - {{getLargeReadOps()}}
> - {{getWriteOps()}}
> - {{toString()}}
> It is quite possible to have an application that interacts with a filesystem 
> but does not call any of these methods on the {{Statistics}}. If such an 
> application runs for a long time and has a large amount of thread churn, the 
> memory footprint will grow significantly.
> The current workaround is either to limit the thread churn or to invoke these 
> operations occasionally to pare down the memory. However, this is still a 
> deficiency with {{FileSystem$Statistics}} itself in that the memory is 
> controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2016-01-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15099221#comment-15099221
 ] 

Hudson commented on HADOOP-12107:
-

FAILURE: Integrated in Hadoop-trunk-Commit #9116 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9116/])
Update CHANGES.txt for commit of HADOOP-12107 to branch-2.7 and (jlowe: rev 
651c23e8ef8aeafd999249ce57b31e689bd2ece6)
* hadoop-common-project/hadoop-common/CHANGES.txt


> long running apps may have a huge number of StatisticsData instances under 
> FileSystem
> -
>
> Key: HADOOP-12107
> URL: https://issues.apache.org/jira/browse/HADOOP-12107
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Fix For: 2.8.0, 2.7.3, 2.6.4
>
> Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
> HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch
>
>
> We observed with some of our apps (non-mapreduce apps that use filesystems) 
> that they end up accumulating a huge memory footprint coming from 
> {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
> {{Statistics}}).
> Although the thread reference from {{StatisticsData}} is a weak reference, 
> and thus can get cleared once a thread goes away, the actual 
> {{StatisticsData}} instances in the list won't get cleared until any of these 
> following methods is called on {{Statistics}}:
> - {{getBytesRead()}}
> - {{getBytesWritten()}}
> - {{getReadOps()}}
> - {{getLargeReadOps()}}
> - {{getWriteOps()}}
> - {{toString()}}
> It is quite possible to have an application that interacts with a filesystem 
> but does not call any of these methods on the {{Statistics}}. If such an 
> application runs for a long time and has a large amount of thread churn, the 
> memory footprint will grow significantly.
> The current workaround is either to limit the thread churn or to invoke these 
> operations occasionally to pare down the memory. However, this is still a 
> deficiency with {{FileSystem$Statistics}} itself in that the memory is 
> controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2016-01-13 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096632#comment-15096632
 ] 

Jason Lowe commented on HADOOP-12107:
-

I tried pulling this into branch-2.7.  It comes in cleanly except for 
CHANGES.txt, but the unit test failed.  Subsequent runs showed it was flaky, 
sometimes passing and sometimes not.  Looking around I noticed a number of 
precommit builds have been complaining about this test.  I can reproduce the 
flaky test on trunk, so I filed HADOOP-12706.  I'd really like this fix 
backported, but I'd also rather not add another flaky test to 2.7 and 2.6.

[~sjlee0] or [~mingma] could you take a look into the unit test?

> long running apps may have a huge number of StatisticsData instances under 
> FileSystem
> -
>
> Key: HADOOP-12107
> URL: https://issues.apache.org/jira/browse/HADOOP-12107
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
> HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch
>
>
> We observed with some of our apps (non-mapreduce apps that use filesystems) 
> that they end up accumulating a huge memory footprint coming from 
> {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
> {{Statistics}}).
> Although the thread reference from {{StatisticsData}} is a weak reference, 
> and thus can get cleared once a thread goes away, the actual 
> {{StatisticsData}} instances in the list won't get cleared until any of these 
> following methods is called on {{Statistics}}:
> - {{getBytesRead()}}
> - {{getBytesWritten()}}
> - {{getReadOps()}}
> - {{getLargeReadOps()}}
> - {{getWriteOps()}}
> - {{toString()}}
> It is quite possible to have an application that interacts with a filesystem 
> but does not call any of these methods on the {{Statistics}}. If such an 
> application runs for a long time and has a large amount of thread churn, the 
> memory footprint will grow significantly.
> The current workaround is either to limit the thread churn or to invoke these 
> operations occasionally to pare down the memory. However, this is still a 
> deficiency with {{FileSystem$Statistics}} itself in that the memory is 
> controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2016-01-08 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089972#comment-15089972
 ] 

Jason Lowe commented on HADOOP-12107:
-

I recently ran across this on a NodeManager running 2.6 that had been up for a 
while.  Any objections to this being picked back to 2.6 and 2.7?

> long running apps may have a huge number of StatisticsData instances under 
> FileSystem
> -
>
> Key: HADOOP-12107
> URL: https://issues.apache.org/jira/browse/HADOOP-12107
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
> HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch
>
>
> We observed with some of our apps (non-mapreduce apps that use filesystems) 
> that they end up accumulating a huge memory footprint coming from 
> {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
> {{Statistics}}).
> Although the thread reference from {{StatisticsData}} is a weak reference, 
> and thus can get cleared once a thread goes away, the actual 
> {{StatisticsData}} instances in the list won't get cleared until any of these 
> following methods is called on {{Statistics}}:
> - {{getBytesRead()}}
> - {{getBytesWritten()}}
> - {{getReadOps()}}
> - {{getLargeReadOps()}}
> - {{getWriteOps()}}
> - {{toString()}}
> It is quite possible to have an application that interacts with a filesystem 
> but does not call any of these methods on the {{Statistics}}. If such an 
> application runs for a long time and has a large amount of thread churn, the 
> memory footprint will grow significantly.
> The current workaround is either to limit the thread churn or to invoke these 
> operations occasionally to pare down the memory. However, this is still a 
> deficiency with {{FileSystem$Statistics}} itself in that the memory is 
> controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2016-01-08 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090150#comment-15090150
 ] 

Sangjin Lee commented on HADOOP-12107:
--

+1

> long running apps may have a huge number of StatisticsData instances under 
> FileSystem
> -
>
> Key: HADOOP-12107
> URL: https://issues.apache.org/jira/browse/HADOOP-12107
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.0
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
> HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch
>
>
> We observed with some of our apps (non-mapreduce apps that use filesystems) 
> that they end up accumulating a huge memory footprint coming from 
> {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
> {{Statistics}}).
> Although the thread reference from {{StatisticsData}} is a weak reference, 
> and thus can get cleared once a thread goes away, the actual 
> {{StatisticsData}} instances in the list won't get cleared until any of these 
> following methods is called on {{Statistics}}:
> - {{getBytesRead()}}
> - {{getBytesWritten()}}
> - {{getReadOps()}}
> - {{getLargeReadOps()}}
> - {{getWriteOps()}}
> - {{toString()}}
> It is quite possible to have an application that interacts with a filesystem 
> but does not call any of these methods on the {{Statistics}}. If such an 
> application runs for a long time and has a large amount of thread churn, the 
> memory footprint will grow significantly.
> The current workaround is either to limit the thread churn or to invoke these 
> operations occasionally to pare down the memory. However, this is still a 
> deficiency with {{FileSystem$Statistics}} itself in that the memory is 
> controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-07-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610306#comment-14610306
 ] 

Hudson commented on HADOOP-12107:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #2172 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2172/])
HADOOP-12107. long running apps may have a huge number of StatisticsData 
instances under FileSystem (Sangjin Lee via Ming Ma) (mingma: rev 
8e1bdc17d9134e01115ae7c929503d8ac0325207)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FCStatisticsBaseTest.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java


 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Fix For: 2.8.0

 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-07-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610348#comment-14610348
 ] 

Hudson commented on HADOOP-12107:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #233 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/233/])
HADOOP-12107. long running apps may have a huge number of StatisticsData 
instances under FileSystem (Sangjin Lee via Ming Ma) (mingma: rev 
8e1bdc17d9134e01115ae7c929503d8ac0325207)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FCStatisticsBaseTest.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java


 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Fix For: 2.8.0

 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608516#comment-14608516
 ] 

Hudson commented on HADOOP-12107:
-

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2190 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2190/])
HADOOP-12107. long running apps may have a huge number of StatisticsData 
instances under FileSystem (Sangjin Lee via Ming Ma) (mingma: rev 
8e1bdc17d9134e01115ae7c929503d8ac0325207)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FCStatisticsBaseTest.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
* hadoop-common-project/hadoop-common/CHANGES.txt


 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Fix For: 2.8.0

 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608460#comment-14608460
 ] 

Hudson commented on HADOOP-12107:
-

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #242 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/242/])
HADOOP-12107. long running apps may have a huge number of StatisticsData 
instances under FileSystem (Sangjin Lee via Ming Ma) (mingma: rev 
8e1bdc17d9134e01115ae7c929503d8ac0325207)
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FCStatisticsBaseTest.java


 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Fix For: 2.8.0

 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-30 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609256#comment-14609256
 ] 

Colin Patrick McCabe commented on HADOOP-12107:
---

Thanks, guys.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Fix For: 2.8.0

 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-29 Thread Gera Shegalov (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606431#comment-14606431
 ] 

Gera Shegalov commented on HADOOP-12107:


bq. You could make the same argument to stop development on almost any patch.

I disagree with such a strong statement. It's not the case in my experience.  
Thanks for pointing out the compatibility document. It gives us a formal basis 
to go on, and not delay [~sjlee0]'s important fix. Maybe one day we'll have a 
compatibility test suite based on that doc.

bq. It's simply unreasonable to try to support users who are putting their code 
inside the org.apache.hadoop.fs 
We develop new Hadoop features and often they do not make it upstream 
immediately. It happens that we have classes in their intended packages but we 
can deal with this. We are not affected by this particular change, either.

+1 for both trunk and branch 2. [~mingma], do you want to exercise your 
committer rights :) ?


 





 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-29 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606271#comment-14606271
 ] 

Colin Patrick McCabe commented on HADOOP-12107:
---

Guys, we clearly define the API contract for the project.  See 
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html

You have to remember that:
1. The function that you are talking about changing (the constructor) is not 
public from Java's point of view.  It is package-private.
2. The function that you are talking about changing is not public from Hadoop's 
point of view (there is no \@Public or \@LimitedPrivate annotation on it)

There is simply no reason to treat this as public.

bq. However, at the expense of being too defensive, the only test I apply here: 
is there hypothetically a scenario where an API user can be broken? My answer 
is yes if you have some org.apache.hadoop.fs.Foo calling the constructor even 
though the user absolutely should not do it. 

You could make the same argument to stop development on almost any patch.  
Almost every patch changes things which are private or package-private inside 
Hadoop.  It's simply unreasonable to try to support users who are putting their 
code inside the org.apache.hadoop.fs namespace (or any other internal project 
namespace)

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606495#comment-14606495
 ] 

Hudson commented on HADOOP-12107:
-

FAILURE: Integrated in Hadoop-trunk-Commit #8089 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8089/])
HADOOP-12107. long running apps may have a huge number of StatisticsData 
instances under FileSystem (Sangjin Lee via Ming Ma) (mingma: rev 
8e1bdc17d9134e01115ae7c929503d8ac0325207)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FCStatisticsBaseTest.java
* hadoop-common-project/hadoop-common/CHANGES.txt


 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-29 Thread Ming Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606514#comment-14606514
 ] 

Ming Ma commented on HADOOP-12107:
--

Also thanks for [~walter.k.su] and [~sandyr] for the review and suggestion.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-29 Thread Ming Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606509#comment-14606509
 ] 

Ming Ma commented on HADOOP-12107:
--

I have committed this to trunk and branch-2. Thanks [~sjlee0] for the 
contribution and [~jira.shegalov] and [~cmccabe] for the code review!

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-29 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606552#comment-14606552
 ] 

Sangjin Lee commented on HADOOP-12107:
--

Thanks [~mingma] for the commit! Many thanks to [~jira.shegalov] and [~cmccabe] 
for the invaluable review.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Fix For: 2.8.0

 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-26 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603218#comment-14603218
 ] 

Sandy Ryza commented on HADOOP-12107:
-

I don't know of anyone using the StatisticsData constructor.  I originally made 
the class public for Spark's needs and it shouldn't ever need that constructor.

This is neither an endorsement nor a -1 against removing the constructor.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-26 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603211#comment-14603211
 ] 

Sangjin Lee commented on HADOOP-12107:
--

[~cmccabe], your thoughts on this?

{{StatisticsData}} was made public in 2.5 (HADOOP-10688), and I doubt there 
would be an actual case of using the constructor. IMO, removing the constructor 
(and the member) seems pretty safe, in theory and practice. [~sandyr], what do 
you think?

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-26 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603216#comment-14603216
 ] 

Sangjin Lee commented on HADOOP-12107:
--

FWIW, when I look at the spark code (which may have been the main reason for 
changes in HADOOP-10688), it doesn't look like it's using the constructor or 
the owner member variable: 
https://github.com/apache/spark/blob/3c0156899dc1ec1f7dfe6d7c8af47fa6dc7d00bf/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-26 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603433#comment-14603433
 ] 

Colin Patrick McCabe commented on HADOOP-12107:
---

OK, at the risk of being pedantic, here is my rundown.  While the 
{{StatisticsData}} class itself is public, the {{StatisticsData}} constructor 
is not.  It is package-private (the access class which things get in Java if 
there is no public, private, or protected keyword on them.)  This means that a 
{{StatisticsData}} object can only be created by code in the 
{{org.apache.hadoop.fs}} package.  You can try this for yourself-- write a 
program external to hadoop that tries to create a {{StatisticsData}} object via 
this constructor.  It will not compile.  This constructor is safe to remove, so 
let's do that.

bq. Colin Patrick McCabe, good point on the one hand but on the other hand this 
constructor is package-scope, and technically usable if an creates a class with 
the same package name, regardless how unlikely or illegal (in terms of 
specified audience) it is. How about we defensively keep that constructor for 
branch-2 at least?

No.  Users simply can't add code to the {{org.apache.hadoop.fs}} package.  If 
they do, things are not going to work-- there are going to be naming conflicts, 
class resolution issues, etc. etc.  There is no possible way we can support 
users doing this and no reason to support it.  If we tried, we would have to 
essentially freeze the API of every single class in Hadoop-- we would have to 
re-have this discussion each time we changed some package-private variable or 
function.  Private and package-private stuff is private-- it's even enforced by 
the compiler, you can't get much more private than that.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-26 Thread Gera Shegalov (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603982#comment-14603982
 ] 

Gera Shegalov commented on HADOOP-12107:


I agree with the sentiment what API isolation contract should be. 

However, at the expense of being too defensive, the only test I apply here: is 
there hypothetically a scenario where an API user can be broken? My answer is 
yes if you have some {{org.apache.hadoop.fs.Foo}} calling the constructor even 
though the user absolutely should not do it.  Regarding freezing the API, given 
that the question was only about branch-2, it does not sound negative to me. 


 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-25 Thread Gera Shegalov (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602230#comment-14602230
 ] 

Gera Shegalov commented on HADOOP-12107:


bq StatisticsData is a public class, but its constructor is not public. 

[~cmccabe], good point on the one hand but on the other hand this constructor 
is package-scope, and technically usable if an creates a class with the same 
package name, regardless how unlikely or illegal (in terms of specified 
audience) it is. How about we defensively keep that constructor for branch-2 at 
least?


 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-24 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600376#comment-14600376
 ] 

Sangjin Lee commented on HADOOP-12107:
--

I believe the test failure is unrelated. Also, the said test passes fine 
locally.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600361#comment-14600361
 ] 

Hadoop QA commented on HADOOP-12107:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 19s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  6s | The applied patch generated  1 
new checkstyle issues (total was 142, now 140). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 50s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  21m 33s | Tests failed in 
hadoop-common. |
| | |  60m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.security.ssl.TestReloadingX509TrustManager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741705/HADOOP-12107.005.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / afe9ea3 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HADOOP-Build/7033/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7033/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7033/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7033/console |


This message was automatically generated.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch, HADOOP-12107.005.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-24 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599861#comment-14599861
 ] 

Colin Patrick McCabe commented on HADOOP-12107:
---

bq. 003 patch is good. I just don't understand why 001 patch creates one 
additional thread per FileSystem object? I look at 
FileSystem#getStaticstics(..). I think it creates one Staticstics object per 
FileSystem class, so 001 patch creates one additional thread per FileSystem 
class? I'll be grateful if somebody can guide me through this.

D'oh.  You're absolutely right... there is only one thread per FileSystem 
class.  Forget what I said earlier.

bq. One other point of discussion: I'm removing the StatisticsData constructor 
that takes a Thread as the argument (along with the owner member variable) as 
it no longer needs the thread inside StatisticsData. But since StatisticsData 
is technically a public class, it could be considered an API incompatible 
change. Thoughts on this? No one outside FileSystem is using the constructor or 
the member variable, and I cannot think of why anyone would. But if we need to 
absolutely maintain the API compatibility, I cannot remove them. Let me know 
what you think.

StatisticsData is a public class, but its constructor is not public.  So there 
is no possible API breakage here.  (As a side note, even if StatisticsData were 
public, it doesn't have an interface annotation specifying that it is safe to 
use outside of Hadoop, so we're doubly safe here.)

{code}
2921  /**
2922   * This constructor is deprecated and is no longer used. One 
should remove
2923   * any use of this constructor.
2924   */
2925  @Deprecated
2926  StatisticsData(WeakReferenceThread owner) {
{code}

Like I explained above, we don't need this.  Let's get rid of it.

StatisticsDataReference: let's put an {{\@Override}} annotation on the 
functions which implement {{PhantomReference}} APIs.

{code}
142 final int maxSeconds = 10;
143 for (int i = 0; i  maxSeconds; i++) {
144   Thread.sleep(1000L);
145   allDataSize = stats.getAllThreadLocalDataSize();
146   if (allDataSize == 0) {
147 LOG.info(cleaned up after  + (i+1) +  seconds);
148 break;
149   } else {
150 LOG.info(not cleaned up after  + (i+1) +  seconds; 
waiting...);
151   }
152 }
{code}

Let's use {{GenericTestUtils#waitFor}} here.

+1 once those are addressed.  Thanks, [~sjlee0].

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-23 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597851#comment-14597851
 ] 

Sangjin Lee commented on HADOOP-12107:
--

Thanks for pointing that out [~walter.k.su]. I think you're right in that 
normally one {{Statistics}} instance is created per {{FileSystem}} class. That 
said, it is possible to create {{Statistics}} instances in other manners simply 
via invoking the public constructor.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-23 Thread Ming Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598388#comment-14598388
 ] 

Ming Ma commented on HADOOP-12107:
--

Thanks [~sjlee0]. Latest patch LGTM. [~cmccabe], [~jira.shegalov] any 
additional comments?

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-23 Thread Gera Shegalov (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598397#comment-14598397
 ] 

Gera Shegalov commented on HADOOP-12107:


Thanks for v3 [~sjlee0]! 

*FileSystem.java:*

*{{getThreadStatistics}}:*
Minimize the code executed under the monitor. Pull reference creation out of 
{{synchronized}} similar to what it was before. Note that currentThread is a 
native call.


*{{Cleaner#run}}*
Catch and log InterruptedException in the while loop, such that thread does not 
die on a spurious wakeup. It's safe since it's a daemon thread.

Nits:
can we be more specific in the naming, to the tune of: STATS_DATA_CLEANER, 
STATS_DATA_REFQUEUE, StatsDataCleaner. 

*{{testStatisticsThreadLocalDataCleanUp}}*
Since the test uses waits, pass some reasonable timeout to {{@Test}}

make 'int size' and 'int maxSeconds' final. 

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-23 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597297#comment-14597297
 ] 

Walter Su commented on HADOOP-12107:


bq. This will create one additional thread per FileSystem object.
003 patch is good. I just don't understand why 001 patch creates one additional 
thread per FileSystem *object*?
I look at {{FileSystem#getStaticstics(..)}}. I think it creates one Staticstics 
object per FileSystem *class*, so 001 patch creates one additional thread per 
FileSystem *class*? I'll be grateful if somebody can guide me through this.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-23 Thread Gera Shegalov (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598617#comment-14598617
 ] 

Gera Shegalov commented on HADOOP-12107:


+1, 004 LGTM

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598570#comment-14598570
 ] 

Hadoop QA commented on HADOOP-12107:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 18s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  4s | The applied patch generated  1 
new checkstyle issues (total was 142, now 141). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  21m 41s | Tests passed in 
hadoop-common. |
| | |  60m 38s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741393/HADOOP-12107.004.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 122cad6 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HADOOP-Build/7027/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7027/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7027/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7027/console |


This message was automatically generated.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch, HADOOP-12107.004.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-22 Thread Ming Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596117#comment-14596117
 ] 

Ming Ma commented on HADOOP-12107:
--

Thanks, [~sjlee0], great find. It overall looks good to me.

* Is there a good way to write any unit test for this?
* This will create one additional thread per FileSystem object. It is fine for 
most scenarios where one FileSystem object is shared among different threads in 
the same JVM. If the application creates many FileSystem objects in same JVM, 
this will create lots of additional threads. But it doesn't seem like something 
worthwhile optimizing for. Just bring it up in case.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-22 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596543#comment-14596543
 ] 

Colin Patrick McCabe commented on HADOOP-12107:
---

Thanks for looking at this, [~sjlee0].  It looks like a good idea to fix this.

As [~mingma] commented, I think we need to have a single thread per JVM to do 
this, rather than one thread per FileSystem object.  We have a number of Hadoop 
applications which create lots of FileSystem objects, and creating a thread per 
FS object would just be too many threads.  We could end up with a thread leak 
which was worse than the memory leak described here.  (I agree that 
well-written applications should strive to avoid creating too many FileSystem 
objects, but that's a separate issue...)

{code}
private synchronized T T visitAll(StatisticsAggregatorT visitor) {
  visitor.accept(rootData);
  if (allData != null) {
{code}

The null check isn't needed any more since you changed allData to be 
initialized in the constructor.

thanks

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-22 Thread Gera Shegalov (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596951#comment-14596951
 ] 

Gera Shegalov commented on HADOOP-12107:


Between a single thread and a thread per FileSystem object, we can compromise 
on having a configurable single thread pool for the class FileSystem. 

Let us maintain fool-proof compatibility to make sure that we can safely 
backport the patch, and just deprecate the old constructor.

I don't see a test covering the scenario in this JIRA. It would be good to add 
a test to {{fs.FCStatisticsBaseTest}}



 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-22 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596806#comment-14596806
 ] 

Sangjin Lee commented on HADOOP-12107:
--

Thanks [~mingma] and [~cmccabe] for your comments!

bq. Is there a good way to write any unit test for this?
I was relying on existing unit tests on {{FileSystem.Statistics}} (there are 
several). I'm still hoping that is good enough for this, but let me know if you 
want me to look into creating more unit tests around this.

Regarding the number of threads for the cleaner, yes, it definitely occurred to 
me that this might create an issue in terms of potentially a large number of 
additional threads. The only reason I shied away from a global cleaner thread 
initially is I wasn't entirely sure whether it would be completely safe and 
perform well in terms of locking (note that the cleaner needs to acquire a lock 
for each {{Statistics}} instance). Let me explore that still.

bq. The null check isn't needed any more since you changed allData to be 
initialized in the constructor.
Thanks for pointing that out. I remembered it at one point, but forgot to 
include that change in the patch.

One other point of discussion: I'm removing the {{StatisticsData}} constructor 
that takes a Thread as the argument (along with the {{owner}} member variable) 
as it no longer needs the thread inside {{StatisticsData}}. But since 
{{StatisticsData}} is technically a public class, it could be considered an API 
incompatible change. Thoughts on this? No one outside {{FileSystem}} is using 
the constructor or the member variable, and I cannot think of why anyone would. 
But if we need to absolutely maintain the API compatibility, I cannot remove 
them. Let me know what you think.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597146#comment-14597146
 ] 

Hadoop QA commented on HADOOP-12107:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 19s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 28s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  5s | The applied patch generated  1 
new checkstyle issues (total was 142, now 141). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 50s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  21m 48s | Tests passed in 
hadoop-common. |
| | |  60m 40s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741209/HADOOP-12107.003.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 99271b7 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HADOOP-Build/7020/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7020/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7020/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7020/console |


This message was automatically generated.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-22 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597153#comment-14597153
 ] 

Sangjin Lee commented on HADOOP-12107:
--

The checkstyle warning:

{panel}
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java:1:
 File length is 3,415 lines (max allowed is 2,000).
{panel}

Nothing I can do about that. :)

Your review is greatly appreciated. Thanks!

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch, 
 HADOOP-12107.003.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597058#comment-14597058
 ] 

Hadoop QA commented on HADOOP-12107:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 18s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 28s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 32s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  5s | The applied patch generated  3 
new checkstyle issues (total was 142, now 143). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 50s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  21m 56s | Tests passed in 
hadoop-common. |
| | |  60m 43s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741198/HADOOP-12107.002.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 99271b7 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HADOOP-Build/7019/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7019/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7019/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7019/console |


This message was automatically generated.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch, HADOOP-12107.002.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594570#comment-14594570
 ] 

Hadoop QA commented on HADOOP-12107:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 26s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  4s | The applied patch generated  1 
new checkstyle issues (total was 142, now 140). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 56s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  21m 55s | Tests passed in 
hadoop-common. |
| | |  61m  3s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740755/HADOOP-12107.001.patch 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 20c03c9 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HADOOP-Build/6997/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/6997/artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/6997/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/6997/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/6997/console |


This message was automatically generated.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-12107.001.patch


 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12107) long running apps may have a huge number of StatisticsData instances under FileSystem

2015-06-19 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594054#comment-14594054
 ] 

Sangjin Lee commented on HADOOP-12107:
--

The key method in clearing this memory is in {{Statistics.visitAll()}}:

{code}
private synchronized T T visitAll(StatisticsAggregatorT visitor) {
  visitor.accept(rootData);
  if (allData != null) {
for (IteratorStatisticsData iter = allData.iterator();
iter.hasNext(); ) {
  StatisticsData data = iter.next();
  visitor.accept(data);
  if (data.owner.get() == null) {
/*
 * If the thread that created this thread-local data no
 * longer exists, remove the StatisticsData from our list
 * and fold the values into rootData.
 */
rootData.add(data);
iter.remove();
  }
}
  }
  return visitor.aggregate();
}
{code}

As part of running the visitor, it checks to see if the underlying thread is 
gone, and if so, adds the data for that thread to {{rootData}} and removes the 
instance from the list.

This pattern almost literally cries out for using a {{PhantomReference}}. That 
way, we can perform this operation as soon as the garbage collector clears up 
the threads. I'll draw up a patch based on that idea soon.

 long running apps may have a huge number of StatisticsData instances under 
 FileSystem
 -

 Key: HADOOP-12107
 URL: https://issues.apache.org/jira/browse/HADOOP-12107
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Minor

 We observed with some of our apps (non-mapreduce apps that use filesystems) 
 that they end up accumulating a huge memory footprint coming from 
 {{FileSystem$Statistics$StatisticsData}} (in the {{allData}} list of 
 {{Statistics}}).
 Although the thread reference from {{StatisticsData}} is a weak reference, 
 and thus can get cleared once a thread goes away, the actual 
 {{StatisticsData}} instances in the list won't get cleared until any of these 
 following methods is called on {{Statistics}}:
 - {{getBytesRead()}}
 - {{getBytesWritten()}}
 - {{getReadOps()}}
 - {{getLargeReadOps()}}
 - {{getWriteOps()}}
 - {{toString()}}
 It is quite possible to have an application that interacts with a filesystem 
 but does not call any of these methods on the {{Statistics}}. If such an 
 application runs for a long time and has a large amount of thread churn, the 
 memory footprint will grow significantly.
 The current workaround is either to limit the thread churn or to invoke these 
 operations occasionally to pare down the memory. However, this is still a 
 deficiency with {{FileSystem$Statistics}} itself in that the memory is 
 controlled only as a side effect of those operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

40 matches

Mail list logo