[ 
https://issues.apache.org/jira/browse/HADOOP-16248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexis Daboville updated HADOOP-16248:
--------------------------------------
    Description: 
In some circumstances (high GC, high CPU usage, creating lots of
 S3AFileSystem) it is possible for MutableQuantiles::scheduler [1] to fall
 behind processing tasks that are submitted to it; because tasks are
 submitted on a regular schedule, the unbounded queue backing the
 {{ExecutorService}} might grow to several gigs [2]. By using
 {{scheduleWithFixedDelay}} instead, we ensure that under pressure this leak 
won't
 happen. In order to mitigate the growth, a simple fix [3] is proposed, simply 
replacing {{scheduler.scheduleAtFixedRate}} by 
{{scheduler.scheduleWithFixedDelay}}.

[1] it is single threaded and shared across all instances of 
{{MutableQuantiles}}: 
[https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java#L66-L68]

[2] see attached mutable-quantiles-leak.png.

[3] mutable-quantiles.patch

  was:
In some circumstances (high GC, high CPU usage, creating lots of
 S3AFileSystem) it is possible for MutableQuantiles::scheduler [1] to fall
 behind processing tasks that are submitted to it; because tasks are
 submitted on a regular schedule, the unbounded queue backing the
 {{ExecutorService}} might grow to several gigs [2]. By using
 {{scheduleWithFixedDelay}} instead, we ensure that under pressure this leak 
won't
 happen. In order to mitigate the growth, a simple fix is proposed, simply 
replacing {{scheduler.scheduleAtFixedRate}} by 
{{scheduler.scheduleWithFixedDelay}}.

[1] it is single threaded and shared across all instances of 
{{MutableQuantiles}}: 
[https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java#L66-L68]

[2] see attached mutable-quantiles-leak.png.


> Fix MutableQuantiles memory leak
> --------------------------------
>
>                 Key: HADOOP-16248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16248
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.9.2
>            Reporter: Alexis Daboville
>            Priority: Major
>         Attachments: mutable-quantiles-leak.png, mutable-quantiles.patch
>
>
> In some circumstances (high GC, high CPU usage, creating lots of
>  S3AFileSystem) it is possible for MutableQuantiles::scheduler [1] to fall
>  behind processing tasks that are submitted to it; because tasks are
>  submitted on a regular schedule, the unbounded queue backing the
>  {{ExecutorService}} might grow to several gigs [2]. By using
>  {{scheduleWithFixedDelay}} instead, we ensure that under pressure this leak 
> won't
>  happen. In order to mitigate the growth, a simple fix [3] is proposed, 
> simply replacing {{scheduler.scheduleAtFixedRate}} by 
> {{scheduler.scheduleWithFixedDelay}}.
> [1] it is single threaded and shared across all instances of 
> {{MutableQuantiles}}: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java#L66-L68]
> [2] see attached mutable-quantiles-leak.png.
> [3] mutable-quantiles.patch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to