[jira] [Updated] (HADOOP-16278) With S3A Filesystem, Long Running services End up Doing lot of GC and eventually die

Steve Loughran (JIRA) Thu, 09 May 2019 13:21:39 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-16278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Steve Loughran updated HADOOP-16278:
------------------------------------
    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

+1
committed to branch-3.1+. Didn't take in 3.0 but if someone does a patch there, 
I'll add it too. I suspect it was the addition of a new quantile in 3.2 which 
amplified the problem

> With S3A Filesystem, Long Running services End up Doing lot of GC and 
> eventually die
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16278
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16278
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: common, hadoop-aws, metrics
>    Affects Versions: 3.1.0, 3.1.1, 3.1.2
>            Reporter: Rajat Khandelwal
>            Assignee: Rajat Khandelwal
>            Priority: Major
>             Fix For: 3.3.0, 3.2.1, 3.1.3
>
>         Attachments: HADOOP-16278.patch, Screenshot 2019-04-30 at 12.52.42 
> PM.png, Screenshot 2019-04-30 at 2.33.59 PM.png
>
>
> I'll start with the symptoms and eventually come to the cause. 
>  
> We are using HDP 3.1 and Noticed that every couple of days the Hive Metastore 
> starts doing GC, sometimes with 30 minute long pauses. Although nothing is 
> collected and the Heap remains fully used. 
>  
> Next, we looked at the Heap Dump and found that 99% of the memory is taken up 
> by one Executor Service for its task queue. 
>  
> !Screenshot 2019-04-30 at 12.52.42 PM.png!
> The Instance is Created like this:
> {{ private static final ScheduledExecutorService scheduler = Executors}}
>  {{ .newScheduledThreadPool(1, new ThreadFactoryBuilder().setDaemon(true)}}
>  {{ .setNameFormat("MutableQuantiles-%d").build());}}
>  
> So All the instances of MutableQuantiles are using a Shared single threaded 
> ExecutorService
> The second thing to notice is this block of code in the Constructor of 
> MutableQuantiles:
> {{this.scheduledTask = scheduler.scheduleAtFixedRate(new 
> MutableQuantiles.RolloverSample(this), (long)interval, (long)interval, 
> TimeUnit.SECONDS);}}
> So As soon as a MutableQuantiles Instance is created, one task is scheduled 
> at Fix Rate. Instead of that, it could schedule them at Fixed Delay (Refer 
> HADOOP-16248). 
> Now coming to why it's related to S3. 
>  
> S3AFileSystem Creates an instance of S3AInstrumentation, which creates two 
> quantiles (related to S3Guard) with 1s(hardcoded) interval and leaves them 
> hanging. By hanging I mean perpetually scheduled. As and when new Instances 
> of S3AFileSystem are created, two new quantiles are created, which in turn 
> create two scheduled tasks and never cancel them. This way number of 
> scheduled tasks keeps on growing without ever getting cleaned up, leading to 
> GC/OOM/Crash. 
>  
> MutableQuantiles has a numInfo field which tells things like the name of the 
> metric. From the Heapdump, I found one numInfo and traced all objects 
> referencing that.
>  
> !Screenshot 2019-04-30 at 2.33.59 PM.png!
>  
> There seem to be 300K objects of for the same metric 
> (S3Guard_metadatastore_throttle_rate). 
> As expected, there are other 300K objects for the other MutableQuantiles 
> created by S3AInstrumentation class. 
> Although the number of instances of S3AInstrumentation class is only 4. 
> Clearly, there is a leak. One S3AInstrumentation instance is creating two 
> scheduled tasks to be run every second. These tasks are left scheduled and 
> not cancelled when S3AInstrumentation.close() is called. Hence, they are 
> never cleaned up. GC is also not able to collect them since they are referred 
> by the scheduler. 
> Who creates S3AInstrumentation instances? S3AFileSystem.initialize(), which 
> is called in FileSystem.get(URI, Configuration). Since hive metastore is a 
> service that deals with a lot of Path Objects and hence needs to do a lot of 
> calls to FileSystem.get, it's the one to first shows these symptoms. 
> We're seeing similar symptoms in AM for long-running jobs (for both Tez AM 
> and MR AM). 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HADOOP-16278) With S3A Filesystem, Long Running services End up Doing lot of GC and eventually die

Reply via email to