[
https://issues.apache.org/jira/browse/SLIDER-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gour Saha updated SLIDER-1190:
------------------------------
Parent Issue: SLIDER-1216 (was: SLIDER-1185)
> Provide solution to possible memory issues with storing app diagnostics for
> large no of containers
> --------------------------------------------------------------------------------------------------
>
> Key: SLIDER-1190
> URL: https://issues.apache.org/jira/browse/SLIDER-1190
> Project: Slider
> Issue Type: Sub-task
> Components: appmaster, client
> Affects Versions: Slider 0.91
> Reporter: Gour Saha
> Fix For: Slider 1.0.0
>
>
> [~billie.rinaldi] raised a very important point on a potential memory issue
> in SLIDER-1187.
> I wanted to capture her point and my first initial thoughts on it. Let's use
> this JIRA to discuss further on this topic and find the best solution.
> Billie's question: Do you think this will cause memory issues for long-lived
> AMs?
> Gour's initial thoughts: I agree with you that any list which is only growing
> over time is a concern for possible memory issues. However I checked the size
> of a single container diagnostics payload and it hovers anywhere between 4-5
> KB. So for about 100,000 containers it will end up consuming ~500MB. This is
> at the borderline of acceptability for a 1GB AM container. However for most
> production clusters I have seen that the min size of a container is set to
> 4GB or higher. Either way, 100K containers for a single app (even if running
> for years) is very unlikely but not impossible. We can do couple of things
> here. 1) Provide an API which can be triggered to drop all container
> diagnostics of the old/dead containers except n most recent ones (n can be
> passed as a parameter to the API). 2) Add logic where the AM will cap the no
> of old/dead containers to a limit of say 10,000 (which will be configurable
> per application). Nevertheless, if an app is created with 100K+ containers we
> can still be hosed, but here we are stretching our imaginations too much
> Anyway I don't think we should use this patch to solve this. I am going to
> create a new sub-task for this possible memory issue.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)