[ 
https://issues.apache.org/jira/browse/MESOS-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904921#comment-16904921
 ] 

longfei commented on MESOS-9852:
--------------------------------

I found that every terminated(no matter completed or unreachable) task would be 
put into slaves.unreachableTasks and would only be erased in _doRegistryGc.

So only when an agent becomes unreachable or gone will its unreachableTasks(in 
master's memory) be released. 

If all agents are working properly, the master's memory will keep growing 
because of slaves.unreachableTasks.

Am I right?

> Slow memory growth in master due to deferred deletion of offer filters and 
> timers.
> ----------------------------------------------------------------------------------
>
>                 Key: MESOS-9852
>                 URL: https://issues.apache.org/jira/browse/MESOS-9852
>             Project: Mesos
>          Issue Type: Bug
>          Components: allocation, master
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>            Priority: Critical
>              Labels: resource-management
>             Fix For: 1.5.4, 1.6.3, 1.7.3, 1.8.1, 1.9.0
>
>         Attachments: _tmp_libprocess.Do1MrG_profile (1).dump, 
> _tmp_libprocess.Do1MrG_profile (1).svg, _tmp_libprocess.Do1MrG_profile 
> 24hours.dump, _tmp_libprocess.Do1MrG_profile 24hours.svg, screenshot-1.png, 
> statistics
>
>
> The allocator does not keep a handle to the offer filter timer, which means 
> it cannot remove the timer overhead (in this case memory) when removing the 
> offer filter earlier (e.g. due to revive):
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1338-L1352
> In addition, the offer filter is allocated on the heap but not deleted until 
> the timer fires (which might take forever!):
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1321
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1408-L1413
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L2249
> We'll need to try to backport this to all active release branches.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to