[ 
https://issues.apache.org/jira/browse/MESOS-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903622#comment-16903622
 ] 

longfei commented on MESOS-9852:
--------------------------------

[~bmahler]

I used jemalloc, which resulted in less memory usage(about 500MB of master). 
But after running some time, it grew to about 1GB.

[^statistics] is the output of the memory-profiler/statistics endpoint.

^[^_tmp_libprocess.Do1MrG_profile (1).dump] is a 2-mins profiling result.^

^The newest commit is 8e8c6c0.^

^OS: Debian stretch with kernel 4.14.81.bm.11-amd64^

> Slow memory growth in master due to deferred deletion of offer filters and 
> timers.
> ----------------------------------------------------------------------------------
>
>                 Key: MESOS-9852
>                 URL: https://issues.apache.org/jira/browse/MESOS-9852
>             Project: Mesos
>          Issue Type: Bug
>          Components: allocation, master
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>            Priority: Critical
>              Labels: resource-management
>             Fix For: 1.5.4, 1.6.3, 1.7.3, 1.8.1, 1.9.0
>
>         Attachments: _tmp_libprocess.Do1MrG_profile (1).dump, 
> _tmp_libprocess.Do1MrG_profile (1).svg, screenshot-1.png, statistics
>
>
> The allocator does not keep a handle to the offer filter timer, which means 
> it cannot remove the timer overhead (in this case memory) when removing the 
> offer filter earlier (e.g. due to revive):
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1338-L1352
> In addition, the offer filter is allocated on the heap but not deleted until 
> the timer fires (which might take forever!):
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1321
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1408-L1413
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L2249
> We'll need to try to backport this to all active release branches.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to