> On Jan. 8, 2020, 7:07 a.m., Greg Mann wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp
> > Lines 199 (patched)
> > <https://reviews.apache.org/r/71944/diff/2/?file=2193218#file2193218line199>
> >
> >     Do we really want to do this? My concern is that this will make any 
> > non-Mesos-task processes on the node (networking and security components, 
> > for example) more likely to be OOM-killed than Mesos tasks. Perhaps we 
> > should only set the OOM score adjustment for burstable tasks. What do you 
> > think?
> 
> Qian Zhang wrote:
>     I think it depends on which one is in higher priority and more important, 
> guaranteed task or non-Mesos-task processes? In Kubernetes implementation 
> (https://github.com/kubernetes/kubernetes/blob/v1.16.2/pkg/kubelet/qos/policy.go#L51:L53),
>  the OOM score adjust of guaranteed container is set to -998, and kubelet's 
> OOM score adjust is set to -998 too, I think we should do the same to protect 
> guaranteed containers and Mesos agent, what do you think?
> 
> Greg Mann wrote:
>     One significant difference in the Kubernetes case is that they have 
> user-space code which kills pod processes to reclaim memory when necessary. 
> Consequently, there will be less impact if the OOM killer shows a strong 
> preference against killing guaranteed tasks.
>     
>     My intuition is that we should not set the OOM score adjustment for 
> non-bursting processes. Even if we leave it at zero, guaranteed tasks will 
> still be treated preferentially with respect to bursting tasks, since all 
> bursting tasks will have an adjustment greater than zero.

I agree that guaranteed tasks will be treated preferentially with respect to 
bursting tasks, but I am thinking about the guaranteed tasks v.s. the 
non-Mesos-tasks, let's say two guaranteed tasks running on a node, and each of 
them's memory request/limit is half of the node's memory, and both of them has 
almost used all of its memory request/limit, so their OOM score will be very 
high (like 490+). Now if a non-mesos-task (e.g., a system component or even 
Mesos agent itself) tries to use a lot of memory suddenly, the node will be 
short of memory, and then OOM killer will definitely kill one of the two 
guaranteed tasks since their OOM score are the top 2 in the node. But I do not 
think K8s will have this issue since the guaranteed containers OOM score adjust 
is -998.


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71944/#review219158
-----------------------------------------------------------


On Jan. 8, 2020, 11:28 p.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71944/
> -----------------------------------------------------------
> 
> (Updated Jan. 8, 2020, 11:28 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10048
>     https://issues.apache.org/jira/browse/MESOS-10048
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set container process's OOM score adjust.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.hpp 
> 27d88e91fb784179effd54781f84000fe85c13eb 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 
> 0896d37761a11f55ba4b866d235c3bd2b79dcfba 
> 
> 
> Diff: https://reviews.apache.org/r/71944/diff/3/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>

Reply via email to