[
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623136#comment-14623136
]
Jie Yu commented on MESOS-2652:
-------------------------------
{quote}E.g., high share ratio, revokable is idle, non-revokable consumes a ton
of cpu time (more than, say, the 1000:1 ratio), then goes idle, revokable then
has something to do and starts running ==> now what happens if the
non-revokable wants to run? Won't the revokable task continue to run until the
share ratio is equalized?{quote}
As far as I know, there's no such preemption mechanism exists in the kernel
that we can use. Real time priority allows preemption, but realtime priority is
not compatible with cgroups
(http://www.novell.com/support/kb/doc.php?id=7012851).
{quote}I don't know the answer without reading the scheduler source code but
given that my assumption about SCHED_IDLE turned out to be incomplete/incorrect
then let's understand the preemption behavior before committing another
incorrect mechanism{quote}
Yeah, I am using the benchmark I mentioned above to see if the new hierarchy
works as expected or not. I'll probably add another latency benchmark (e.g.
http://parsa.epfl.ch/cloudsuite/memcached.html) to see if latency will be
affected or not.
But given that we don't have a way to allow kernel preempts revocable tasks,
setting shares seems to be the only solution.
> Update Mesos containerizer to understand revocable cpu resources
> ----------------------------------------------------------------
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
> Issue Type: Task
> Reporter: Vinod Kone
> Assignee: Ian Downes
> Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png,
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal
> performance with 3 additional revocable tasks (4).png, Abnormal performance
> with 3 additional revocable tasks (5).png, Abnormal performance with 3
> additional revocable tasks (6).png, Abnormal performance with 3 additional
> revocable tasks (7).png, Performance improvement after reducing cpu.share to
> 2 for revocable tasks (1).png, Performance improvement after reducing
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance
> improvement after reducing cpu.share to 2 for revocable tasks (4).png,
> Performance improvement after reducing cpu.share to 2 for revocable tasks
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for
> revocable tasks (7).png, Performance improvement after reducing cpu.share to
> 2 for revocable tasks (8).png, Performance improvement after reducing
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split
> (TBD). Containers would be present in only one of the subtrees. CFS quotas
> will *not* be set on subtree roots, only cpu.shares. Each container would set
> CFS quota and shares as done currently.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)