[ https://issues.apache.org/jira/browse/MESOS-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019733#comment-17019733 ]
Charles commented on MESOS-1807: -------------------------------- Is there any way I could help this move forward? I just got bitten by this where my custom executor would lead to random errors described as [~vinodkone] "when the last task on the executor finishes and Containerizer::update() is called with 0 cpus or 0 mem.". See for example https://github.com/mesos/chronos/issues/428 {noformat} ec2-__-___-___-___.compute-1.amazonaws.com E0414 00:41:50.864876 29069 slave.cpp:2344] Failed to update resources for container 867bfec1-ac28-4a4f-8904-3404e6d1e3e9 of executor shell-wrapper-executor running task ct:1428972109061:0:my-chronos-job on status update for terminal task, destroying container: Collect failed: No cpus resource given {noformat} In the mean time what's the proper workaround? Always define CPU and memory resources for the executor? It's a bit annoying because it effectively means arbitrarily limiting the CPU usage of the task (e.g. if there's 1 core and we allocate 0.01 CPU to the executor, we only have 0.99 left for the task), but I guess there's no really any way around that. Maybe [~bmahler] has an idea? > Disallow executors with cpu only or memory only resources > --------------------------------------------------------- > > Key: MESOS-1807 > URL: https://issues.apache.org/jira/browse/MESOS-1807 > Project: Mesos > Issue Type: Improvement > Reporter: Vinod Kone > Priority: Major > Attachments: Screenshot 2015-07-28 14.40.35.png > > > Currently master allows executors to be launched with either only cpus or > only memory but we shouldn't allow that. > This is because executor is an actual unix process that is launched by the > slave. If an executor doesn't specify cpus, what should the cpu limits be for > that executor when there are no tasks running on it? If no cpu limits are set > then it might starve other executors/tasks on the slave violating isolation > guarantees. Same goes with memory. Moreover, the current > containerizer/isolator code will throw failures when using such an executor, > e.g., when the last task on the executor finishes and Containerizer::update() > is called with 0 cpus or 0 mem. > According to a source code [TODO | > https://github.com/apache/mesos/blob/0226620747e1769434a1a83da547bfc3470a9549/src/master/validation.cpp#L400] > this should also include checking whether requested resources are greater > than MIN_CPUS/MIN_BYTES. -- This message was sent by Atlassian Jira (v8.3.4#803005)