----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72216/#review219999 -----------------------------------------------------------
src/master/validation.cpp Lines 1839-1841 (patched) <https://reviews.apache.org/r/72216/#comment308245> What if the task does not set `ContainerInfo` at all? In that case, I think task's `share_cgroups` is also true (since it is true by default) and we do not want it to have resource limits set as well, right? src/master/validation.cpp Lines 1839-1845 (patched) <https://reviews.apache.org/r/72216/#comment308246> So we have this validation only for the tasks within a task group but not for the tasks launched by `LAUNCH` operation? That means for the task whose `share_cgroups` is true, if it is launched by `LAUNCH_GROUP` then it is not allowed to specify resource limits, but if it is launched by `LAUNCH` then it is allowed to specify resource limits, this may confuse users, do we want to make them consistent? Like always allow to specify resource limits for any tasks. If we do this, then I think we may need to update the comments of the `TaskInfo.limits` and `Task.limits` protobuf message fields, since that field may have different meaning depending on the value of `share_cgroups`, like if `share_cgroups==false`, it is the limits of the task itself, otherwise the task may share limits with other tasks launched by the same executor. src/master/validation.cpp Lines 2030-2036 (patched) <https://reviews.apache.org/r/72216/#comment308239> I think we can change this block to: ``` bool taskShareCgroups = (container.isSome() && container->has_linux_info() && container->linux_info().has_share_cgroups()) ? container->linux_info().share_cgroups() : true; ``` src/master/validation.cpp Lines 2054-2055 (patched) <https://reviews.apache.org/r/72216/#comment308240> A newline between. src/master/validation.cpp Lines 2059-2064 (patched) <https://reviews.apache.org/r/72216/#comment308241> I see the caller of this method already have a slave pointer `Slave* slave`, so maybe we can just pass `slave->id` into this method instead of getting it from task? And in another hand, it seems we do not validate a task group must contain at least one task (looks like an existing bug), so if framework launches an empty task group, `CHECK_SOME(slaveId);` will fail. src/master/validation.cpp Lines 2072 (patched) <https://reviews.apache.org/r/72216/#comment308242> Is it possible that `shareCgroups.isSome()` is false at this point? I think as long as the task cgroup is not empty (at least contains one task), `shareCgroups.isSome()` must be true, right? So maybe we should do `CHECK_SOME(shareCgroups)`? src/master/validation.cpp Lines 2073 (patched) <https://reviews.apache.org/r/72216/#comment308243> I think we should not check all tasks of the framework, instead we should only check the tasks launched by the same executor on this specific slave for this specific framework. Maybe we should go through `slave->tasks.at(framework.id())` and check each task launched by the same executor. Or we could do this validation in Mesos agent? - Qian Zhang On March 19, 2020, 10:10 a.m., Greg Mann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72216/ > ----------------------------------------------------------- > > (Updated March 19, 2020, 10:10 a.m.) > > > Review request for mesos, Andrei Budnik and Qian Zhang. > > > Bugs: MESOS-10045 > https://issues.apache.org/jira/browse/MESOS-10045 > > > Repository: mesos > > > Description > ------- > > Added master validation for task resource limits and shared cgroups. > > > Diffs > ----- > > src/master/validation.cpp 084f281eadd65cb8ae0a19b7b7797dc71ccebdd2 > > > Diff: https://reviews.apache.org/r/72216/diff/5/ > > > Testing > ------- > > > Thanks, > > Greg Mann > >