> On Sept. 6, 2016, 7:38 p.m., Qian Zhang wrote: > > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp, lines 290-291 > > <https://reviews.apache.org/r/51631/diff/1/?file=1491031#file1491031line290> > > > > I do not think we need this comment because I think if recover fails, > > the agent will exit, so we do not have chance (or actually do not need) to > > do any cleanup. > > haosdent huang wrote: > We call cleanup before return `Failure` in `__recover`, I think this > comment still correct here?
I took a look at `__recover()` again, and I see in this method, we will not call `cleanup()` before returning `Failure`: https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp#L268:L277 > On Sept. 6, 2016, 7:38 p.m., Qian Zhang wrote: > > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp, lines 486-492 > > <https://reviews.apache.org/r/51631/diff/1/?file=1491031#file1491031line486> > > > > Here we may assign pid to cgroup for a single hierarchy multiple times. > > For example, in the case of CPU: > > ``` > > /cgroup/cpu,cpuacct -> cpu > > /cgroup/cpu,cpuacct -> cpuacct > > ``` > > With your logic here, we will call `cgroups::assign()` twice for the > > hierarchy `/cgroup/cpu,cpuacct`. > > haosdent huang wrote: > Because we have `break` above, so this would not happen. Yes, you are right, thanks! > On Sept. 6, 2016, 7:38 p.m., Qian Zhang wrote: > > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp, lines 408-411 > > <https://reviews.apache.org/r/51631/diff/1/?file=1491031#file1491031line408> > > > > Why moving these code here? Can you please let me know what is the > > problem if we still keep these code in its original location? > > haosdent huang wrote: > Suppose we failed at > ``` > if (containerConfig.has_user()) { > Try<Nothing> chown = os::chown( > containerConfig.user(), > path, > false); > > if (chown.isError()) { > return Failure( > "Failed to chown the cgroup at " > "'" + path + "': " + chown.error()); > } > ``` > > but we have > ``` > Try<Nothing> create = cgroups::create( > hierarchy, > infos[containerId]->cgroup); > ``` > before. > > Then the cgroup would not be destroyed if we don't > `infos[containerId]->subsystems.insert(subsystem->name());`. Got it! Then what if we fail right after `cgroups::create()` but before `infos[containerId]->subsystems.insert();`, in this case, the cgroup will not be destroyed too. So I think we may need to do `infos[containerId]->subsystems.insert();` right after the new Info structure is created, like below: ``` infos[containerId] = Owned<Info>(new Info( containerId, path::join(flags.cgroups_root, containerId.value()))); foreachvalue (const Owned<Subsystem>& subsystem, subsystems) { infos[containerId]->subsystems.insert(subsystem->name()); } ``` - Qian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51631/#review147806 ----------------------------------------------------------- On Sept. 12, 2016, 10:49 a.m., haosdent huang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51631/ > ----------------------------------------------------------- > > (Updated Sept. 12, 2016, 10:49 a.m.) > > > Review request for mesos, Gilbert Song, Jie Yu, and Qian Zhang. > > > Bugs: MESOS-6063 > https://issues.apache.org/jira/browse/MESOS-6063 > > > Repository: mesos > > > Description > ------- > > Recover newly added cgroups subsystems on existing containers would > fail, and continue to perform the `update` and other operations of > the newly added subsystems on them don't make sense. This patch add > the tracking for the recovered or prepared cgroups subsystems of a > container and skip performing unnecessary subsystem operations on the > container if the subsystem is never recovered or prepared. > > > Diffs > ----- > > src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp > 38d1428f5425566502747d2a8394e246e0b3fd9e > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp > 8b6dfde366caf82d30afb891c8f1337ceed12157 > > Diff: https://reviews.apache.org/r/51631/diff/ > > > Testing > ------- > > > Thanks, > > haosdent huang > >
