> On June 2, 2014, 11:57 a.m., Ben Mahler wrote: > > Is having the slave at the same level as containers the long term strategy > > here? > > > > Doesn't the EC need the same fix?
Yes, the long term fix is to have all containers grouped under "containers" which is at the same hierarchy level as the "slave" cgroup. No, at least not within the Mesos side of the EC. During recovery the EC queries externally for the list of containers and therefore potential orphans, rather than examining all the cgroups as the MC isolators do. - Ian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22156/#review44528 ----------------------------------------------------------- On June 2, 2014, 11:50 a.m., Ian Downes wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/22156/ > ----------------------------------------------------------- > > (Updated June 2, 2014, 11:50 a.m.) > > > Review request for mesos, Ben Mahler and Vinod Kone. > > > Bugs: MESOS-1449 > https://issues.apache.org/jira/browse/MESOS-1449 > > > Repository: mesos-git > > > Description > ------- > > Do not consider the slave cgroup (from --slave_subsystems) as an orphan > during recover(). > > > Diffs > ----- > > src/slave/containerizer/isolators/cgroups/cpushare.cpp > b494a9236210245383e20fa9ab3dbac01e42f8dd > src/slave/containerizer/isolators/cgroups/mem.cpp > 6324dcd288975872c26685c713910d778def4e10 > > Diff: https://reviews.apache.org/r/22156/diff/ > > > Testing > ------- > > Manually verified that log message ""Removing orphaned cgroup ... " did not > appear in any slave recovery tests when --slave_subsystems was enabled. > > > Thanks, > > Ian Downes > >
