> On Feb. 10, 2017, 12:24 a.m., Joseph Wu wrote: > > src/slave/containerizer/mesos/windows_launcher.cpp, lines 59-60 > > <https://reviews.apache.org/r/56362/diff/3/?file=1627948#file1627948line59> > > > > I recall some talk about dis-allowing certain `std::` helpers on the > > `hashmap` and `hashset` classes. In which case, we'd need to rethink > > https://reviews.apache.org/r/56363 > > > > Anyway, for this method, I'd recommend constructing a `hashset` of PIDs > > and checking/updating this set while recovering. Because O(N). > > Andrew Schwartzmeyer wrote: > Is this really a CPU bottleneck on the agent here? The `PosixLauncher` > does the [same O(n) > operation](https://github.com/andschwa/mesos/blob/master/src/slave/containerizer/mesos/launcher.cpp#L62) > (I point it out because this code is pretty much just a derivation of the > former).
(I don't mind a better operation; where `n` is `containers.size()` and `p` is number of PIDs to check, one time `O(n)` cache of container PIDs to a hashmap upfront, followed by `O(p)` lookups afterward for `O(p)` instead of `O(p*n)`. But then we should do this for the `PosixLauncher` too.) - Andrew ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56362/#review165027 ----------------------------------------------------------- On Feb. 9, 2017, 6:50 p.m., Andrew Schwartzmeyer wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56362/ > ----------------------------------------------------------- > > (Updated Feb. 9, 2017, 6:50 p.m.) > > > Review request for mesos, Alex Clemmer and Joseph Wu. > > > Bugs: MESOS-6892 > https://issues.apache.org/jira/browse/MESOS-6892 > > > Repository: mesos > > > Description > ------- > > Instead of deriving `WindowsLauncher` from `PosixLauncher`, > this commit implements a separate `WindowsLauncher` derived > from `Launcher` parallel to the `PosixLauncher`. > > This launcher is then refactored to use Windows' "Job Objects," which > are similar to Linux's cgroups, and enable us to reason about a group > of processes associated with a task/container as a "Job Object" > instead of a root PID and the tree containing its children. The latter > is not a reasonable approach on Windows, and has been the source of > subtle bugs. > > The Job Object approach creates a named job with a one-to-one mapping > to the containerizer, and assigns the first process started for the > task to the job object. After being assigned, the Windows kernel > ensures all spawned child processes > (specifically those made with the `CreateProcess` system call) are > also assigned to the named job object. Thus this job object can then > be used to query the process group's resource usage, kill the process > group, and set limits on the process group. > > So instead of seeing a process group as a tree and referring to it by > the root process's PID, the `WindowsLauncher` sees a process group as > a named Job Object, the same way the Windows kernel sees it. However, > the containerizer code which interacts with the launcher still refers > to a task group by the singular PID, and so we have a sort of shim > which maps the initial PID to the name of the job object. This is a an > unfortunate consequence of the shared containerizer code being > originally written for POSIX-like systems. > > This abstraction sets us up for implementing resource usage limits on > the process group as a whole. > > > Diffs > ----- > > src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee > src/slave/containerizer/mesos/containerizer.cpp > d2b4f75a55dbe4746bc2dfc180335fa831a554ef > src/slave/containerizer/mesos/launcher.hpp > 79f6eea0ee8e564e90b36208672df150dbc5d540 > src/slave/containerizer/mesos/launcher.cpp > 5114c130efbfb252dde1e85c081f5174e66f57af > src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION > src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION > > Diff: https://reviews.apache.org/r/56362/diff/ > > > Testing > ------- > > > Thanks, > > Andrew Schwartzmeyer > >
