> On Feb. 25, 2017, 1:24 a.m., Joseph Wu wrote:
> > This is definitely the direction we want to head towards (Job objects, 
> > without leaking handles).
> > 
> > But, due to the partial duplication of logic in: 
> > https://reviews.apache.org/r/56366/
> > I think we should move much of this logic into libprocess.
> > 
> > Here's the gist of what should be done:
> > 
> > * We'll want a new, Windows-only actor that is created in 
> > `process::initialize`.  This will look very similar to the 
> > `GarbageCollector` process in `include/process/gc.hpp`.  Something like:
> > ```
> > class JobObjectManager : public Process<JobObjectManager>
> > {
> > public:
> >   JobObjectManager() : ProcessBase("__job_object_manager__") {}
> >   virtual ~JobObjectManager() {}
> > 
> >   // NOTE: Maybe you'll want the actor to create the JobObject itself.
> >   // You could even move the CREATE_JOB parent hook into this file.
> >   void manage(const pid_t pid, const std::string& name, const SharedHandle& 
> > handle)
> >   {
> >     objects[pid] = {name, handle};
> >     
> >     process::reap(pid)
> >       .onAny(lambda::bind(&Self::cleanup, lambda::_1, pid));
> >   }
> > 
> > protected:
> >   virtual void cleanup(Future<Option<int>> exit_code, const pid_t pid)
> >   {
> >     CHECK(!exit_code.isPending());
> >     CHECK(!exit_code.isDiscarded());
> > 
> >     Try<Nothing> killJobResult = os::kill_job(objects[pid].handle.get());
> >     CHECK(!killJobResult.isError())
> >       << "Failed to kill job object: " + killJobResult.error();
> > 
> >     objects.erase(pid);
> >   }
> > 
> > private:
> >   struct JobData {
> >     std::string name;
> >     SharedHandle handle;
> >   }
> >   std::map<pid, JobData> objects;
> > };
> > ```
> > * You can either expose the actor's methods via `process.hpp` or via moving 
> > the ParentHook into the new actor.
> > * The WindowsLauncher would become a thin wrapper around the new actor.
> > * https://reviews.apache.org/r/56366/ should become much simpler.

Hm, okay, I like your idea. Let's chat and go over the design you're proposing.


> On Feb. 25, 2017, 1:24 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/windows_launcher.cpp, lines 50-62
> > <https://reviews.apache.org/r/56362/diff/4/?file=1630626#file1630626line50>
> >
> >     Erm... saving the set of PIDs in a prior loop guarantees that the 
> > `pids.contains(pid)` condition will be met.

containers != containerStates

The containers is what's already tracked in the launcher; the states is what's 
being recovered from disk.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56362/#review166640
-----------------------------------------------------------


On Feb. 11, 2017, 2:04 a.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56362/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2017, 2:04 a.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-6892
>     https://issues.apache.org/jira/browse/MESOS-6892
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Instead of deriving `WindowsLauncher` from `PosixLauncher`,
> this commit implements a separate `WindowsLauncher` derived
> from `Launcher` parallel to the `PosixLauncher`.
> 
> This launcher is then refactored to use Windows' "Job Objects," which
> are similar to Linux's cgroups, and enable us to reason about a group
> of processes associated with a task/container as a "Job Object"
> instead of a root PID and the tree containing its children. The latter
> is not a reasonable approach on Windows, and has been the source of
> subtle bugs.
> 
> The Job Object approach creates a named job with a one-to-one mapping
> to the containerizer, and assigns the first process started for the
> task to the job object. After being assigned, the Windows kernel
> ensures all spawned child processes
> (specifically those made with the `CreateProcess` system call) are
> also assigned to the named job object. Thus this job object can then
> be used to query the process group's resource usage, kill the process
> group, and set limits on the process group.
> 
> So instead of seeing a process group as a tree and referring to it by
> the root process's PID, the `WindowsLauncher` sees a process group as
> a named Job Object, the same way the Windows kernel sees it. However,
> the containerizer code which interacts with the launcher still refers
> to a task group by the singular PID, and so we have a sort of shim
> which maps the initial PID to the name of the job object. This is a an
> unfortunate consequence of the shared containerizer code being
> originally written for POSIX-like systems.
> 
> This abstraction sets us up for implementing resource usage limits on
> the process group as a whole.
> 
> 
> Diffs
> -----
> 
>   src/CMakeLists.txt 3a4ace9c8011ac8eec5067cd085fa7fe4166b9ee 
>   src/slave/containerizer/mesos/containerizer.cpp 
> d2b4f75a55dbe4746bc2dfc180335fa831a554ef 
>   src/slave/containerizer/mesos/launcher.hpp 
> 79f6eea0ee8e564e90b36208672df150dbc5d540 
>   src/slave/containerizer/mesos/launcher.cpp 
> 5114c130efbfb252dde1e85c081f5174e66f57af 
>   src/slave/containerizer/mesos/windows_launcher.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/windows_launcher.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/56362/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>

Reply via email to