Re: Review Request 17567: Added External Containerizer.

Till Toenshoff Wed, 23 Apr 2014 19:42:12 -0700


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/examples/python/test_containerizer.py, line 50
> > <https://reviews.apache.org/r/17567/diff/20/?file=562057#file562057line50>
> >
> >     s/use/usage/?


That function name is already used. 


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.hpp, lines 48-50
> > <https://reviews.apache.org/r/17567/diff/20/?file=562060#file562060line48>
> >
> >     Just FYI, if we ever decide to modify the interface of 
> > Containerizer::usage,wait,destroy then we'll effectively break external 
> > containerizer programs. If we actually had a 
> > containerizer::Usage,Wait,Destroy protobuf then we could likely add 
> > optional fields to them as a way of extending the interface in the future.

That is a very good point I did not see earlier - adding these to 
containerizer.proto now.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.hpp, lines 219-223
> > <https://reviews.apache.org/r/17567/diff/20/?file=562060#file562060line219>
> >
> >     Just like 'isDone', let's pull this into the .cpp file directly (i.e., 
> > not a member of ExternalContainerizerProcess). The big benefit here is that 
> > it's helpful to a code reader to know that this function only relies on 
> > it's parameters. Making this a static function helps a bit with this, but 
> > then we still have to think about the static members and we "pollute" the 
> > header file with this non-interface facing function. Make sense?

Yes, indeed a much better design.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, lines 229-230
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line229>
> >
> >     My guess is that an external containerizer program will want more than 
> > just the path to mesos-executor but also the path to mesos-fetcher in order 
> > to fetch packages? Perhaps this should be 'mesos_libexec_directory' instead 
> > (which just equals flags.launcher_dir)? Or something else to describe where 
> > mesos-* programs are installed?

Yes, good point.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, line 679
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line679>
> >
> >     In case you didn't know about it, there's also VLOG_IF:
> >     
> >     VLOG_IF(sandbox.user.isSome(), 2) << "user: " << sandbox.user.get();

Neat, ty


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/examples/python/test_containerizer.py, line 80
> > <https://reviews.apache.org/r/17567/diff/20/?file=562057#file562057line80>
> >
> >     Why not use sys.stdout.write like you use sys.stdin.read above?

Historic reasons that do no apply in this version anymore. Fixing....


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/examples/python/test_containerizer.py, line 218
> > <https://reviews.apache.org/r/17567/diff/20/?file=562057#file562057line218>
> >
> >     Do you ever actually destroy the process? Will tests leave around 
> > orphans?

Good point, fixing that.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.hpp, line 179
> > <https://reviews.apache.org/r/17567/diff/20/?file=562060#file562060line179>
> >
> >     Did you mean s/processes/containers/ ?

Yes :)


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.hpp, lines 213-215
> > <https://reviews.apache.org/r/17567/diff/20/?file=562060#file562060line213>
> >
> >     This doesn't rely on anything from ExternalContainerizerProcess, so 
> > let's pull it out and put it in the .cpp as a static function.

Aye


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.hpp, line 269
> > <https://reviews.apache.org/r/17567/diff/20/?file=562060#file562060line269>
> >
> >     It would be nice to move 'containerExecutorInfo' into 
> > ExternalContainerizer as a static function. Then uses of it (for example in 
> > tests) read as:
> >     ExternalContainerizer::containerExecutorInfo.
> >     
> >     That's more explicit because you see that you're getting an 
> > ExecutorInfo as deemed by the ExternalContainerizer.

Aye


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, lines 51-52
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line51>
> >
> >     This doesn't look like it's used?

Orphaned, ty!


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, lines 184-185
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line184>
> >
> >     FYI, our style is to keep '+' on previous line and indent as though it 
> > was a continued argument. Can you please do a quick pass and clean this up 
> > everywhere? Thanks Till!

return Failure("Cannot start already running container '" +
               containerId.value() + "'");

It is, as a continued argument gets indented matching the argument start, 
correct?


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, lines 196-198
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line196>
> >
> >     Would this be generally useful to move to 'executorEnvironment'? How 
> > about a TODO to do this in the future?

Yes.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, lines 205-206
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line205>
> >
> >     My two cents, the default container image would be better shared as an 
> > environment variable, i.e., MESOS_DEFAULT_CONTAINER_IMAGE. My thinking here 
> > is that the external containerizer program might actually like to know 
> > whether or not a default image was included in the task and if we set it 
> > then it will have no way of knowing. Is deimos actually assuming this will 
> > happen right now? We could always change this in the future but it'll be 
> > breaking if deimos assumes this now. Thoughts?

An environment variable seems to be fine for this purpose.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, lines 249-250
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line249>
> >
> >     So what happens if the slave discards the returned future? I guess in 
> > these semantics the subprocess might stay running forever (if there was a 
> > bug). How about a TODO above each of these calls to 'then' that suggests 
> > adding an 'onDiscard' callback? Oh how I wish we had C++11 lambdas! :-/

Fixing this specific case by inserting an onAny which checks if the future is 
ready. If that future is not ready (discarded, failed), then this callback will 
terminate the given containerId. And yes, proper lambdas would be such a code 
savior.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, line 276
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line276>
> >
> >     The 'isDone' function is a little misleading to me as the future is 
> > definitely done since you did a 'then' above. In fact, I was a little 
> > surprised that '_launch' wasn't taking a Future<Option<int>> instead of an 
> > Option<int>.
> >     
> >     Given the comment at the declaration of 'isDone' (in the header) I 
> > wonder if a better name here might be 'validate'? We used 'verify' in 
> > src/linux/cgroups.cpp code and an Option<Error> instead of a Try<Nothing>. 
> > That pattern was pretty clear to read:
> >     
> >     Option<Error> error = validate(status);
> >     if (error.isSome()) {
> >       return Failure(error.get());
> >     }
> >     
> >     Taking a step back a bit, maybe you wanted '_launch' to be invoked for 
> > an 'onAny' not a 'then'? That still doesn't change too much in my mind, you 
> > still know that the future is "done", you just don't know if the future is 
> > a failure or not, hence the suggestion for 'validate' or 'verify'.

Yes, much better.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, line 287
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line287>
> >
> >     I think the biggest question I have architecturally here is why the 
> > external containerizer program isn't returning the ExecutorInfo (and thus, 
> > also determining what the ExecutorInfo should even be). In this case it's 
> > especially weird because the ExecutorInfo that gets created is never 
> > actually passed to the external containerizer program! It looks like maybe 
> > this was done because we needed an ExecutorInfo in order to call 
> > 'executorEnvironment', but we could have easily refactored that. Maybe for 
> > now we should add a TODO here ...

Adapted towards latest related commits.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, line 300
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line300>
> >
> >     What if a launch fails!? IIUC a failure would mean that 'launched' 
> > stays in pending forever because we do a 'then' in 'launch' which means 
> > '_launch' will not get invoked on a failure ... thus 'launched' will stay 
> > pending forever.

Should be fixed now.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, line 580
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line580>
> >
> >     So what I understand here is that terminate is really just about 
> > killing the subprocess that was invoked in order to wait on a container, 
> > but passing 'destroy' to the external containerizer program is really what 
> > does the termination/cleanup of the container. The comments here make it 
> > sound like you're actually terminating the container when you're just 
> > killing the process that is waiting from the 'wait' subcommand. Just want 
> > to make sure that is what you expect as well, and if that's the case then 
> > how about we clean up some of the comments here (and where 'terminate' gets 
> > called as well)? In fact, what about calling this 'unwait' instead of 
> > 'terminate' since that's being explicit about what it's really doing?
> >     
> >     Note that an external containerizer program might try and do something 
> > fancy when/if it gets a signal here (at least if it wasn't a SIGKILL and it 
> > could react to this signal), but I don't think we should tell people to 
> > rely on this, but instead to rely on getting a 'destroy' in order to 
> > actually terminate/cleanup a container. Make sense? Or did you have 
> > something else in mind?

Correct, it is basically an unwait and that is what I shall rename it to.


> On April 19, 2014, 9:33 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/external_containerizer.cpp, line 607
> > <https://reviews.apache.org/r/17567/diff/20/?file=562061#file562061line607>
> >
> >     Just so I understand correctly, the reason you're not calling 'cleanup' 
> > here is because you assume that __wait will get invoked and you'll call 
> > cleanup then? That might be a nice comment to leave behind for somebody! ;)

Aye.


- Till


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17567/#review40828
-----------------------------------------------------------


On April 24, 2014, 2:40 a.m., Till Toenshoff wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17567/
> -----------------------------------------------------------
> 
> (Updated April 24, 2014, 2:40 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ian Downes, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-816
>     https://issues.apache.org/jira/browse/MESOS-816
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This patch adds the so-called external containerizer. This
> containerizer delegates all containerizer calls directly to 
> an external containerizer program (which can be specified on 
> start-up).
> 
> The protocol for the interactions with the external program
> is as follows:
> 
> COMMAND < INPUT-PROTO > RESULT-PROTO
> 
> launch < Launch
> update < Update
> usage < ContainerID > ResourceStatistics
> wait < ContainerID > Termination
> destroy < ContainerID
> 
> When protocol buffers need to be provided, the Mesos side of
> the external containerizer implementation will serialize the 
> protos on stdin and vice-versa for reading protos on stdout as 
> drafted in the above scheme.
> 
> 
> Diffs
> -----
> 
>   configure.ac c964452 
>   include/mesos/containerizer/containerizer.proto 6ecd82e 
>   src/Makefile.am 364d63b 
>   src/examples/python/test-containerizer.in PRE-CREATION 
>   src/examples/python/test_containerizer.py PRE-CREATION 
>   src/slave/containerizer/containerizer.hpp 9a50fba 
>   src/slave/containerizer/containerizer.cpp 344872a 
>   src/slave/containerizer/external_containerizer.hpp PRE-CREATION 
>   src/slave/containerizer/external_containerizer.cpp PRE-CREATION 
>   src/slave/containerizer/mesos_containerizer.cpp 722f3fe 
>   src/slave/flags.hpp d5c54c0 
>   src/tests/external_containerizer_test.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17567/diff/
> 
> 
> Testing
> -------
> 
> make check and functional testing.
> 
> 
> Thanks,
> 
> Till Toenshoff
> 
>

Re: Review Request 17567: Added External Containerizer.

Reply via email to