> On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote: > > src/slave/containerizer/mesos/containerizer.cpp, lines 711-713 > > <https://reviews.apache.org/r/52446/diff/1/?file=1517218#file1517218line711> > > > > Can you introduce a helper in paths.hpp|cpp: > > > > ``` > > paths::getContainerTerminationPath(...); > > ```
Sure. None of the other files have this currently though (i.e. pid, status). Maybe we should commit this patch without this helper and then do a separate patch that adds this helper for all checkpointed files. I will hold off on this one until you confirm. > On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote: > > src/slave/containerizer/mesos/containerizer.cpp, lines 1699-1705 > > <https://reviews.apache.org/r/52446/diff/1/?file=1517218#file1517218line1699> > > > > Let's do that only for nested container. It should still work as is (since top-level containers will simply never have this file), but it's probably better to be explicit. I'll change it. > On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote: > > src/slave/containerizer/mesos/containerizer.cpp, lines 2197-2203 > > <https://reviews.apache.org/r/52446/diff/1/?file=1517218#file1517218line2197> > > > > NO need for this? I think checkpoint supports a Protobuf Message > > directly. Cool. I didn't realize this. > On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote: > > src/slave/containerizer/mesos/containerizer.cpp, line 2211 > > <https://reviews.apache.org/r/52446/diff/1/?file=1517218#file1517218line2211> > > > > Totally orthogonol issue: I am wondering if we should prevent people > > from creating nested container under a legacy container? It would definitely be a bug if they did because no runtime directory will exist for legacy containers. It would be a simple check in launch to verify this. > On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote: > > src/tests/containerizer/nested_container_tests.cpp, line 32 > > <https://reviews.apache.org/r/52446/diff/1/?file=1517221#file1517221line32> > > > > Please avoid this. Use 'using' explicitly Yeah, this must have been copied in. I will be more careful next time. > On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote: > > src/tests/containerizer/nested_container_tests.cpp, line 59 > > <https://reviews.apache.org/r/52446/diff/1/?file=1517221#file1517221line59> > > > > s/false/true/ Are we making this tru for all tests going forward? I see a mix of false/true in existing tests. > On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote: > > src/tests/containerizer/nested_container_tests.cpp, lines 95-97 > > <https://reviews.apache.org/r/52446/diff/1/?file=1517221#file1517221line95> > > > > No need for this. Copy and paste leftovers... > On Sept. 30, 2016, 9:52 p.m., Jie Yu wrote: > > src/tests/containerizer/nested_container_tests.cpp, line 116 > > <https://reviews.apache.org/r/52446/diff/1/?file=1517221#file1517221line116> > > > > This looks redundant? The fact the first 'wait' returns means that > > container has been destroyed. Yeah, it is. I only wrote it to be explicit (it actually always ends up logging that the container is already destroyed). Maybe it's more confusing this way though. - Kevin ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52446/#review151092 ----------------------------------------------------------- On Sept. 30, 2016, 11:08 p.m., Kevin Klues wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/52446/ > ----------------------------------------------------------- > > (Updated Sept. 30, 2016, 11:08 p.m.) > > > Review request for mesos, Gilbert Song, Jie Yu, and Vinod Kone. > > > Bugs: MESOS-6287 > https://issues.apache.org/jira/browse/MESOS-6287 > > > Repository: mesos > > > Description > ------- > > Previously, when a nested container was being destroyed, it's runtime > directory was being deleted (just the same as a top-level container). > However, this meant that calling 'wait()' on a previously terminated > nested container would return 'None()' since its status had already > been reaped. The problem with this, however, is that this will cause > an entire pod to be terminated since it thinks that the container it > is calling wait on cannot be found. > > To fix this, we leave the runtime directory of nested containers > around until their top-level containers are destroyed. Additionally, > we checkpiont the entire termination state of the nested container > into its runtime directory, so that subsequent calls to 'wait()' can > retrieve the full termination state for the lifetime of the top-level > container. > > > Diffs > ----- > > src/Makefile.am f093000e0282a8d5ac17e7ba33711690ccdfe68a > src/slave/containerizer/mesos/containerizer.cpp > 522d2c37229b07b66a0824c3e246c32f8d803b10 > src/slave/containerizer/mesos/paths.hpp > 1051c219c55253d03199045b6d2f43377ae93e53 > src/slave/containerizer/mesos/paths.cpp > 6c6b4dcc39fbc00485552caab88457918e622e08 > src/tests/containerizer/nested_container_tests.cpp PRE-CREATION > > Diff: https://reviews.apache.org/r/52446/diff/ > > > Testing > ------- > > GTEST_FILTER="" make -j check > sudo src/mesos-tests > > > Thanks, > > Kevin Klues > >
