> On Sept. 8, 2016, 9:21 p.m., Jie Yu wrote: > > src/slave/containerizer/mesos/provisioner/provisioner.cpp, lines 194-197 > > <https://reviews.apache.org/r/51402/diff/3/?file=1493283#file1493283line194> > > > > I realized that it's not sufficient to just pass in top level orphan > > containers to provisioners/isolators. We also want to know about known > > child containers for both checkpointed containers and orphan containers so > > that provisioners/isolators can cleanup unknown child containers. > > > > Consider the following case: > > 1) containerizer launched a child container A/B under top level > > container A > > 2) isolator prepare finishes for container A/B > > 3) agent crashes before launcher fork is called > > 4) agent recovers > > 5) container A is checkpointed, thus considered alive > > 6) however, provisioners/isolators need to cleanup for container A/B as > > it's unknown to the launcher > > > > Therefore, I suggest we introduce a protobuf 'ContainerRecoverInfo' in > > `include/mesos/slave/containerizer.proto`: > > > > ``` > > message ContainerRecoverInfo { > > repeated ContainerState checkpointed_containers; > > repeated ContainerID orphan_container_ids; // Deprecated. Top level > > orphans. > > repeated COntainerID known_container_ids; // All known containers, > > including child containers. > > } > > ``` > > > > And both Provisioner and Isolator recover interface will take this > > protobuf.
Thanks for being in details. Sorry did not see this comment yesterday. In my local implementation, `ContainerRecoverInfo` is only for isolator::recover(), since we can just change the provisioner::recover interface with only `knownContainers` set without breaking other parts. This reduce the complication in provisioner::recover. - Gilbert ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51402/#review148300 ----------------------------------------------------------- On Sept. 7, 2016, 11:49 a.m., Gilbert Song wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51402/ > ----------------------------------------------------------- > > (Updated Sept. 7, 2016, 11:49 a.m.) > > > Review request for mesos, Benjamin Hindman, Artem Harutyunyan, Jie Yu, Joseph > Wu, and Kevin Klues. > > > Bugs: MESOS-6067 > https://issues.apache.org/jira/browse/MESOS-6067 > > > Repository: mesos > > > Description > ------- > > Added nested container check in provisioner destroy. > > > Diffs > ----- > > src/slave/containerizer/mesos/provisioner/provisioner.cpp > 8e35ff49ec99a242e764095dcfbb541c5e41ec71 > > Diff: https://reviews.apache.org/r/51402/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Gilbert Song > >
