> On Oct. 3, 2017, 1:19 a.m., Gilbert Song wrote: > > src/slave/containerizer/mesos/provisioner/docker/metadata_manager.cpp > > Lines 209-211 (patched) > > <https://reviews.apache.org/r/56721/diff/7/?file=1791108#file1791108line209> > > > > three scenarios: > > 1. checkpointed cache is cleaned up for some reasons > > 2. the `activeImages` contains `excludedImages`, which the operator > > specifys some images that are never pulled before on this agent. > > > > and I just think of the 3rd case: > > 3. if a couple big images are still being pulled (pulling started > > before `prune()`), it means thay are not in `storedImages` yet. > > 4. same as #3, we just unlock the store and resume the image pulling, > > then the requests in queue will be executed simultanuously, but another > > prune() call come in from the operator for some reason. > > > > for the case of #3 and #4, we may need to fix: > > > > https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/provisioner/docker/registry_puller.cpp#L384~#L388 > > > > because container launch may fail if any of the layers are included in > > those ongoing pull but still get marked from pre-existed unused images.
Case 1 means an operator or unknown manually played with checkpointed cache, so there is nothing we can really do; Case 2 is an acceptable case IMO; Case 3 and case 4 could be real problem. One straightfoward fix I can imagine is to require some "mutual exclusive" between any image pulling and pruning. For example, if `store` class received a `pruning` request, we simply lock the store but wait until ongoing image pulling to finish (and not admit any future image pulling until this pruning finishes its marking phase). I understand this has some throughput implication but it seems easiest to implement. What do you think? - Zhitao ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56721/#review186320 ----------------------------------------------------------- On Oct. 3, 2017, 5:13 p.m., Zhitao Li wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56721/ > ----------------------------------------------------------- > > (Updated Oct. 3, 2017, 5:13 p.m.) > > > Review request for mesos, Gilbert Song, Jason Lai, and Jie Yu. > > > Bugs: MESOS-4945 > https://issues.apache.org/jira/browse/MESOS-4945 > > > Repository: mesos > > > Description > ------- > > This includes the following changes: > - add a `pruneImages()` function on the chain of relevant classes; > - implement prune in docker store; > - fix mock interface to keep existing tests pass. > > > Diffs > ----- > > src/slave/containerizer/composing.hpp > 06d68eef5de7745e32f0e808f11016bcc285dd8f > src/slave/containerizer/composing.cpp > 587f009384f0c7ef87482686578dc822d3d5b208 > src/slave/containerizer/containerizer.hpp > 449bb5d0902936faae7bf9bae9c703b219aed842 > src/slave/containerizer/docker.hpp b602a5698cae12686f51c4b9370a06042cda6270 > src/slave/containerizer/docker.cpp 292eecbca246edf068ec8c262aff4f3ce9cd8c67 > src/slave/containerizer/mesos/containerizer.hpp > cc23b4d91be16fc95a131c09d07378b801e34d6f > src/slave/containerizer/mesos/containerizer.cpp > 4d5dc13f363f5d8886983d7dd06a5cecc177c345 > src/slave/containerizer/mesos/provisioner/docker/metadata_manager.hpp > 954da1681778878c8aff6150002e52ecb648d1bb > src/slave/containerizer/mesos/provisioner/docker/metadata_manager.cpp > d86afd2a6ff0bf87e624db1c99255c85068bf6ab > src/slave/containerizer/mesos/provisioner/docker/paths.hpp > 232c027f8f96da0cb30b957bce4607d3695050d2 > src/slave/containerizer/mesos/provisioner/docker/paths.cpp > cd684b33eb308ce1eeb4539a5b2d51985d835db7 > src/slave/containerizer/mesos/provisioner/docker/store.hpp > 1cf68665d33bd40a7605d26c96fb7b618407fdd0 > src/slave/containerizer/mesos/provisioner/docker/store.cpp > f357710cb19aec3654b0604f7909d068eaf20095 > src/slave/containerizer/mesos/provisioner/provisioner.hpp > 7cba54ce490d1e6e17081cd7e04fd6759ceddb8e > src/slave/containerizer/mesos/provisioner/provisioner.cpp > 450a3b32d69d2882973a6ed4e94e169a0256056b > src/slave/containerizer/mesos/provisioner/store.hpp > 01ab83dca79e51b8c96d18ee65705912b0ac8324 > src/slave/containerizer/mesos/provisioner/store.cpp > cc5cc81e05f29bb0e11ffa13cdb8d63d4397114f > src/tests/containerizer.hpp a778b8581904bacea9eec3ff50c3c009959b5dac > src/tests/containerizer.cpp cd140f4263621a0a33a34b7e062a9ca6cf426e7a > src/tests/containerizer/mock_containerizer.hpp > 0adcb01e6c12d6cc4abed1f14fa2df833ffc6569 > > > Diff: https://reviews.apache.org/r/56721/diff/9/ > > > Testing > ------- > > > Thanks, > > Zhitao Li > >
