> On Dec. 25, 2014, 12:37 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/composing.hpp, line 68
> > <https://reviews.apache.org/r/28830/diff/1/?file=786165#file786165line68>
> >
> >     This is the biggest design issue I have with this review. We'll only 
> > ever have a single Fetcher in the slave, even after we inject a Fetcher. 
> > There are two things that are not ideal here: (1) passing a pointer between 
> > two actors and (2) reasoning about the lifetime of the Fetcher* that gets 
> > passed between actors. In the current implementation there is a race 
> > between shutting down and deleting the Slave actor and a containerizer that 
> > might be using the Fetcher* that we passed, and I'd like us to eliminate 
> > that dependency.
> >     
> >     So, let's pass the Fetcher* into a containerizer at construction time, 
> > through all of the *Containerizer::create functions. I'm fine with 
> > injecting it as a pointer like we do in the Slave and other places since 
> > reasoning about lifetime using that pattern is very standard (especially in 
> > our codebase).

Yes. Switching to passing the fetcher to containerizer constructors now. 

Keeping in mind that the fetcher will later critically depend on the slave ID, 
to keep cache dirs for different slaves on the same host separate. The slave ID 
is known late in the game when Mesos starts up, only after the slave and its 
tree of members have already been instantiated.


> On Dec. 25, 2014, 12:37 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/fetcher.hpp, line 80
> > <https://reviews.apache.org/r/28830/diff/1/?file=786172#file786172line80>
> >
> >     Why are we passing a pointer here? Are you using this as an "out" 
> > parameter? If so, return something other than Nothing instead! I'd rather 
> > not introduce something where we have to track memory across actors, it's 
> > prone to developer errors.

Yes, a good design principle to avoid passing pointers between actors. Won't do 
it again :-)

The problem at hand is now solved by moving most of the fetcher subprocess 
killing logic into FetcherProcess. Instead of the above pointer we are now 
passing a container ID and then the FetcherProcess, not the container, 
remembers what subprocess belongs to that ID. This further separates the 
containerizer from internal fetcher workings. 

Furthermore, by keeping a mapping from container ID to fetcher subprocess in 
the fetcher actor, there is now also an easy solution to implement shutting 
down all fetcher subprocesses when the slave exits. This also solves a 
TODO(tnachen): The FetcherProcess destructor kills all subprocesses in its list 
(hashmap).


> On Dec. 25, 2014, 12:37 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/fetcher.hpp, lines 110-115
> > <https://reviews.apache.org/r/28830/diff/1/?file=786172#file786172line110>
> >
> >     Does this have any dependency on the actual FetcherProcess 
> > implementation? I don't see a Fetcher::environment(...) function which 
> > means this should be a private function that only the FetcherProcess is 
> > using. The model here is "pimpl", where the FetcherProcess is the 
> > implementation, it's just asynchronous. If you want people to be able to 
> > call 'environment' then you'll need to add a call in Fetcher too. Likewise, 
> > if this doesn't rely on any state from FetcherProcess then let's make it 
> > static, and feel free to move it's implementation directly into Fetcher as 
> > that will make it even more obvious the the reader that this is a pure 
> > function.

Making this a static function in Fetcher. (The only reason why this instance 
method was public was for testing. I did not figure this would count as a 
strong enough refactoring reason.)


> On Dec. 25, 2014, 12:37 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/fetcher.hpp, line 123
> > <https://reviews.apache.org/r/28830/diff/1/?file=786172#file786172line123>
> >
> >     First, these should all be made private since they are not exposed via 
> > the public API in Fetcher. Second, why the s/run/runSubprocess/? The 
> > increased verbosity doesn't appear helpful here: it's private to the 
> > implementation and 'Subprocess' is redundant with the signature given that 
> > it's returning a Subprocess.

Partially dropped: One of them needs to stay public for testing purposes, 
unless we want to rewrite these pre-existing tests.

Fixed: Back to naming it "run".


> On Dec. 25, 2014, 12:37 p.m., Benjamin Hindman wrote:
> > src/tests/fetcher_tests.cpp, line 81
> > <https://reviews.apache.org/r/28830/diff/1/?file=786184#file786184line81>
> >
> >     Ah, this is why 'environment' should be in Fetcher! We don't make 
> > synchronous calls to actors from outside the actor itself.

See above.


> On Dec. 25, 2014, 12:37 p.m., Benjamin Hindman wrote:
> > src/slave/slave.hpp, line 417
> > <https://reviews.apache.org/r/28830/diff/1/?file=786176#file786176line417>
> >
> >     Let's inject the fetcher into the slave like we do the others please, 
> > that will make it easier to test in the future.

Yes! The field has now been eliminated. Only some containerizer processes get 
to hold on to the fetcher and it's injected there.

BTW, why isn't any of these a "* const"?


> On Dec. 25, 2014, 12:37 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/fetcher.cpp, line 91
> > <https://reviews.apache.org/r/28830/diff/1/?file=786173#file786173line91>
> >
> >     This looks like it should be the return value! You can let the caller 
> > actually wait on the exit status of the subprocess itself (they have to 
> > wait on receiving Nothing anyway, so this isn't that big of a deal).

Looking ahead, when implementing caching behavior, the subprocess will only be 
part of what this code does.


- Bernd


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28830/#review66105
-----------------------------------------------------------


On Dec. 8, 2014, 4:02 p.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28830/
> -----------------------------------------------------------
> 
> (Updated Dec. 8, 2014, 4:02 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-2172
>     https://issues.apache.org/jira/browse/MESOS-2172
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Layed the groundwork for having a fetcher object with a cache dir per slave. 
> Factored all fetcher-relevant code into a fetcher class and process. Added a 
> fetcher parameter to a lot of methods related to launching tasks, mostly in 
> containerizers. The latter are thus not holders of fetchers, only slaves are.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/composing.hpp 
> f1e60b0dae45757f3573c81a4a4a2de6fcb27aca 
>   src/slave/containerizer/composing.cpp 
> a6ae817a973d49b433519beb8eda99692c203613 
>   src/slave/containerizer/containerizer.hpp 
> 02754cd5b4499d5809c80985100d11d2663dcc44 
>   src/slave/containerizer/docker.hpp 28ebc6272cd68167fc9a0898fd8eb9d53890b815 
>   src/slave/containerizer/docker.cpp 19a6ea2b5342abe4bf10cf4ea2d5d756df9f8665 
>   src/slave/containerizer/external_containerizer.hpp 
> 8363cec490a2918e08a8d30d9b8832fd1e2dd4f5 
>   src/slave/containerizer/external_containerizer.cpp 
> efbc68f205b26cff58bc414427d3361711afcc94 
>   src/slave/containerizer/fetcher.hpp 
> 12b81b1065b6b0a0368ebd44326c75c1d4e4ba79 
>   src/slave/containerizer/fetcher.cpp 
> d702a9ccc15b21ba26fb9356aa69da15dfd5973e 
>   src/slave/containerizer/mesos/containerizer.hpp 
> 0b635d445f6cc5c188670ddadab98d4154fc5d2b 
>   src/slave/containerizer/mesos/containerizer.cpp 
> d70259b10abdf995229ceba66ec9b58cf111e056 
>   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 
> 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/containerizer.cpp a17e1e0c3f4d5fb2e369ae2b06430b8fd366bfc7 
>   src/tests/containerizer_tests.cpp 02a5f15ff970b475df2fe5e48f2aee42eeefe18e 
>   src/tests/docker_containerizer_tests.cpp 
> bed2d1005647d62b3ec013315b883d33990b896e 
>   src/tests/external_containerizer_test.cpp 
> 45cdeb57dd5b7b20294ebdecfd091a77e86941cb 
>   src/tests/fetcher_tests.cpp 9e48392d31c886bad60abc8999fe293f55f589b0 
>   src/tests/isolator_tests.cpp 01c0239553011c68fecaeeeb1d87fc097509a136 
>   src/tests/slave_recovery_tests.cpp 8bd0f145ff52d91cc0ad194e8676fb105aba2792 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/28830/diff/
> 
> 
> Testing
> -------
> 
> make check on MacOS X 10.10 and Ubuntu 13.10. 
> 
> Some of these tests fail on Linux when running "make check" as root, but not 
> always the same ones and they all seme unrelated:
> 
> [  FAILED  ] DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override
> [  FAILED  ] DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args
> [  FAILED  ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework
> [  FAILED  ] CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks
> [  FAILED  ] CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read
> [  FAILED  ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess
> [  FAILED  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess
> [  FAILED  ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess
> [  FAILED  ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework
> [  FAILED  ] NsTest.ROOT_setns
> [  FAILED  ] PerfTest.ROOT_SampleInit
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>

Reply via email to