----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52241/ -----------------------------------------------------------
(Updated Sept. 26, 2016, 5:13 p.m.) Review request for mesos, Benjamin Hindman, Artem Harutyunyan, Jie Yu, Joseph Wu, Kevin Klues, and Vinod Kone. Repository: mesos Description ------- This is a regression test for a bug which was already fixed in our code. The bug was that we missed a check in containerizer::recover() to make sure we will return a failure if the container state is changed to 'DESTROYING'. Otherwise, it was possible that the container state might be reset from 'DESTORYING' to 'PREPARING' by the race between containerizer::prepare and containerizer::destory. However, while this regression test tries to reproduce the race and tests on that the issue is fixed. It was making the assumption that if the container is in PROVISIONING state and we call containerizer::destroy(), we will always reach the future of 'container->provisioning' and waiting for the call back. This is not true, because it is possible that when the call back is triggered and we call containerizer::prepare(), the destroy track does not reach the 'container->provisioning' yet. The race exists in the unit test for a while. Now, after we implemented the code for nested container support, the containerizer::destroy become slower because we changed it to be recursive (destroy child containers first before destroying the parent container). Then, it exposes the race in this unit test, since when the callback is triggered and containerizer call prepare(), the containerizer::destroy() does not change the container state to 'DESTROYING' yet. After some intestigation, because the containerizer return type is void and all its continuous method (e.g., __destroy) only take one parameter 'ContainerID' and there is no special variable we can inject in to the method, there is no technical tool for us to make sure the destroy track reachs 'container->provisioning' point and waiting for a callback. After a second thought, the implementation in the containerizer to deal with container state is handled correctly. We should remove this regression test if such a race is not fixable. Diffs (updated) ----- src/tests/containerizer/mesos_containerizer_tests.cpp 8b5a19e121ef74eaf99b39682f8fd170605bf56d Diff: https://reviews.apache.org/r/52241/diff/ Testing ------- Thanks, Gilbert Song
