> On Oct. 19, 2013, 11:35 p.m., Benjamin Hindman wrote: > > 3rdparty/libprocess/include/process/process.hpp, line 379 > > <https://reviews.apache.org/r/14686/diff/2/?file=365570#file365570line379> > > > > This is great! A wonderful step in the evolution of libprocess!
Ignore this issue, my bad. - Benjamin ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14686/#review27222 ----------------------------------------------------------- On Oct. 17, 2013, 12:11 a.m., Jie Yu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/14686/ > ----------------------------------------------------------- > > (Updated Oct. 17, 2013, 12:11 a.m.) > > > Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone. > > > Repository: mesos-git > > > Description > ------- > > This is the second patch of a series of patches that implement a catch-up > mechanism for replicated log. This patch fixed a race condition caused by > passing process wrapper pointers (e.g. Network, Replica, etc.). > > In our code base, one commonly used routine is to use a wrapper class to wrap > a libprocess so that we don't need to worry about whether we should directly > call the function or should use a dispatch. For example: > > class FooProcess : public Process<FooProcess> { > int foo(); > }; > > class Foo { > Future<int> foo() { return dispatch(process, &FooProcess::foo); } > ... > FooProcess* process; > }; > > If multiple entities want to share a libprocess (e.g. FooProcess), currently, > we pass a pointer of its wrapper (e.g. Foo*) to the users. However, this is > problematic in some cases. The main reason is that we don't know when it is > safe to destroy the libprocess (FooProcess) and delete the wrapper (Foo) if a > pointer of this wrapper has been passed to some other entities in the system > and we don't know whether they will use it or not later. > > There is a race condition in our first patch. Here is the buggy interleaving: > > T1 T2 > > appending.discard(); > ExplicitPromiseProcess::initialize() { > onDiscarded() { > terminate ExplicitPromiseProcess > } > > delete network; > > ... > network->broadcast(protocol::promise, > request) > } > > (We added a new test CoordinatorTest.FillTimeout which can reproduce this bug > and will segfault without this patch.) > > > To fix this problem, we introduce a concept called ProcessHandle to > generalize process wrappers (see process.hpp). It internally maintains a > shared pointer to the underlying libprocess. It is copyable. Multiple copies > of the same handle share the same underlying libprocess. When the last copy > is deleted, we will terminate the underlying process and delete the > libprocess. If a user wants to early terminate the libprocess, he can > explicitly call the terminate() function. > > Now, creating a wrapper for a libprocess is simply as that: > > class Foo : public ProcessHandle<FooProcess> { ... }; > > This change won't break the existing code (i.e., if you use the old way of > passing wrapper pointers, this change is like a noop). > > This is a joint work with Yan Xu. > > > Diffs > ----- > > 3rdparty/libprocess/include/process/process.hpp b502324 > src/Makefile.am a2d8242 > src/log/catchup.hpp PRE-CREATION > src/log/catchup.cpp PRE-CREATION > src/log/consensus.hpp PRE-CREATION > src/log/consensus.cpp PRE-CREATION > src/log/coordinator.hpp 3f6fb7c > src/log/coordinator.cpp 6e6466f > src/log/log.hpp 77edc7a > src/log/network.hpp d34cf78 > src/log/replica.hpp d1f5ead > src/log/replica.cpp 59a6ff3 > src/log/zookeeper.hpp PRE-CREATION > src/log/zookeeper.cpp PRE-CREATION > src/tests/log_tests.cpp ff5f86c > > Diff: https://reviews.apache.org/r/14686/diff/ > > > Testing > ------- > > bin/mesos-tests.sh --gtest_filter=CoordinatorTest*:ReplicaTest*:LogTest* > --gtest_repeat=100 > > > Thanks, > > Jie Yu > >
