On Mon, Oct 5, 2015 at 1:50 PM, Neil Conway <neil.con...@gmail.com> wrote: > > On Sun, Oct 4, 2015 at 6:14 PM, Maged Michael <maged.mich...@gmail.com> wrote: > > I'd appreciate feedback on a proposal for a simulation tool for debugging > > and testing the Mesos master and allocator. > > Overall, this is awesome! I'd love to see Mesos improve in this area, > and I'd be happy to help out where I can.
Thanks. That would be awesome! > > Simulations would--randomly but deterministically--explore the state space > > of cloud configurations and check for invariant violations and collect > > stats--in addition to those already in the Mesos master code. > > It would be useful to be able to (a) record a "trace" from a running > (production) Mesos instance (b) replay that trace under the simulator, > e.g., to explore the impact of changes to Mesos. For example, see > Section 3.1 of the Borg paper [1]. Yes. I totally agree. > > * Automated transformation of Mesos source code for integration into the > > simulator, to allow the simulator to use simulated time instead of real > > time and to intercept libprocess-based inter-thread and inter-node > > communication. > > Can you elaborate on how you see the source code transformation working? I have in mind three options. (1) Text translation of Mesos source code. E.g., "process::Future" into, say, "sim::process::Future". - Pros: Does not require any changes to any Mesos or libprocess code. Replace only what needs to be replaced in libprocess for simulation. - Cons: Fragile. (2) Integrate the simulation mode with the libprocess code. - Pros: Robust. Add only what needs to be added to libprocess for simulation. Partial reuse some data structures from regular-mode libprocess. - Cons: Might get in the way of the development and bug fixes in the regular libprocess code. (3) Changes to Mesos makefiles to use alternative simulation-oriented libprocess code. - Pros: Robust. - Cons: Might need to create a lot of stubs that redirect to the regular-mode (i.e., not for simulation) libprocess code that doesn't need any change under simulation. I thought I would start incrementally with option (1) and eventually switch to option (2) after the code of approach (1) matures, to minimize interference with development of regular-mode libprocess code. Option (3) can be an alternative to (2) if the combination of regular-mode and simulation-mode code becomes too complex to maintain. > Because of the way in which Mesos uses processes and message passing, > you can already control timeouts and inter-process communication in a > fairly sophisticated way -- for example, see Clock::advance(), > Clock::settle(), FUTURE_MESSAGE(), DROP_MESSAGE(), etc. Do you think > it would be possible to implement the simulator in a way that > leverages (and improves!) the existing facilities in libprocess, > rather than building new functionality? For example, to control the > way in which processes and events are interleaved, would it be > possible to do this by hooking into the libprocess message dispatch > logic, rather than doing a source code transformation? What I had in mind is to start with separate functionality. I am a newbie to libprocess, so I'd appreciate any suggestions for reusing libprocess code. As an example of what I have in mind. this a sketch of sim::process::dispatch. template<class T, class... Args> // Let R be an abbreviation of typename result_of<T::*method(Args...)>::type sim::process::Future<R> dispatch( const sim::process::Process<T>& pid, R (T::*method)(Args...), Args... args) { /* Still running in the context of the parent simulated thread - the same C++/OS thread as the simulator. */ <context switch to the simulator and back to allow event interleaving> /* e.g., setjmp/longjmp */ // create a promise std::shared_ptr<sim::process::Promise(R) prom(new sim::process::Promise<R>()); <create a function object fn initialized with T::method and args> <associate prom with fn> // e.g., a map structure <enqueue fn in pid's structure> return prom->future(); /* The dispatched function will start running when at some point later the simulator decides to switch to the child thread (pid) when pid is ready to run fn. */ } > > Examples of problems to be detected: > > * Liveness problems such as deadlock, livelock, starvation > > * Safety problems such as oversubscription of resources, permanent loss of > > resources or tasks, data corruption in general. > > * Fairness problems such as sustained imbalance in allocation of resources > > to frameworks. > > * Performance problems such as high response time, low resource utilization. > > Validating that the system behaves correctly in the presence of > network partitions would also be great. Good point. Thanks. > To clarify, it seems like you are primarily focused on finding > bugs/problems in core Mesos, rather than in Mesos framework > implementations. The latter would also be a very interesting project > (e.g., as a framework author, we'd give you a tool that would push > your scheduler/executor implementation through the entire state space > of situations the framework would need to handle). I am glad that you mentioned this because I had that in mind too. I just didn't want to confuse the goals in the first post. Most of the infrastructure can be shared between simulations for the purpose of debugging and testing and simulations for experimenting with high level Mesos allocation and framework task scheduling policies. In the latter mode improve speed by assuming that the Mesos code is correct at a low level (i.e., no race conditions between the master and allocator threads) and then sim::process::dispatch would just strip away the child pid and run the dispatched function immediately. I see correctness testing and debugging vs performance and policy testing as two somewhat diverging tracks that have common infrastructure. The first track (correctness) needs investment in libprocess simulation whereas the second track (policies) needs investment in cloud models and interfaces to plug-in heterogeneous framework models. I'd be happy to prioritize whichever track brings more value to the community. > Neil > > [1] > https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf --Maged