Thanks to both of you.

@David Idempotence (and functional style) will both mitigate the issue of
testing.

@Sharma #3 looks impressive and I hear the pain. Few questions:
* Since you already have the state machine modeling, can't the scheduler
actions also be modeled as a state machine transitions?
* Having a spec for (in form of state machine or otherwise) scheduler looks
important (and hard) goal. Mocking looks like a good thing. Is mocking
general enough to become a library available to all, to enable *verifiably*
correct scheduler behavior?

Again thanks for sharing your thoughts.

Thanks,
Dharmesh

On Mon, Oct 13, 2014 at 7:29 AM, David Greenberg <dsg123456...@gmail.com>
wrote:

> Specifically with regards to the state of the framework due to callback
> ordering, we ensure that our framework is written in a functional style, so
> that all callbacks atomically transform the previous state to a new state.
> By doing this, we serialize all callbacks. At this point, you can do
> generative testing to create events and run them through your system. This,
> at least, makes #3 possible.
>
> For #4, we are pretty careful to choose idempotent writes into the DB and
> a DB that supports snapshot reads. This way, you can just use at-least-once
> semantics for easy-to-implement retries. If a write fails, you just crash,
> since that means your DB's completely down. Then we test by thinking
> through and discussing whether operations have this idempotency property
> and the simple retry logic independently. This starts to get at a way to
> manage #4 to avoid learning in production.
>
> On Sun, Oct 12, 2014 at 11:44 AM, Dharmesh Kakadia <dhkaka...@gmail.com>
> wrote:
>
>> Thanks David.
>>
>> Taking state of the framework is an interesting design. I am assuming the
>> scheduler is maintaining the state and then handing tasks on slaves. If
>> that's the case, we can safely test executor (stateless - receiving event
>> and returning appropriate status to the scheduler). You construct scheduler
>> tests similarly by passing different states and event and observing the
>> next state. This way you will be sure that your callbacks works fine in
>> *isolation*. I would be concerned about the state of the framework in
>> case of callback ordering (or re-execution) in *all possible scenarios*.
>> Mocking is exactly what might uncover such bugs, but as you pointed out, I
>> also think it would not be trivial for many frameworks.
>>
>> At a high-level it would be important to know for frameworks developers
>> that,
>> 1. executors are working fine in isolation on fresh start, implementing
>> the feature.
>> 2. executors are working fine when rescheduled/restarted/in presence of
>> other executors.
>> 3. scheduler is working fine in isolation.
>> 4. scheduler is fine in the wild ( in presence of
>> others/failures/checkpointing/...).
>>
>> 1 is easy to do traditionally. 2 is possible if your executors do not
>> have side effect or using Docker etc.
>> 3 and 4 are not easy to do. I think having support/library for testing
>> scheduler is something all the framework writer would benefit from. Not
>> having to think about communication between executors and scheduler is
>> already a big plus, can we also make it easier for developers to test about
>> their scheduler behaviour?
>>
>> Thoughts?
>>
>> I would love to hear thoughts from others.
>>
>> Thanks,
>> Dharmesh
>>
>> On Sun, Oct 12, 2014 at 8:03 PM, David Greenberg <dsg123456...@gmail.com>
>> wrote:
>>
>>> For our frameworks, we don't tend to do much automated testing of the
>>> Mesos interface--instead, we construct the framework state, then "send it a
>>> message", since our callbacks take the state of the framework + the event
>>> as the argument. This way, we don't need to have mesos running, and we can
>>> trim away large amounts of code necessary to connect to mesos but
>>> unnecessary for the actual feature under test. We've also been
>>> experimenting with simulation testing by mocking out the mesos APIs. These
>>> techniques are mostly effective when you can pretend that the executors
>>> you're using don't communicate much, or when they're trivial to mock.
>>>
>>> On Sun, Oct 12, 2014 at 9:42 AM, Dharmesh Kakadia <dhkaka...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am working on a tiny experimental framework for Mesos. I was
>>>> wondering what is the recommended way of writing testcases for framework
>>>> testing. I looked at the several existing frameworks, but its still not
>>>> clear to me. I understand that I might be able to test executor
>>>> functionality in isolation through normal test cases, but testing as a
>>>> whole framework is what I am unclear about.
>>>>
>>>> Suggestions? Is that a non-goal? How do other framework developers go
>>>> about it?
>>>>
>>>> Also, on the related note, is there a way to debug frameworks in better
>>>> way than sifting through logs?
>>>>
>>>> Thanks,
>>>> Dharmesh
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to