Re: Review Request 33465: Removed 'uuid' field from UPDATE call.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33465/
---

(Updated April 23, 2015, 5:48 a.m.)


Review request for mesos and Ben Mahler.


Changes
---

minor fix. updated tag id.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

Now that TaskStatus has 'uuid', we don't need it in UPDATE. It will also make 
the ACKNOWLEDGE semantics easier.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/examples/low_level_scheduler_libprocess.cpp 
63d34eefb60d13fe2b82905c1cec9b762340e997 
  src/examples/low_level_scheduler_pthread.cpp 
6d1f938660c02db75bfcbf7c8de0d941cff1920d 
  src/scheduler/scheduler.cpp bd9fced0f58aa3bc0ff147dbefb77cea4734a79e 

Diff: https://reviews.apache.org/r/33465/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 33274: Fix capture by reference of temporary strings in Libprocess.

2015-04-22 Thread Adam B

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33274/#review81267
---


Is it possible to do some quick benchmarking to show that this change doesn't 
compile into more copies, making Mesos run slower and fatter?

- Adam B


On April 22, 2015, 11:11 a.m., Joris Van Remoortere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33274/
> ---
> 
> (Updated April 22, 2015, 11:11 a.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
> Park, and Till Toenshoff.
> 
> 
> Bugs: MESOS-2631
> https://issues.apache.org/jira/browse/MESOS-2631
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/help.cpp 85e1bdec8d7e8f46477d0f3d88847baeca2dcc9c 
>   3rdparty/libprocess/src/process.cpp 
> 97ac09fd10b767bc46387abc3657d005a00830c8 
>   3rdparty/libprocess/src/tests/decoder_tests.cpp 
> efe364a5de38c4d39fe42df243df173e7d04479a 
>   3rdparty/libprocess/src/tests/encoder_tests.cpp 
> 784a2c71a2a7d349726f8e85bab3888b2c405aba 
>   3rdparty/libprocess/src/tests/http_tests.cpp 
> dfdb233de93552ab7040647ee1958a85d47c9482 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> eb38edce2c483ba7f963a826893a15a075238618 
>   3rdparty/libprocess/src/tests/subprocess_tests.cpp 
> dea4ed8699c6c7cf5b4f5d3f014577c474dcac95 
> 
> Diff: https://reviews.apache.org/r/33274/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Joris Van Remoortere
> 
>



Re: Review Request 32509: Documented the scheduler Event/Call protobufs.

2015-04-22 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32509/#review81266
---


Patch looks great!

Reviews applied: [32500, 32501, 32502, 32504, 32843, 32505, 32506, 32844, 
32845, 33465, 32509]

All tests passed.

- Mesos ReviewBot


On April 23, 2015, 2:55 a.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32509/
> ---
> 
> (Updated April 23, 2015, 2:55 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
> 
> Diff: https://reviews.apache.org/r/32509/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 32509: Documented the scheduler Event/Call protobufs.

2015-04-22 Thread Vinod Kone


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, line 52
> > 
> >
> > Hm, what do you mean by "can be offered"? Do you want to just s/that 
> > can be// ?

done.


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, lines 54-55
> > 
> >
> > Hm, this last sentence sort of implies no optimistic offers? We'll need 
> > to be sure that we update the documentation later I suppose.

yup.


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, lines 60-63
> > 
> >
> > "no longer valid" seems a little specific, we could rescind in cases 
> > where the offer would have remained valid but we choose to change the 
> > resource allocations dynamically. How about just s/is no longer valid/is 
> > rescinded/ and removing the "hence" part?
> > 
> > Also, do we need the example or can we trim this down a bit?

"A Rescind is received when an offer is rescinded" doesn't seem to offer much 
value to framework writers? I used "invalid" because from framework's 
perspective the offer is no longer valid, as in, it cannot use it. Does that 
make sense?


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, line 69
> > 
> >
> > "or mesos"? Slave and master might be a bit too implementation specific 
> > in the future, plus we'd like to rename "Slave".
> > 
> > On that note, before we finalize the http api, can we get rid of 
> > "slave" in the api?

"mesos" might be weird because frameworks might not consider executors sending 
updates as getting from mesos? Regarding Slave renaming, we should have a 
separate discussion around it because the api uses protobufs from mesos.proto 
which uses it.


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, lines 101-102
> > 
> >
> > Specifically, we have to think about how to do reconciliation between 
> > mesos and the scheduler, yes?

yup. expanded the TODO.


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, line 151
> > 
> >
> > Do you want to add something about how the "removes" the framework too?

done.


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, lines 235-238
> > 
> >
> > Hm.. are we expecting schedulers to use the source to decide whether to 
> > acknowledge? That seems a bit implicit, should we make 'uuid' optional to 
> > make it more clear?
> > 
> > Also, that way, the master can tell when one of its updates is being 
> > acknowledged (empty uuid), which should also make it possible for the 
> > master to drop the acknowledgments, yes?

SGTM. I'll remove uuid from Update, now that it is part of TaskStatus. Will do 
that in a dependent review.


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, lines 268-270
> > 
> >
> > Hm.. schedulers can't use Update as a replacement for this Message, 
> > since Message is scheduler -> executor but Update is executor -> scheduler..

Alex asked the same thing earlier. My response was that schedulers could use 
tasks and Updates to do 2 way communication. I'll just remove this sentence to 
avoid confusion.


> On April 22, 2015, 11:26 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, lines 113-116
> > 
> >
> > Isn't this more general than these two specific cases? Maybe we should 
> > describe it more generally instead of documenting examples like this? For 
> > example, what should the scheduler do when Error is received? Will the 
> > connection be closed? Will they need to re-subscribe?

This is there for backwards compatibility with the old driver and hence talks 
about the conditions that generate Error with the old driver. I plan to remove 
this in the new API (as the TODO says) because the errors will be delivered as 
HTTP responses instead of on the SUBSCRIBE stream. That makes sense?


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32509/#review81234
---


On April 22, 2015, 9:44 p.m

Re: Review Request 32509: Documented the scheduler Event/Call protobufs.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32509/
---

(Updated April 23, 2015, 2:55 a.m.)


Review request for mesos and Ben Mahler.


Changes
---

benm's. NNFR.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

See summary.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 

Diff: https://reviews.apache.org/r/32509/diff/


Testing
---

make check


Thanks,

Vinod Kone



Review Request 33465: Removed 'uuid' field from UPDATE call.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33465/
---

Review request for mesos and Ben Mahler.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

Now that TaskStatus has 'uuid', we don't need it in UPDATE. It will also make 
the ACKNOWLEDGE semantics easier.


Diffs
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/examples/low_level_scheduler_libprocess.cpp 
63d34eefb60d13fe2b82905c1cec9b762340e997 
  src/examples/low_level_scheduler_pthread.cpp 
6d1f938660c02db75bfcbf7c8de0d941cff1920d 
  src/scheduler/scheduler.cpp bd9fced0f58aa3bc0ff147dbefb77cea4734a79e 

Diff: https://reviews.apache.org/r/33465/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32845: Renamed UNREGISTER call to TEARDOWN.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32845/
---

(Updated April 23, 2015, 2:52 a.m.)


Review request for mesos and Ben Mahler.


Changes
---

benm's. NNFR.


Summary (updated)
-

Renamed UNREGISTER call to TEARDOWN.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

Renamed UNREGISTER call to UNSUBSCRIBE for consistency.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/examples/low_level_scheduler_libprocess.cpp 
63d34eefb60d13fe2b82905c1cec9b762340e997 
  src/examples/low_level_scheduler_pthread.cpp 
6d1f938660c02db75bfcbf7c8de0d941cff1920d 
  src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
  src/scheduler/scheduler.cpp bd9fced0f58aa3bc0ff147dbefb77cea4734a79e 
  src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 

Diff: https://reviews.apache.org/r/32845/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32845: Renamed UNREGISTER call to UNSUBSCRIBE.

2015-04-22 Thread Vinod Kone


> On April 22, 2015, 10:48 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, line 113
> > 
> >
> > And removes the framework?
> > 
> > Do you want to align the comments so the `//` characters are lined up?

done.


> On April 22, 2015, 10:48 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, line 211
> > 
> >
> > s/UNSUBSCRIBE/TEARDOWN/ here? Could you do a grep just in case anything 
> > else was missed?

good catch. did a grep too. this was the only one.


> On April 22, 2015, 10:48 p.m., Ben Mahler wrote:
> > src/tests/scheduler_tests.cpp, line 654
> > 
> >
> > Hm.. can you add a comment above this as to why it's needed?

doesn't look like it's needed. not sure why i added it. deleted it.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32845/#review81232
---


On April 22, 2015, 9:43 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32845/
> ---
> 
> (Updated April 22, 2015, 9:43 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Renamed UNREGISTER call to UNSUBSCRIBE for consistency.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
>   src/examples/low_level_scheduler_libprocess.cpp 
> 63d34eefb60d13fe2b82905c1cec9b762340e997 
>   src/examples/low_level_scheduler_pthread.cpp 
> 6d1f938660c02db75bfcbf7c8de0d941cff1920d 
>   src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
>   src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
>   src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 
> 
> Diff: https://reviews.apache.org/r/32845/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 32844: Added SUBSCRIBE call and SUBSCRIBED event.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32844/
---

(Updated April 23, 2015, 2:51 a.m.)


Review request for mesos and Ben Mahler.


Changes
---

benm's. NNFR.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

Instead of REGISTER and REREGISTER we now just have SUBSCRIBE. Similarly, 
instead of REGISTERED and REREGISTERED there is only SUBSCRIBED. This will 
simplify a scheduler's registration semantics.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/examples/low_level_scheduler_libprocess.cpp 
63d34eefb60d13fe2b82905c1cec9b762340e997 
  src/examples/low_level_scheduler_pthread.cpp 
6d1f938660c02db75bfcbf7c8de0d941cff1920d 
  src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
  src/scheduler/scheduler.cpp bd9fced0f58aa3bc0ff147dbefb77cea4734a79e 
  src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 

Diff: https://reviews.apache.org/r/32844/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32843: Updated KILL to include SlaveID.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32843/
---

(Updated April 23, 2015, 2:51 a.m.)


Review request for mesos and Ben Mahler.


Changes
---

Benm's. NNFR.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

Having SlaveID will help us do better reconciliation when the task is not 
found. Also, updated master to handle Call::Kill.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/master/master.hpp 550d2c50cd01aa5830a7e7e03ab4a0240c221b15 
  src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
  src/scheduler/scheduler.cpp bd9fced0f58aa3bc0ff147dbefb77cea4734a79e 
  src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 

Diff: https://reviews.apache.org/r/32843/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32843: Updated KILL to include SlaveID.

2015-04-22 Thread Vinod Kone


> On April 22, 2015, 10:20 p.m., Ben Mahler wrote:
> > src/master/master.cpp, lines 2731-2732
> > 
> >
> > You may want to avoid taking a const reference to the temporary here?

fixed.


> On April 22, 2015, 10:20 p.m., Ben Mahler wrote:
> > src/master/master.cpp, line 2772
> > 
> >
> > Will this be a BadRequest so schedulers can detect the issue? Maybe a 
> > TODO?

done.


> On April 22, 2015, 10:20 p.m., Ben Mahler wrote:
> > src/master/master.cpp, line 2773
> > 
> >
> > Whoops, remove the semi-colon?

done.


> On April 22, 2015, 10:20 p.m., Ben Mahler wrote:
> > src/tests/scheduler_tests.cpp, line 417
> > 
> >
> > Did you want this extra newline?

nope. fixed.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32843/#review81222
---


On April 22, 2015, 9:40 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32843/
> ---
> 
> (Updated April 22, 2015, 9:40 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Having SlaveID will help us do better reconciliation when the task is not 
> found. Also, updated master to handle Call::Kill.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
>   src/master/master.hpp 550d2c50cd01aa5830a7e7e03ab4a0240c221b15 
>   src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
>   src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
>   src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 
> 
> Diff: https://reviews.apache.org/r/32843/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-22 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/#review81257
---

Ship it!



src/slave/slave.cpp


I would rather not set the container field for Mesos containers (as we 
don't set it for customized executors as well).


- Jie Yu


On April 20, 2015, 9:18 p.m., Timothy Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33257/
> ---
> 
> (Updated April 20, 2015, 9:18 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie 
> Yu, and Till Toenshoff.
> 
> 
> Bugs: MESOS-2601
> https://issues.apache.org/jira/browse/MESOS-2601
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed recover tasks only by the intiated containerizer.
> Currently both mesos and docker containerizer recovers tasks that wasn't 
> started by themselves.
> The proposed fix is to record the intended containerizer in the checkpointed 
> executorInfo, and reuse that information on recover to know if the 
> containerizer should recover or not. We are free to modify the executorInfo 
> since it's not being used to relaunch any task.
> The external containerizer doesn't need to change since it is only recovering 
> containers that are returned by the containers script.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/docker.hpp 0d5289c626bdf22420759485f2146abfb6bdaffc 
>   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
>   src/slave/containerizer/mesos/containerizer.cpp 
> e4136095fca55637864f495098189ab3ad8d8fe7 
>   src/slave/slave.cpp 8ec80ed26f338690e0a1e712065750ab77a724cd 
>   src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
>   src/tests/docker_containerizer_tests.cpp 
> b119a174de79c2f98a0c575e6be2f59649f509ef 
> 
> Diff: https://reviews.apache.org/r/33257/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>



Re: Review Request 33459: Fix containerizers to only recover containers they would launch.

2015-04-22 Thread Benjamin Hindman


> On April 23, 2015, 12:37 a.m., Timothy Chen wrote:
> > I already have a fix out for this:
> > https://reviews.apache.org/r/33257/
> > 
> > My patch also handles docker container tasks, since the docker 
> > containerizer uses the command executor in docker tasks, and without more 
> > handling we can't tell if it's from mesos or docker.

Cool! Well in that case I'll discard this review. Please update MESOS-2605 and 
MESOS-2601 with this proposed review.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33459/#review81252
---


On April 23, 2015, 12:33 a.m., Benjamin Hindman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33459/
> ---
> 
> (Updated April 23, 2015, 12:33 a.m.)
> 
> 
> Review request for mesos, Joris Van Remoortere, Michael Park, and Timothy 
> Chen.
> 
> 
> Bugs: MESOS-2605
> https://issues.apache.org/jira/browse/MESOS-2605
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> It's possible that a containerizer might recover a container that it
> can't properly handle, and later when any updates are attempted on
> that container the slave will actually destroy it which could leave an
> orphaned container because the containerizer doesn't actually know how
> to destroy it. See MESOS-2605.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 0bd26ea58ef56bad3b215711a8a38ca65ecd8ebd 
> 
> Diff: https://reviews.apache.org/r/33459/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>



Re: Review Request 32140: Enabled 'Resources' to handle 'Resource::ReservationInfo'.

2015-04-22 Thread Jie Yu


> On April 8, 2015, 8:16 p.m., Jie Yu wrote:
> > src/common/resources.cpp, lines 450-459
> > 
> >
> > The semantics of this function becomes a little weired now. For 
> > example, for a resource that has `role == "*"` and has reservation set, 
> > `isReserved(resource, "*")` is going to return `true`? Given that 
> > 'resource' is invalid, we should return a `false` in that case?
> 
> Alexander Rukletsov wrote:
> Can we have a resource with `role == "*"` and reservations set?
> 
> Alexander Rukletsov wrote:
> Excuse my premature comment earlier, I'm slowly starting to understand 
> what's going on here. It looks like in the case Jie describes, the function 
> returns `true` indeed. Which is invalid by now, but _may_ become valid in the 
> future. On the other side, I'm not sure that returning `false` in this case 
> is an alternative: it will read as unreserved resource, not an invalid 
> resource. We can wrap the return value in `Try` but I prefer the way it's 
> done now.
> 
> Michael Park wrote:
> Hey guys, this predicate, as with other predicates assume that the 
> resources have already been validated.
> The following note can be found at the top of the predicate declarations 
> in the header.
> 
> ```
> // NOTE: The following predicate functions assume that the given
> // resource is validated.
> ```
> 
> Is it still weird with this assumption? If it was just that the note is 
> difficult to find, I could maybe put the note as a comment on every predicate?

See my first comment in https://reviews.apache.org/r/32150/

If we move the validation to master, then this function becomes weired.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32140/#review79410
---


On April 22, 2015, 8:36 p.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32140/
> ---
> 
> (Updated April 22, 2015, 8:36 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Ben Mahler, and Jie Yu.
> 
> 
> Bugs: MESOS-2476
> https://issues.apache.org/jira/browse/MESOS-2476
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `Resource::ReservationInfo` was introduced in 
> [r32139](https://reviews.apache.org/r/32139). We need to consider it in our 
> `Resources` class implementation.
> 
> ### `Resources::flatten`
> 
> `flatten` is used as the utility function to cross reservation boundaries 
> without affecting the given resources. Since the reservation is now specified 
> by the (`role`, `reservation`) pair, `flatten` needs to consider 
> `ReservationInfo` as well.
> 
> ### `Resources::validate`
> 
> If `role == "*"`, then `reservation` field must not be set.
> 
> ### `Resources` comparators
> 
> `operator==`, `addable` and `substractable` need to test for 
> `ReservationInfo` as well.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 56affd45e1e71e96ff5778b43271f81b9b9a9e4d 
>   src/common/resources.cpp 2c99b6884d7296099e19e2e3182cbe11b5e1e059 
>   src/tests/mesos.hpp 4294e28f904161b473c0cfa7feea4eaa4e7c97e3 
>   src/tests/resources_tests.cpp 7e0ad98c3366f647f190363a0e6b576dbfc7d415 
> 
> Diff: https://reviews.apache.org/r/32140/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 33459: Fix containerizers to only recover containers they would launch.

2015-04-22 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33459/#review81251
---


Subsumed by this https://reviews.apache.org/r/33257?

- Jie Yu


On April 23, 2015, 12:33 a.m., Benjamin Hindman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33459/
> ---
> 
> (Updated April 23, 2015, 12:33 a.m.)
> 
> 
> Review request for mesos, Joris Van Remoortere, Michael Park, and Timothy 
> Chen.
> 
> 
> Bugs: MESOS-2605
> https://issues.apache.org/jira/browse/MESOS-2605
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> It's possible that a containerizer might recover a container that it
> can't properly handle, and later when any updates are attempted on
> that container the slave will actually destroy it which could leave an
> orphaned container because the containerizer doesn't actually know how
> to destroy it. See MESOS-2605.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 0bd26ea58ef56bad3b215711a8a38ca65ecd8ebd 
> 
> Diff: https://reviews.apache.org/r/33459/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>



Re: Review Request 33459: Fix containerizers to only recover containers they would launch.

2015-04-22 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33459/#review81252
---


I already have a fix out for this:
https://reviews.apache.org/r/33257/

My patch also handles docker container tasks, since the docker containerizer 
uses the command executor in docker tasks, and without more handling we can't 
tell if it's from mesos or docker.

- Timothy Chen


On April 23, 2015, 12:33 a.m., Benjamin Hindman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33459/
> ---
> 
> (Updated April 23, 2015, 12:33 a.m.)
> 
> 
> Review request for mesos, Joris Van Remoortere, Michael Park, and Timothy 
> Chen.
> 
> 
> Bugs: MESOS-2605
> https://issues.apache.org/jira/browse/MESOS-2605
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> It's possible that a containerizer might recover a container that it
> can't properly handle, and later when any updates are attempted on
> that container the slave will actually destroy it which could leave an
> orphaned container because the containerizer doesn't actually know how
> to destroy it. See MESOS-2605.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 0bd26ea58ef56bad3b215711a8a38ca65ecd8ebd 
> 
> Diff: https://reviews.apache.org/r/33459/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>



Re: Review Request 33358: Moved implementation of StatusUpdateStream to a compilation unit.

2015-04-22 Thread Ben Mahler


> On April 22, 2015, 9:24 a.m., Alexander Rukletsov wrote:
> > Let's not repeat summary in description. How about the following:
> > "StatusUpdateStream is not a template, hence we can reduce compilation time 
> > by moving method definitions into a `.cpp` file. As a drive-by change 
> > headers are cleaned up."
> > 
> > Thank you for putting effort into reducing compilation time!

> Thank you for putting effort into reducing compilation time!

Was it measured? :)


- Ben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33358/#review81136
---


On April 22, 2015, 9:09 a.m., Alexander Rojas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33358/
> ---
> 
> (Updated April 22, 2015, 9:09 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Joerg Schad, and Till 
> Toenshoff.
> 
> 
> Bugs: MESOS-2609
> https://issues.apache.org/jira/browse/MESOS-2609
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Moves the implementation of StatusUpdateStream to a compilation unit.
> Also cleans up headers.
> 
> 
> Diffs
> -
> 
>   src/slave/status_update_manager.hpp 
> b4d91b22b515195fdb69c89af9c2864e469e7e54 
>   src/slave/status_update_manager.cpp 
> fab8c22d46b8ab0a3c3745541ddc650b574bfbd4 
> 
> Diff: https://reviews.apache.org/r/33358/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Alexander Rojas
> 
>



Review Request 33459: Fix containerizers to only recover containers they would launch.

2015-04-22 Thread Benjamin Hindman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33459/
---

Review request for mesos, Joris Van Remoortere, Michael Park, and Timothy Chen.


Bugs: MESOS-2605
https://issues.apache.org/jira/browse/MESOS-2605


Repository: mesos


Description
---

It's possible that a containerizer might recover a container that it
can't properly handle, and later when any updates are attempted on
that container the slave will actually destroy it which could leave an
orphaned container because the containerizer doesn't actually know how
to destroy it. See MESOS-2605.


Diffs
-

  src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
  src/slave/containerizer/mesos/containerizer.cpp 
0bd26ea58ef56bad3b215711a8a38ca65ecd8ebd 

Diff: https://reviews.apache.org/r/33459/diff/


Testing
---


Thanks,

Benjamin Hindman



Re: Review Request 32509: Documented the scheduler Event/Call protobufs.

2015-04-22 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32509/#review81234
---

Ship it!


Looks great, only 1 question which we can tackle separately from this review:

Is it possible to describe when to acknowledge an update in a more explicit way 
(currently source == SLAVE of source == EXECUTOR seems implicit). Can we make 
the 'uuid' optional? Also we should think about how the master can drop 
acknowledgements for its own updates.


include/mesos/scheduler/scheduler.proto


Hm, what do you mean by "can be offered"? Do you want to just s/that can 
be// ?



include/mesos/scheduler/scheduler.proto


Hm, this last sentence sort of implies no optimistic offers? We'll need to 
be sure that we update the documentation later I suppose.



include/mesos/scheduler/scheduler.proto


"no longer valid" seems a little specific, we could rescind in cases where 
the offer would have remained valid but we choose to change the resource 
allocations dynamically. How about just s/is no longer valid/is rescinded/ and 
removing the "hence" part?

Also, do we need the example or can we trim this down a bit?



include/mesos/scheduler/scheduler.proto


"or mesos"? Slave and master might be a bit too implementation specific in 
the future, plus we'd like to rename "Slave".

On that note, before we finalize the http api, can we get rid of "slave" in 
the api?



include/mesos/scheduler/scheduler.proto


Specifically, we have to think about how to do reconciliation between mesos 
and the scheduler, yes?



include/mesos/scheduler/scheduler.proto


Isn't this more general than these two specific cases? Maybe we should 
describe it more generally instead of documenting examples like this? For 
example, what should the scheduler do when Error is received? Will the 
connection be closed? Will they need to re-subscribe?



include/mesos/scheduler/scheduler.proto


Do you want to add something about how the "removes" the framework too?



include/mesos/scheduler/scheduler.proto


Hm.. are we expecting schedulers to use the source to decide whether to 
acknowledge? That seems a bit implicit, should we make 'uuid' optional to make 
it more clear?

Also, that way, the master can tell when one of its updates is being 
acknowledged (empty uuid), which should also make it possible for the master to 
drop the acknowledgments, yes?



include/mesos/scheduler/scheduler.proto


Hm.. schedulers can't use Update as a replacement for this Message, since 
Message is scheduler -> executor but Update is executor -> scheduler..


- Ben Mahler


On April 22, 2015, 9:44 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32509/
> ---
> 
> (Updated April 22, 2015, 9:44 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
> 
> Diff: https://reviews.apache.org/r/32509/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 29889: Recover Docker containers when mesos slave is in a container

2015-04-22 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29889/#review81235
---


Bad patch!

Reviews applied: [29327, 29328]

Failed command: ./support/apply-review.sh -n -r 29328

Error:
 2015-04-22 22:53:16 URL:https://reviews.apache.org/r/29328/diff/raw/ 
[2149/2149] -> "29328.patch" [1]
error: patch failed: src/slave/flags.hpp:311
error: src/slave/flags.hpp: patch does not apply
Failed to apply patch

- Mesos ReviewBot


On April 22, 2015, 10:50 p.m., Timothy Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29889/
> ---
> 
> (Updated April 22, 2015, 10:50 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till 
> Toenshoff.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This is a one mega patch and contains many reviews that's already on rb.
> 
> This review is not meant to be merged, only provided for easier review.
> 
> 
> Diffs
> -
> 
>   Dockerfile 35abf25 
>   docs/configuration.md 54c4e31 
>   docs/docker-containerizer.md a5438b7 
>   src/Makefile.am 93c7c8a 
>   src/docker/docker.hpp 3ebbc1f 
>   src/docker/docker.cpp 3a485a2 
>   src/docker/executor.cpp PRE-CREATION 
>   src/slave/containerizer/docker.hpp 0d5289c 
>   src/slave/containerizer/docker.cpp f9fb078 
>   src/slave/flags.hpp d3b1ce1 
>   src/slave/flags.cpp d0932b0 
>   src/tests/docker_containerizer_tests.cpp b119a17 
> 
> Diff: https://reviews.apache.org/r/29889/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>



Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-22 Thread Jie Yu


> On April 21, 2015, 11:25 p.m., Jie Yu wrote:
> > src/slave/slave.cpp, lines 3065-3078
> > 
> >
> > Instead of doing that in your way, can we just try to make sure 
> > `containerizer->wait` here will return a failure (or a Termination with 
> > some reason) when `containerizer->launch` fails. In that way, the 
> > `executorTerminated` will properly send status updates to the slave 
> > (TASK_LOST/TASK_FAILED).
> > 
> > Or am I missing something?
> 
> Jie Yu wrote:
> OK, I think I got confused by the ticket. There are actually two problems 
> here. The problem I am refering to is the fact that we don't send status 
> update to the scheduler if containerizer launch fails until executor 
> reregistration timeout happens. Since for docker containerizer, someone might 
> use a very large timeout value, ideally, the slave should send a status 
> update to the scheduler right after containerizer launch fails.
> 
> After chat with Jay, the problem you guys are refering to is the fact 
> that the scheduler cannot disinguish between the case where the task has 
> failed vs. the case where the configuration of a task is not correct, because 
> in both cases, the scheduler will receive a TASK_FAILED/TASK_LOST.
> 
> Jie Yu wrote:
> To address the first problem, I think the simplest way is to add a 
> containerizer->destroy(..) in executorLaunched when containerizer->launch 
> fails. In that way, it's going to trigger containerizer->wait and thus send 
> status update to the scheduler.
> 
> Jie Yu wrote:
> Regarding the second problem, IMO, we should include a reason field in 
> Termination (https://issues.apache.org/jira/browse/MESOS-2035) and let 
> sendExecutorTerminatedStatusUpdate to propagate the termination reason to the 
> scheduler.
> 
> Timothy Chen wrote:
> Reason field sounds good, I think what you proposed makes sense, in 
> docker containerizer at least we also need to make sure termination message 
> is set correctly as currently it doesn't contain all the error information 
> that we pass back to the launch future.
> 
> Jay Buffington wrote:
> Sorry for the confusion.   There are three problems that are all related. 
>  Yes, we need to send statusUpdates as soon as containerizer launch fails and 
> yes, we need to set the reason so the scheduler can distinguish a slave 
> failure from a bad request.  However, my intention with this patch is not to 
> address either of those two problems.
> 
> My goal with this patch is to simply send the containerizer launch 
> failure message back to the scheduler.  I am using Aurora with the docker 
> containerizer.  There are a myriad of reasons that dockerd could fail to 
> "docker run" a container.  Currently, when that fails the user sees a useless 
> and incorrect "Abnormal Executor Termination" message in the Aurora web ui.  
> With this patch they see the stderr output of "docker run."  This way they 
> can report meaningful error messages to the operations team.
> 
> I can update this patch to address the other two issues, but the key is 
> that when the containerizer launch fails we have to send a statusUpdate with 
> a message that contains future.failure().  Before this patch we were only 
> logging it.  The scheduler needs to get that error message.
> 
> Jie Yu wrote:
> Thanks for clarifying it, Jay! In fact, my proposal above should be able 
> to solve the third problem cleanly. Check out 
> `Slave::sendExecutorTerminatedStatusUpdate`. The containerizer should 
> properly set the message and reason fields in the Termination protobuf 
> (basically why the container gets terminated and what's the error message). 
> The slave will forward the reason and message to the scheduler through status 
> update.
> 
> I chatted with BenM about this yesterday, and there are a couple of notes 
> I want to write down here.
> 
> 1. We probably need multiple levels for TaskStatus::Reason. In other 
> words, we probably need a "repeated Reason reasons" field in status update 
> message. For instance, for a containerizer launch failure, we probably need 
> two reasons: 1) the top level reason REASON_EXECUTOR_TERMINATED; 2) the 
> second level reason REASON_DOCKER_PULL_FAILURE;
> 
> 2. We probably want to allow extension to TaskStatus::Reason enum. For 
> example, some framework/executor may want to add customized reasons. We could 
> leverage the protobuf extension support to achieve that 
> (https://developers.google.com/protocol-buffers/docs/proto#extensions).
> 
> 3. The semantics around Termination is broken currently and we need to 
> clean it up. For instance, the following code is broken,
> ```
> void Slave::sendExecutorTerminatedStatusUpdate(...)
> {
>   ...
>   if (termination.isReady() && termination.get().killed()) {
> taskState = TASK_FAILED;
> // TODO(dhamon):

Re: Review Request 29889: Recover Docker containers when mesos slave is in a container

2015-04-22 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29889/
---

(Updated April 22, 2015, 10:50 p.m.)


Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till Toenshoff.


Repository: mesos


Description
---

This is a one mega patch and contains many reviews that's already on rb.

This review is not meant to be merged, only provided for easier review.


Diffs (updated)
-

  Dockerfile 35abf25 
  docs/configuration.md 54c4e31 
  docs/docker-containerizer.md a5438b7 
  src/Makefile.am 93c7c8a 
  src/docker/docker.hpp 3ebbc1f 
  src/docker/docker.cpp 3a485a2 
  src/docker/executor.cpp PRE-CREATION 
  src/slave/containerizer/docker.hpp 0d5289c 
  src/slave/containerizer/docker.cpp f9fb078 
  src/slave/flags.hpp d3b1ce1 
  src/slave/flags.cpp d0932b0 
  src/tests/docker_containerizer_tests.cpp b119a17 

Diff: https://reviews.apache.org/r/29889/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 32845: Renamed UNREGISTER call to UNSUBSCRIBE.

2015-04-22 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32845/#review81232
---

Ship it!


TEARDOWN sounds good, thanks!


include/mesos/scheduler/scheduler.proto


And removes the framework?

Do you want to align the comments so the `//` characters are lined up?



include/mesos/scheduler/scheduler.proto


s/UNSUBSCRIBE/TEARDOWN/ here? Could you do a grep just in case anything 
else was missed?



src/tests/scheduler_tests.cpp


Hm.. can you add a comment above this as to why it's needed?


- Ben Mahler


On April 22, 2015, 9:43 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32845/
> ---
> 
> (Updated April 22, 2015, 9:43 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Renamed UNREGISTER call to UNSUBSCRIBE for consistency.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
>   src/examples/low_level_scheduler_libprocess.cpp 
> 63d34eefb60d13fe2b82905c1cec9b762340e997 
>   src/examples/low_level_scheduler_pthread.cpp 
> 6d1f938660c02db75bfcbf7c8de0d941cff1920d 
>   src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
>   src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
>   src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 
> 
> Diff: https://reviews.apache.org/r/32845/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 32844: Added SUBSCRIBE call and SUBSCRIBED event.

2015-04-22 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32844/#review81227
---

Ship it!



src/examples/low_level_scheduler_libprocess.cpp


Can you fit this on one line by doing 
`s/event.subscribed().framework_id()/framework.id()/` ?



src/examples/low_level_scheduler_pthread.cpp


Ditto here?



src/master/master.cpp


Do you want this further down, above the second `switch` in this method, 
since it's general to implementing calls? Whereas this first switch is just for 
SUBSCRIBE? I guess we don't really need the TODO since we have Unimplemented 
written in the code :)



src/master/master.cpp


Do we want the same id presence checking here as you had below to drop the 
call with a more informative message?

I'm curious if the set to "" case will become a pain to check everywhere?



src/scheduler/scheduler.cpp


Feel free to tease this change out and commit on its own. No need for 
review, just might be nice to have a separate bug fix commit?


- Ben Mahler


On April 22, 2015, 9:42 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32844/
> ---
> 
> (Updated April 22, 2015, 9:42 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Instead of REGISTER and REREGISTER we now just have SUBSCRIBE. Similarly, 
> instead of REGISTERED and REREGISTERED there is only SUBSCRIBED. This will 
> simplify a scheduler's registration semantics.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
>   src/examples/low_level_scheduler_libprocess.cpp 
> 63d34eefb60d13fe2b82905c1cec9b762340e997 
>   src/examples/low_level_scheduler_pthread.cpp 
> 6d1f938660c02db75bfcbf7c8de0d941cff1920d 
>   src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
>   src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
>   src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 
> 
> Diff: https://reviews.apache.org/r/32844/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-22 Thread Jay Buffington


> On April 21, 2015, 4:25 p.m., Jie Yu wrote:
> > src/slave/slave.cpp, lines 3065-3078
> > 
> >
> > Instead of doing that in your way, can we just try to make sure 
> > `containerizer->wait` here will return a failure (or a Termination with 
> > some reason) when `containerizer->launch` fails. In that way, the 
> > `executorTerminated` will properly send status updates to the slave 
> > (TASK_LOST/TASK_FAILED).
> > 
> > Or am I missing something?
> 
> Jie Yu wrote:
> OK, I think I got confused by the ticket. There are actually two problems 
> here. The problem I am refering to is the fact that we don't send status 
> update to the scheduler if containerizer launch fails until executor 
> reregistration timeout happens. Since for docker containerizer, someone might 
> use a very large timeout value, ideally, the slave should send a status 
> update to the scheduler right after containerizer launch fails.
> 
> After chat with Jay, the problem you guys are refering to is the fact 
> that the scheduler cannot disinguish between the case where the task has 
> failed vs. the case where the configuration of a task is not correct, because 
> in both cases, the scheduler will receive a TASK_FAILED/TASK_LOST.
> 
> Jie Yu wrote:
> To address the first problem, I think the simplest way is to add a 
> containerizer->destroy(..) in executorLaunched when containerizer->launch 
> fails. In that way, it's going to trigger containerizer->wait and thus send 
> status update to the scheduler.
> 
> Jie Yu wrote:
> Regarding the second problem, IMO, we should include a reason field in 
> Termination (https://issues.apache.org/jira/browse/MESOS-2035) and let 
> sendExecutorTerminatedStatusUpdate to propagate the termination reason to the 
> scheduler.
> 
> Timothy Chen wrote:
> Reason field sounds good, I think what you proposed makes sense, in 
> docker containerizer at least we also need to make sure termination message 
> is set correctly as currently it doesn't contain all the error information 
> that we pass back to the launch future.
> 
> Jay Buffington wrote:
> Sorry for the confusion.   There are three problems that are all related. 
>  Yes, we need to send statusUpdates as soon as containerizer launch fails and 
> yes, we need to set the reason so the scheduler can distinguish a slave 
> failure from a bad request.  However, my intention with this patch is not to 
> address either of those two problems.
> 
> My goal with this patch is to simply send the containerizer launch 
> failure message back to the scheduler.  I am using Aurora with the docker 
> containerizer.  There are a myriad of reasons that dockerd could fail to 
> "docker run" a container.  Currently, when that fails the user sees a useless 
> and incorrect "Abnormal Executor Termination" message in the Aurora web ui.  
> With this patch they see the stderr output of "docker run."  This way they 
> can report meaningful error messages to the operations team.
> 
> I can update this patch to address the other two issues, but the key is 
> that when the containerizer launch fails we have to send a statusUpdate with 
> a message that contains future.failure().  Before this patch we were only 
> logging it.  The scheduler needs to get that error message.
> 
> Jie Yu wrote:
> Thanks for clarifying it, Jay! In fact, my proposal above should be able 
> to solve the third problem cleanly. Check out 
> `Slave::sendExecutorTerminatedStatusUpdate`. The containerizer should 
> properly set the message and reason fields in the Termination protobuf 
> (basically why the container gets terminated and what's the error message). 
> The slave will forward the reason and message to the scheduler through status 
> update.
> 
> I chatted with BenM about this yesterday, and there are a couple of notes 
> I want to write down here.
> 
> 1. We probably need multiple levels for TaskStatus::Reason. In other 
> words, we probably need a "repeated Reason reasons" field in status update 
> message. For instance, for a containerizer launch failure, we probably need 
> two reasons: 1) the top level reason REASON_EXECUTOR_TERMINATED; 2) the 
> second level reason REASON_DOCKER_PULL_FAILURE;
> 
> 2. We probably want to allow extension to TaskStatus::Reason enum. For 
> example, some framework/executor may want to add customized reasons. We could 
> leverage the protobuf extension support to achieve that 
> (https://developers.google.com/protocol-buffers/docs/proto#extensions).
> 
> 3. The semantics around Termination is broken currently and we need to 
> clean it up. For instance, the following code is broken,
> ```
> void Slave::sendExecutorTerminatedStatusUpdate(...)
> {
>   ...
>   if (termination.isReady() && termination.get().killed()) {
> taskState = TASK_FAILED;
> // TODO(dhamon): 

Re: Review Request 32509: Documented the scheduler Event/Call protobufs.

2015-04-22 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32509/#review81225
---


Patch looks great!

Reviews applied: [32500, 32501, 32502, 32504, 32843, 32505, 32506, 32844, 
32845, 32509]

All tests passed.

- Mesos ReviewBot


On April 22, 2015, 9:44 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32509/
> ---
> 
> (Updated April 22, 2015, 9:44 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
> 
> Diff: https://reviews.apache.org/r/32509/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 32843: Updated KILL to include SlaveID.

2015-04-22 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32843/#review81222
---

Ship it!



src/master/master.cpp


Does this variable buy us much vs `kill.task_id()`? For example, we don't 
do this for framework->id(), I'm guessing you did it to be similar to slaveId?



src/master/master.cpp


You may want to avoid taking a const reference to the temporary here?



src/master/master.cpp


Will this be a BadRequest so schedulers can detect the issue? Maybe a TODO?



src/master/master.cpp


Whoops, remove the semi-colon?



src/tests/scheduler_tests.cpp


Did you want this extra newline?


- Ben Mahler


On April 22, 2015, 9:40 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32843/
> ---
> 
> (Updated April 22, 2015, 9:40 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Having SlaveID will help us do better reconciliation when the task is not 
> found. Also, updated master to handle Call::Kill.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
>   src/master/master.hpp 550d2c50cd01aa5830a7e7e03ab4a0240c221b15 
>   src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
>   src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
>   src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 
> 
> Diff: https://reviews.apache.org/r/32843/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 32140: Enabled 'Resources' to handle 'Resource::ReservationInfo'.

2015-04-22 Thread Michael Park


> On April 8, 2015, 8:16 p.m., Jie Yu wrote:
> > src/common/resources.cpp, lines 450-459
> > 
> >
> > The semantics of this function becomes a little weired now. For 
> > example, for a resource that has `role == "*"` and has reservation set, 
> > `isReserved(resource, "*")` is going to return `true`? Given that 
> > 'resource' is invalid, we should return a `false` in that case?
> 
> Alexander Rukletsov wrote:
> Can we have a resource with `role == "*"` and reservations set?
> 
> Alexander Rukletsov wrote:
> Excuse my premature comment earlier, I'm slowly starting to understand 
> what's going on here. It looks like in the case Jie describes, the function 
> returns `true` indeed. Which is invalid by now, but _may_ become valid in the 
> future. On the other side, I'm not sure that returning `false` in this case 
> is an alternative: it will read as unreserved resource, not an invalid 
> resource. We can wrap the return value in `Try` but I prefer the way it's 
> done now.

Hey guys, this predicate, as with other predicates assume that the resources 
have already been validated.
The following note can be found at the top of the predicate declarations in the 
header.

```
// NOTE: The following predicate functions assume that the given
// resource is validated.
```

Is it still weird with this assumption? If it was just that the note is 
difficult to find, I could maybe put the note as a comment on every predicate?


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32140/#review79410
---


On April 22, 2015, 8:36 p.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32140/
> ---
> 
> (Updated April 22, 2015, 8:36 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Ben Mahler, and Jie Yu.
> 
> 
> Bugs: MESOS-2476
> https://issues.apache.org/jira/browse/MESOS-2476
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `Resource::ReservationInfo` was introduced in 
> [r32139](https://reviews.apache.org/r/32139). We need to consider it in our 
> `Resources` class implementation.
> 
> ### `Resources::flatten`
> 
> `flatten` is used as the utility function to cross reservation boundaries 
> without affecting the given resources. Since the reservation is now specified 
> by the (`role`, `reservation`) pair, `flatten` needs to consider 
> `ReservationInfo` as well.
> 
> ### `Resources::validate`
> 
> If `role == "*"`, then `reservation` field must not be set.
> 
> ### `Resources` comparators
> 
> `operator==`, `addable` and `substractable` need to test for 
> `ReservationInfo` as well.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 56affd45e1e71e96ff5778b43271f81b9b9a9e4d 
>   src/common/resources.cpp 2c99b6884d7296099e19e2e3182cbe11b5e1e059 
>   src/tests/mesos.hpp 4294e28f904161b473c0cfa7feea4eaa4e7c97e3 
>   src/tests/resources_tests.cpp 7e0ad98c3366f647f190363a0e6b576dbfc7d415 
> 
> Diff: https://reviews.apache.org/r/32140/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 33452: Fixed the python bindings to use implicit acknoweldgements by default.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33452/#review81221
---

Ship it!


Ship It!

- Vinod Kone


On April 22, 2015, 9:53 p.m., Ben Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33452/
> ---
> 
> (Updated April 22, 2015, 9:53 p.m.)
> 
> 
> Review request for mesos, Alexander Rojas and Vinod Kone.
> 
> 
> Bugs: MESOS-2643
> https://issues.apache.org/jira/browse/MESOS-2643
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See [MESOS-2643](https://issues.apache.org/jira/browse/MESOS-2643) and [this 
> thread](http://www.mail-archive.com/dev@mesos.apache.org/msg32291.html) for 
> the context.
> 
> 
> Diffs
> -
> 
>   src/python/native/src/mesos/native/mesos_scheduler_driver_impl.cpp 
> b0a5ef49ba7ff7cf52d2ec649045a48f7f692014 
> 
> Diff: https://reviews.apache.org/r/33452/diff/
> 
> 
> Testing
> ---
> 
> Tested manually through `test_framework.py`.
> 
> I removed the explicit argument passing and ran with and without this patch 
> to ensure the default is picked up correctly.
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>



Review Request 33452: Fixed the python bindings to use implicit acknoweldgements by default.

2015-04-22 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33452/
---

Review request for mesos, Alexander Rojas and Vinod Kone.


Bugs: MESOS-2643
https://issues.apache.org/jira/browse/MESOS-2643


Repository: mesos


Description
---

See [MESOS-2643](https://issues.apache.org/jira/browse/MESOS-2643) and [this 
thread](http://www.mail-archive.com/dev@mesos.apache.org/msg32291.html) for the 
context.


Diffs
-

  src/python/native/src/mesos/native/mesos_scheduler_driver_impl.cpp 
b0a5ef49ba7ff7cf52d2ec649045a48f7f692014 

Diff: https://reviews.apache.org/r/33452/diff/


Testing
---

Tested manually through `test_framework.py`.

I removed the explicit argument passing and ran with and without this patch to 
ensure the default is picked up correctly.


Thanks,

Ben Mahler



Re: Review Request 32506: Added output stream operators for scheduler Calls and Events.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32506/
---

(Updated April 22, 2015, 9:46 p.m.)


Review request for mesos and Ben Mahler.


Changes
---

updated "depends on". NNFR.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

For readable output in logs.


Diffs
-

  include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
  src/master/master.cpp e30b951eda2b3b0d5b2a80716f0b32c6bbe041bc 

Diff: https://reviews.apache.org/r/32506/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32509: Documented the scheduler Event/Call protobufs.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32509/
---

(Updated April 22, 2015, 9:44 p.m.)


Review request for mesos and Ben Mahler.


Changes
---

updated documentation.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

See summary.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 

Diff: https://reviews.apache.org/r/32509/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32845: Renamed UNREGISTER call to UNSUBSCRIBE.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32845/
---

(Updated April 22, 2015, 9:43 p.m.)


Review request for mesos and Ben Mahler.


Changes
---

s/UNREGISTER/TEARDOWN/


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

Renamed UNREGISTER call to UNSUBSCRIBE for consistency.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/examples/low_level_scheduler_libprocess.cpp 
63d34eefb60d13fe2b82905c1cec9b762340e997 
  src/examples/low_level_scheduler_pthread.cpp 
6d1f938660c02db75bfcbf7c8de0d941cff1920d 
  src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
  src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
  src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 

Diff: https://reviews.apache.org/r/32845/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32844: Added SUBSCRIBE call and SUBSCRIBED event.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32844/
---

(Updated April 22, 2015, 9:42 p.m.)


Review request for mesos and Ben Mahler.


Changes
---

Benm's comments.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

Instead of REGISTER and REREGISTER we now just have SUBSCRIBE. Similarly, 
instead of REGISTERED and REREGISTERED there is only SUBSCRIBED. This will 
simplify a scheduler's registration semantics.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/examples/low_level_scheduler_libprocess.cpp 
63d34eefb60d13fe2b82905c1cec9b762340e997 
  src/examples/low_level_scheduler_pthread.cpp 
6d1f938660c02db75bfcbf7c8de0d941cff1920d 
  src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
  src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
  src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 

Diff: https://reviews.apache.org/r/32844/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32845: Renamed UNREGISTER call to UNSUBSCRIBE.

2015-04-22 Thread Vinod Kone


> On April 21, 2015, 1:23 a.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, line 113
> > 
> >
> > Is UNSUBSCRIBE the right name for this? It seems like "unsubscribe" 
> > would stop the subscription for events, rather than shutdown all the tasks 
> > and remove the framework. That seems prone to a bit of confusion and 
> > mistakes?
> > 
> > For example, would an operator remove a framework with an /unsubscribe 
> > endpoint?
> > 
> > Hm.. it seems like SHUTDOWN would have been the best names for this? 
> > Should we re-use KILL for executors, but use SHUTDOWN for frameworks?

Didn't want to reuse SHUTDOWN because executor shutdown is a well known term.

I'm calling this TEARDOWN instead to make the semantics more explicit.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32845/#review80870
---


On April 20, 2015, 8:04 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32845/
> ---
> 
> (Updated April 20, 2015, 8:04 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Renamed UNREGISTER call to UNSUBSCRIBE for consistency.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
>   src/examples/low_level_scheduler_libprocess.cpp 
> 63d34eefb60d13fe2b82905c1cec9b762340e997 
>   src/examples/low_level_scheduler_pthread.cpp 
> 6d1f938660c02db75bfcbf7c8de0d941cff1920d 
>   src/master/master.cpp e30b951eda2b3b0d5b2a80716f0b32c6bbe041bc 
>   src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
>   src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 
> 
> Diff: https://reviews.apache.org/r/32845/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 32505: Added SHUTDOWN scheduler call.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32505/
---

(Updated April 22, 2015, 9:41 p.m.)


Review request for mesos, Benjamin Hindman and Ben Mahler.


Changes
---

rebased. NNFR.


Bugs: MESOS-1127 and MESOS-330
https://issues.apache.org/jira/browse/MESOS-1127
https://issues.apache.org/jira/browse/MESOS-330


Repository: mesos


Description
---

This is a new call added to explicitly shutdown an executor.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/master/master.hpp 550d2c50cd01aa5830a7e7e03ab4a0240c221b15 
  src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
  src/messages/messages.proto bdf474b6614fad6eebdd79c636e820f9c30cf0c9 
  src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
  src/slave/slave.hpp d214ddb168298071dd90ad6d7f9bc6fd3d78d986 
  src/slave/slave.cpp 60345ec0d05a8150e61d71a6510824a7f2ca9524 
  src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 

Diff: https://reviews.apache.org/r/32505/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 32844: Added SUBSCRIBE call and SUBSCRIBED event.

2015-04-22 Thread Vinod Kone


> On April 20, 2015, 11:18 p.m., Ben Mahler wrote:
> > include/mesos/scheduler/scheduler.proto, line 117
> > 
> >
> > Might help to have a hint pointing to the framework id semantics:
> > 
> > ```
> > SUBSCRIBE = 1;  // See 'framework_info' below.
> > ```

done.


> On April 20, 2015, 11:18 p.m., Ben Mahler wrote:
> > src/examples/low_level_scheduler_libprocess.cpp, lines 126-127
> > 
> >
> > Maybe just "Subscribed with ID X"?

done.


> On April 20, 2015, 11:18 p.m., Ben Mahler wrote:
> > src/examples/low_level_scheduler_pthread.cpp, lines 147-148
> > 
> >
> > Ditto here.

done.


> On April 20, 2015, 11:18 p.m., Ben Mahler wrote:
> > src/master/master.cpp, lines 1564-1567
> > 
> >
> > TODO for implementing this? I assume you held off because of the 
> > non-trivial streaming related changes needed?

done. added a top level TODO for all unimplemented calls.


> On April 20, 2015, 11:18 p.m., Ben Mahler wrote:
> > src/scheduler/scheduler.cpp, lines 542-548
> > 
> >
> > This isn't yours, but is this right? We're looking at SUBSCRIBED events 
> > when the master is `None`?

doesn't seem correct. fixed.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32844/#review80855
---


On April 20, 2015, 8:03 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32844/
> ---
> 
> (Updated April 20, 2015, 8:03 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1127
> https://issues.apache.org/jira/browse/MESOS-1127
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Instead of REGISTER and REREGISTER we now just have SUBSCRIBE. Similarly, 
> instead of REGISTERED and REREGISTERED there is only SUBSCRIBED. This will 
> simplify a scheduler's registration semantics.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> 783a63ad1cc0edd86605d638046fb959cb6e97e8 
>   src/examples/low_level_scheduler_libprocess.cpp 
> 63d34eefb60d13fe2b82905c1cec9b762340e997 
>   src/examples/low_level_scheduler_pthread.cpp 
> 6d1f938660c02db75bfcbf7c8de0d941cff1920d 
>   src/master/master.cpp e30b951eda2b3b0d5b2a80716f0b32c6bbe041bc 
>   src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
>   src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 
> 
> Diff: https://reviews.apache.org/r/32844/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 32843: Updated KILL to include SlaveID.

2015-04-22 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32843/
---

(Updated April 22, 2015, 9:40 p.m.)


Review request for mesos and Ben Mahler.


Changes
---

rebased. NNFR.


Bugs: MESOS-1127
https://issues.apache.org/jira/browse/MESOS-1127


Repository: mesos


Description
---

Having SlaveID will help us do better reconciliation when the task is not 
found. Also, updated master to handle Call::Kill.


Diffs (updated)
-

  include/mesos/scheduler/scheduler.proto 
783a63ad1cc0edd86605d638046fb959cb6e97e8 
  src/master/master.hpp 550d2c50cd01aa5830a7e7e03ab4a0240c221b15 
  src/master/master.cpp f3462d15005e24ab28e8265484b0d3810f21bd47 
  src/scheduler/scheduler.cpp 584b042e32865fdf875bf41ebcfb7f9c327d882a 
  src/tests/scheduler_tests.cpp 4a89a7a88b50bb8c254f5076661ce07ac9fc7657 

Diff: https://reviews.apache.org/r/32843/diff/


Testing
---

make check


Thanks,

Vinod Kone



Re: Review Request 33448: Added Eclipse-specific files to the gitignore list

2015-04-22 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33448/#review81217
---


Patch looks great!

Reviews applied: [33376, 33448]

All tests passed.

- Mesos ReviewBot


On April 22, 2015, 8:36 p.m., Marco Massenzio wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33448/
> ---
> 
> (Updated April 22, 2015, 8:36 p.m.)
> 
> 
> Review request for mesos and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added Eclipse-specific files to the gitignore list
> 
> 
> Diffs
> -
> 
>   .gitignore-template 1a8e2a8677a29f06ba6386d56537a0013d38821c 
> 
> Diff: https://reviews.apache.org/r/33448/diff/
> 
> 
> Testing
> ---
> 
> N/A
> 
> 
> Thanks,
> 
> Marco Massenzio
> 
>



Re: Review Request 32140: Enabled 'Resources' to handle 'Resource::ReservationInfo'.

2015-04-22 Thread Michael Park


> On April 15, 2015, 5:36 p.m., Alexander Rukletsov wrote:
> > src/common/resources.cpp, lines 93-105
> > 
> >
> > Can we replace it by
> > ```
> > if !addable(left, right) {
> >   return false;
> > }
> > ```
> > ? Or even add `addable()` check to the chain of checks below?

Each of `operator==`, `addable` and `subtractable` has subtle differences.

`operator==` and `addable` in specific are different in that equal persistent 
volumes (same persistence ID) are not addable.
Here's Jie's comment:

```
// TODO(jieyu): Even if two Resource objects with DiskInfo have the
// same persistence ID, they cannot be added together. In fact, this
// shouldn't happen if we do not add resources from different
// namespaces (e.g., across slave). Consider adding a warning.
```


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32140/#review80201
---


On April 22, 2015, 8:36 p.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32140/
> ---
> 
> (Updated April 22, 2015, 8:36 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Ben Mahler, and Jie Yu.
> 
> 
> Bugs: MESOS-2476
> https://issues.apache.org/jira/browse/MESOS-2476
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `Resource::ReservationInfo` was introduced in 
> [r32139](https://reviews.apache.org/r/32139). We need to consider it in our 
> `Resources` class implementation.
> 
> ### `Resources::flatten`
> 
> `flatten` is used as the utility function to cross reservation boundaries 
> without affecting the given resources. Since the reservation is now specified 
> by the (`role`, `reservation`) pair, `flatten` needs to consider 
> `ReservationInfo` as well.
> 
> ### `Resources::validate`
> 
> If `role == "*"`, then `reservation` field must not be set.
> 
> ### `Resources` comparators
> 
> `operator==`, `addable` and `substractable` need to test for 
> `ReservationInfo` as well.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 56affd45e1e71e96ff5778b43271f81b9b9a9e4d 
>   src/common/resources.cpp 2c99b6884d7296099e19e2e3182cbe11b5e1e059 
>   src/tests/mesos.hpp 4294e28f904161b473c0cfa7feea4eaa4e7c97e3 
>   src/tests/resources_tests.cpp 7e0ad98c3366f647f190363a0e6b576dbfc7d415 
> 
> Diff: https://reviews.apache.org/r/32140/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 32140: Enabled 'Resources' to handle 'Resource::ReservationInfo'.

2015-04-22 Thread Michael Park


> On April 15, 2015, 5:36 p.m., Alexander Rukletsov wrote:
> > src/tests/mesos.hpp, lines 376-378
> > 
> >
> > Why not putting the definition into `tests/mesos.cpp`? Here and below.

I kept it consistent with other utility functions that live in this file: 
`createTask`, `createDiskInfo`, `createPersistentVolume`. I think we can fix 
this as a batch in a separate patch.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32140/#review80201
---


On April 22, 2015, 8:36 p.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32140/
> ---
> 
> (Updated April 22, 2015, 8:36 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Ben Mahler, and Jie Yu.
> 
> 
> Bugs: MESOS-2476
> https://issues.apache.org/jira/browse/MESOS-2476
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `Resource::ReservationInfo` was introduced in 
> [r32139](https://reviews.apache.org/r/32139). We need to consider it in our 
> `Resources` class implementation.
> 
> ### `Resources::flatten`
> 
> `flatten` is used as the utility function to cross reservation boundaries 
> without affecting the given resources. Since the reservation is now specified 
> by the (`role`, `reservation`) pair, `flatten` needs to consider 
> `ReservationInfo` as well.
> 
> ### `Resources::validate`
> 
> If `role == "*"`, then `reservation` field must not be set.
> 
> ### `Resources` comparators
> 
> `operator==`, `addable` and `substractable` need to test for 
> `ReservationInfo` as well.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 56affd45e1e71e96ff5778b43271f81b9b9a9e4d 
>   src/common/resources.cpp 2c99b6884d7296099e19e2e3182cbe11b5e1e059 
>   src/tests/mesos.hpp 4294e28f904161b473c0cfa7feea4eaa4e7c97e3 
>   src/tests/resources_tests.cpp 7e0ad98c3366f647f190363a0e6b576dbfc7d415 
> 
> Diff: https://reviews.apache.org/r/32140/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 32140: Enabled 'Resources' to handle 'Resource::ReservationInfo'.

2015-04-22 Thread Michael Park


> On April 22, 2015, 12:14 p.m., Alexander Rukletsov wrote:
> > src/common/resources.cpp, lines 44-49
> > 
> >
> > Won't this be equivalent to the auto-generated version?

I'm not sure what auto-generated version you're referring to? Protobuf doesn't 
auto-generate `operator==`.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32140/#review81143
---


On April 22, 2015, 8:36 p.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32140/
> ---
> 
> (Updated April 22, 2015, 8:36 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Ben Mahler, and Jie Yu.
> 
> 
> Bugs: MESOS-2476
> https://issues.apache.org/jira/browse/MESOS-2476
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `Resource::ReservationInfo` was introduced in 
> [r32139](https://reviews.apache.org/r/32139). We need to consider it in our 
> `Resources` class implementation.
> 
> ### `Resources::flatten`
> 
> `flatten` is used as the utility function to cross reservation boundaries 
> without affecting the given resources. Since the reservation is now specified 
> by the (`role`, `reservation`) pair, `flatten` needs to consider 
> `ReservationInfo` as well.
> 
> ### `Resources::validate`
> 
> If `role == "*"`, then `reservation` field must not be set.
> 
> ### `Resources` comparators
> 
> `operator==`, `addable` and `substractable` need to test for 
> `ReservationInfo` as well.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 56affd45e1e71e96ff5778b43271f81b9b9a9e4d 
>   src/common/resources.cpp 2c99b6884d7296099e19e2e3182cbe11b5e1e059 
>   src/tests/mesos.hpp 4294e28f904161b473c0cfa7feea4eaa4e7c97e3 
>   src/tests/resources_tests.cpp 7e0ad98c3366f647f190363a0e6b576dbfc7d415 
> 
> Diff: https://reviews.apache.org/r/32140/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Review Request 33448: Added Eclipse-specific files to the gitignore list

2015-04-22 Thread Marco Massenzio

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33448/
---

Review request for mesos and Joris Van Remoortere.


Repository: mesos


Description
---

Added Eclipse-specific files to the gitignore list


Diffs
-

  .gitignore-template 1a8e2a8677a29f06ba6386d56537a0013d38821c 

Diff: https://reviews.apache.org/r/33448/diff/


Testing
---

N/A


Thanks,

Marco Massenzio



Re: Review Request 32140: Enabled 'Resources' to handle 'Resource::ReservationInfo'.

2015-04-22 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32140/
---

(Updated April 22, 2015, 8:36 p.m.)


Review request for mesos, Alexander Rukletsov, Ben Mahler, and Jie Yu.


Changes
---

Addressed some of the comments from AlexR, Tim and Jie.


Bugs: MESOS-2476
https://issues.apache.org/jira/browse/MESOS-2476


Repository: mesos


Description
---

`Resource::ReservationInfo` was introduced in 
[r32139](https://reviews.apache.org/r/32139). We need to consider it in our 
`Resources` class implementation.

### `Resources::flatten`

`flatten` is used as the utility function to cross reservation boundaries 
without affecting the given resources. Since the reservation is now specified 
by the (`role`, `reservation`) pair, `flatten` needs to consider 
`ReservationInfo` as well.

### `Resources::validate`

If `role == "*"`, then `reservation` field must not be set.

### `Resources` comparators

`operator==`, `addable` and `substractable` need to test for `ReservationInfo` 
as well.


Diffs (updated)
-

  include/mesos/resources.hpp 56affd45e1e71e96ff5778b43271f81b9b9a9e4d 
  src/common/resources.cpp 2c99b6884d7296099e19e2e3182cbe11b5e1e059 
  src/tests/mesos.hpp 4294e28f904161b473c0cfa7feea4eaa4e7c97e3 
  src/tests/resources_tests.cpp 7e0ad98c3366f647f190363a0e6b576dbfc7d415 

Diff: https://reviews.apache.org/r/32140/diff/


Testing
---

make check


Thanks,

Michael Park



Re: Review Request 33376: MESOS-2633 Moved struct Framework methods to their own implementation class.

2015-04-22 Thread Marco Massenzio

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33376/
---

(Updated April 22, 2015, 8:35 p.m.)


Review request for mesos and Joris Van Remoortere.


Changes
---

reverted changes


Bugs: MESOS-2633
https://issues.apache.org/jira/browse/MESOS-2633


Repository: mesos


Description
---

Created new file framework.cpp containing all the methods' implementations for 
the `Framework` class (declared in master/master.hpp)

Declared `operator ==` for Task in type_utils.hpp 
(it was implemented before, but not declared in the header file);

Refactored all the LOG(WARNING) to a single utility method.


Diffs (updated)
-

  include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
  src/Makefile.am 93c7c8a807a33ab639be6289535bbd32022aa85b 
  src/master/framework.cpp PRE-CREATION 
  src/master/master.hpp 550d2c50cd01aa5830a7e7e03ab4a0240c221b15 

Diff: https://reviews.apache.org/r/33376/diff/


Testing
---

All tests (make check) pass.


Thanks,

Marco Massenzio



Re: Review Request 32975: MESOS-1790 Adds chown option to CommandInfo.URI

2015-04-22 Thread Jim Klucar


> On April 8, 2015, 9:57 p.m., Vinod Kone wrote:
> > src/tests/fetcher_tests.cpp, lines 112-116
> > 
> >
> > Add a note/todo here mentioning why you are setting these fields but 
> > not doing any asserts/expects on them.
> 
> Jim Klucar wrote:
> With the type_utils.cpp change, the EXPECT_EQ on lines 140/141 should be 
> checking the chown field. Should I still add text regarding how we can't 
> check this capability without assuming the user running the test has 
> permission to chown to some other user?
> 
> Vinod Kone wrote:
> Yes. Because the EXPECT on #140 just checks that the chown field is 
> properly propagated to fetcher environment but not that fetcher actually 
> skips chown if chown is false.
> 
> If you write a FetcherTest (see examples later in the file) that actually 
> runs a fetcher process, would you able to test the semantics? I'm thinking 
> that you can create a non-existent user (random string) and set chown to 
> false. Once fetcher process finishes, you can check that the owner of the 
> fetched file hasn't changed to the user. Does that make sense?
> 
> Jim Klucar wrote:
> This makes sense. I'll create the test, back out the change, see the test 
> fail, then add the change back in.
> 
> Jim Klucar wrote:
> I ended up creating two new tests. Both extract a framework, one sets 
> chown true and expects a failure due to the unknown user name. The other sets 
> chown to false and expects the fetcher to finish without error.
> 
> Adam B wrote:
> FYI, if you prefix the test name with `ROOT_` then it will only be run 
> when `make check` is run as root. That way you can provide tests that 
> exercise the chown permissions.

I wasn't aware of that. I'll give it a shot. Is there a standard way other 
tests have done similar things? For instance, create a new user, destroy the 
user, assume some other non-root user exists, etc.


- Jim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32975/#review79433
---


On April 21, 2015, 5 p.m., Jim Klucar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32975/
> ---
> 
> (Updated April 21, 2015, 5 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-1790
> https://issues.apache.org/jira/browse/MESOS-1790
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added chown to CommandInfo.URI protocol buffer as an optional
> boolean that defaults to true, the current chown behavior.
> 
> The fetcher was updated to skip the os::chown operation if the chown
> boolean is set to false.
> 
> No documentation was updated.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 3c592d5ab3092ecbeddfaff95e0c1addc3ac58f8 
>   src/common/type_utils.cpp e92f6f36de0955784619029a016667b46bbe221b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
> 
> Diff: https://reviews.apache.org/r/32975/diff/
> 
> 
> Testing
> ---
> 
> Unit testing this functionality is difficult because it would require that 
> the user running the test to have permission to chown a file to someone other 
> than themselves. I didn't want to add that as a requirement to build. I added 
> the new field to the existing test cases just to see that they populate.
> 
> 
> Thanks,
> 
> Jim Klucar
> 
>



Re: Review Request 33446: Link to mesos-containerizer from the main documentation page.

2015-04-22 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33446/#review81208
---

Ship it!


Thanks James!

- Ben Mahler


On April 22, 2015, 7:22 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33446/
> ---
> 
> (Updated April 22, 2015, 7:22 p.m.)
> 
> 
> Review request for mesos and Joerg Schad.
> 
> 
> Bugs: MESOS-2524
> https://issues.apache.org/jira/browse/MESOS-2524
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Link to mesos-containerizer from the main documentation page.
> 
> 
> Diffs
> -
> 
>   docs/home.md 6ab61f85aa7d62e0f19267b886dffb4e0aa826ea 
> 
> Diff: https://reviews.apache.org/r/33446/diff/
> 
> 
> Testing
> ---
> 
> Build the Mesos documentation and verify the link is corect using ``middleman 
> server``.
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 33275: Fix capture by reference of temporaries in Stout.

2015-04-22 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33275/#review81206
---


Patch looks great!

Reviews applied: [33271, 33272, 33275]

All tests passed.

- Mesos ReviewBot


On April 22, 2015, 6:11 p.m., Joris Van Remoortere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33275/
> ---
> 
> (Updated April 22, 2015, 6:11 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
> Park, and Till Toenshoff.
> 
> 
> Bugs: MESOS-2630
> https://issues.apache.org/jira/browse/MESOS-2630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> see summary.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp 
> 37e007e9ee3010d9aacf2de6254abb75b716412d 
>   3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp 
> fb383b463a99924483634eebf22bf34de318f920 
>   3rdparty/libprocess/3rdparty/stout/include/stout/format.hpp 
> e0b60fe2766f0cb5681b5c55c5ff4ed61cd6a4a7 
>   3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 
> 334c898906018be6e663f53815abbe047806b95c 
>   3rdparty/libprocess/3rdparty/stout/include/stout/multihashmap.hpp 
> ecda6a941077b3e4f9585674b0b5d00d35c5aa1a 
>   3rdparty/libprocess/3rdparty/stout/include/stout/multimap.hpp 
> 9714ba0668e5bd91c215eaf3c3e8728feb41887f 
>   3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp 
> 02db3c587e3f9a40282405e9496bde30e251f8bb 
>   3rdparty/libprocess/3rdparty/stout/include/stout/os/linux.hpp 
> 7a5573be8d8be537e5fd85b86e322fd5765c54ab 
>   3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp 
> 7d17249cf1d14e9344112dd7fef7230c2ad8 
>   3rdparty/libprocess/3rdparty/stout/include/stout/os/pstree.hpp 
> e7fe077f763518680ba13d609dba9ea26950210d 
>   3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp 
> 6728ad8109dfd5e490c0f63684fbf05d0e1fd1b2 
>   3rdparty/libprocess/3rdparty/stout/include/stout/strings.hpp 
> 46b79bc5c11cb1878ffccb4c0c8bb8a49ee9832d 
>   3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 
> 343f95be7f316170b37c9358627f3c2090f0e29e 
> 
> Diff: https://reviews.apache.org/r/33275/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Joris Van Remoortere
> 
>



Re: Review Request 33446: Link to mesos-containerizer from the main documentation page.

2015-04-22 Thread Joerg Schad

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33446/#review81204
---


LGTM


docs/home.md


Should we mention here that this is the default containerizer?


- Joerg Schad


On April 22, 2015, 7:22 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33446/
> ---
> 
> (Updated April 22, 2015, 7:22 p.m.)
> 
> 
> Review request for mesos and Joerg Schad.
> 
> 
> Bugs: MESOS-2524
> https://issues.apache.org/jira/browse/MESOS-2524
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Link to mesos-containerizer from the main documentation page.
> 
> 
> Diffs
> -
> 
>   docs/home.md 6ab61f85aa7d62e0f19267b886dffb4e0aa826ea 
> 
> Diff: https://reviews.apache.org/r/33446/diff/
> 
> 
> Testing
> ---
> 
> Build the Mesos documentation and verify the link is corect using ``middleman 
> server``.
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 33446: Link to mesos-containerizer from the main documentation page.

2015-04-22 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33446/
---

(Updated April 22, 2015, 7:22 p.m.)


Review request for mesos and Joerg Schad.


Bugs: MESOS-2524
https://issues.apache.org/jira/browse/MESOS-2524


Repository: mesos


Description
---

Link to mesos-containerizer from the main documentation page.


Diffs
-

  docs/home.md 6ab61f85aa7d62e0f19267b886dffb4e0aa826ea 

Diff: https://reviews.apache.org/r/33446/diff/


Testing
---

Build the Mesos documentation and verify the link is corect using ``middleman 
server``.


Thanks,

James Peach



Review Request 33446: Link to mesos-containerizer from the main documentation page.

2015-04-22 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33446/
---

Review request for mesos and Joerg Schad.


Bugs: TS-2524
https://issues.apache.org/jira/browse/TS-2524


Repository: mesos


Description
---

Link to mesos-containerizer from the main documentation page.


Diffs
-

  docs/home.md 6ab61f85aa7d62e0f19267b886dffb4e0aa826ea 

Diff: https://reviews.apache.org/r/33446/diff/


Testing
---

Build the Mesos documentation and verify the link is corect using ``middleman 
server``.


Thanks,

James Peach



Re: Review Request 33376: MESOS-2633 Moved struct Framework methods to their own implementation class.

2015-04-22 Thread Marco Massenzio


> On April 21, 2015, 3:27 a.m., Joris Van Remoortere wrote:
> > .gitignore-template, lines 22-33
> > 
> >
> > I think this got added to your commit accidentaly? If you want to add 
> > this to the repository we should make a separate JIRA and review for this.
> 
> Marco Massenzio wrote:
> it was not accidental - but, really: do you need a review to add ignore 
> files? it seems a bit overkill to me.
> 
> Joris Van Remoortere wrote:
> See my high-level comment.

reverted changes


> On April 21, 2015, 3:27 a.m., Joris Van Remoortere wrote:
> > src/Makefile.am, lines 302-304
> > 
> >
> > I don't think you meant to change the spacing of all these lines 
> > (highlighted and rest of file). Can you check your editor settings to see 
> > if you're modifying them accidentally?
> 
> Marco Massenzio wrote:
> the right-hand side continuation \ was all over the place and looked 
> honestly awful: I follow the "boyscout principle" and just lined up quite a 
> few of them around the *one* change I made (adding `framework.cpp`) - 
> hopefully this should upset no one?
> 
> Joris Van Remoortere wrote:
> I think we might have different whitespace modifiers in our editors. 
> Let's sync up as in my high-level comment.

reverted


- Marco


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33376/#review80894
---


On April 21, 2015, 7:06 p.m., Marco Massenzio wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33376/
> ---
> 
> (Updated April 21, 2015, 7:06 p.m.)
> 
> 
> Review request for mesos and Joris Van Remoortere.
> 
> 
> Bugs: MESOS-2633
> https://issues.apache.org/jira/browse/MESOS-2633
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Created new file framework.cpp containing all the methods' implementations 
> for the `Framework` class (declared in master/master.hpp)
> 
> Declared `operator ==` for Task in type_utils.hpp 
> (it was implemented before, but not declared in the header file);
> 
> Refactored all the LOG(WARNING) to a single utility method.
> 
> 
> Diffs
> -
> 
>   .gitignore-template 1a8e2a8677a29f06ba6386d56537a0013d38821c 
>   include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
>   src/Makefile.am d15a37365bcdd5c3906160b46b389635b38b1673 
>   src/master/framework.cpp PRE-CREATION 
>   src/master/master.hpp c10e7c08c191acef9d31d98018a47a2a952a4dfc 
> 
> Diff: https://reviews.apache.org/r/33376/diff/
> 
> 
> Testing
> ---
> 
> All tests (make check) pass.
> 
> 
> Thanks,
> 
> Marco Massenzio
> 
>



Re: Review Request 33276: Fix capture by reference of temporaries in Libprocess.

2015-04-22 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33276/#review81202
---


Patch looks great!

Reviews applied: [33271, 33274, 33276]

All tests passed.

- Mesos ReviewBot


On April 22, 2015, 6:11 p.m., Joris Van Remoortere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33276/
> ---
> 
> (Updated April 22, 2015, 6:11 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
> Park, and Till Toenshoff.
> 
> 
> Bugs: MESOS-2631
> https://issues.apache.org/jira/browse/MESOS-2631
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/gmock.hpp 
> a99eb8ae8fedef2b7d9e227afb3c101f82f753db 
>   3rdparty/libprocess/src/http.cpp eed42ad2725d56ef8c000c1bdedcdc6063ad28c9 
>   3rdparty/libprocess/src/process.cpp 
> 97ac09fd10b767bc46387abc3657d005a00830c8 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> eb38edce2c483ba7f963a826893a15a075238618 
>   3rdparty/libprocess/src/tests/reap_tests.cpp 
> a18d54c43fba44d084e722abdc67d32e5d4504e4 
> 
> Diff: https://reviews.apache.org/r/33276/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Joris Van Remoortere
> 
>



Re: Review Request 33271: Update style guide to disallow capturing temporaries by reference.

2015-04-22 Thread Joris Van Remoortere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33271/
---

(Updated April 22, 2015, 6:36 p.m.)


Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
Park, and Till Toenshoff.


Changes
---

verbage tweak


Bugs: MESOS-2629
https://issues.apache.org/jira/browse/MESOS-2629


Repository: mesos


Description
---

Follow up from r32630.


Diffs (updated)
-

  docs/mesos-c++-style-guide.md de1b93e9f210cd20db2355bd666991339fa4f50b 

Diff: https://reviews.apache.org/r/33271/diff/


Testing
---


Thanks,

Joris Van Remoortere



Re: Review Request 33271: Update style guide to disallow capturing temporaries by reference.

2015-04-22 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33271/#review81196
---



docs/mesos-c++-style-guide.md


`s/an explanation/the rationale`


- Michael Park


On April 22, 2015, 6:11 p.m., Joris Van Remoortere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33271/
> ---
> 
> (Updated April 22, 2015, 6:11 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
> Park, and Till Toenshoff.
> 
> 
> Bugs: MESOS-2629
> https://issues.apache.org/jira/browse/MESOS-2629
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Follow up from r32630.
> 
> 
> Diffs
> -
> 
>   docs/mesos-c++-style-guide.md de1b93e9f210cd20db2355bd666991339fa4f50b 
> 
> Diff: https://reviews.apache.org/r/33271/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Joris Van Remoortere
> 
>



Re: Review Request 33109: Allow setting environment variables in mesos-execute

2015-04-22 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33109/#review81195
---


Patch looks great!

Reviews applied: [33109]

All tests passed.

- Mesos ReviewBot


On April 22, 2015, 5:38 p.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33109/
> ---
> 
> (Updated April 22, 2015, 5:38 p.m.)
> 
> 
> Review request for mesos and Adam B.
> 
> 
> Bugs: MESOS-2023
> https://issues.apache.org/jira/browse/MESOS-2023
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Allow setting environment variables in mesos-execute
> 
> 
> Diffs
> -
> 
>   include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
>   src/cli/execute.cpp 84f70dccbc2c5dd43f68105d967f4488c82f582b 
>   src/common/parse.hpp 547b32041f39f0ff0c38179b66a32b2239134abc 
> 
> Diff: https://reviews.apache.org/r/33109/diff/
> 
> 
> Testing
> ---
> 
> ### Test without env
> 
> ```
> $ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a'
> ```
> 
> The log from mesos, we could see the output is empty
> 
> ```
> Registered executor on localhostI0423 01:27:37.180140 14061 slave.cpp:1540] 
> Sending queued task 'test' to executor 'test' of framework 
> 20150423-012731-16777343-5050-14024-
> Starting task test
> Forked command at 14133
> sh -c 'echo $a'
> 
> I0423 01:27:37.183823 14064 slave.cpp:2507] Handling status update 
> TASK_RUNNING (UUID: be8664ab-a4ca-409e-87e4-de3756506c4a) for task test of 
> framework 20150423-012731-16777343-5050-14024- from 
> executor(1)@127.0.0.1:33700
> ```
> 
> ### Test with env from command
> 
> ```
> $ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo 
> $a' --env="{\"a\": \"stdin\"}"
> ```
> 
> The log from mesos, we could see the output is "stdin"
> 
> ```
> Registered executor on localhost
> Starting task test
> sh -c 'echo $a'
> Forked command at 14783
> "stdin"
> I0423 01:30:41.750128 14063 slave.cpp:2507] Handling status update 
> TASK_RUNNING (UUID: e8b6cf52-9635-4393-94de-b16869e50c91) for task test of 
> framework 20150423-012731-16777343-5050-14024-0001 from 
> executor(1)@127.0.0.1:40051
> ```
> 
> ### Test with env from file
> 
> ```
> $ cat /tmp/env
> {"a": "file"}
> ```
> 
> ```
> $ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo 
> $a' --env="file:///tmp/env"
> ```
> 
> The log from mesos, we could see the output is "file"
> 
> ```
> Registered executor on localhost
> Starting task testsh -c 'echo $a'
> Forked command at 15258
> "file"
> I0423 01:32:49.817705 14065 slave.cpp:2507] Handling status update 
> TASK_RUNNING (UUID: 9e8945d0-4921-41e8-a8e2-d64ec36eea16) for task test of 
> framework 20150423-012731-16777343-5050-14024-0002 from 
> executor(1)@127.0.0.1:56334
> ```
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 33276: Fix capture by reference of temporaries in Libprocess.

2015-04-22 Thread Joris Van Remoortere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33276/
---

(Updated April 22, 2015, 6:11 p.m.)


Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
Park, and Till Toenshoff.


Bugs: MESOS-2631
https://issues.apache.org/jira/browse/MESOS-2631


Repository: mesos


Description
---

See summary.


Diffs
-

  3rdparty/libprocess/include/process/gmock.hpp 
a99eb8ae8fedef2b7d9e227afb3c101f82f753db 
  3rdparty/libprocess/src/http.cpp eed42ad2725d56ef8c000c1bdedcdc6063ad28c9 
  3rdparty/libprocess/src/process.cpp 97ac09fd10b767bc46387abc3657d005a00830c8 
  3rdparty/libprocess/src/tests/process_tests.cpp 
eb38edce2c483ba7f963a826893a15a075238618 
  3rdparty/libprocess/src/tests/reap_tests.cpp 
a18d54c43fba44d084e722abdc67d32e5d4504e4 

Diff: https://reviews.apache.org/r/33276/diff/


Testing
---

make check


Thanks,

Joris Van Remoortere



Re: Review Request 33275: Fix capture by reference of temporaries in Stout.

2015-04-22 Thread Joris Van Remoortere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33275/
---

(Updated April 22, 2015, 6:11 p.m.)


Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
Park, and Till Toenshoff.


Bugs: MESOS-2630
https://issues.apache.org/jira/browse/MESOS-2630


Repository: mesos


Description
---

see summary.


Diffs
-

  3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp 
37e007e9ee3010d9aacf2de6254abb75b716412d 
  3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp 
fb383b463a99924483634eebf22bf34de318f920 
  3rdparty/libprocess/3rdparty/stout/include/stout/format.hpp 
e0b60fe2766f0cb5681b5c55c5ff4ed61cd6a4a7 
  3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 
334c898906018be6e663f53815abbe047806b95c 
  3rdparty/libprocess/3rdparty/stout/include/stout/multihashmap.hpp 
ecda6a941077b3e4f9585674b0b5d00d35c5aa1a 
  3rdparty/libprocess/3rdparty/stout/include/stout/multimap.hpp 
9714ba0668e5bd91c215eaf3c3e8728feb41887f 
  3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp 
02db3c587e3f9a40282405e9496bde30e251f8bb 
  3rdparty/libprocess/3rdparty/stout/include/stout/os/linux.hpp 
7a5573be8d8be537e5fd85b86e322fd5765c54ab 
  3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp 
7d17249cf1d14e9344112dd7fef7230c2ad8 
  3rdparty/libprocess/3rdparty/stout/include/stout/os/pstree.hpp 
e7fe077f763518680ba13d609dba9ea26950210d 
  3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp 
6728ad8109dfd5e490c0f63684fbf05d0e1fd1b2 
  3rdparty/libprocess/3rdparty/stout/include/stout/strings.hpp 
46b79bc5c11cb1878ffccb4c0c8bb8a49ee9832d 
  3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 
343f95be7f316170b37c9358627f3c2090f0e29e 

Diff: https://reviews.apache.org/r/33275/diff/


Testing
---

make check


Thanks,

Joris Van Remoortere



Re: Review Request 33272: Fix capture by reference of temporary strings in Stout.

2015-04-22 Thread Joris Van Remoortere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33272/
---

(Updated April 22, 2015, 6:11 p.m.)


Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
Park, and Till Toenshoff.


Bugs: MESOS-2630
https://issues.apache.org/jira/browse/MESOS-2630


Repository: mesos


Description
---

See summary.


Diffs
-

  3rdparty/libprocess/3rdparty/stout/include/stout/bytes.hpp 
0e8385b106bd8383525df4344a4e49b4df79e0ee 
  3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp 
8a1626cc7c3fa855f57f531903d0c39e621fff1f 
  3rdparty/libprocess/3rdparty/stout/include/stout/flags/fetch.hpp 
1df982c623ad195b483a0034559e6a488df62ee7 
  3rdparty/libprocess/3rdparty/stout/tests/flags_tests.cpp 
00281195b53d2597bdb46e3fe6cd9d46a5e9b1f1 
  3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 
343f95be7f316170b37c9358627f3c2090f0e29e 

Diff: https://reviews.apache.org/r/33272/diff/


Testing
---

make check


Thanks,

Joris Van Remoortere



Re: Review Request 33274: Fix capture by reference of temporary strings in Libprocess.

2015-04-22 Thread Joris Van Remoortere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33274/
---

(Updated April 22, 2015, 6:11 p.m.)


Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
Park, and Till Toenshoff.


Bugs: MESOS-2631
https://issues.apache.org/jira/browse/MESOS-2631


Repository: mesos


Description
---

See summary.


Diffs
-

  3rdparty/libprocess/src/help.cpp 85e1bdec8d7e8f46477d0f3d88847baeca2dcc9c 
  3rdparty/libprocess/src/process.cpp 97ac09fd10b767bc46387abc3657d005a00830c8 
  3rdparty/libprocess/src/tests/decoder_tests.cpp 
efe364a5de38c4d39fe42df243df173e7d04479a 
  3rdparty/libprocess/src/tests/encoder_tests.cpp 
784a2c71a2a7d349726f8e85bab3888b2c405aba 
  3rdparty/libprocess/src/tests/http_tests.cpp 
dfdb233de93552ab7040647ee1958a85d47c9482 
  3rdparty/libprocess/src/tests/process_tests.cpp 
eb38edce2c483ba7f963a826893a15a075238618 
  3rdparty/libprocess/src/tests/subprocess_tests.cpp 
dea4ed8699c6c7cf5b4f5d3f014577c474dcac95 

Diff: https://reviews.apache.org/r/33274/diff/


Testing
---

make check


Thanks,

Joris Van Remoortere



Re: Review Request 33271: Update style guide to disallow capturing temporaries by reference.

2015-04-22 Thread Joris Van Remoortere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33271/
---

(Updated April 22, 2015, 6:11 p.m.)


Review request for mesos, Bernd Mathiske, Cody Maloney, Joerg Schad, Michael 
Park, and Till Toenshoff.


Bugs: MESOS-2629
https://issues.apache.org/jira/browse/MESOS-2629


Repository: mesos


Description
---

Follow up from r32630.


Diffs
-

  docs/mesos-c++-style-guide.md de1b93e9f210cd20db2355bd666991339fa4f50b 

Diff: https://reviews.apache.org/r/33271/diff/


Testing
---


Thanks,

Joris Van Remoortere



Re: Review Request 32975: MESOS-1790 Adds chown option to CommandInfo.URI

2015-04-22 Thread Adam B


> On April 8, 2015, 2:57 p.m., Vinod Kone wrote:
> > src/tests/fetcher_tests.cpp, lines 112-116
> > 
> >
> > Add a note/todo here mentioning why you are setting these fields but 
> > not doing any asserts/expects on them.
> 
> Jim Klucar wrote:
> With the type_utils.cpp change, the EXPECT_EQ on lines 140/141 should be 
> checking the chown field. Should I still add text regarding how we can't 
> check this capability without assuming the user running the test has 
> permission to chown to some other user?
> 
> Vinod Kone wrote:
> Yes. Because the EXPECT on #140 just checks that the chown field is 
> properly propagated to fetcher environment but not that fetcher actually 
> skips chown if chown is false.
> 
> If you write a FetcherTest (see examples later in the file) that actually 
> runs a fetcher process, would you able to test the semantics? I'm thinking 
> that you can create a non-existent user (random string) and set chown to 
> false. Once fetcher process finishes, you can check that the owner of the 
> fetched file hasn't changed to the user. Does that make sense?
> 
> Jim Klucar wrote:
> This makes sense. I'll create the test, back out the change, see the test 
> fail, then add the change back in.
> 
> Jim Klucar wrote:
> I ended up creating two new tests. Both extract a framework, one sets 
> chown true and expects a failure due to the unknown user name. The other sets 
> chown to false and expects the fetcher to finish without error.

FYI, if you prefix the test name with `ROOT_` then it will only be run when 
`make check` is run as root. That way you can provide tests that exercise the 
chown permissions.


- Adam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32975/#review79433
---


On April 21, 2015, 10 a.m., Jim Klucar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32975/
> ---
> 
> (Updated April 21, 2015, 10 a.m.)
> 
> 
> Review request for mesos.
> 
> 
> Bugs: MESOS-1790
> https://issues.apache.org/jira/browse/MESOS-1790
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added chown to CommandInfo.URI protocol buffer as an optional
> boolean that defaults to true, the current chown behavior.
> 
> The fetcher was updated to skip the os::chown operation if the chown
> boolean is set to false.
> 
> No documentation was updated.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 3c592d5ab3092ecbeddfaff95e0c1addc3ac58f8 
>   src/common/type_utils.cpp e92f6f36de0955784619029a016667b46bbe221b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
> 
> Diff: https://reviews.apache.org/r/32975/diff/
> 
> 
> Testing
> ---
> 
> Unit testing this functionality is difficult because it would require that 
> the user running the test to have permission to chown a file to someone other 
> than themselves. I didn't want to add that as a requirement to build. I added 
> the new field to the existing test cases just to see that they populate.
> 
> 
> Thanks,
> 
> Jim Klucar
> 
>



Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-22 Thread Jie Yu


> On April 21, 2015, 11:25 p.m., Jie Yu wrote:
> > src/slave/slave.cpp, lines 3065-3078
> > 
> >
> > Instead of doing that in your way, can we just try to make sure 
> > `containerizer->wait` here will return a failure (or a Termination with 
> > some reason) when `containerizer->launch` fails. In that way, the 
> > `executorTerminated` will properly send status updates to the slave 
> > (TASK_LOST/TASK_FAILED).
> > 
> > Or am I missing something?
> 
> Jie Yu wrote:
> OK, I think I got confused by the ticket. There are actually two problems 
> here. The problem I am refering to is the fact that we don't send status 
> update to the scheduler if containerizer launch fails until executor 
> reregistration timeout happens. Since for docker containerizer, someone might 
> use a very large timeout value, ideally, the slave should send a status 
> update to the scheduler right after containerizer launch fails.
> 
> After chat with Jay, the problem you guys are refering to is the fact 
> that the scheduler cannot disinguish between the case where the task has 
> failed vs. the case where the configuration of a task is not correct, because 
> in both cases, the scheduler will receive a TASK_FAILED/TASK_LOST.
> 
> Jie Yu wrote:
> To address the first problem, I think the simplest way is to add a 
> containerizer->destroy(..) in executorLaunched when containerizer->launch 
> fails. In that way, it's going to trigger containerizer->wait and thus send 
> status update to the scheduler.
> 
> Jie Yu wrote:
> Regarding the second problem, IMO, we should include a reason field in 
> Termination (https://issues.apache.org/jira/browse/MESOS-2035) and let 
> sendExecutorTerminatedStatusUpdate to propagate the termination reason to the 
> scheduler.
> 
> Timothy Chen wrote:
> Reason field sounds good, I think what you proposed makes sense, in 
> docker containerizer at least we also need to make sure termination message 
> is set correctly as currently it doesn't contain all the error information 
> that we pass back to the launch future.
> 
> Jay Buffington wrote:
> Sorry for the confusion.   There are three problems that are all related. 
>  Yes, we need to send statusUpdates as soon as containerizer launch fails and 
> yes, we need to set the reason so the scheduler can distinguish a slave 
> failure from a bad request.  However, my intention with this patch is not to 
> address either of those two problems.
> 
> My goal with this patch is to simply send the containerizer launch 
> failure message back to the scheduler.  I am using Aurora with the docker 
> containerizer.  There are a myriad of reasons that dockerd could fail to 
> "docker run" a container.  Currently, when that fails the user sees a useless 
> and incorrect "Abnormal Executor Termination" message in the Aurora web ui.  
> With this patch they see the stderr output of "docker run."  This way they 
> can report meaningful error messages to the operations team.
> 
> I can update this patch to address the other two issues, but the key is 
> that when the containerizer launch fails we have to send a statusUpdate with 
> a message that contains future.failure().  Before this patch we were only 
> logging it.  The scheduler needs to get that error message.

Thanks for clarifying it, Jay! In fact, my proposal above should be able to 
solve the third problem cleanly. Check out 
`Slave::sendExecutorTerminatedStatusUpdate`. The containerizer should properly 
set the message and reason fields in the Termination protobuf (basically why 
the container gets terminated and what's the error message). The slave will 
forward the reason and message to the scheduler through status update.

I chatted with BenM about this yesterday, and there are a couple of notes I 
want to write down here.

1. We probably need multiple levels for TaskStatus::Reason. In other words, we 
probably need a "repeated Reason reasons" field in status update message. For 
instance, for a containerizer launch failure, we probably need two reasons: 1) 
the top level reason REASON_EXECUTOR_TERMINATED; 2) the second level reason 
REASON_DOCKER_PULL_FAILURE;

2. We probably want to allow extension to TaskStatus::Reason enum. For example, 
some framework/executor may want to add customized reasons. We could leverage 
the protobuf extension support to achieve that 
(https://developers.google.com/protocol-buffers/docs/proto#extensions).

3. The semantics around Termination is broken currently and we need to clean it 
up. For instance, the following code is broken,
```
void Slave::sendExecutorTerminatedStatusUpdate(...)
{
  ...
  if (termination.isReady() && termination.get().killed()) {
taskState = TASK_FAILED;
// TODO(dhamon): MESOS-2035: Add 'reason' to containerizer::Termination.
reason = TaskStatus::REASON_MEMORY_LIMIT;
  }
}
```
because we now have disk limit a

Re: Review Request 29889: Recover Docker containers when mesos slave is in a container

2015-04-22 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29889/
---

(Updated April 22, 2015, 5:33 p.m.)


Review request for mesos and Benjamin Hindman.


Repository: mesos


Description
---

This is a one mega patch and contains many reviews that's already on rb.

This review is not meant to be merged, only provided for easier review.


Diffs (updated)
-

  Dockerfile 35abf25 
  docs/configuration.md 54c4e31 
  docs/docker-containerizer.md a5438b7 
  src/Makefile.am 93c7c8a 
  src/docker/docker.hpp 3ebbc1f 
  src/docker/docker.cpp 3a485a2 
  src/docker/executor.cpp PRE-CREATION 
  src/slave/containerizer/docker.hpp 0d5289c 
  src/slave/containerizer/docker.cpp f9fb078 
  src/slave/flags.hpp d3b1ce1 
  src/slave/flags.cpp d0932b0 
  src/tests/docker_containerizer_tests.cpp b119a17 

Diff: https://reviews.apache.org/r/29889/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 33109: Allow setting environment variables in mesos-execute

2015-04-22 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33109/
---

(Updated April 22, 2015, 5:38 p.m.)


Review request for mesos and Adam B.


Bugs: MESOS-2023
https://issues.apache.org/jira/browse/MESOS-2023


Repository: mesos


Description
---

Allow setting environment variables in mesos-execute


Diffs (updated)
-

  include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
  src/cli/execute.cpp 84f70dccbc2c5dd43f68105d967f4488c82f582b 
  src/common/parse.hpp 547b32041f39f0ff0c38179b66a32b2239134abc 

Diff: https://reviews.apache.org/r/33109/diff/


Testing
---

### Test without env

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a'
```

The log from mesos, we could see the output is empty

```
Registered executor on localhostI0423 01:27:37.180140 14061 slave.cpp:1540] 
Sending queued task 'test' to executor 'test' of framework 
20150423-012731-16777343-5050-14024-
Starting task test
Forked command at 14133
sh -c 'echo $a'

I0423 01:27:37.183823 14064 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: be8664ab-a4ca-409e-87e4-de3756506c4a) for task test of framework 
20150423-012731-16777343-5050-14024- from executor(1)@127.0.0.1:33700
```

### Test with env from command

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a' 
--env="{\"a\": \"stdin\"}"
```

The log from mesos, we could see the output is "stdin"

```
Registered executor on localhost
Starting task test
sh -c 'echo $a'
Forked command at 14783
"stdin"
I0423 01:30:41.750128 14063 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: e8b6cf52-9635-4393-94de-b16869e50c91) for task test of framework 
20150423-012731-16777343-5050-14024-0001 from executor(1)@127.0.0.1:40051
```

### Test with env from file

```
$ cat /tmp/env
{"a": "file"}
```

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a' 
--env="file:///tmp/env"
```

The log from mesos, we could see the output is "file"

```
Registered executor on localhost
Starting task testsh -c 'echo $a'
Forked command at 15258
"file"
I0423 01:32:49.817705 14065 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: 9e8945d0-4921-41e8-a8e2-d64ec36eea16) for task test of framework 
20150423-012731-16777343-5050-14024-0002 from executor(1)@127.0.0.1:56334
```


Thanks,

haosdent huang



Re: Review Request 29889: Recover Docker containers when mesos slave is in a container

2015-04-22 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29889/#review81180
---


Bad patch!

Reviews applied: [29327, 29328]

Failed command: ./support/apply-review.sh -n -r 29328

Error:
 2015-04-22 17:37:07 URL:https://reviews.apache.org/r/29328/diff/raw/ 
[2149/2149] -> "29328.patch" [1]
error: patch failed: src/slave/flags.hpp:311
error: src/slave/flags.hpp: patch does not apply
Failed to apply patch

- Mesos ReviewBot


On April 22, 2015, 5:33 p.m., Timothy Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29889/
> ---
> 
> (Updated April 22, 2015, 5:33 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till 
> Toenshoff.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This is a one mega patch and contains many reviews that's already on rb.
> 
> This review is not meant to be merged, only provided for easier review.
> 
> 
> Diffs
> -
> 
>   Dockerfile 35abf25 
>   docs/configuration.md 54c4e31 
>   docs/docker-containerizer.md a5438b7 
>   src/Makefile.am 93c7c8a 
>   src/docker/docker.hpp 3ebbc1f 
>   src/docker/docker.cpp 3a485a2 
>   src/docker/executor.cpp PRE-CREATION 
>   src/slave/containerizer/docker.hpp 0d5289c 
>   src/slave/containerizer/docker.cpp f9fb078 
>   src/slave/flags.hpp d3b1ce1 
>   src/slave/flags.cpp d0932b0 
>   src/tests/docker_containerizer_tests.cpp b119a17 
> 
> Diff: https://reviews.apache.org/r/29889/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>



Re: Review Request 33109: Allow setting environment variables in mesos-execute

2015-04-22 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33109/
---

(Updated April 22, 2015, 5:37 p.m.)


Review request for mesos and Adam B.


Bugs: MESOS-2023
https://issues.apache.org/jira/browse/MESOS-2023


Repository: mesos


Description
---

Allow setting environment variables in mesos-execute


Diffs (updated)
-

  include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
  src/cli/execute.cpp 84f70dccbc2c5dd43f68105d967f4488c82f582b 
  src/common/parse.hpp 547b32041f39f0ff0c38179b66a32b2239134abc 

Diff: https://reviews.apache.org/r/33109/diff/


Testing
---

### Test without env

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a'
```

The log from mesos, we could see the output is empty

```
Registered executor on localhostI0423 01:27:37.180140 14061 slave.cpp:1540] 
Sending queued task 'test' to executor 'test' of framework 
20150423-012731-16777343-5050-14024-
Starting task test
Forked command at 14133
sh -c 'echo $a'

I0423 01:27:37.183823 14064 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: be8664ab-a4ca-409e-87e4-de3756506c4a) for task test of framework 
20150423-012731-16777343-5050-14024- from executor(1)@127.0.0.1:33700
```

### Test with env from command

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a' 
--env="{\"a\": \"stdin\"}"
```

The log from mesos, we could see the output is "stdin"

```
Registered executor on localhost
Starting task test
sh -c 'echo $a'
Forked command at 14783
"stdin"
I0423 01:30:41.750128 14063 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: e8b6cf52-9635-4393-94de-b16869e50c91) for task test of framework 
20150423-012731-16777343-5050-14024-0001 from executor(1)@127.0.0.1:40051
```

### Test with env from file

```
$ cat /tmp/env
{"a": "file"}
```

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a' 
--env="file:///tmp/env"
```

The log from mesos, we could see the output is "file"

```
Registered executor on localhost
Starting task testsh -c 'echo $a'
Forked command at 15258
"file"
I0423 01:32:49.817705 14065 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: 9e8945d0-4921-41e8-a8e2-d64ec36eea16) for task test of framework 
20150423-012731-16777343-5050-14024-0002 from executor(1)@127.0.0.1:56334
```


Thanks,

haosdent huang



Re: Review Request 33109: Allow setting environment variables in mesos-execute

2015-04-22 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33109/
---

(Updated April 22, 2015, 5:35 p.m.)


Review request for mesos and Adam B.


Bugs: MESOS-2023
https://issues.apache.org/jira/browse/MESOS-2023


Repository: mesos


Description
---

Allow setting environment variables in mesos-execute


Diffs
-

  include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
  src/cli/execute.cpp 84f70dccbc2c5dd43f68105d967f4488c82f582b 
  src/common/parse.hpp 547b32041f39f0ff0c38179b66a32b2239134abc 

Diff: https://reviews.apache.org/r/33109/diff/


Testing (updated)
---

### Test without env

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a'
```

The log from mesos, we could see the output is empty

```
Registered executor on localhostI0423 01:27:37.180140 14061 slave.cpp:1540] 
Sending queued task 'test' to executor 'test' of framework 
20150423-012731-16777343-5050-14024-
Starting task test
Forked command at 14133
sh -c 'echo $a'

I0423 01:27:37.183823 14064 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: be8664ab-a4ca-409e-87e4-de3756506c4a) for task test of framework 
20150423-012731-16777343-5050-14024- from executor(1)@127.0.0.1:33700
```

### Test with env from command

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a' 
--env="{\"a\": \"stdin\"}"
```

The log from mesos, we could see the output is "stdin"

```
Registered executor on localhost
Starting task test
sh -c 'echo $a'
Forked command at 14783
"stdin"
I0423 01:30:41.750128 14063 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: e8b6cf52-9635-4393-94de-b16869e50c91) for task test of framework 
20150423-012731-16777343-5050-14024-0001 from executor(1)@127.0.0.1:40051
```

### Test with env from file

```
$ cat /tmp/env
{"a": "file"}
```

```
$ ./src/mesos-execute  --master=127.0.0.1:5050 --name=test --command='echo $a' 
--env="file:///tmp/env"
```

The log from mesos, we could see the output is "file"

```
Registered executor on localhost
Starting task testsh -c 'echo $a'
Forked command at 15258
"file"
I0423 01:32:49.817705 14065 slave.cpp:2507] Handling status update TASK_RUNNING 
(UUID: 9e8945d0-4921-41e8-a8e2-d64ec36eea16) for task test of framework 
20150423-012731-16777343-5050-14024-0002 from executor(1)@127.0.0.1:56334
```


Thanks,

haosdent huang



Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-22 Thread Jay Buffington


> On April 21, 2015, 4:25 p.m., Jie Yu wrote:
> > src/slave/slave.cpp, lines 3065-3078
> > 
> >
> > Instead of doing that in your way, can we just try to make sure 
> > `containerizer->wait` here will return a failure (or a Termination with 
> > some reason) when `containerizer->launch` fails. In that way, the 
> > `executorTerminated` will properly send status updates to the slave 
> > (TASK_LOST/TASK_FAILED).
> > 
> > Or am I missing something?
> 
> Jie Yu wrote:
> OK, I think I got confused by the ticket. There are actually two problems 
> here. The problem I am refering to is the fact that we don't send status 
> update to the scheduler if containerizer launch fails until executor 
> reregistration timeout happens. Since for docker containerizer, someone might 
> use a very large timeout value, ideally, the slave should send a status 
> update to the scheduler right after containerizer launch fails.
> 
> After chat with Jay, the problem you guys are refering to is the fact 
> that the scheduler cannot disinguish between the case where the task has 
> failed vs. the case where the configuration of a task is not correct, because 
> in both cases, the scheduler will receive a TASK_FAILED/TASK_LOST.
> 
> Jie Yu wrote:
> To address the first problem, I think the simplest way is to add a 
> containerizer->destroy(..) in executorLaunched when containerizer->launch 
> fails. In that way, it's going to trigger containerizer->wait and thus send 
> status update to the scheduler.
> 
> Jie Yu wrote:
> Regarding the second problem, IMO, we should include a reason field in 
> Termination (https://issues.apache.org/jira/browse/MESOS-2035) and let 
> sendExecutorTerminatedStatusUpdate to propagate the termination reason to the 
> scheduler.
> 
> Timothy Chen wrote:
> Reason field sounds good, I think what you proposed makes sense, in 
> docker containerizer at least we also need to make sure termination message 
> is set correctly as currently it doesn't contain all the error information 
> that we pass back to the launch future.

Sorry for the confusion.   There are three problems that are all related.  Yes, 
we need to send statusUpdates as soon as containerizer launch fails and yes, we 
need to set the reason so the scheduler can distinguish a slave failure from a 
bad request.  However, my intention with this patch is not to address either of 
those two problems.

My goal with this patch is to simply send the containerizer launch failure 
message back to the scheduler.  I am using Aurora with the docker 
containerizer.  There are a myriad of reasons that dockerd could fail to 
"docker run" a container.  Currently, when that fails the user sees a useless 
and incorrect "Abnormal Executor Termination" message in the Aurora web ui.  
With this patch they see the stderr output of "docker run."  This way they can 
report meaningful error messages to the operations team.

I can update this patch to address the other two issues, but the key is that 
when the containerizer launch fails we have to send a statusUpdate with a 
message that contains future.failure().  Before this patch we were only logging 
it.  The scheduler needs to get that error message.


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33249/#review81090
---


On April 21, 2015, 10:14 a.m., Jay Buffington wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33249/
> ---
> 
> (Updated April 21, 2015, 10:14 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
> 
> 
> Bugs: MESOS-2020
> https://issues.apache.org/jira/browse/MESOS-2020
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When mesos is unable to launch the containerizer the scheduler should
> get a TASK_FAILED with a status message that includes the error the
> containerizer encounted when trying to launch.
> 
> Introduces a new TaskStatus: REASON_CONTAINERIZER_LAUNCH_FAILED
> 
> Fixes MESOS-2020
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
>   src/slave/slave.cpp 8ec80ed26f338690e0a1e712065750ab77a724cd 
>   src/tests/slave_tests.cpp b826000e0a4221690f956ea51f49ad4c99d5e188 
> 
> Diff: https://reviews.apache.org/r/33249/diff/
> 
> 
> Testing
> ---
> 
> I added test case to slave_test.cpp.  I also tried this with Aurora, supplied 
> a bogus docker image url and saw the "docker pull" failure stderr message in 
> Aurora's web UI.
> 
> 
> Thanks,
> 
> Jay Buffington
> 
>