Re: Review Request 41285: Initialized `pid` to None() instead of `UPID()` in Slave.

2016-01-19 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41285/
---

(Updated Jan. 19, 2016, 10:28 p.m.)


Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone.


Changes
---

Updated Depends on and moved this out of the existing review chain. This should 
go in 0.27.


Bugs: MESOS-3550
https://issues.apache.org/jira/browse/MESOS-3550


Repository: mesos


Description
---

This was earlier pointed out by Jie on https://reviews.apache.org/r/38874.

Currently, most of our logic for finding the executor type is based on the 
fields `pid`/`http` in the `Executor` struct. We were erroneously setting the 
initial `pid` value to be `UPID()` instead of being `None()`.

The value of `pid` being `UPID()` is only possible in this scenario:

- We were unable to find the type of executor upon agent recovery i.e. no 
pid/http marker file was found. We then mark this special case as `pid=UPID()`.

This special case helps us while recovery in the following ways:

- If the agent crashed before checkpointing the pid file, both `executor->pid` 
and `executor->http` would be `None()`. This is similar to the case for HTTP 
based executors (pid/http being `None`). In order, to distinguish between these 
two cases, we set the `pid=UPID()`. This helps us in not shutting the HTTP 
executor accidently by destroying the container when the agent is recovered 
using `recovery=cleanup`
- This special value also helps us to do better logging by correctly 
distinguishing between HTTP based executors and agent dying before 
checkpointing the pid/http marker file. ( Both have `pid/http` as `None`)


Diffs
-

  src/slave/slave.cpp 90d0fecd2d83fd174134870a577ac59d79c0006f 

Diff: https://reviews.apache.org/r/41285/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Re: Review Request 41285: Initialized `pid` to None() instead of `UPID()` in Slave.

2016-01-12 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41285/
---

(Updated Jan. 12, 2016, 5:37 p.m.)


Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone.


Changes
---

Modified summary.


Summary (updated)
-

Initialized `pid` to None() instead of `UPID()` in Slave.


Bugs: MESOS-3550
https://issues.apache.org/jira/browse/MESOS-3550


Repository: mesos


Description
---

This was earlier pointed out by Jie on https://reviews.apache.org/r/38874 . We 
never observed it as we had no tests for recovery of HTTP executors/receiving 
status updates for them.


Diffs
-

  src/slave/slave.cpp 90d0fecd2d83fd174134870a577ac59d79c0006f 

Diff: https://reviews.apache.org/r/41285/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Review Request 41285: Initialized `pid` to None() instead of `UPID()` in Slave

2015-12-11 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41285/
---

Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone.


Bugs: MESOS-3550
https://issues.apache.org/jira/browse/MESOS-3550


Repository: mesos


Description
---

This was earlier pointed out by Jie on https://reviews.apache.org/r/38874 . We 
never observed it as we had no tests for recovery of HTTP executors/receiving 
status updates for them.


Diffs
-

  src/slave/slave.cpp 9bd86e122c070cd072a54d4de8097f844bd95bb0 

Diff: https://reviews.apache.org/r/41285/diff/


Testing
---

make check


Thanks,

Anand Mazumdar