-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38876/
-----------------------------------------------------------

(Updated Nov. 10, 2015, 2:22 p.m.)


Review request for mesos, Ben Mahler, Isabel Jimenez, and Vinod Kone.


Changes
-------

Address comments from BenM + Updated Description


Bugs: MESOS-3515
    https://issues.apache.org/jira/browse/MESOS-3515


Repository: mesos


Description (updated)
-------

This change adds the relevant functionality to `src/slave/paths.cpp/hpp` to 
store a marker file to denote HTTP based executors. We create the file when 
`checkpointing` is enabled as part of handling the `Subscribe` request. This is 
then used by the agent when recovering to ascertain if the executor was 
connected via `HTTP` before the agent restart.

-- Detailed Explanation of Changes ( not to be included in the commit message )
This marker file is used when recovering HTTP based executors (assuming 
framework checkpointing is enabled). Currently we support the following 
recovery options on the agent.

1. `--cleanup` : If `PID` marker file is not found, the current behavior is to 
directly destroy the container the executor was running in. With the help of 
this `HTTP` marker file, we can now check if the executor was connected via 
HTTP previously and if so, send it a `Event::SHUTDOWN` when it retries the 
`Subscribe` call.
2. `--reconnect` : If `PID` marker file is not found, the current behavior is 
to just `LOG` that we were not able to reconnect back to the executor. With the 
help of the `HTTP` marker file, we are able to correctly distinguish between 
the cases when a `PID` based executor failed to checkpoint its PID and it being 
an `HTTP` based executor. An example: 
https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L4177


Diffs (updated)
-----

  src/slave/paths.hpp f743fb4b1ca278fade9134e0ae8f6a6450d4a977 
  src/slave/paths.cpp aab7a4b63f0e7c2104097077369bb10bcd28c6a1 

Diff: https://reviews.apache.org/r/38876/diff/


Testing
-------

make check


Thanks,

Anand Mazumdar

Reply via email to