Questions about Framework/Scheduler

2015-07-01 Thread Ying Ji
Hey, I am new to mesos and just start to investigate it. I have a fundament
question about Framework

Assume i am using a long live framework, how the mesos master detect the
framework in unavailable , such as some network error or some internal
error from the framework ? (I can not find it at mater.cpp. Could you
please point out the source code for me ?)The framework has been registered
to the master successfully, and has successfully run for a whole.

Thanks

Ying


Re: Questions about Framework/Scheduler

2015-07-01 Thread Adam Bordelon
See Master::exited()
https://github.com/apache/mesos/blob/0.22.1/src/master/master.cpp#L878
which derives from Process::exited()
https://github.com/apache/mesos/blob/0.22.1/3rdparty/libprocess/include/process/process.hpp#L55
In the event of a temporary network partition, the Mesos master will
continue trying to send offer/status/etc. messages to the framework
scheduler. Since status messages are reliable at-least once delivery, they
are actually queued up (per task) on the slave until an acknowledgement is
received from scheduler.

On Wed, Jul 1, 2015 at 5:08 PM, Ying Ji jiyin...@gmail.com wrote:

 Hey, I am new to mesos and just start to investigate it. I have a
 fundament question about Framework

 Assume i am using a long live framework, how the mesos master detect the
 framework in unavailable , such as some network error or some internal
 error from the framework ? (I can not find it at mater.cpp. Could you
 please point out the source code for me ?)The framework has been registered
 to the master successfully, and has successfully run for a whole.

 Thanks

 Ying