Hi,

I faced a very strange situation with my framework that talks to
mesos master via Scheduler HTTP API:

Sometimes my framework stops to receive the heartbeats and task updates
from a master.
I read the documentation of mesos (
http://mesos.apache.org/documentation/latest/scheduler-http-api/), *Network
partitions *section and I see that if a framework does not receive the
heartbeats within some time it should reconnect to the master.

I have written a heartbeat monitor that checks if there were not heartbeats
last n seconds, then reconnect, but after the reconnection, I all the time
receive an ERROR from the mesos master that my framework has been removed.

Why is it happening?

Regards,
Uladzimir

Reply via email to