Re: Weird behavior when stopping the mesos master leader of a HA mesos cluster

2015-03-17 Thread Geoffroy Jabouley
Thanks a lot Dario for the workaround! It works fine and can be scripted with ansible. For the record, the github issue is available here: https://github.com/mesosphere/marathon/issues/1292 2015-03-12 17:27 GMT+01:00 Dario Rexin da...@mesosphere.io: Hi Geoffrey, we identified the issue and

Re: Weird behavior when stopping the mesos master leader of a HA mesos cluster

2015-03-12 Thread Geoffroy Jabouley
Thanks Alex for your answer. I will have a look. Would it be better to (cross-)post this discussion on the marathon mailing list? Anyway, the issue is fixed for 0.8.0, which is the version i'm using. 2015-03-11 22:18 GMT+01:00 Alex Rukletsov a...@mesosphere.io: Geoffroy, most probably

Re: Weird behavior when stopping the mesos master leader of a HA mesos cluster

2015-03-12 Thread Dario Rexin
Hi Geoffrey, we identified the issue and will fix it in Marathon 0.8.2. To prevent this behaviour for now, you just have to make sure that in a fresh setup (Marathon was never connected to Mesos) you first start up a single Marathon and let it register with Mesos and then start the other

Re: Weird behavior when stopping the mesos master leader of a HA mesos cluster

2015-03-10 Thread Geoffroy Jabouley
Hello thanks for your interest. Following are the requested logs, which will result in a pretty big mail. Mesos/Marathon are *NOT running inside docker*, we only use Docker as our mesos containerizer. For reminder, here is the use case performed to get the logs file:

Re: Weird behavior when stopping the mesos master leader of a HA mesos cluster

2015-03-10 Thread Adam Bordelon
This is certainly not the expected/desired behavior when failing over a mesos master in HA mode. In addition to the master logs Alex requested, can you also provide relevant portions of the slave logs for these tasks? If the slave processes themselves never failed over, checkpointing and slave

Weird behavior when stopping the mesos master leader of a HA mesos cluster

2015-03-06 Thread Geoffroy Jabouley
Hello we are facing some unexpecting issues when testing high availability behaviors of our mesos cluster. *Our use case:* *State*: the mesos cluster is up (3 machines), 1 docker task is running on each slave (started from marathon) *Action*: stop the mesos master leader process *Expected*:

Re: Weird behavior when stopping the mesos master leader of a HA mesos cluster

2015-03-06 Thread Alex Rukletsov
Geoffroy, could you please provide master logs (both from killed and taking over masters)? On Fri, Mar 6, 2015 at 4:26 AM, Geoffroy Jabouley geoffroy.jabou...@gmail.com wrote: Hello we are facing some unexpecting issues when testing high availability behaviors of our mesos cluster. *Our