RE: cluster confusion after zookeeper blip

2015-05-18 Thread Nikolay Borodachev
Have you tried to restart Marathon and Mesos processes? Once you restart them they should pick zookeepers, elect leaders, etc. If you're using Docker containers, they should reattach themselves to the respective slaves. Thanks Nikolay -Original Message- From: rasput...@gmail.com

Re: cluster confusion after zookeeper blip

2015-05-18 Thread Dick Davies
Thanks Nikolay - I checked the frameworkid in zookeeper (/marathon/state/frameworkId) matched the one attached to the running tasks, gave the old marathon leader a restart and everything reconnected ok (we did have to disable our service discovery pieces to avoid getting empty JSON back when

Re: mesos slave doesn't pick up tasks after restart

2015-05-18 Thread Cody Maloney
Running mesos slave inside of a docker container and having working slave task recovery isn't supported at the moment. See: https://issues.apache.org/jira/browse/MESOS-2115 On Mon, May 18, 2015 at 4:47 AM, Grzegorz Graczyk gregor...@gmail.com wrote: 3-node cluster CoreOS 675.0.0 Mesos 0.22.1

Re: mesos slave doesn't pick up tasks after restart

2015-05-18 Thread Grzegorz Graczyk
Thanks a lot! :) I couldn’t find any corresponding issue. On 18 May 2015, at 19:37, Cody Maloney c...@mesosphere.io wrote: Running mesos slave inside of a docker container and having working slave task recovery isn't supported at the moment. See:

Re: make[3]: *** [check-local] Aborted (core dumped) in make test

2015-05-18 Thread haosdent
@Joerg Maurer I could not reproduce your problems in CentOS. From this ticket[https://issues.apache.org/jira/browse/MESOS-2744], @Colin Williams also could not reproduce your problems in Ubuntu which kernel is 3.13.0-35-generic. So could you sure the problem is exist in the latest code? Thank you

Re: Medallia powered by Mesos

2015-05-18 Thread Adam Bordelon
I have added Medallia to the Mesos adopters list. It will show up in the next website update. Thanks for using Mesos! See you at MesosCon? On Sun, May 17, 2015 at 4:56 PM, Anirudha Jadhav aniru...@nyu.edu wrote: +1 On Mon, May 18, 2015 at 2:15 AM, Mauricio Garavaglia mauri...@medallia.com

cluster confusion after zookeeper blip

2015-05-18 Thread Dick Davies
We run a 3 node marathon cluster on top of 3 mesos masters + 6 slaves. (mesos 0.21.0, marathon 0.7.5) This morning we had a network outage long enough for everything to lose zookeeper. Now our marathon UI is empty (all 3 marathons think someone else is a master, and marathons 'proxy to leader'

mesos slave doesn't pick up tasks after restart

2015-05-18 Thread Grzegorz Graczyk
3-node cluster CoreOS 675.0.0 Mesos 0.22.1 Marathon 0.8.2-RC2 Everything is run in containers, mesos slave run using command: /usr/bin/docker run \ --rm \ --net=host \ --pid=host \ --name slave \ -v /data/server/mesos-slave:/data/mesos-slave \ -v /root/.dockercfg:/etc/.dockercfg \ --privileged \