Hi Nikolay, could this be the problem?
Apr 27 22:36:00 mesos1 marathon[6289]: ************************************************** Apr 27 22:36:00 mesos1 marathon[6289]: Scheduler driver bound to loopback interface! Cannot communicate with remote master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a routable IP address. Apr 27 22:36:00 mesos1 marathon[6289]: ************************************************** This would explain why only a certain node (most likely the one that’s running on the same machine as the current Mesos leader) can start tasks. Cheers, Dario > On 27 Apr 2015, at 23:49, Nikolay Borodachev <[email protected]> wrote: > > Dario, > > The logs are quote lengthy, so I sent them to you directly. Marathon version > is 0.8.1. > > Thank you > Nikolay > > From: Dario Rexin [mailto:[email protected]] > Sent: Monday, April 27, 2015 4:01 PM > To: [email protected] > Subject: Re: Marathon chage of leader and stalled deployments > > Hi Nikolay, > > this is an unexpected behavior. Could you please post the log output from the > leading node around the time you try to scale? Also, what version of Marathon > are you running? > > Thanks, > Dario > > > > On 27.04.2015, at 20:41, Nikolay Borodachev <[email protected] > <mailto:[email protected]>> wrote: > > Hello All, > > I noticed a strange behavior of a Marathon cluster. The cluster consist of 3 > mesos/marathon masters and 3 slaves. > > Once the cluster is freshly started I can start a process (e.g. httpd) and > scale it up and down without any problems. Everything works as it should. > However, if a Marathon leader goes down or gets restarted, the managed > processes cannot be scaled anymore. The scaling request gets queued but does > not get executed by a new Marathon leader. > I found that if I recycle the current leader until the original server > becomes a leader again, the scaling request would not move. > It is only when the server that used to be a leader when the tasks were > created becomes a leader again then these tasks can be scaled. > > Is this a known and expected behavior? > > Thanks > Nikolay

