> > If I would like to allow it to restart on any node in a cluster can I use > Marathon to simplify the implementation or it warrants more involved > implementation using Zoo. Does Mesos provide any other helpers to simplify > this use case?
Marathon can do that, but be aware that there is possibility that in some edge cases like network partitions, there could be two instances running at the same time. For example, if the mesos slave that runs your cache-updating service can't connect with mesos master due to network connection problems, after a while mesos master would think the slave is dead, and tells marathon that the task is lost. In this case marathon would launch another instance of your cache-updating service, the result - two instances are running at the same time. To avoid this, you can pin the service to a specific slave by using the constraints provided by marathon, as @klaus suggested above, but this would lost the flexibility of running it inside a mesos cluster. Otherwise you have to use a distributed consensus solution like zookeeper. On Tue, Feb 23, 2016 at 6:06 PM, Petr Novak <[email protected]> wrote: > Hello, > if I need to run single stateless instance or only a single leader doing a > work at any given time. Something I would typically implement using Zoo > Curator LeaderSelector. Can I use Marathon to ensure this without having to > implement mutual exclusion myself? Let's assume that other parts of the > architecture aren't designed well to support more running workers at a time. > > Currently we have a service which updates cache, it runs on one node and > when it fails it is restarted and PID file is used to ensure single > instance. Pretty naive implementation, possibly doesn;t work in all edge > cases. > > If I would like to allow it to restart on any node in a cluster can I use > Marathon to simplify the implementation or it warrants more involved > implementation using Zoo. Does Mesos provide any other helpers to simplify > this use case? > > Or is Marathon designed only to run stateless services which can possibly > run in multiple instances? > > Many thanks, > Petr > > >

