Re: mesos ui best practise - mesos cluster in HA

2015-12-02 Thread Jeremy Olexa
This is what we use in haproxy: backend master_cluster   option httpclose   option forwardfor   mode http   option httpchk GET /metrics/snapshot   http-check expect string master\/elected":1   server master-0 ip1:5050 check   server master-1 ip2:5050 check   server master-2 ip3:5050 check

Re: Mesos and Zookeeper TCP keepalive

2015-11-13 Thread Jeremy Olexa
pshot/backup” also. thanks Jojy > On Nov 11, 2015, at 6:04 PM, Jeremy Olexa > <jol...@spscommerce.com<mailto:jol...@spscommerce.com>> wrote: > > Hi Joris, all, > > We are still at the default timeout values for those that you linked. In the > meantime, since the

Re: Mesos and Zookeeper TCP keepalive

2015-11-11 Thread Jeremy Olexa
nvironment to allow re-registration by the agent after the agent notices it needs to re-establish the connection. — Joris Van Remoortere Mesosphere On Tue, Nov 10, 2015 at 5:02 AM, Jeremy Olexa <jol...@spscommerce.com<mailto:jol...@spscommerce.com>> wrote: Hi Tommy, Erik, all,

Re: Mesos and Zookeeper TCP keepalive

2015-11-10 Thread Jeremy Olexa
ures are a valid state in distributed systems. If you think there is a special case you are trying to solve, I suggest proposing a design document for review. For ZK client code, I would suggest asking the zookeeper mailing list. thanks -Jojy On Nov 9, 2015, at 7:56 PM, Jeremy Olexa <jol...@

Re: Mesos and Zookeeper TCP keepalive

2015-11-09 Thread Jeremy Olexa
Mesos and Zookeeper TCP keepalive Hi Jeremy The "network" code is at "3rdparty/libprocess/include/process/network.hpp" , "3rdparty/libprocess/src/poll_socket.hpp/cpp". thanks jojy On Nov 9, 2015, at 6:54 AM, Jeremy Olexa <jol...@spscommerce.com<mailto:jol.

Mesos and Zookeeper TCP keepalive

2015-11-07 Thread Jeremy Olexa
Hello all, We have been fighting some network/session disconnection issues between datacenters and I'm curious if there is anyway to enable tcp keepalive on the zookeeper/mesos sockets? If there was a way, then the sysctl tcp kernel settings would be used. I believe keepalive has to be

Re: upgrade from 0.24.1 to 0.25

2015-10-15 Thread Jeremy Olexa
Hi Craig, it was posted on the marathon email list that 0.11.0 is not safe for production. https://groups.google.com/d/msg/marathon-framework/u4-FKVkh5RQ/wH-s1sdECgAJ From: craig w Sent: Thursday, October 15, 2015 4:13 AM To:

Advice on agent disconnect and reconnect

2015-10-04 Thread Jeremy Olexa
Hello, We have been observing some agent processes disconnects when our agent processes are in another datacenter, A, and accessing the master cluster in datacenter B. I would like to mitigate this issue because it ejects all the applications running and then all of the sandbox links, etc,

Master UI - Tasks section is empty

2015-08-23 Thread Jeremy Olexa
Hi all, On a new cluster, the tasks section of the left sidebar is populated as jobs are staged, started, killed, etc. I've noticed that after a rolling restart of the cluster, like taking a node out for maintenance - or restarted instances in an ASG, that the Tasks section of the UI stops

Re: Mesos Whitelist syntax

2015-08-13 Thread Jeremy Olexa
: Thursday, August 13, 2015 1:56 AM To: user@mesos.apache.org Subject: Re: Mesos Whitelist syntax Hi, @Jeremy If the whitelist file, you need add every explicit IP as per line. If you don't special --whitelist or use --whitelist=*, it would accept all ip. On Thu, Aug 13, 2015 at 6:49 AM, Jeremy Olexa