Hi Jeremy The “network” code is at "3rdparty/libprocess/include/process/network.hpp” , "3rdparty/libprocess/src/poll_socket.hpp/cpp”.
thanks jojy > On Nov 9, 2015, at 6:54 AM, Jeremy Olexa <[email protected]> wrote: > > Hi all, > > Jojy, That is correct, but more specifically a keepalive timer from slave to > master and slave to zookeeper. Can you send a link to the portion of the code > that builds the socket/connection? Is there any reason to not set the > SO_KEEPALIVE option in your opinion? > > hasodent, I'm not looking for keepalive between zk quorum members, like the > ZOOKEEPER JIRA is referencing. > > Thanks, > Jeremy > > > From: Jojy Varghese <[email protected]> > Sent: Sunday, November 8, 2015 8:37 PM > To: [email protected] > Subject: Re: Mesos and Zookeeper TCP keepalive > > Hi Jeremy > Are you trying to establish a keepalive timer between mesos master and > mesos slave? If so, I don’t believe its possible today as SO_KEEPALIVE option > is not set on an accepting socket. > > -Jojy > >> On Nov 8, 2015, at 8:43 AM, haosdent <[email protected] >> <mailto:[email protected]>> wrote: >> >> I think keepalive option should be set in Zookeeper, not in Mesos. See this >> related issue in Zookeeper. >> https://issues.apache.org/jira/browse/ZOOKEEPER-2246?focusedCommentId=14724085&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14724085 >> >> <https://issues.apache.org/jira/browse/ZOOKEEPER-2246?focusedCommentId=14724085&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14724085> >> >> On Sun, Nov 8, 2015 at 4:47 AM, Jeremy Olexa <[email protected] >> <mailto:[email protected]>> wrote: >> Hello all, >> >> We have been fighting some network/session disconnection issues between >> datacenters and I'm curious if there is anyway to enable tcp keepalive on >> the zookeeper/mesos sockets? If there was a way, then the sysctl tcp kernel >> settings would be used. I believe keepalive has to be enabled by the >> software which is opening the connection. (That is my understanding anyway) >> >> Here is what I see via netstat --timers -tn: >> tcp 0 0 172.18.1.1:55842 <http://172.18.1.1:55842/> >> 10.10.1.1:2181 <http://10.10.1.1:2181/> ESTABLISHED off (0.00/0/0) >> tcp 0 0 172.18.1.1:49702 10.10.1.1:5050 ESTABLISHED >> off (0.00/0/0) >> >> >> Where 172 is the mesos-slave network and 10 is the mesos-master network. The >> "off" keyword means that keepalive's are not being sent. >> >> I've trolled through JIRA, git, etc and cannot easily determine if this is >> expected behavior or should be an enhancement request. Any ideas? >> >> Thanks much! >> -Jeremy >> >> >> >> >> -- >> Best Regards, >> Haosdent Huang

