[
https://issues.apache.org/jira/browse/ZOOKEEPER-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037851#comment-16037851
]
ASF GitHub Bot commented on ZOOKEEPER-1748:
-------------------------------------------
GitHub user bensherman opened a pull request:
https://github.com/apache/zookeeper/pull/274
Zookeeper 1748: Add option for tcp keepalive
As referenced in https://issues.apache.org/jira/browse/ZOOKEEPER-1748 and
https://github.com/apache/zookeeper/pull/83, add the option to use keepalived
on quorum connections. These connections are often idle and long-lived, thus
tend to be silently dropped by intermediate networking infrastructure (AWS
Security Groups' state tables, for example).
This PR adds the option to use the system's keepalive functionality when
creating the socket for quorum connections.
It does not change existing behavior.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/bensherman/zookeeper ZOOKEEPER-1748
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/zookeeper/pull/274.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #274
----
commit 0f7ece14df76cba5170c0442ebedec31cf6fb4b9
Author: Ben Sherman <[email protected]>
Date: 2017-06-05T21:56:14Z
Added option to use tcp keep alives.
commit 820c628ea05fbdd5477083fa77db55a4f0e8812f
Author: Ben Sherman <[email protected]>
Date: 2017-06-05T22:34:59Z
document tcpKeepAlive option
----
> TCP keepalive for leader election connections
> ---------------------------------------------
>
> Key: ZOOKEEPER-1748
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1748
> Project: ZooKeeper
> Issue Type: Improvement
> Components: leaderElection
> Affects Versions: 3.4.5, 3.5.0
> Environment: Linux, Java 1.7
> Reporter: Antal Sasvári
> Assignee: Daniel Peon
> Priority: Minor
> Fix For: 3.5.4, 3.6.0
>
> Attachments: Zookeeper-1748-add_tcp_keepalive.patch
>
>
> In our system we encountered the following problem:
> If the system is stable, and there is no leader election, the leader election
> port connections are open for very long time without any packets being sent
> on them.
> Some network elements silently drop the established TCP connection after a
> timeout if there are no packets being sent on it. In this case the ZK servers
> will not notice the connection loss. This causes additional delay later when
> the next leader election is started, as the TCP connections are not alive any
> more.
> We would like to be able to enable TCP keepalive on the leader election
> sockets in order to prevent the connection timeout in some network elements
> due to connection inactivity.
> This could be controlled by adding a new config parameter called tcpKeepAlive
> in the ZooKeeper configuration file. It would be only applicable in case of
> algorithm 3 (TCP based fast leader election), having the default value false.
> If tcpKeepAlive is set to true, the TCP keepalive flag should be enabled for
> the leader election sockets in QuorumCnxManager.setSockOpts() by calling
> sock.setKeepAlive(true).
> We have tested this change successfully in our environment.
> Please comment whether you see any problem with this. If not, I am going to
> submit a patch.
> I've been told that e.g. Apache ActiveMQ also has a config option for similar
> purpose called transport.keepalive.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)