Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Todd Lipcon,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13702
to look at the new patch set (#4).
Change subject: KUDU-2192: Enable TCP keepalive for all outbound connections
......................................................................
KUDU-2192: Enable TCP keepalive for all outbound connections
This change enables TCP keepalive for all outbound connections.
This aims to handle cases in which the remote peer may have
dropped off the network without sending a TCP RST. For instance,
a remote host could have hit a kernel panic and got power cycled.
In which case, the existing TCP connection to that host may be
stale. In an idle cluster, this stale connection may not be detected
until the next use of it, in which case it will result in a RPC
failure due to TCP RST sent from the restarted peer.
By enabling TCP keepalive, we ensure that stale TCP connections
in an idle cluster will be detected and closed within a time bound
so a new connection will be created on the next use. This change
introduces 3 different flags:
--tcp_keepalive_probe_period_s: the duration in seconds a TCP connection
has to be idle before keepalive probes started to be sent.
--tcp_keepalive_retry_period_s: the duration in seconds between successive
keepalive probes if previous probes didn't get an ACK from remote peer.
--tcp_keepalive_retry_count: the maximum number of TCP keepalive probes
sent without an ACK before declaring the remote peer as dead.
Testing:
- Used TCP dump to verify that keepalive probes are being sent periodically.
- Verified that blocking all incoming traffic to a server's port via an iptable
rule caused the TCP connection to be closed and the keepalive probes to stop
eventually.
Change-Id: Iaa1d66d83aea1cc82d07fc6217be5fc1306695bc
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
M src/kudu/rpc/rpc-test.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/net/socket.h
7 files changed, 112 insertions(+), 2 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/13702/4
--
To view, visit http://gerrit.cloudera.org:8080/13702
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaa1d66d83aea1cc82d07fc6217be5fc1306695bc
Gerrit-Change-Number: 13702
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <[email protected]>