On Thu, Aug 16, 2018 at 12:44 AM, Olle E. Johansson <[email protected]> wrote: > > > On 16 Aug 2018, at 09:28, Mikael Abrahamsson <[email protected]> wrote: > > On Wed, 15 Aug 2018, Kent Watsen wrote: > > You bring up an interesting point, it goes to the motivation for wanting to > do keepalives in the first place. The text doesn't yet mention maintain > flow state as a motivation. > > > It's not only to maintain flow state, it's also to close the connection when > the network goes down and doesn't work anymore, and "give up" on connections > that doesn't work anymore (for some definition of "anymore"). > > I have operationally been in the situation where a server/client application > was implemented so that the server could only handle 256 connections (some > filedescriptor limit). Every time the firewall was rebooted, lost state, the > connection hung around forever. So the server administrators had to go in > and restart the process to clear these connections, otherwise there were 256 > hung connections and no new connections could be established. > > Sometimes the other endpoint goes down, and doesn't come back. We will for > instance deploy home gateways probably keeping netconf-call-home sessions to > an NMS, and we want them to be around forever, as long as they work. TCP > level keepalives would solve this, as if the customer just powers off the > device, after a while the session will be cleared. Using TCP keepalives here > means you get this kind of behaviour even if the upper-layer application > doesn't support it (netconf might have been a bad example here). It's a > single socket option to set, so it's very easy to do. > > From knowing approximately what settings people have in their NAT44 and > > firewalls etc, I'd say the recommendation should be that keepalives are set > to around 60-300 second interval, and then kill the connection if no traffic > has passed in 3-5 of these intervals, kill the connection. Otherwise TCP > will have backed off so far anyway, that it's probably faster to just re-try > the connection instead of waiting for TCP to re-send the packet. > > I have seen so many times in my 20 years working in networking where lack of > keepalives have caused all kinds of problems. I wish everybody would turn it > on and keep it on. > Olle,
They are already on, TCP has a default keepalive for 2 hrs. The issue that is inevitably raised is that 2 hrs. is much too long a period for maintaining NAT state (NAT timeouts are usuallu far less time). But, as I pointed out already, sending keepalives at a higher frequency is not devoid of cost nor problems. Tom > > As more and more connections flow over mobile networks, it seems more and > more important, even for flows you did not expect. I have to send keepalives > over IPv6 connections - not for NAT as on IPv4. but for middlebox devices > that has an interesting approach and attitude towards connection management. > ;-) > > The SIP Outbound RFC has a lot of reasoning behind keep-alives for > connection failover and may be good input here. > > https://tools.ietf.org/html/rfc5626 > > /O
