On Thu, Aug 16, 2018 at 12:44 AM, Olle E. Johansson <[email protected]> wrote:
>
>
> On 16 Aug 2018, at 09:28, Mikael Abrahamsson <[email protected]> wrote:
>
> On Wed, 15 Aug 2018, Kent Watsen wrote:
>
> You bring up an interesting point, it goes to the motivation for wanting to
> do keepalives in the first place.  The text doesn't yet mention maintain
> flow state as a motivation.
>
>
> It's not only to maintain flow state, it's also to close the connection when
> the network goes down and doesn't work anymore, and "give up" on connections
> that doesn't work anymore (for some definition of "anymore").
>
> I have operationally been in the situation where a server/client application
> was implemented so that the server could only handle 256 connections (some
> filedescriptor limit). Every time the firewall was rebooted, lost state, the
> connection hung around forever. So the server administrators had to go in
> and restart the process to clear these connections, otherwise there were 256
> hung connections and no new connections could be established.
>
> Sometimes the other endpoint goes down, and doesn't come back. We will for
> instance deploy home gateways probably keeping netconf-call-home sessions to
> an NMS, and we want them to be around forever, as long as they work. TCP
> level keepalives would solve this, as if the customer just powers off the
> device, after a while the session will be cleared. Using TCP keepalives here
> means you get this kind of behaviour even if the upper-layer application
> doesn't support it (netconf might have been a bad example here). It's a
> single socket option to set, so it's very easy to do.
>
> From knowing approximately what settings people have in their NAT44 and
>
> firewalls etc, I'd say the recommendation should be that keepalives are set
> to around 60-300 second interval, and then kill the connection if no traffic
> has passed in 3-5 of these intervals, kill the connection. Otherwise TCP
> will have backed off so far anyway, that it's probably faster to just re-try
> the connection instead of waiting for TCP to re-send the packet.
>
> I have seen so many times in my 20 years working in networking where lack of
> keepalives have caused all kinds of problems. I wish everybody would turn it
> on and keep it on.
>
Olle,

They are already on, TCP has a default keepalive for 2 hrs. The issue
that is inevitably raised is that 2 hrs. is much too long a period for
maintaining NAT state (NAT timeouts are usuallu far less time). But,
as I pointed out already, sending keepalives at a higher frequency is
not devoid of cost nor problems.

Tom

>
> As more and more connections flow over mobile networks, it seems more and
> more important, even for flows you did not expect. I have to send keepalives
> over IPv6 connections - not for NAT as on IPv4. but for middlebox devices
> that has an interesting approach and attitude towards connection management.
> ;-)
>
> The SIP Outbound RFC has a lot of reasoning behind keep-alives for
> connection failover and may be good input here.
>
> https://tools.ietf.org/html/rfc5626
>
> /O

Reply via email to