On Aug 19, 2014, at 1:39 PM, Tero Kivinen <[email protected]> wrote:

> Les Leposo writes:
>> have you overlooked the issue of nat mappings?
> 
> Nope.
> 
>> ipsec nat keepalives are very useful for keeping nat mappings alive,
>> and in a world full of all sorts of nat devices (some behaving
>> reliably and others not), one would have to use low keepalive
>> interval... like 10-60s. 
> 
> IPsec NAT-T keepalives are completely different thing than DPD.
> 
> IPsec NAT-T keepalives are packets sent by the device behind the NAT
> as specified in the RFC3948 section 2.3. The responder SHOULD ignore
> the received NAT-keepalive packet, and MUST NOT be used to detect
> whether a connection is live (RFC 3948 section 4). Only device behind
> the NAT sends them and other end does not respond to them, or send its
> own keepalives (unless it is also behind NAT).

> 
> The dead peer detection (DPD) or liveness check is a procedure
> specified in the RFC5996 section 2.4 where it says that:
> 
>       ... If there has only been outgoing traffic on all of
>   the SAs associated with an IKE SA, it is essential to confirm
>   liveness of the other endpoint to avoid black holes.  If no
>   cryptographically protected messages have been received on an IKE SA
>   or any of its Child SAs recently, the system needs to perform a
>   liveness check in order to prevent sending messages to a dead peer.
>   (This is sometimes called "dead peer detection" or "DPD", although it
>   is really detecting live peers, not dead ones.)  Receipt of a fresh
>   cryptographically protected message on an IKE SA or any of its Child
>   SAs ensures liveness of the IKE SA and all of its Child SAs.
> 
> This is done by sending empty INFORMATIONAL message to the other end,
> and if there is response to it then other end is up and running. You
> are supposed to do this only when you suspect something is wrong, i.e.
> the traffic changed to be one way (i.e. no packets coming back), or
> you get ICMP or unauthenticated notify payload or similar.
> 

>> Now, today's client devices need to be energy efficient - so the
>> device sleeps/hibernates to save battery. Sleeping past the nat
>> keepalives is bound to happen (either by design or error). At some
>> point the device will wake from sleep and need to test reachability
>> using dpd.
> 
> Yes, if the device sleeps a long time, it should check whether the IKE
> SA is still up by using DPD. This has nothing to do with the NAT
> keepalives. 

Paul's concerns centred around chattiness and the client-side energy cost of 
this chattiness.
Hence, I offered some suggestion for reducing the energy cost of the 
chattiness, while probing down the path of DPD, and rekeys to get to the root 
of the high chattiness.

Paul's ios issue seems pathological - linked to the ike daemon crashing or 
being signalled. But unless there is a gold standard implementation that can 
maintain tunnels like all week (on a single charge) we can't just stop at 
poor-implementation=low-battery.

But let me explain why I ventured into Keepalives and DPD.
Keepalives help maintain the network path. In their absence, the network path 
(from the server's point of view) will likely change (particularly in today's 
world full of crappy nats and ip oversubscription, made worse by 
multi-connection web pages and apps).

If mobile device sleeps a long time, the likelihood that device wakes to a 
network path change is significant. Aside from pure blackouts, black holes and 
Source IP changes, Source Port address changes will cause issues for some 
servers/configurations. 

Hence dpd is increasingly used to verify both path and peer, both by design 
(e.g. upon waking just do dpd, or upon waking daemon kicks off dpd if it sees 
outgoing esp traffic and no incoming esp traffic) and by configuration. As of 
ios 4 & 5, i'm sure the iphone ike did the former approach, and many admins do 
the latter.
Same goes for wake, where the daemon has to wait for the interface to return 
(and verify the underlying network) before trying DPD, ios 4 & 5 did that.

Keepalives and DPD were merely a small illustration of the larger issue which 
is that in todays real world network conditions, maintaining the tunnel is a 
challenge.
Protocol chattiness increases as the client daemon adapts (itself or by the 
developer) to the lowest common denominator (the crappy nat, the roadside motel 
wifi, oversubscribed cell data network, or city apartment 'high interference' 
wifi, or the buggy/misconfigured server).
Hence I kept probing down the path of DPD, and rekeys because Paul's concerns 
centered around chattiness and .

Even with good implementation of the latest ikev2 spec, the lowest common 
denominator will demand more energy. And so, spec tweaks and creative 
(supplementary) drafts/standards are needed (aside from additional server-side 
hardware investment)

1) Session Resumption ... ikev2 MUST (not SHOULD) both as tools for handling 
post wake/crash recovery.
2) MOBIKE & correct handling of Source Port Address Changes ... MUST (... the 
latter should have been a MUST for Ikev1).
3) additional IKEv2 drafts/standards to create 'very low cal' clients and 
improve handling of path mtu (.... another effect of the lowest common 
denominator). These 'very low cal' clients will also be chatty, but consume 
less because more of the ike operations are offloaded to the high capacity 
server.
4) Strategies for dealing with network/sleep/wake events crash-recovery and 
energy conservation... perhaps as drafts or design guidelines to improve the 
standard of all ikev2 implementations.

> 
>> And in some cases (if the sleep was more than a certain threshold),
>> rather than wait for dpd to failover, the choice is to go for rekey.
> 
> Rekey is not an option, as that would require the IKE SA to still be
> up. I think you mean startting the IKE SA from the beginning. If the
> device has been sleeping for long time, and it suspects the IKE SA is
> gone, it might try shorter period for the DPD, i.e. send few retries
> for the empty INFORMATIONAL message and if no response is received,
> then fall back to start the IKE SA from the beginning.
For your choice of algorithm, the lowest common denominators will always 
experience delays in connecting because the DPD retries max out more often than 
not. Just pointing that out.

> 
> The device of course needs to first wait that the network is up again
> before doing this test, as when you wake up from the sleep, the device
> most likely needs to find the wifi network again, do DHCP, perhaps
> even do hotel login page etc before it can actually send any packets
> to network, and doing DPD during that time, would certainly fail, even
> when the other end would still be there. 
Correct, ios 4 & 5 used to do these sort of things. I'm not sure about other 
smartphones.

> 
> It is important to remember what are the fundamental restrictions set
> by the protocol, and which issues are just caused by bad
> implementations. Quite a lot of the problems we are seeing are caused
> by bad implementations... 

Not solely. the network then isn't the same now. Back then the lowest common 
denominator was a laptop connected through a cell modem or campus wifi network, 
path mtu wasn't bad, certificate payloads weren't big.

Now it is a power-constrained smartphone connected to a high-loss 
oversubscribed network. Some of the issues that weren't important/visible then 
are now significant (e.g. source address/port changes are very common, large 
certificate payloads commonplace & their interaction with small or changing 
path mtu is an obstacle, detecting local ip address changes alone isn't 
enough... you might get the same address back but your ports/gw/dns changed).

> -- 
> [email protected]

_______________________________________________
IPsec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/ipsec

Reply via email to