Re: OpenIKED Keepalive Broken

William Ahern Fri, 12 Aug 2016 14:57:07 -0700

On Fri, Aug 12, 2016 at 09:56:41PM +0200, fRANz wrote:
> On Sat, Aug 6, 2016 at 2:18 AM, William Ahern
> <[email protected]> wrote:
<snip>
> > isakmpd unconditionally sends NAT-T keepalive messages every 30 seconds,
> > whereas iked's ikev2_ike_sa_alive only sends a keepalive message iff
> > `!foundin && foundout`. But that presumes that the SA initiator is also the
> > initiator of traffic, which definitely isn't the case in my situation, and
> > seems dubious and unreliable even for real road warriors.
> 
> ...
> 
> > I'd be happy to create a proper patch if someone could explain the purpose
> > of the conditional logic. I wouldn't want to accidentally break something.
> >
> > I also wouldn't mind making the keepalive interval configurable--rather than
> > a compile-time constant--so users could deal with NAT gateways which
> > aggressively flush state.
> 
> Hello William,
> I did the same switch (from isakmpd to iked) with a lot of problems,
> maybe the same that you're reported.
> Did you receive any feedback from OpenBSD staff, catching the occasion
> of the 6.0 release ready to go?
> Regards,
> -f


No feedback, yet, but soon after posting I realized a few things:

1) My hack makes the tunnel much more stable, but not nearly as stable as
for isakmpd. I think it's because with isakmpd both peers are sending a
keepalive every 30 seconds, whereas I only applied the hack I posted to the
active, behind-the-NAT peer. See point #3, below.

2) The logic of ikev2_ike_sa_alive is intended, I think, to preserve the
limited lifetime of SAs. Otherwise by unconditionally sending a keepalive
and not distinguishing keepalive traffic, the SA might never expire. I'm not
sure why iked isn't using the standard NAT-T keepalive message format and
protocol like isakmpd does. AFAICT it's still defined by IKEv2. Maybe iked
is using a hybrid keepalive/dead peer detetion solution, but the author
forgot to account for different effective behavior in some scenarios; or
maybe it was just more expedient than implementing NAT-T keepalive messages.
Figuring that out will probably help me answer what a proper solution looks
like.

3) I'm fairly certain the keepalive interval should be configurable. The
default UDP NAT state expiration on OpenBSD, for example, is 30 seconds. The
compile-time constant interval for keepalives in isakmpd and iked is also 30
seconds--the recommended period in the RFCs. Over time it's inevitable for a
peer's keepalive packet to miss the window for preserving NAT state,
especially considering that the peer's and router's timers are going to be
synchronized. That would explain why even with isakmpd I still need a
cronjob to ping the passive host. And it explains why isakmpd is more stable
than my hacked iked--isakmpd sends keepalives independently from both sides,
so you have two shots at making the NAT expiration window. iked's keepalive
is a round-trip message; also two packets, but the timing of the first
packet is all that matters.

Of course, the NAT state could expire before an IKE keepalive for many
reasons, but a keepalive interval at least a few seconds less than the
router's NAT expiration should keep the connection stable for longer periods
of time. And rather than having a cron job run every minute, the SA child
lifetime could be set to something smallish. If and when NAT state does
expire, the active peer behind NAT will rekey the SA within a tolerable
period, reestablishing NAT state and restoring reverse traffic. Lowering
both keepalive and child SA lifetime should make these types of tunnels much
more stable and reliable, without recourse to external hacks.

Re: OpenIKED Keepalive Broken

Reply via email to