Hi James, (bringing the discussion back to the mailing list.)
> I have refined and tested my patch and let it run with a short keylife > (10 minutes) and rekeymargin (1 minute) since last Thursday. In that > time frame I have not had the problem as described Many thanks for your patch, I finally have found some time to take a closer look at the issue and your work. Sorry for the delay. > I have gotten to the bottom of the issue and have found that when both > peers rekey (auto=start) a CHILD_SA at the same time, both rekeying > attempts are successful, traffic flows on each of the new CHILD_SAs > in 1 direction and it breaks the tunnel. If a peer initiates rekeying, the responder tries to detect that (by comparing traffic selectors etc.) and reuse the requid. Reusing the same requid is crucial: We can't install identical policies in the Linux kernel, hence we only install one copy. But if we have two different requids, we can't have a working policy for both tunnels, this only works if we have the same reqid. The kernel should use always the newest SA to send traffic, but accept traffic on both the new and the rekeyed SA to have no interruption in the traffic flow. But as the reqid changes, not both policies work anymore, breaking the tunnel. > At the next rekeying attempt by either of the peers the redundant > CHILD_SA message comes, 1 of the CHILD_SAs is deleted and the tunnel > is functional again. Yes, this is the "interoperable" way we tried to solve the "redundant" Quick Mode issue. While your patch will work very well between strongSwans, I don't know how this would end when talking to a different implementation. In IKEv2 we can compare nonces to elect a CHILD_SA to delete after a collision; this is standardized. But I don't think there isn't a procedure defined for IKEv1, or is the SPI comparison used in your patch based on some recommendation? > I assume that only having 1 of the peers configured with auto=start > would solve this issue No, actually not. Rekeying works independent of the initial tunnel setup: any peer may initiate rekeying at the configured interval, and the risk of a collision always exists (even if can be reduced with a larger rekeymargin/fuzz). > [...] if both peers check for redundancy after rekeying they > should both find an extra CHILD_SA and as long as they delete the same > CHILD_SA the tunnel should function properly (feel free to poke holes > in my idea). That's true, but as indicated above, it might me non-trivial to "delete the same CHILD_SA" when talking to a third party implementation. I think the main issue has already been solved by the call to check_for_rekeyed_child() in quick_mode.c. The problem is just that the comparison does not include Quick Modes in rekeyed state, as this only happens during collisions. The attached patch changes that, and fixes traffic flow even if we have multiple identical Quick Modes for some time. The existing is_redundant() check will take care that multiple Quick Mode states won't multiply states again in the next rekey cycle. Please let me know if this patch works for you, I'll then include it for 5.0.3. Best regards Martin
>From ac48d9e458699aa68f774aaa3168533cf42f95b0 Mon Sep 17 00:00:00 2001 From: Martin Willi <[email protected]> Date: Wed, 3 Apr 2013 15:56:26 +0200 Subject: [PATCH] Reuse reqid of an existing Quick Mode, even if it has been rekeyed If two peers rekey Quick Modes at the same time, the original Quick Mode is in REKEYING state and hence the requid is not reused. This is required though, as two identical policies won't work if they have different requids. --- src/libcharon/sa/ikev1/tasks/quick_mode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/libcharon/sa/ikev1/tasks/quick_mode.c b/src/libcharon/sa/ikev1/tasks/quick_mode.c index afdff8c..bb50ca9 100644 --- a/src/libcharon/sa/ikev1/tasks/quick_mode.c +++ b/src/libcharon/sa/ikev1/tasks/quick_mode.c @@ -925,7 +925,8 @@ static void check_for_rekeyed_child(private_quick_mode_t *this) enumerator = this->ike_sa->create_child_sa_enumerator(this->ike_sa); while (this->reqid == 0 && enumerator->enumerate(enumerator, &child_sa)) { - if (child_sa->get_state(child_sa) == CHILD_INSTALLED && + if ((child_sa->get_state(child_sa) == CHILD_INSTALLED || + child_sa->get_state(child_sa) == CHILD_REKEYING) && streq(child_sa->get_name(child_sa), this->config->get_name(this->config))) { -- 1.7.10.4
_______________________________________________ Dev mailing list [email protected] https://lists.strongswan.org/mailman/listinfo/dev
