Hi Steve, > one or more tunnels seem to stop working
What does that exactly mean? What IKE version are you using? > After some investigation, it seemed that these coincide with a rekey > collision where both sides create a rekey jobs at (to the nearest > second) the same time. When this happens I dont see any specific errors > in the logs. If charon detects rekey collisions, it should log that fact. But there are many different collision scenarios; an excerpt from your log could certainly help to analyze the issue. > 1. Why do we keep seeing the collisions, surely the rekeyfuzz would > make this pretty unlikely or does the way the host were built and/or > time sync affect the randomness of rekeyfuzz? Your system time should not have any effect; on most systems charon does not use the system time anyway to schedule such events. With your rekeymargin of 9m and 100% fuzz, collisions should be in fact be very (very) rare. If this is reproducible, something is seriously wrong. For these non-cryptographic operations, charon relies on a getpid() + time() initialized random() calls. Not sure how your Hypervisor handles that? > 2. When we get a collision why dont we see an error and why doesnt it > retry given the keyingtries parameter? keyingtries has no effect when handling rekey collisions. I think with 5.1.1 these collisions should be handled properly. > 3. Is it recommended that only one side should do rekeying (i.e. set > rekey=no on the other)? Usually it is not required, as with a sane configuration collisions are unlikely, and even if they happen should be handled gracefully, at least between two strongSwan hosts. Regards Martin _______________________________________________ Users mailing list [email protected] https://lists.strongswan.org/mailman/listinfo/users
