Hello group! I'm running 2 Strongswan-based VPN IKEv2 servers runnig on Linux (Debian Buster with kernel 5.10 and Strongswan 5.8). Together they grant access to my corporate network to around 200 users each day since the pandemics has started.
One of the servers is available only over IPv4 and the other over IPv4 and IPv6 and we see more than half of clients connecting over IPv6. I have came across a problem of server talking to client for reasons of DPD (see [1]) or rekeying (see [2]): 1. Client is using IPv6. 2. Client sends some data (ikev2_init) over UDP 500. 3. Client's home router creates a firewall state for UDP 500. 4. Client sends some data (ikev2_auth) over UDP 4500. 5. Client's home router creates a firewall state for UDP 4500. 6. IPsec is established. 7. Client tunnels traffic in ESP. 8. Client's home router creates a firewall state for ESP. 9. Some time passes. 10. States for UDP 500 and UDP 4500 are removed from client's home router. 11. State for ESP is kept because there is active traffic flowing through it. 11. More time passes. 10. Server decides it's time for DPD or rekeying. 11. Server sends packets (parent_sa inf2 in case of DPD) to client over UDP 4500. 12. This data never reaches the client due to his home router not forwarding them. 13. No responses come to the server. 14. Served decides the client is dead and removes his policies. 15. Client comes to us complaining that his "vpn broke". This also sometimes leaves Windows VPN client unable to reconnect. How is it supposed to work in real life? For clients connecting over IPv4 I *think* this works fine because of active traffic being encapsulated in UDP 4500. Support for IPv6 UDP IPsec has been added only recently to Linux kernel and I'm not even sure if Windows or Mac OS clients can do this. Also - why go with UDP at all? Pure ESP has lower overhead, doesn't it? [1] When I started developing the VPN solution, I came across the problem that clients (mostly Windows and Mac) sometimes loose connection (problems at ISP, at home WiFi, having the laptop sleep, etc), reconnect and demand the same IP address. This address is granted by Strongswan but it is impossible to install policies in Kernel as the old ones still exist. At least that's what I've documented over a year ago. In order to ensure that a client will be able to reconnect, I've pushed DPD timers quite low: dpd_delay = 40s # There's around 20 seconds from retransmit # algorithm, makes total ~60s dpd_action = clear So far so good, clients now can reconnect. And for most clients doing real work, this is fine - they will generate some traffic once every 40 seconds. But when I'm connecting from a test system (tested on Windows and Linux Ubuntu), which does nothing, it is easy to have no data sent over 40 seconds. [2] I have given up on rekeying, timers are set to some absurd values ensuring that clients can work fine the whole day. -- | pozdrawiam / greetings | Powered by macOS, Debian and FreeBSD | | Kajetan Staszkiewicz | www: http://vegeta.tuxpowered.net | `------------------------^--------------------------------------'
OpenPGP_signature
Description: OpenPGP digital signature