On Wed, 10 Oct 2018, Whit Blauvelt wrote:
What's best practice for restarting a connection when the internal dead peer detection isn't enough?
Why isn't that enough? Can you describe what is going on ?
In past years with Openswan I've run a script pinging an address in each remote subnet, restarting ipsec if there are persistent failures to respond on any of them. Libreswan tunnels get into a bad state less often (Cisco ASAs on the other end); but nonetheless, despite dpd being enabled, can get into a state where traffic is failing, and an instant restart of ipsec has risk involved.
I'd like to know more about those states.
Yesterday with one tunnel failing seemingly entirely, restarting ipsec resulted in several subnets on a second tunnel becoming unusable, and this through several restarts (although not the same subnets each time), until I waited a full minute for the restart. So to not have to be woken in the middle of the night if this gets into a similar state again, I need to get that test script up again, and presumably introduce a delay in it so it shuts down ipsec, waits somthing like a minute, and then starts it again. Or I need to find a better strategy. What's clear is that dpd needs an external backup to get to automated reliability. This sort of bad state is thankfully infrequent; but I have to prepare for it.
I guess I would do: ipsec auto --replace conn sleep 2 ipsec auto --up conn The sleep will give both ends some time to send their Delete/Notify packets. If these are more conns sharing an IKE SA (multiple subnets between the same peers not using leftsubnetS= / rightsubnetS= then do all of them: ipsec auto --replace conn1 ipsec auto --replace conn2 sleep 2 ipsec auto --up conn1 ipsec auto --up conn2 Paul _______________________________________________ Swan mailing list [email protected] https://lists.libreswan.org/mailman/listinfo/swan
