Hi, I have a couple of random seeming problems between Meraki MX devices and Strongswan via pfsense and I'm at a bit of a loss on how to gather more information. Hoping for some pointers here
The Meraki side is their latest firmware and the pfsense is running FreeBSD strongSwan U5.7.1/K11.2-RELEASE-p10. I have several sets of these vpn's but the most problematic one has around 40 phase 1 peers, each with 2 or 3 phase 2 configurations, this is on a single pfsense instance with the 40 phase 1 peers being mx devices on the internet. These are all IKEv1 configurations. For the most part, we have solid and reliable VPN's among the devices, but sometimes the two endpoints appear to get out of sync. This can happen every few days or it can happen every couple of hours. I see instances of the strongswan side successfully rekeying, but the Meraki side logging an SPI expiration and never having logged an established event for that same SPI. The result is that the pfsense side will send traffic forever but the MX apparently just discards the incoming traffic. In other instances, I will see sometimes 5 or 6 phase 2 SPI pairs for the same network set on the same conneciton In either of these two cases, my operational symptom will be that traffic is not passing. In both cases, an ipsec down connection && ipsec up connection makes traffic flow again. I've engaged Meraki many times including as the problems are happening, and I always get an inconclusive answer/ no answer. This is an example config, they're generally all the same for the different phase 1 and phase 2 connections conn con1000 fragmentation = yes keyexchange = ikev1 reauth = yes forceencaps = yes mobike = no rekey = yes installpolicy = yes type = tunnel dpdaction = restart dpddelay = 10s dpdtimeout = 60s auto = route left = leftnet right = rghtnet leftid = leftid ikelifetime = 28800s lifetime = 3600s ike = aes256-sha1-modp1024! esp = aes256-sha1,aes192-sha1,aes128-sha1! leftauth = psk rightauth = psk rightid = rightid aggressive = no rightsubnet = 10.1.1.0/24 leftsubnet = 10.10.1.0/24 I suspect that some of my problems might be related to delivery problems for the encapsulated packets over the internet but I don't know how I can go about knowing that. I have the ability to capture packets on the wan side of the pfsense/strongswan devices, but I don't quite know what I'm looking for in the network traffic. Any pointers to help me get the data I need to make these tunnels way more reliable? Thanks Mark
