Hi,

You can look at traffic in the tunnel by using the NFLOG target in iptables. 
Read the CorrectTrafficDump page on the wiki.

Kind regards

Noel

On 29.05.2018 18:05, Arzhel Younsi wrote:
> Hello!
>
> I started to troubleshoot intermittent but large spikes of ICMP "packet too 
> big" messages on our servers running IPsec in transport mode with StrongSwan.
>
> We're tracking that issue "internally" on 
> https://phabricator.wikimedia.org/T195365 with many digressions and real 
> data, but here is a summarized version:
>
> hostA and hostB have IPsec configured such as all traffic between the two 
> hosts is being encrypted. Traffic is relatively steady.
>
> At (so far) random times, a packet capture on hostA's loopback shows large 
> spikes of ICMP "packet too big" from and to hostA's interface IP.
> The payload (detailed in the phabricator task) says: hostA tried to send a 
> 1516 bytes packet to hostB while hostA's interface MTU is 1500.
>
> During that spike of ICMP, running:
> "ip -s route get hostB" on hostA shows "mtu 1500". 
> This mtu mention is absent during "quiet time" (default value?).
> The ICMP spike stops before the end of the "cache" countdown. But if the ICMP 
> spike happens again, the "cache" countdown gets re-initialized. 
>
> Locking the MTU with:
> "ip route add hostB via xxx mtu lock 1400" seems to fix the issue.
>
> Our current guess is something along the lines of:
> 1/ An unknown event (eg. congestion) triggers a MTU probing from the kernel 
> (we have tcp_mtu_probing set to 1)
> (As it's all in ipsec, we can't inspect the traffic and see what and how 
> traffic is flowing) 
> 2/ The kernel sets a temporary PMTU value based on the interface (and maybe 
> hostB)
> without taking the ESP overhead into consideration
> 3/ Traffic use that mtu 1500 to send traffic, but can't get passed the 
> interface after beeing encrypted because being too big.
>
> But as this is still quite speculative, and for Ocham's razor' sake I'd 
> expect a miss-configuration on our side instead of a bug in the 
> kernel/StrongSwan :)
>
> How to figure out what creates that cache entry?
> Is our guess plausible?
> How to troubleshoot it more?
> Any help welcome.
>
> As we have many to many IPsec links, I would rather avoid deploying the mtu 
> lock everywhere. This also doesn't help understanding and nailing the root of 
> the issue.
>
> Cheers
>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to