On Wed, 19 Oct 2005, Mike Ireton wrote:
Suspeciously, I also have been observing an excessive number of ICMP "Frag
reassembly time exceeded" messages coming from this openvpn client directed
at the server. Putting 2 and 2 together, these excess icmp messages appear to
be being generated because the client is not receiving all fragments. And I
think it's the encrypted payload that's not being received properly.
I have a good test case right now. If I ping with 1393 bytes or more of
data, it doesn't work reliably. Whereas if I ping with 1392 bytes, it does
work reliably and without loss. Here is an example:
At first glance that seems like a classic MTU/packet fragmentation issue.
I've seen many odd problems due to packet fragmentation. Packet
fragementation should in teory work just as reliable as with no
fragmentation, just a little performance lose, but in reality there are
many problems associated with this. For example I've seen routers routing
frags perfect for a couple of seconds and then all of suddence only
forward the first frag of each packet. Very hard to debug as all packets
not large enough to be fragmented, like smaller pings still pass
perfectly.
So, the bottom line is to always try avoiding packet fragmentation, or if
you have the time, get to the bottom with why frags are not getting though
properly.
--mssfix helps keeping frags away for tcp sessions, but that won't help
you with pppoe or pings. First use tcpdump to watch the network for the
openvpn traffic and see if you have any frags occationally while sending
large pings.
Then try using the --fragment option with a rather low value which will
cause OpenVPN todo internal fragrentation to avoid IP fragmentation and
see if the frags goes away, and hopefully the problem as well!
If that solves your problem, it would still be nice to understand if the
cause were due to a broken router, or if there really is a bug
somewhere...
Cheers - Mathias