hello Stuart, sorry for the delay in replying I think the issue in my ISP corner case case was that clients were natted to Public address pool X while link ips within the ISP network (the ips that might send the ICMP destination unreachable fragmentation needed messages would be natted to a different IP address, so PMTU discovery inbound (behind the NAT) in that case didn't work. ( I think you are right re the possibility of a Catch all NAT being missed for the Private router links also would result in the PMTU Frag needed ICMP messages getting lost)
Re: >My preference is to try and set things up as much as possible so that >you don't get PMTU blackholes or have to fragment the tunnel packet, >but also clamp mss so that even if you do hit a blackhole there's no problem. >There are some downsides to clamping MSS but they're relatively small >and it's something done by almost every off-the-shelf home CPE so it's >very very xommon on the internet. Agreed on the above... I see alot of 4G devices / networks clamping the hell out of TCP MSS in the wild also, which can make TCP VPNs (SSTP) TLS etc... VPNS Challenging as you have to clamp the TCP MSS in anticipation of an outer clamp on the TCP MSS some tunnels do Fragment gracefully (if you call doubling packet per second on your VPN device graceful, but performance takes a big hit, in testing even deliberately fragmenting packets (to send full frames (layer2) in tunnels or full packets in tunnels (layer3) ) the benefit of being able to send the full packet over the fragmented tunnel does not in any way increase perf... and the TCP MSS clamping gives the best throughput (in my experience) ... Thanks again, Tom Smyth On Sun 15 May 2022, 21:02 Stuart Henderson, <[email protected]> wrote: > On 2022-05-15, Tom Smyth <[email protected]> wrote: > > Hi Stuart, > > I have huge regard for you and all you contribute to OpenBSD and the > community > > Im going to clarify what I meant and what my experience with PMTU and > > constrained MTUs behind > > NAT, > > My humble experience is that if we have a constrained MTU behind a NAT > > Path MTU discovery from the server to the client fails because > > > > [website]--- public IP MTU 1500 bytes ----------[firewall/Nat] > > private network MTU 1492 bytes-client > > > > so while MTU discovery may work outbound...(from client to the website) > > the public website to the public IP has no way to discover the > > constrained PMTU behind the nat... > > There's no reason for this to fail? 1500 byte packet with DF set hits > the firewall/nat box, route lookup, exit MTU is 1492, too big -> surely > it just sends back frag needed? > > Even if you have a nat device with 1500 exit mtu and it then hits 1492 > mtu on another device, similar case but the original frag-needed is > sourced from a private address so it gets natted on the way out. > > There could be some specific cases where things aren't setup to allow > this to work but there's nothing in general to cause it to fail. > > The problem case is when you have router hops on private addresses > where there is *no* nat in the path in which case icmp is generated > from the private address but there's nithing ti translate it, so that > case you do often lose the message due to "no martian" packet filtering. > > > This corner case was discovered when I setup My ISP initially and I > > had not many IP addresses many moons ago > > It would be rare for a client behind a NAT to have a smaller MTU than > > their public IP internet connection. > > > > Is my reasoning and analysis here correct ? > > > > > > Regarding my comment > >> PMTU cannot properly account for underlay restrictions Inside a VPN > > > > what I meant was, that if you set an MTU of 1500 on a VPN Tunnel > interface > > but in sending 1500 Bytes in an IP packet across the tunnel it > > requires a the VPN encapsulated Packet + a Fragment Packet to be sent > > also, (on the underlay interface) > > the Router on the VPN wont sent a Fragment needed IP message to the > > client because the MTU of the Tunnel was not exceeded > > (but the MTU on the underlay was exceeded) > > This depends on the MTU stored in the route table entry used to send > the packet over the vpn. > > With a separate tunnel interface the mtu on that interface and thus the > route table can be set low enough that frag needed is sent. > > With standard flow-based IPsec the route used is normally the default > route with either a standard ethernet MTU or a pppoe MTU. But if there's > another route (route-based IPsec on OS which have this, or a > dummyinterface such as is sometines used in combo with flow-based IPsec, > for example a vether interface with a netmask that covers the "other > side" of the IPsec tunnel as defined in the flow) with a lower MTU set > on that interface, when the packet is attempted to be transmitted it > will again see the lower MTU via the route table and be able to send > frag-needed. > > It's easy to hit the blackhole case with IPsec tunnels *but* also often > not so hard to avoid it. > > My preference is to try and set things up as much as possible so that > you don't get PMTU blackholes or have to fragment the tunnel packet, > but also clamp mss so that even if you do hit a blackhole there's no > problem. > There are some downsides to clamping MSS but they're relatively small > and it's something done by almost every off-the-shelf home CPE so it's > very very xommon on the internet. > > > > I hope the clarifications helps and that im right or at least that I > > learn something new :) > > Thanks > > Tom Smyth > > > > > > > > > > > > > > > > > > On Sun, 15 May 2022 at 19:37, Stuart Henderson > ><[email protected]> wrote: > >> > >> On 2022-05-15, Tom Smyth <[email protected]> wrote: > >> > IP fragments on internet are avoided generally through PMTU discovery > (mtu path > >> > discovery) but > >> > PMTU does not work beyond a Nat (if a smaller MTU interface exists > >> > behind a NAT then the smaller > >> > MTU will not be discovered. > >> > >> That's not right, NAT doesn't break PMTU detection. > >> > >> > PMTU cannot properly account for underlay restrictions Inside a VPN > >> > >> Depends on the VPN type. For VPNs using a tunnel device (openvpn, > >> WireGuard, gif/gre/l2tp etc, maybe route-based IPsec) then PMTU works > >> like it would on another network type. Not nornally for flow-based IPsec > >> though as the MTU is taken from the route (but it can be made to work > >> with a dummy interface covering the VPN range with a lower MTU set in > >> it). > >> > >> > > > > > > > -- > Please keep replies on the mailing list. > >

