I have been looking into some problems with PMTU Discovery when routing packets over IPSec (gif) tunnels, I have submitted the details to the open PR kern/91412 but have had no response as to whether my patch is the correct solution to the problem.

The problem occurs when sys/netinet/ip_input.c constructs the ICMP Host Unreachable message with an MTU hint. Triggered when a packet that is to be routed over the IPSec link is larger than the MTU on the link and has the Don't Fragment bit set. There is a block of code that is specific to IPSec (gif) MTU discovery which attempts to calculate the MTU size of the link by working out the size of the IPSec header and subtracting this from the MTU of the transmission interface and the gif IP header. However the code fails to retrieve a fully populated security policy which means that the code block designed to calculate the MTU never gets run. The code then breaks out of the case statement and transmits the ICMP packet with the MTU hint set to 0. If the break is removed from IPSec code the MTU calculation is carried out by the non IPSec code successfully. This begs the question whether the IPSec code is even needed as the normal code works fine?


Network Layout:

Box 1 --------- Router 2 --(Ipsec tunnel)-- Router 3 --(lan) --- Box 2
        |(lan)
        |------ Router 1


Box 1: FreeBSD 5.4
Router [123]: FreeBSD 6.1
Box 2: Linux 2.6


Tests to reproduce the error:

PING Test from box 1 to box 2 with do not fragment set and a packet larger than the path MTU:

box1# ping -s 1280 -D box2
PING box2 (10.0.0.79): 1280 data bytes
36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 051c b454   0 0000  40  01 c9fc 172.17.1.48  10.0.0.79

36 bytes from router2 (172.17.3.6): frag needed and DF set (MTU 0)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 1c05 b454   0 0000  3f  01 cafc 172.17.1.48  10.0.0.79

36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 051c b45f   0 0000  40  01 c9f1 172.17.1.48  10.0.0.79

36 bytes from router2 (172.17.3.6): frag needed and DF set (MTU 0)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 1c05 b45f   0 0000  3f  01 caf1 172.17.1.48  10.0.0.79

^C
--- box2 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

PING Test from box 1 to box 2 with do not fragment set and a packet smaller than the path MTU:

box1# ping -s 1200 -D box2
PING box2 (10.0.0.79): 1200 data bytes
36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 04cc b472   0 0000  40  01 ca2e 172.17.1.48  10.0.0.79

1208 bytes from 10.0.0.79: icmp_seq=0 ttl=61 time=111.017 ms
36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 04cc b479   0 0000  40  01 ca27 172.17.1.48  10.0.0.79

1208 bytes from 10.0.0.79: icmp_seq=1 ttl=61 time=110.419 ms
^C
--- box2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 110.419/110.718/111.017/0.299 ms
box1#


Relevent interface configuration on box1 (from ifconfig):

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
       options=b<RXCSUM,TXCSUM,VLAN_MTU>
       inet 172.17.1.48 netmask 0xffff0000 broadcast 172.17.255.255
       ether 00:0f:1f:fa:d1:b5
       media: Ethernet autoselect (1000baseTX <full-duplex>)
       status: active


Relevent interface configuration on router2 (from ifconfig):

em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
       options=b<RXCSUM,TXCSUM,VLAN_MTU>
       inet 172.17.3.6 netmask 0xffff0000 broadcast 172.17.255.255
       ether 00:c0:9f:12:13:1b
       media: Ethernet autoselect (1000baseTX <full-duplex>)
       status: active
gif0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280
       tunnel inet 63.174.xxx.xxx --> 82.195.xxx.xxx
       inet 192.168.174.10 --> 192.168.174.9 netmask 0xfffffffc



Patch:

Index: sys/netinet/ip_input.c
===================================================================
--- sys/netinet/ip_input.c      (revision 24)
+++ sys/netinet/ip_input.c      (working copy)
@@ -1990,8 +1990,8 @@
 #else /* FAST_IPSEC */
                                KEY_FREESP(&sp);
 #endif
-                               ipstat.ips_cantfrag++;
-                               break;
+//                             ipstat.ips_cantfrag++;
+//                             break;
                        }
                }
 #endif /*IPSEC || FAST_IPSEC*/



Tests after the patch has been applied:

PING Test from box 1 to box 2 with do not fragment set and a packet larger than the path MTU:

box1# ping -s 1280 -D box2
PING box2 (10.0.0.79): 1280 data bytes
36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 051c b454   0 0000  40  01 c9fc 172.17.1.48  10.0.0.79

36 bytes from router2 (172.17.3.6): frag needed and DF set (MTU 1280)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 1c05 b454   0 0000  3f  01 cafc 172.17.1.48  10.0.0.79

36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 051c b45f   0 0000  40  01 c9f1 172.17.1.48  10.0.0.79

36 bytes from router2 (172.17.3.6): frag needed and DF set (MTU 1280)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
4  5  00 1c05 b45f   0 0000  3f  01 caf1 172.17.1.48  10.0.0.79


Any comments suggestions on this would be greatly appreciated.

Tom J


_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to