PMTUD issue using NAT64

Luís Rosa Wed, 24 Oct 2012 09:16:53 -0700

Hi list,

I'm doing some research on NAT64 and OpenBSD. I would like to say that
OpenBSD is very good in this area. During some tests, I observed issues
regarding PMTUD.


I was testing OpenBSD 5.2 with different MTU values on its interfaces.
If the translator (pf) needs to generate an error message, ICMPv4
"fragmentation needed" or ICMPv6 "packet too big" (if a packet is bigger
than the MTU of the output interface), those messages are not properly
generated. The expected behavior is described in rfc6145.

Notice that this scenario is different from the existing regress tests in
regress/sys/net/pf_forward, where just ICMP message forwarding (originating
from the outside) is tested.

To reproduce the behavior, you need to setup a NAT64 scenario using af-to
rules and reduce the MTU of the output interface (to where pf forwards the
messages). I used scapy to craft "big" packets with DF set.

$ cat /etc/pf.conf
ext_if="vic0"
ext_ip4="192.1.1.64"
ext_ip6="2012::64"

int_if="vic1"
int_ip4="192.2.2.64"
int_ip6="2011::64"

int_dst4="192.2.2.2"
int_dst6="2011::11"

#NAT64 & NAT46
pass in log on $ext_if inet to $ext_ip4 af-to inet6 from $int_ip6 to $int_dst6
pass in log on $ext_if inet6 to $ext_ip6 af-to inet from $int_ip4 to $int_dst4

pass log inet6 proto icmp6
pass log inet proto icmp


$ ifconfig vic  | grep mtu
vic0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
vic1: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST>
mtu 1400


In case of IPv6 to IPv4 translation, I observed the ICMPv4 "fragmentation
need" messages being sent to the loopback interface (with wrong addresses)
instead of ICMPv6 messages back to the original sender.

$ tcpdump -Nnti vic0
2012::11.6666 > 2012::64.90: udp 1440

$ tcpdump -Nnti lo0
192.2.2.2 > 192.2.2.64: icmp: 192.2.2.2 unreachable - need to frag (mtu 1400)


In case of IPv4 to IPv6 translation, I noticed that the af-to code
"ignores" the IPv4 "don't fragment" flag resulting in a IPv6 packet with a
fragmentation header and without sending an ICMP error message.

$ tcpdump -Nnti vic0
192.1.1.1.6666 > 192.1.1.64.9090: udp 1440 (DF)

$ tcpdump -Nnti vic1
2011::64 > 2011::11: frag (0x4c65749a:1352@0+) 63926 > 9090:  udp 1440
2011::64 > 2011::11: frag (0x4c65749a:96@1352)


In the last weeks, I started to dig into the source code. For instance, in
the IPv6 to IPv4 translation case the MTU verification is performed after
the translation in pf_route, but then the icmp_error* and icmp_reflect
functions cannot properly handle a reflection to the original source in
case of a different address family.

In the other case, IPv4 to IPv6, the original IPv4 flags get lost during
translation. At least the DF bit is important to decide whether to fragment
or not and then to send back an error message.

I still don't have a patch ready. I would appreciate any help or hint. My
current idea is to pass e.g the DF bit and the original source address to
where the icmp error messages are generated. If necessary, that function
would need to translate back to the original af and send the error message.

-- 
Cumprimentos,
Luís Rosa

PMTUD issue using NAT64

Reply via email to