Bug#404365: RFC 4380 advice to improve reliability of Teredo relays breaks clients behind Linux NATs in common configurations

Sam Hartman Sat, 23 Dec 2006 17:52:55 -0800


package: miredo
severity: important
version: 1.0.4-1
Tags: upstream
justification: Debian's Teredo implementation does not particularly work with 
Debian's NAT implementation



[I've copied the Miredo author because this really seems more an
upstream issue than an Debian issue.  I've copied Christian because he
may find this interesting and because he may want to consider what the
appropriate implementation advice is for future implementations.]

Section 5.4.1 of RFC 4380 suggests that to improve reliability Teredo
relays MAY send a bubble directed at the mapped IPV4 address even when
they do not believe they are behind a non-cone NAT.

Unfortunately, if you have a client behind a Linux NAT and you
receieve a bubble to the mapped IPV4 address before the client sends
the bubble towards the relay, then Linux allocates the wrong mapped
port, and the bubble sent to the relay is rejected because its mapped
port does not match the teredo address.  If you do not send the bubble
to the mapped IPV4 address then things work fine.  As a consequence,
getting clients behind Linux NATs to work with relays behind non-cone
NAT is challeging.  I think the best you can do is wait to send the
bubble to open your side of the NAT until the client has sent its
bubble.  If you're both behind Linux, well, you didn't really want
connectivity did you?

Proposed solution: Miredo should gain an option to suppress the
optional bubble to the mapped IPV4 address when the cone bit is clear
and the relay is not behind a NAT.  We may want to consider whether
the advice in RFC 4380 should be qualified with an explanation about
this problem.  Someone should yell at the Linux ip_conntrack people
until they suck les.  I really don't know how to report Linux kernel
bugs effectively so I'd appreciate help with that part.

Details:

Linux uses ip_conntrack to track connection state .  This connection
state is used for NAT bindings among other things.

Consider a simple  Linux NAT with the following rule in the nat table and no 
rules in other tables:

iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -j MASQUERADE

In other words NAT outbound packets from 10.0.0/24.



Here's what happens when a client behind the nat attempts to open a
session to port 143 on 2002:4519:c41c:2:216:3eff:fe5d:302f


03:43:41.410918 IP xx.xx.xx.xx.3545 > 65.54.227.124.3544: UDP, length: 66
# to teredo server
03:43:43.238943 IP 69.25.196.28.32780 > xx.xx.xx.xx.3545: UDP, length: 40
# optional bubble from relay
03:43:43.239436 IP xx.xx.xx.xx > 69.25.196.28: icmp 76: xx.xx.xx.xx udp port 
3545 unreachable
#Gee, we didn't have a mapping for that

03:43:43.326584 IP 65.54.227.124.3544 > xx.xx.xx.xx.3545: UDP, length: 48
#Here comes the bubble via the server
03:43:43.335244 IP xx.xx.xx.xx.1024 > 69.25.196.28.32780: UDP, length: 40
#And here comes the bubble  from the client to the relay
#Notice we got the wrong port outbound

03:43:47.485564 IP xx.xx.xx.xx.3545 > 65.54.227.124.3544: UDP, length: 66
#retry
03:43:49.355222 IP 69.25.196.28.32780 > xx.xx.xx.xx.3545: UDP, length: 40
03:43:49.355455 IP xx.xx.xx.xx > 69.25.196.28: icmp 76: xx.xx.xx.xx udp port 
3545 unreachable
#And we still don't love the relay



What's causing us to get the wrong outbound port?
Let's look at our connection tracking tables (/proc/net/ip_conntrack) looking 
for 69.25.196.28:

udp      17 3 src=69.25.196.28 dst=xx.xx.xx.xx sport=32780 dport=3545 packets=4 
bytes=272 [UNREPLIED] src=xx.xx.xx.xx dst=69.25.196.28 sport=3545 dport=32780 
packets=0 bytes=0 mark=0 use=1
udp      17 3 src=10.0.0.25 dst=69.25.196.28 sport=3545 dport=32780 packets=4 
bytes=272 [UNREPLIED] src=69.25.196.28 dst=xx.xx.xx.xx sport=32780 dport=1024 
packets=0 bytes=0 mark=0 use=1


The first line tells the horror story.  Linux sees an incoming locally
destined UDP packet.  It creates connection tracking state for remote
IP 69.25.196.28 from the teredo relay to the teredo client on the
local system.  Even though this packet generates an ICMP error because
there is no socket listening on the local system for that port, the
connection state is retained.  so, then, when the client tries to send
it cannot obtain public port 3545 because there is existing connection
state.  So, it is assigned a new port and Teredo doesn't work.  I
really hope that this Linux behavior is against
draft-ietf-behave-nat-udp because it's certainly anti-social.

So, what do things look like if we introduce a blackhole route near
the relay to prevent the bubble from the relay to the mapped address
From reaching the Linux box?  We will remove this route after the
client has had a chance to create NAT state.


04:18:03.174279 IP xx.xx.xx.xx.3545 > 65.54.227.124.3544: UDP, length: 66
04:18:05.131731 IP 65.54.227.124.3544 > xx.xx.xx.xx.3545: UDP, length: 48
04:18:05.140466 IP xx.xx.xx.xx.3545 > 69.25.196.28.32780: UDP, length: 40
04:18:09.255394 IP xx.xx.xx.xx.3545 > 65.54.227.124.3544: UDP, length: 66
04:18:13.311655 arp who-has 148.64.166.189 tell xx.xx.xx.xx
04:18:13.322788 arp reply 148.64.166.189 is-at 40:00:00:44:31:01
04:18:21.424976 IP xx.xx.xx.xx.3545 > 65.54.227.124.3544: UDP, length: 66
04:18:23.343698 IP 69.25.196.28.32780 > xx.xx.xx.xx.3545: UDP, length: 66
04:18:23.350882 IP xx.xx.xx.xx.3545 > 69.25.196.28.32780: UDP, length: 80
04:18:23.354926 IP xx.xx.xx.xx.3545 > 69.25.196.28.32780: UDP, length: 80
04:18:23.355278 IP xx.xx.xx.xx.3545 > 69.25.196.28.32780: UDP, length: 80
04:18:27.072693 IP 69.25.196.28.32780 > xx.xx.xx.xx.3545: UDP, length: 80
04:18:27.075442 IP 69.25.196.28.32780 > xx.xx.xx.xx.3545: UDP, length: 72
04:18:27.078220 IP 69.25.196.28.32780 > xx.xx.xx.xx.3545: UDP, length: 72
04:18:27.080359 IP xx.xx.xx.xx.3545 > 69.25.196.28.32780: UDP, length: 72
04:18:28.891810 IP 69.25.196.28.32780 > xx.xx.xx.xx.3545: UDP, length: 147
04:18:28.898799 IP xx.xx.xx.xx.3545 > 69.25.196.28.32780: UDP, length: 72
04:18:34.134020 IP xx.xx.xx.xx.3545 > 69.25.196.28.32780: UDP, length: 72
04:18:35.916735 IP 69.25.196.28.32780 > xx.xx.xx.xx.3545: UDP, length: 72
04:18:35.963221 IP xx.xx.xx.xx.3545 > 69.25.196.28.32780: UDP, length: 72

20 packets captured

And the telnet session work and we get a nice imap greeting.
The connection tracking state is also telling:

udp      17 165 src=10.0.0.25 dst=69.25.196.28 sport=3545 dport=32780 packets=8 
bytes=792 src=69.25.196.28 dst=xx.xx.xx.xx sport=32780 dport=3545 packets=6 
bytes=677 [ASSURED] mark=0 use=1

Only one entry and it actually received a reply.


I tried to find a rule in the raw table of Linux to tell ip_conntrack
to ignore incoming packets that were just going to generate icmp
errors using the NOTRACK target.  I was unable to do so in such a way
that the reply to the client actually established connection state, so
I couldn't make that useful.

I know of no work around on the Linux side.  The only thing I know
will work is to prevent the relay from sending the optional bubble.
I'd appreciate any comments or suggestions and I'd really appreciate a
Miredo option to suppress that bubble.

pgpG1KQLO2tZN.pgp
Description: PGP signature

Bug#404365: RFC 4380 advice to improve reliability of Teredo relays breaks clients behind Linux NATs in common configurations

Reply via email to