Davin Flatten wrote:
Hello-
I am having a strange problem ever since we applied the Fedora Core 5
update to the Openssh RPM's. Ever since the update when some users
connect thru a NAT gateway to the NAT'ed server the connection hangs.
This occurs only for some combinations of firewalls. Below is all the
information I could gather on the subject. Has anyone had this same
problem and found a solution?
The setup is as follows:
ssh server <---Nat firewall #1 <--Internet <--Nat firewall #2<--ssh
client
I would put money on it being this:
http://www.snailbook.com/faq/mtu-mismatch.auto.html
[quote]
Short Answer
You probably have an MTU/fragmentation problem. For each network
interface on both client and server set the MTU to 576, eg ifconfig eth0
mtu 576. If the problem goes away, read on.
Long Answer
Long answer: At each routing hop, IP packets bigger than the outgoing
interface’s Maximum Transmission Unit (MTU) get fragmented. Only the
first fragment has TCP port numbers. Firewalls often behave badly in the
presence of packet fragmentation, dropping everything but the first
fragment since the subsequent ones can’t be matched against the firewall
rules. Some NAT configuration (eg many-to-one NAT or port address
translation) can’t match the fragments against their translation state
tables.
Arguably, such devices should perform packet reassembly first so as to
properly consider fragmented packets. However, this is more complicated
and so is often not done. Also, this feature would raise a possible
starvation attack against the packet filter, by sending many bogus
initial fragments and causing the device to store them for reassembly
with subsequent packets which will never come.
Logging in and using the shell will normally generate relatively small
packets, and so the initial connection proceeds normally ; however if do
you something that generates a lot of data (eg cat'ing a big file or
starting an X Windows application), you may generate a packet bigger
than the MTU.
Let's say it’s a 1500 byte IP packet and the router has 2 different
MTU's (say 1500 & 1484) and no firewall. When the router goes to forward
it, the packet is too big for the interface MTU (1484), so the router
breaks it into 2 fragments, 0 and 1. Fragment 0 contains the first 1484
bytes (including the TCP source and dest ports) and fragment 1 contains
the remaining 16 bytes. Both fragments are sent on to their destinations.
When the first fragment reaches its target, it’s held by the IP stack
until the remaining fragments arrive, at which time the IP packet is
reassembled and passed up the stack to TCP. If all fragments are not
received by the timeout, the entire IP packet is discarded and an ICMP
"timeout during reassembly" error is sent back.
Now add your firewall, which drops fragment 1. Your 1500 byte IP packet
times out during reassembly and TCP retries, by sending another 1500
byte packet. Repeat. Eventually, TCP will time out and you’ll get a
connection termination.
IP stack parameters (such as Path MTU Discovery) and external variables
(such as the MTU's of all the hops between hosts) can also affect
whether or not a given connection will have this problem.
[/quote]
Firewall #1 is an OpenBSD gateway running m0n0wall and the Firewall #2
depends on which client is connecting.
I thought m0n0wall was FreeBSD based, but if it uses PF and you're using
"keep state" rules then make sure you're only creating state on the TCP
SYN packets, eg:
pass in [...] proto tcp [...] flags S/SA keep state
If you omit the "flags S/SA" then depending on your rules you may end up
creating state on a reply packet which may result in PF's idea of the
TCP window scaling being wrong. This will mostly work but can cause all
kinds of weird problems...
--
Darren Tucker (dtucker at zip.com.au)
GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69
Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.