Carl Lowenstein wrote:
This seems to have cured the problem. The next question is: why are
these ssh configuration parameters not mentioned in ssh_coinfig(5) or
in /etc/ssh/ssh_config? Maybe there is other documentation for ssh.
(OpenSSH)?
I don't remember where I saw this. All I know is that I investigated
some way to send ssh keepalives. It was biting me with a translation
timeout when doing rsync of large (>50 GByte) files -- ssh was keeping
the socket open but during large block checksum no data was going over
the pipe -- pix firewall decided the connection was stale and removed it.
Why does my Netgear WGR614 do this to me? Looking at the Netgear user
forum, it appears that others have similar problems with a 10-minute
inactivity timeout. It is reported that changing to a different
revision of firmware may or may not solve the problem. That is such a
confidence-building statement.
Changing the firmware probably won't fix the problem. In order to
explain why your nat appliance is doing this, I have to explain how
source nat does its thing. SNAT is a hybrid layer 2/3 translation
protocol. At layer 2 (data link) we have the IP protocol, which controls
addressing and packet destination.
IP packets contain (typically) a 20-byte header. The fields, in order, are:
Version (4 bits -- always 0x4)
Header Length (4 bits -- number of 32 bit words, which is almost always 0x5)
ToS/DSCP (8 bits -- used in QoS et al)
Total Datagram Length (16 bits)
Identification (16 bits -- used for reassembly of fragmented packets)
Flags (3 bits -- controls fragmentation and a couple other settings)
Fragmentation Offset (13 bits)
Time to Live (8 bits -- number of hops a packet my traverse before being
dropped)
Protocol (8 bits -- determines whether the packet contains ICMP, TCP,
UDP, et al)
Header Checksum (16 bits -- ensures the packet header hasn't been
tampered with/mangled during delivery. Invalid packets are immediately
dropped)
Source Address (32 bits)
Destination Address (32 bits)
Optional Other Information (arbitrary length up to eleven 32 bit words
-- this is if the header length is greater than 5 words)
Data (arbitrary length up to 65,375 bytes -- this contains the data
payload of the packet)
Furthermore, protocols like TCP and UDP sit at layer 3 (transport) and
provide more information about where a packet is going. They also have a
header within the data portion of the IP packet:
We'll use TCP as an example:
source port (16 bit)
destination port (16 bit)
sequence number (32 bits -- rolling packet counter)
ACK number (32 bits -- contains the sequence number of the next packet
the sender expects to receive)
Data offset (4 bits -- number of 32 bit words in TCP header -- always at
least 5)
Reserved/ECN (6 bits -- three Reserved bits must be zeroed, remaining
three bits control Explicit Congestion Notification)
Flags (6 bits -- contains the fields URG, ACK, PSH, RST, SYN, FIN --
these control TCP stateful operation)
Window Size (16 bit -- the number of bytes the sender is willing to
accept back in a message)
Checksum (16 bits -- this is a combination of IP header, TCP header, and
data checksum all in one. If this checksum is invalid, the destination
will request a retry from the sending station. If this were UDP, the
packet would be dropped).
Urgent Pointer (16 bits -- if URG is set in flags, this points to last
byte in sequence of urgent data. Normally zeroed)
Optional Other Information (if the header size is greater than 5 words)
Data (Up to 65215 bytes)
OK, now that we know what a typical packet header contains, here's what
NAT does on a typical outbound packet.
First, NAT looks at the packet's source address. If the packet is
'interesting' (meaning, there is an entry in the appliance's translation
source table for that source address or range) it performs further
processing. It takes that source address and changes it to one of its
designated source addresses that it found in its translation source
table. Next, the appliance looks at the protocol field and decides
whether it can perform further translation. If it can, it recomputes the
IP header checksum and mangles the packet with the new source
information also. Let's say our arbitrary packet contains TCP data, and
can be further translated. In the case of TCP, really only two fields
may be touched. The TCP source port is examined. The appliance looks in
its TCP state table to see if that source port is already in use on the
new source address. If it is, it picks the next available port. If it
isn't, then it leaves the port field intact. The packet is then mangled
with a recomputed checksum and (if necessary) the new source port. If
this is a connection establishment request (SYN set) an entry in the
appliance's translation state table is created. If not, the TST is
queried for an applicable translation. If one exists, it is used. If one
doesn't exist, a packet with RST set is sent to the originator. The
translation state table typically contains (at least) the old source
address, the new source address, the translation timer, and any L3
protocol information like old and new source port in the case of TCP.
With Inbound packets the process is a little bit different -- the
Translation State Table is queried first for a translation entry. If one
exists for the packet's source/destination, the reverse of the process
in outbound occurs.
The problem exists in appliances with small amounts of memory. So, their
translation state tables aren't very big. Which means, they need to time
out their translations when they haven't been used for a set amount of
time to avoid filling the small amount of memory allocated to
translation tasks. In your case, this is 10 minutes. Upon initial entry
into the translation state table, a timer was set on the packet
translation entry. Subsequent packets reset this timer. If the timer is
allowed to expire, the translation entry is removed from the table.
This brings up a question: Why wouldn't the appliance just recreate the
translation entry when subsequent data is transmitted? The answer is
twofold: a) there's no way of knowing whether the L3 information will
have changed with the new translation entry (the source port on the
translated address may be different, or the translated address itself
may be different) -- since TCP is a stateful protocol, this is
unacceptable. b) there's no way of knowing whether the other end hasn't
tried to send data back to a bad translation and received a RST. In
either case, a RST is sent to the originator saying, basically,
'connection closed by remote host.'
NAT breaks the transparent end-to-end connectivity that most IP
protocols were designed with having in mind. And it's true -- in some
cases you can get by without transparent E2E. But, there are some times
where it's necessary -- connections which sit idle is but one example.
Hope this helps to explain how and why NAT does its thing. Let me know
if you need clarifiaction on anything. :)
Talk to you later,
-Kelsey
--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list