In article <[EMAIL PROTECTED]>, Ramin Alidousti <[EMAIL PROTECTED]> wrote: >On Sat, Apr 13, 2002 at 02:05:56AM -0400, Zygo Blaxell wrote: >> Well, yes, that is what I mean. Routers tell TCP that bandwidth is scarce >> by dropping packets (or using the ECN bit, but PPP-over-SSH doesn't set >> the ECN bit either). > >Or with higher latency.
Higher latency does not imply that bandwidth is scarce. TCP does not consider latency when estimating available bandwidth, except if latency increases suddenly (in which case TCP (mis)interprets the latency as packet loss). >The loss of a packet is not reported by the receiver but detected by the >sender itself (timer). Or by SACK or duplicate ACKs, which come from the receiver. To be strictly correct, losses are inferred by the sender from data supplied or _not_ supplied by the receiver. >Each TCP connection has its own set of variables which is not shared with >the other instances. So, the underlaying TCP (SSH) might have different >window-size, timer-value and for that matter even different heuristics than >the encapsulated TCP riding on top of it, completely independent. Events on the underlying IP (packet loss, latency, etc) which affect the underlying TCP, will cause variables that TCP assumes are independent to become dependent in the encapsulated TCP. Packet loss without latency becomes latency without packet loss. TCP heuristics optimize in one direction to deal with high latency (send more packets to increase throughput) and the *opposite* direction to deal with packet loss (send fewer packets to reduce congestion). The Nagle algorithm does all kinds of damage when applied to TCP itself, especially to the TCP slow start algorithm. Encapsulated TCP retransmitted packets become duplicate ACKs at the receiver, which abuse the congestion window in the other direction. If it is possible to set the parameters of the underlying TCP, then it can be configured to work a little better, but in most cases the reason why people attempt to do TCP over TCP is also the reason why they can't change these parameters--the underlying TCP connection is some kind of corporate HTTP proxy. >This phenomenon can go on until the RTT of the encapsulated TCP is slightly >larger than the RTT of the underlaying TCP. At this time they should stabilize. >This delta is proportional to the SSH and PPP overhead. This is not true at all. The minimum RTT in the encapsulated TCP is slightly larger than the RTT of the underlying TCP. The maximum RTT of the encapsulated TCP is for all practical purposes unlimited--it's the size of the TCP window and buffers utilized divided by the minimum available bandwidth. If retransmission is required on the underlying TCP, it will add delay to the encapsulated TCP. This delay will accumulate and persist until the next time the underlying TCP connection becomes idle, or until TCP connections (at any level) fail due to timeout. Further, if the TCP implementations are similar, the delay added in the underlying TCP by the retransmit will probably be slightly longer than the time that the encapsulated TCP assumes means a packet has been lost. This will add retransmissions to the encapsulated TCP stream at exactly the time when the underlying TCP is retransmitting segments itself, which increases latency, decreases throughput, and wastes bandwidth all at once, and increases the amount of traffic that has to be cleared before the underlying TCP can reach idle state. Typically encapsulated TCP throughput drops to zero before this happens. >> The failure modes are: >> - steadily increasing latency when bandwidth is in use, up to >> about 120 seconds >> >> - SSH blocks on input or output and it hangs. Some SSH versions >> have fixed this problem. >> >> - SSH or PPP protocol timeout (trivially easy to avoid) >> >> - TCP timeout (75 second delay * 9 retransmits = TCP connection >> fails, IIRC) >Case (2) and (3) are being considered as bugs or, for that matter, as >deficiency/shortcomings and case (1) and (4) would affect both TCP sessions. In a sane network configuration between two TCP peers case 1 failure does not occur. Latency increases to somewhere near the (packet queue size) / (bandwidth) of the devices in the network path, then stays mostly constant thereafter--once all the buffers are full, all further packets will be lost, so they won't add to latency. Real network devices don't delay packets arbitrarily; what they can't send right away gets dropped. Dialup and DSL modems have several seconds' worth of buffering inside them, but they are typically used only near the extreme ends of a real Internet network route. Most TCP's won't actually utilize all of the available queue space because the probability of packet loss increases as available queue space decreases, and TCP's estimation of available bandwidth decreases geometrically when packet loss is discovered. Case 4 is much more common in PPP-over-SSH and similar configurations than it is in IP-over-packet-carrier configurations due to the RTT spiralling out of control. Case 2 and 3 are really symptoms of the other problems. -- Zygo Blaxell (Laptop) <[EMAIL PROTECTED]> GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD
