OK, another Comer/Stevens lover :-)

You beat me. I admit that I've to go and grab either of these two persons'
books to check what exactly happens when you encapsulate TCP within another
TCP but I don't have time for that right now. Nevertheless I enjoyed it.

What I can tell is that I tried this IP-over-SSH on both LAN and WAN
environment and this is the result:

*) On an Ethernet LAN (back to back) I had 6M throughput from a plain TCP
   while I had 800K throughput from the encapsulated TCP. (13% throughput).

*) In a WAN (through the Internet) I had 130K throughput from a plain TCP
   while I had 100K throughput from the encapsulated TCP. (77% throughput).

I must say that I did not encounter any "down to death" service decrease
at all and this with almost 1G of data transfer. So, if you can tell me how
to set this environment up so that the service sucks I'm willing to test it
again.

Ramin
PS. Poor Nagle just wanted to help you slash 4000% overhead. Be more
gentle with him.



On Fri, Apr 19, 2002 at 09:17:48PM -0400, Zygo Blaxell wrote:

> >> Well, yes, that is what I mean.  Routers tell TCP that bandwidth is scarce
> >> by dropping packets (or using the ECN bit, but PPP-over-SSH doesn't set
> >> the ECN bit either).
> >
> >Or with higher latency.
> 
> Higher latency does not imply that bandwidth is scarce.  TCP does not
> consider latency when estimating available bandwidth, except if latency
> increases suddenly (in which case TCP (mis)interprets the latency as
> packet loss).
> 
> >The loss of a packet is not reported by the receiver but detected by the
> >sender itself (timer).
> 
> Or by SACK or duplicate ACKs, which come from the receiver.  To be
> strictly correct, losses are inferred by the sender from data supplied
> or _not_ supplied by the receiver.
> 
> >Each TCP connection has its own set of variables which is not shared with
> >the other instances. So, the underlaying TCP (SSH) might have different 
> >window-size, timer-value and for that matter even different heuristics than
> >the encapsulated TCP riding on top of it, completely independent.
> 
> Events on the underlying IP (packet loss, latency, etc) which affect the
> underlying TCP, will cause variables that TCP assumes are independent to
> become dependent in the encapsulated TCP.  Packet loss without latency
> becomes latency without packet loss.  TCP heuristics optimize in one
> direction to deal with high latency (send more packets to increase
> throughput) and the *opposite* direction to deal with packet loss (send
> fewer packets to reduce congestion).  The Nagle algorithm does all kinds
> of damage when applied to TCP itself, especially to the TCP slow start
> algorithm.  Encapsulated TCP retransmitted packets become duplicate ACKs
> at the receiver, which abuse the congestion window in the other direction.
> 
> If it is possible to set the parameters of the underlying TCP, then it
> can be configured to work a little better, but in most cases the reason
> why people attempt to do TCP over TCP is also the reason why they can't
> change these parameters--the underlying TCP connection is some kind of
> corporate HTTP proxy.
> 
> >This phenomenon can go on until the RTT of the encapsulated TCP is slightly
> >larger than the RTT of the underlaying TCP. At this time they should stabilize.
> >This delta is proportional to the SSH and PPP overhead.
> 
> This is not true at all.
> 
> The minimum RTT in the encapsulated TCP is slightly larger than the RTT
> of the underlying TCP.  The maximum RTT of the encapsulated TCP is for
> all practical purposes unlimited--it's the size of the TCP window and
> buffers utilized divided by the minimum available bandwidth.
> 
> If retransmission is required on the underlying TCP, it will add delay
> to the encapsulated TCP.  This delay will accumulate and persist until
> the next time the underlying TCP connection becomes idle, or until TCP
> connections (at any level) fail due to timeout.  
> 
> Further, if the TCP implementations are similar, the delay added in the
> underlying TCP by the retransmit will probably be slightly longer than
> the time that the encapsulated TCP assumes means a packet has been lost.
> This will add retransmissions to the encapsulated TCP stream at exactly
> the time when the underlying TCP is retransmitting segments itself,
> which increases latency, decreases throughput, and wastes bandwidth
> all at once, and increases the amount of traffic that has to be cleared
> before the underlying TCP can reach idle state.  Typically encapsulated
> TCP throughput drops to zero before this happens.
> 
> >> The failure modes are:
> 
> >>    - steadily increasing latency when bandwidth is in use, up to
> >>    about 120 seconds
> >> 
> >>    - SSH blocks on input or output and it hangs.  Some SSH versions
> >>    have fixed this problem.
> >> 
> >>    - SSH or PPP protocol timeout (trivially easy to avoid)
> >> 
> >>    - TCP timeout (75 second delay * 9 retransmits = TCP connection
> >>    fails, IIRC)
> 
> >Case (2) and (3) are being considered as bugs or, for that matter, as
> >deficiency/shortcomings and case (1) and (4) would affect both TCP sessions.
> 
> In a sane network configuration between two TCP peers case 1 failure does
> not occur.  Latency increases to somewhere near the (packet queue size)
> / (bandwidth) of the devices in the network path, then stays mostly
> constant thereafter--once all the buffers are full, all further packets
> will be lost, so they won't add to latency.  Real network devices don't
> delay packets arbitrarily; what they can't send right away gets dropped.
> 
> Dialup and DSL modems have several seconds' worth of buffering inside
> them, but they are typically used only near the extreme ends of a real
> Internet network route.  Most TCP's won't actually utilize all of the
> available queue space because the probability of packet loss increases
> as available queue space decreases, and TCP's estimation of available
> bandwidth decreases geometrically when packet loss is discovered.
> 
> Case 4 is much more common in PPP-over-SSH and similar configurations
> than it is in IP-over-packet-carrier configurations due to the RTT
> spiralling out of control.
> 
> Case 2 and 3 are really symptoms of the other problems.
> 
> -- 
> Zygo Blaxell (Laptop) <[EMAIL PROTECTED]>
> GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD

Reply via email to