In article <[EMAIL PROTECTED]>,
Ramin Alidousti  <[EMAIL PROTECTED]> wrote:
>OK, another Comer/Stevens lover :-)

Comer?  Stevens?  [blinking innocently]  Who are they?  ;-)

>You beat me. I admit that I've to go and grab either of these two persons'
>books to check what exactly happens when you encapsulate TCP within another
>TCP but I don't have time for that right now. Nevertheless I enjoyed it.

Most of this I learned from RFC's and Linux kernel sources.  The RFC's
are also an entertaining source of historical context (written in the
days when it was generally considered normal for large IT departments
to implement their own custom networking stacks).  I've flipped through
the Stevens book from time to time, but haven't actually sat down to
read it yet...

>What I can tell is that I tried this IP-over-SSH on both LAN and WAN
>environment and this is the result:
>
>*) On an Ethernet LAN (back to back) I had 6M throughput from a plain TCP
>   while I had 800K throughput from the encapsulated TCP. (13% throughput).
>
>*) In a WAN (through the Internet) I had 130K throughput from a plain TCP
>   while I had 100K throughput from the encapsulated TCP. (77% throughput).

I have seen similar results, although only under very good network
conditions.  Also try measuring latency--send some pings at the same
time and see what the RTT does.

I wish I had any clue why it actually gets better (although 23% overhead
is still bad) if the underlying TCP is running on a slower network.
I can guess:  the variance in RTT should be much lower on a slow network
(the average RTT is of course larger), which might eliminate some spurious
retransmits and associated congestion window thrashing (Ethernet can be
very bursty, especially as throughput approaches the performance limits
of host CPU's, and the context switching of TCP-over-TCP in user-space
will just make it worse).

>I must say that I did not encounter any "down to death" service decrease
>at all and this with almost 1G of data transfer. So, if you can tell me how
>to set this environment up so that the service sucks I'm willing to test it
>again.

1% random packet loss on the underlying network (typical during "busy"
Internet conditions) should produce the spiralling death case, but
should not have a significant effect on plain TCP (it will be bursty but
it should recover quickly).  10% packet loss will be usable (although
annoyingly slow) for plain TCP, but not TCP-over-PPP-over-SSH.

You might also get spiralling death if you open a few parallel connections
(such that the sum of window sizes on encapsulated TCP's is greater than
the window+buffer size of the underlying TCP), or if you use simpler/older
network stacks.  

You won't get spiralling death unless the SSH buffers and TCP windows
fill up.  If you can guarantee these will stay empty (e.g. by putting
a QoS TBF filter on the PPP interface) then you can avoid spiralling
death, but you never get better than worst-case bandwidth.

>PS. Poor Nagle just wanted to help you slash 4000% overhead. Be more
>gentle with him.

...or 10000% if you tunnel SSH through two layers of CIPE (don't ask)...

I'm not saying Nagle didn't have a good idea, but even Nagle (the person)
admitted that there are certain specific cases where Nagle (the algorithm)
reduces throughput, and I don't think Nagle had even considered mixing
real-time-sensitive algorithms (e.g. TCP) with TCP.  They're good enough
for interactive use.

You can turn Nagle on or off per-socket with a setsockopt() in Linux.
If you have to tunnel through a firewall that allows TCP traffic at the
packet level, then you can turn this on for a PPP-over-SSH connection.
Usually PPP-over-some-corporate-proxy-TCP forwards TCP traffic via a
pair of sockets, so disabling Nagle doesn't help since you can't do it
on the exterior socket where it actually matters.

-- 
Zygo Blaxell (Laptop) <[EMAIL PROTECTED]>
GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD

Reply via email to