SFTP perf

Kacheong Poon Tue, 19 Feb 2008 18:19:39 +0800

Nicolas Williams wrote:

> I see.  Well, we can always start with a very large SO_RCVBUF and to
> hell with tuning TCP.  My only concern with that is that this may
> reserver a large amount of memory, but the buffers should only ever get
> really big in large delay, WAN situations.



Assuming an app is well behaved, the buffer is only used when
there is data lost.  TCP needs to deliver data in sequence.
So if there is data lost, TCP needs to buffer subsequent
data until the lost data is recovered.  So if congestion
control is done right, meaning not much data is lost due
to congestion, the kernel memory issue should not matter.


> *Exactly*.  ssh/sshd could track the running average of RTTs over two
> different time periods, and when the short-term average is smaller than
> the long-term then available bandwidth is growing and with it's higher
> we have congestion.


There have been a lot of research on this area.  But they
are on how TCP should work.  It is not clear to me how
the above will work on top of TCP, which is also doing its
own congestion control.  But I think the end result may
still be limited by TCP.  The sending rate is always limited
by TCP's congestion window.  And it is increased as fast as
the underlying TCP's algorithm allows.  This assumes that there
is always data to be sent in TCP.  So if an app just keeps
dumping data to TCP, TCP will send out as fast as it thinks it
can.  And TCP will react to congestion events.


> Well, we're talking about bulk data transfers.  Congestion will be
> noticeable.  I'm more concerned about detecting when congestion is
> resolved.  Also, the application will be able to measure both, RTTs and
> actual bandwidth for the connection.
> 
> Yes, getting this right will be tricky.  It doesn't help that we have
> two layers of flow control.  But perhaps your comments about TCP buffer
> sizing limitations is actually a boon in disguise: just don't auto-tune
> TCP and start with very large TCP buffer sizes but small SSHv2 channel
> windows and slow start those.


Note that my comment on buffer size is on the receiving side.
An app cannot reduce the receive buffer.  And yes in theory,
if the receiver advertises a huge window, the TCP bulk transfer
rate will be mostly controlled by the sending side.  If the
sending side's algorithm is good, it can react well to both
very low and very high bandwidth environment.


> The implementation is real dumb: fixed window sizes without relation to
> TCP buffer sizes.  (Actually the window size shrinks when the sender
> sends data and grows when the receiver drains it, but it never exceeds
> the original.)  See $SRC/cmd/ssh/libssh/common/channels.c, and search
> for "adjust" case-insensitively -- it's pretty obvious.
> 
> The SSHv2 spec covering this (RFC4254) allows the channel window size to
> grow, and it would be silly to over-subscribe the connection's buffers
> for long.  Each channel has an initial window size.  Sending data
> consumes space from the window.  The receiver can send an unsigned
> integer adjustment whenever it wants.


So I guess the issue is on how to grow the window size of each
channel.  And only those channels which have used up their
windows will need to have the windows adjusted.  I guess your
scheme may work.  Just let the sender know the receiver's buffer
size to avoid over subscription.  Using an appropriate fairness
control and grow the bulk trasnfer channel's window.  This makes
sure that there is always data in TCP to be sent and TCP will
send as fast as it thinks it can.


-- 

                                                K. Poon.
                                                kacheong.poon at sun.com

[networking-discuss] Improving HPC network performance, specifically SSH/SCP/SFTP perf

Reply via email to