Hi friends,

I've been working on introducing the asynch approach for SFTP downloads. Downloads are a bit different in nature than uploads as for example we will normally get a largish buffer in each call and we will read until EOF.

I first experimented with sending very large FXP_READ requests, and I learned than OpenSSH only sends back 64K data and never more. The SFTP spec is a bit vaguely written but seems to say that implementations are only obliged to support 32K.

So, to read SFTP really fast we create a queue of outgoing READ packets and send them out one by one and return data as soon as we get such. This way we get the pipelining effect we want and thus circumvent the waiting. The fact that we don't know the size before-hand combined with this sort of pre-reading makes us send a lot of READs beyond the end of the file. That's definately room for improvement.

SOME NUMBERS

   (As usual all numbers are rough, possibly wrong due to my mistakes and I've
   ran all libssh2 tests using debug builds (without any kind of compiler
   optimizations enabled) and some of them used a fair about of printf outputs
   during their operations.)

 - I made the pre-reading use 'buffer_size * 4' as maximum outstanding reads.

 - I used the sftp_nonblock.c code and bumped its own buffer to 1MB - yes that
   makes libssh2 pre-read 4MB! Setting down the max to 'buffer_size * 4' did
   cut the transfer speed by 20%! 4MB makes ~135 outstanding READ packets when
   30000 is requested in each.

 - I modified the code to not write() the received data anywhere

 - For my test with the 1.2.7 code, I modified the buffer to 100K just to make
   sure it was as big as possible for that code.

HIGH LATENCY

 To simulate a far away server, I used a nice new trick I've learned to add
 RTT time:

  $ tc qdisc add dev lo root handle 1:0 netem delay 100msec

 Restore it back to normal again with:

  $ tc qdisc del dev lo root

 The added 100 millisecond delay here is once for each way, so this makes a
 200ms RTT when I ping localhost.

 A test with the original 1.2.7 code first:

  Got 10240000 bytes in 64238 ms = 159407.2 bytes/sec

 Yes, it really does perform that terribly bad. OpenSSH's sftp tool does the
 upload at 7.5MB/sec over the same connection.

 My first test with my new code, using the 4MB/30000 sizes:

  Got 102400000 bytes in 20585 ms = 4974496.0 bytes/sec

 Correct. Check the number of zeroes. Ten times the data in a third of the
 time: 31 times faster in total...

 So I started to experiment with sizes. My thinking is that with a 200 ms
 latency, we might want more than 200 requests in the pipe to be really
 efficient. And what do you know? If I cut down the outgoing data requests to
 ask for just 2000 bytes per "piece" I'm able to bump it up another 40%:

  Got 102400000 bytes in 14695 ms = 6968356.6 bytes/sec

 At almost 7MB/sec we're now very close to OpenSSH and roughly 43 times faster
 than 1.2.7...

ZERO LATENCY

 When I removed the added latency again and ran the test against localhost my
 test app seemed to get quite stable 25MB/sec while OpenSSH run like the wind
 at 44MB/sec. I've tried changing the packet sizes between 2000 and 30000 as I
 suspected that localhost might perform better with larger sizes there, but I
 didn't see any significant difference. I believe this difference is more due
 to something in our regular transport/channel handling as we are noticably
 slower than openssh already with plain SCP and as long as we are that, we
 can't make SFTP compare either.

DOWNSIDE

 When we use this approach we have a significant over-read for small files. If
 we for example were to write an application that moves over a directory with
 100 files, each being 20 bytes, we would perform terribly slow and waste a
 lot of bandwidth.

IMPROVEMENTS

 I think that we should consider having the SFTP code do an SSH_FXP_STAT
 query first to figure out the size of the remote file so that _no_
 "over-read" will be done and thus there will be no punishment for small
 files. Of course this will then not work exactly like today in cases when for
 example the file is being written to while the download begins.

 I think we should consider an API that limits or disables this read-ahead
 concept for small memory situations or just situations where it doesn't
 behave in a way that is favourable to the application.

WHAT NOW

 I'll be committing my changes soonish. I have come to think of a few quirks I
 want to look over first - not really related to my changes but I think my
 changes expose these problems more.

 I will really appreciate if everyone would consider getting the new
 code for a little spin to see in which ways it breaks and what mistakes I
 haven't yet found myself. My tests seems to run rather solidly, but I have a
 rather limited test environment and quite likely too bad imagination to cause
 the real disasters!

--

 / daniel.haxx.se
_______________________________________________
libssh2-devel http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-devel

Reply via email to