tomaswolf commented on issue #854:
URL: https://github.com/apache/mina-sshd/issues/854#issuecomment-3575522132

   This way of downloading a file is always going to be slow.
   
   First, a buffer of 10MB is not going to help because servers typically don't 
return that much data in a single read request. More likely you'll be getting 
much less per read request; normally less than 256kB and typically 32kB or 
maybe 64kB. Servers do this to guard against out-of-memory conditions under 
heavy load.
   
   Second, this code will fire off a read request and then wait for the 
response to arrive before it sends the next read request. So it incurs the full 
network latency on each request.
   
   Third, although I don't see any multi-threading in this code: trying to 
download a single file using multiple threads is unlikely to give any speed-up. 
First, all requests and data are sent over a single network connection, so some 
serialization will occur at the SSH level anyway. And second: let's assume you 
have one thread downloading the block from file offset 0 to 4'999'999, another 
thread downloading the block from 5'000'000 to 9'999'999 and a third thread 
downloading 10'000'000 to 14'999'999. When these threads receive data, they'd 
have to write it at the correct offsets in the local file. So you need random 
access on the local file, and you'd be jumping around, frequently resetting 
file offsets. That is going to kill performance when writing to the local file. 
On the server side; if the read requests end up operating on the same 
handle/file object, such re-positioning may also occur, making things even 
worse.
   
   I suggest you use something like
   ```
   try (InputStream in = sftpClient.read(filename)) {
       Files.copy(in, file, StandardCopyOption.REPLACE_EXISTING);
   }
   ```
   to download a file. This
   * uses a reasonable buffer size internally,
   * sends off multiple read requests and handles the responses when they come 
in, amortizing network latency,
   * writes the local file sequentially, avoiding performance problems with 
file positioning in the local file.
   
   You might also want to look at the upload tests in project 
`sshd-benchmarks`, these show various ways to upload a file, but similar 
different ways to download files also exist.
   
   It might make sense to download multiple files using multiple threads (one 
thread per file); this might perhaps give a small speed improvement if network 
operations can be overlapped with local file writing in other threads, but then 
again perhaps it might not bring much because in the end it's still a single 
network connection. It will in any case complicate thread synchronization 
(waiting until all downloads are done; error handling if some downloads fail).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to