[
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766228#comment-16766228
]
Michael Osipov commented on WAGON-537:
--------------------------------------
Awesome, thank you very much for retesting. Funnily, this changed revealed a
serious bug in JSch.
> Maven transfer speed of large artifacts is slow due to unsuitable buffer
> strategy
> ---------------------------------------------------------------------------------
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
> Issue Type: Improvement
> Components: wagon-http, wagon-provider-api
> Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s
> network connection.
> Reporter: Olaf Otto
> Assignee: Michael Osipov
> Priority: Major
> Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes
> involves uploading and downloading artifacts with a few gigabytes in size.
> Here, maven's transfer speed is consistently and reproducibly slow. For
> instance, an artifact with 7,5 GB in size took almost two hours to transfer
> in spite of a 100 MB/s connection with respective reproducible download speed
> from the remote nexus artifact repository when using a browser to download.
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream
> output, int requestType, long maxSize ) method used for remote artifacts and
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data
> is received, the received data is pushed to downstream listeners via
> fireTransferProgress. These listeners (or rather consumers) perform expensive
> tasks.
> Now, the underlying InputStream implementation used in transfer will return
> calls to read(buffer, offset, length) as soon as *some* data is available.
> That is, fireTransferProgress may well be invoked with an average number of
> bytes less than half the buffer capacity (this varies with the underlying
> network and hardware architecture). Consequently, fireTransferProgress is
> invoked *millions of times* for large files. As this is a blocking operation,
> the time spent in fireTransferProgress dominates and drastically slows down
> the transfers by at least one order of magnitude.
> !wagon-issue.png!
> In our case, we found download speed reduced from a theoretical optimum of
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers /
> listeners invoked via fireTransferProgress aware of their potential impact on
> download speed, but rather refactor the transfer method such that it uses a
> buffer strategy reducing the the number of fireTransferProgress invocations.
> This should be done with regard to the expected file size of the transfer,
> such that fireTransferProgress is invoked often enough but not to frequent.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)