On Fri, 10 Feb 2023, Jeroen Ooms wrote:

Ah ok that is better than I thought. I was under the impression that it would immediately start with 6 connections, even before considering multiplexing.

There is also CURLOPT_PIPEWAIT that you can use to make libcurl prefer multiplexing to starting new connections.

However I noticed that even when setting CURLMOPT_MAX_HOST_CONNECTIONS to 1, GitHub still drops the connections at some point. Perhaps the issue isn't just the concurrency but we are hitting a proxy_read_timeout or something, because the downloads are idling for too long, due to the high concurrency...

On a single connection libcurl should fill it up to the max amount of streams, and none of those should go idle. The transfers that haven't been able to start yet because of the traffic jam are just qeueud up in memory by libcurl and will not appear as idle to anyone as they haven't actually started.

I did find that the problems disappear when I disable multiplexing, and performance isn't much worse (about 6 minutes for downloading the 25k files), so this solves my immediate problem.

If you're not terribly far away and each transfer mostly saturates the pipe, doing them serially is not going to be much different than parallel in the grand total. Multiplexing should still be slightly faster since you won't get punished by the RTT gaps and slow-starts that serial transfers get, but if that's just small parts of a second, it might not add up to much.

--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html

Reply via email to