Hey Michael,

Sorry that the Windows version of Wget2 doesn't support multi-threading. This is a known issue when cross-building wget2.exe on Linux. As far as I know, you need a native Windows build, but I can't help you much with that.

There is actually no obvious reason why you experience a slowdown with the Linux. Running your command wget2 here (same wget2 version) needs 21.5s to download all 69 files. (Just for the record: using 69 threads instead of 32 doesn't download much faster. Server limitations, my bandwidth is far from being expired.)

My observation is that the www.uog.edu server doesn't support HTTP/2.
This unnecessarily slows down multiple-file downloads, because for every single file a new TCP connection must be established. On top of that, the TLS layer needs to be established as well, which is relatively expensive.

But 138 seconds from prior 25 seconds is weird.
Here (Debian Testing, kernel 6.5.3-1), even with just 5 threads, it takes only 32s for me.

My guess is, Fedora somehow managed to build wget2 without multi-threading support, likely accidentally.

You can either build your own wget2 binary to test with or open a bug report on Fedora (and let the experts investigate). Or both :)

Regards, Tim

On 10/8/23 11:08, Michael D. Setzer II via Primary discussion list for GNU Wget wrote:
I have used wget2 to download 69 to 70 pages from a University
College Campus directory. The process has worked with no
problems for many years and reduced time to about 25 seconds,

But know I get errors if I set it to more than 32 threads.
wget2  --max-threads=32 --secure-protocol=PFS
--base="https://www.uog.edu/"; -i testlistuog

works fine
testlistuog contains
directory/?page=01
directory/?page=02
...
...
directory/?page=68
directory/?page=69

Know the wget2 recently was updated in the Fedora 38 repo,
GNU Wget2 2.1.0 - multithreaded metalink/file/website
downloader

+digest +https +ssl/gnutls +ipv6 +iri +large-file +nls -ntlm -opie
+psl -hsts +iconv +idn2 +zlib -lzma +brotlidec +zstd -bzip2 -lzip
+http2 +gpgme

Don't know if that change did something with threads? or perhaps
some other update?

I had found that the windows version of wget2 did not work well
with threads so have it run with threads set to 1.
Time with windows to download is:
Time to Download Campus Directory 154.332887 Seconds

The linux version with 32 threads now takes.
Time to Download Campus Directory 138.430772 Seconds
While previously it was running about 25 seconds with 70 threads?

Origainal lines in program
Call to get page 1 to find total number of pages in directory.
     system("wget2 --restrict-file-names=windows --secure-protocol=PFS -q
\"https://www.uog.edu/directory/?page=01\"";);

Creates the testlistuog file with ?page=01 to ?page=lastpage number

Call with linux (Runs the wget in backgroud and loop to display with downloads
     system("wget2 --restrict-file-names=windows --max-threads=70 
--secure-protocol=PFS -q
--base=\"https://www.uog.edu/directory/\"; -i testlistuog 2>error & PID=$! ; 
printf '[' ; while ps hp $PID
/dev/null ; do  printf  '▓'; sleep 1 ; done ; printf '] done!\n'");
This produces individual files for each page, and then combines them into one 
allraw.uog when done.

With windows it uses single thread and downloads pages 1 to last and sends 
output to allraw.uog.
     system("wget2 --max-threads=1 --restrict-file-names=windows 
--secure-protocol=PFS
--progress=none --base=\"https://www.uog.edu/directory/\"; -O \"allraw.uog\" -i 
testlistuog");

Run wget2 commands outside cpp program to make sure it wasn't that causing 
issue.

Going from 25 seconds to 138 isn't a huge problem, but seeing the change in how 
the program is
working is concerning.

Perhaps a change in max number of threads was done, or perhaps some other 
update in Fedora or
within kernels? 6.5.5-200.fc38.x86_64







+------------------------------------------------------------+
  Michael D. Setzer II - Computer Science Instructor (Retired)
  mailto:mi...@guam.net
  mailto:msetze...@gmail.com
  Guam - Where America's Day Begins
  G4L Disk Imaging Project maintainer
  http://sourceforge.net/projects/g4l/
+------------------------------------------------------------+


Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to