Public bug reported:

Hello, I wanted a mirror of the irc logs hosted on
https://irclogs.ubuntu.com/ and started the project with:

wget --mirror https://irclogs.ubuntu.com/

This worked okay but was very slow, as there's probably hundreds of
thousands of links to traverse.

I switched to wget2 to get the multiple simultaneous connections, and
ran with:

wget2 --mirror https://irclogs.ubuntu.com

I assumed that wget2 would try to accomplish the same thing: mirror that
site *and only that site*.

What actually happened was that it followed a link on that site to
ubuntu.com and downloaded two and a half million files like this:

$ find ubuntu.com/ -ls | head -20
 11190888 995121 drwxr-xr-x  48 sarnold  sarnold   2417914 Mar 16 01:47 
ubuntu.com/
  6717440     37 -rw-r--r--   1 sarnold  sarnold     73591 Mar 16 01:23 
ubuntu.com/security?q=&package=epiphany&offset=40
  6717456     29 -rw-r--r--   1 sarnold  sarnold     73469 Mar 16 01:23 
ubuntu.com/security?q=&package=openssh&offset=40
  6717468     37 -rw-r--r--   1 sarnold  sarnold     73687 Mar 16 01:23 
ubuntu.com/security?q=&package=webkitgtk&offset=0
  6717648     29 -rw-r--r--   1 sarnold  sarnold     73527 Mar 16 01:23 
ubuntu.com/security?q=&package=openssh&offset=0
  6717662     29 -rw-r--r--   1 sarnold  sarnold     73555 Mar 16 01:23 
ubuntu.com/security?q=&package=grub2-unsigned&offset=60
  6717758     37 -rw-r--r--   1 sarnold  sarnold     73625 Mar 16 01:23 
ubuntu.com/security?q=&package=openssh&offset=80
  6717786     37 -rw-r--r--   1 sarnold  sarnold     73693 Mar 16 01:23 
ubuntu.com/security?q=&package=php8.0&offset=0
  6717790     37 -rw-r--r--   1 sarnold  sarnold     73591 Mar 16 01:23 
ubuntu.com/security?q=&package=grub2-unsigned&offset=80
  6717980     29 -rw-r--r--   1 sarnold  sarnold     73435 Mar 16 01:23 
ubuntu.com/security?q=&package=epiphany&offset=80
  6717984     37 -rw-r--r--   1 sarnold  sarnold     73589 Mar 16 01:23 
ubuntu.com/security?q=&package=openssh&offset=20
  6717986     37 -rw-r--r--   1 sarnold  sarnold     73649 Mar 16 01:23 
ubuntu.com/security?q=&package=awstats&offset=40
  6718000     29 -rw-r--r--   1 sarnold  sarnold     73495 Mar 16 01:23 
ubuntu.com/security?q=&package=grub2-unsigned&offset=60
  6718034     37 -rw-r--r--   1 sarnold  sarnold     73649 Mar 16 01:23 
ubuntu.com/security?q=&package=mozjs60&offset=60
  6718176     29 -rw-r--r--   1 sarnold  sarnold     73555 Mar 16 01:23 
ubuntu.com/security?q=&package=vlc&offset=0
  6718210     37 -rw-r--r--   1 sarnold  sarnold     73629 Mar 16 01:23 
ubuntu.com/security?q=&package=vlc&offset=20
  6718248     37 -rw-r--r--   1 sarnold  sarnold     73617 Mar 16 01:23 
ubuntu.com/security?q=&package=vlc&offset=60
  6718266     37 -rw-r--r--   1 sarnold  sarnold     73673 Mar 16 01:23 
ubuntu.com/security?q=&package=mozjs60&offset=60
  6718292     37 -rw-r--r--   1 sarnold  sarnold     73593 Mar 16 01:23 
ubuntu.com/security?q=&package=vlc&offset=40
  6718354     29 -rw-r--r--   1 sarnold  sarnold     73545 Mar 16 01:23 
ubuntu.com/security?q=&package=vlc&offset=60


This is unexpected and unpleasant.

Thanks

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: wget2 1.99.1-2.1
ProcVersionSignature: Ubuntu 5.4.0-166.183-generic 5.4.252
Uname: Linux 5.4.0-166-generic x86_64
NonfreeKernelModules: lkp_Ubuntu_5_4_0_166_183_generic_101 zfs zunicode zavl 
icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu27.27
Architecture: amd64
CasperMD5CheckResult: skip
Date: Sat Mar 16 01:50:30 2024
SourcePackage: wget2
UpgradeStatus: Upgraded to focal on 2020-01-24 (1512 days ago)

** Affects: wget2 (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug focal third-party-packages

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2058082

Title:
  wget2 --mirror leaves the specified host

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/wget2/+bug/2058082/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to