Public bug reported:
This command should theoretically fetch all PDFs on a page:
$ wget -v -d -r --level 1 --adjust-extension --no-clobber --no-directories\
--accept-regex 'administrative-orders/.*/administrative-order-matter-'\
--accept-regex 'administrative-orders.*.pdf'\
--accept-regex 'administrative-orders.page[^&]*$'\
--directory-prefix=/tmp\
'https://www.ncua.gov/regulation-supervision/enforcement-actions/administrative-orders?page=56'
But it fails to grab any of them, giving the output:
---
Deciding whether to enqueue
"https://www.ncua.gov/files/administrative-orders/AO14-0241-R4.pdf".
https://www.ncua.gov/files/administrative-orders/AO14-0241-R4.pdf is
excluded/not-included through regex.
Decided NOT to load it.
---
That's bogus. The workaround is to remove this option:
--accept-regex 'administrative-orders.page[^&]*$'
But that should not be necessary. Adding an --accept-* clause should
never cause another --accept-* clause to become invalidated and it
should not shrink the set of fetched files.
** Affects: wget (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1937874
Title:
one --accept-regex expression negates another
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/wget/+bug/1937874/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs