from:"Luca Bernardi"

[bug #58354] Wget doesn't parse URIs starting with http:/

2020-05-12 Thread Luca Bernardi

Follow-up Comment #1, bug #58354 (project wget):

PS This bug has happened when trying to crawl a website with default Wordpress
template.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/

[bug #58354] Wget doesn't parse URIs starting with http:/

2020-05-12 Thread Luca Bernardi

URL:
  

 Summary: Wget doesn't parse URIs starting with http:/
 Project: GNU Wget
Submitted by: f0ff
Submitted on: Tue 12 May 2020 10:45:17 AM UTC
Category: None
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: 
Originator Email: 
 Open/Closed: Open
 Release: 1.14
 Discussion Lock: Any
Operating System: GNU/Linux
 Reproducibility: Every Time
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: No

___

Details:

Hi,
Wget refuses to parse URIs that start with http:/ (note single slash), e.g.
http:/wp-includes/css/dist/block-library/style.min.css?ver=5.4.1. These are
widely accepted by browsers.

Command that I've used: `wget --user-agent=Mozilla --content-disposition
--page-requisites --adjust-extension --restrict-file-names=windows -d -e
robots=off -m -k -E -r -l 10 -p -N -F -P crawl  -nH $IP`



___

File Attachments:


---
Date: Tue 12 May 2020 10:45:17 AM UTC  Name: out.txt  Size: 17KiB   By: f0ff



___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/

[bug #58354] Wget doesn't parse URIs starting with http:/

[bug #58354] Wget doesn't parse URIs starting with http:/

2 matches

Site Navigation

Mail list logo

Footer information