Re: Wget recursive option not working correctly with scheme relative URLs

2023-07-01 Thread Tim Rühsen

Hey Jan,

On 7/1/23 15:16, Jan Bidler via Primary discussion list for GNU Wget wrote:

Hello,
I have part of a website (`example.com/index.html`) I want to mirror which 
contains scheme relative URLs (`//otherexample.com/image.png`). Trying to 
download these with the -r flag, results in wget converting them to a wrong URL 
(`example.com//otherexample.com`).

So using
`wget -r example.com/index.html`
Will cause links with 
`https://example.com/index.html\/\/otherexample.com\/image.png` in the output
Using the debug flag reveals this:
`merge(»example.com/index.html «, » //otherexample.com/image.png«) -> 
https://example.com/index.html\/\/otherexample.com\/image.png 
[`](https://example.com/index.html//otherexample.com/image.png`)


This is unexpected since these kind of links are relatively common and 
so far nobody complaint about it.


I just added a new test function for uri_merge(), the function that does 
this job. It has no issue to merge a relative URL like 
'//otherexample.com/image.png'.


So is it possible to share a real world wget command line to reproduce 
the issue ?


Regards, Tim


OpenPGP_signature
Description: OpenPGP digital signature


Wget recursive option not working correctly with scheme relative URLs

2023-07-01 Thread Jan Bidler via Primary discussion list for GNU Wget
Hello,
I have part of a website (`example.com/index.html`) I want to mirror which 
contains scheme relative URLs (`//otherexample.com/image.png`). Trying to 
download these with the -r flag, results in wget converting them to a wrong URL 
(`example.com//otherexample.com`).

So using
`wget -r example.com/index.html`
Will cause links with 
`https://example.com/index.html\/\/otherexample.com\/image.png` in the output
Using the debug flag reveals this:
`merge(»example.com/index.html «, » //otherexample.com/image.png«) -> 
https://example.com/index.html\/\/otherexample.com\/image.png 
[`](https://example.com/index.html//otherexample.com/image.png`)