Re: R: wget fails with this url

2022-05-21 Thread Tim Rühsen

On 21.05.22 12:20, i...@mbsoft.biz wrote:

I have found the solution

Adding option --no-iri works as expected.

Problem closed.



Thanks. If anyone is interested in the details, see also 
https://gitlab.com/gnuwget/wget/-/issues/10.


Regards, Tim


OpenPGP_signature
Description: OpenPGP digital signature


R: wget fails with this url

2022-05-21 Thread info
I have found the solution

Adding option --no-iri works as expected.

Problem closed.




Re: wget fails with this url

2022-05-21 Thread Tim Rühsen

On 20.05.22 19:39, Gisle Vanem wrote:

ge...@mweb.co.za wrote:

I tried this from South Africa and I am getting the exact behaviour as 
the OP.


Okay. Now I tested with various other older Wgets:
   the one bundled with Ruby (a msys2 built version) -> 403 Forbidden
   the one bundled with GNU Octave-6.4 -> 403 Forbidden
   Lumito's 'wget2.exe', same bleeding 403.

I forgot to state, I got no '403' with my home-built
Wget from git master (on Windows-10). So this is probably
a bug that's been fixed. So you could upgrade and try
again.


Thanks for your tests.
I also think this is something (not necessarily a bug) that has been 
fixed or at lest changed.


Interestingly, wget2 seems to have the same (or a similar) issue.

The 'Location:' header in the redirection contains several '+' chars in 
the query part. Internally, these get unescaped to ' ' (space) and 
later, for the GET, the spaces are escaped to %20 by wget2 and to '+' by 
current wget.


For some servers this makes a difference, others don't care (%20 and '+' 
should be unescaped to space on the server side). I can imagine that 
performance-optimized caching proxies skip the normalization and take 
the input as-is to perform the lookup.

In other cases, wget2 may succeed with %20 while wget doesn't with '+' !?
I might have to rethink normalization / escaping...

Regards, Tim


OpenPGP_signature
Description: OpenPGP digital signature


Re: wget fails with this url

2022-05-20 Thread Petr Pisar
V Fri, May 20, 2022 at 06:18:15PM +0200, ge...@mweb.co.za napsal(a):
> (not sure if reply all is appropriate here)
> 
> I tried this from South Africa and I am getting the exact behaviour as the 
> OP. 
> 
It works from the Czech Republic. Both Firefox and wget.

> (fwiw: The downloaded audio sounds Italian to me.)
>
Yes.

> Of course, the 403 is the server's response to what it finds out about the
> requestor in the request. I just wonder what the difference is between the
> browser-generated request and the wget request and how the server could
> react in this way - cookies, maybe?
> 
In my case I cannot see any cookies involved. Nonetheless, wget supports
cookies by default. That should not be a problem.

In Firefox, press F12 to invoke a debugger, select Network tab, and request the
URL from in address bar. Then you can study HTTP headers sent and recieved in
Headers tab for each of the three listed requests.

Wget has -S option to print the recieved headers. Sent headers can be only
seen in a warc dump file created with --warc-file option.

You can use --header option to manually inject or override a particular HTTP
header.

The server actually uses Cloudfront proxies. I guess the proxies in your
region behaves differently.

Maybe pinning d1bxy2pveef3fq.cloudfront.net server to a particular IP address
could help. My system used 65.9.96.42. But be aware the the same IP address
does not guarantee anything. The same address might be assigined to multiple
hosts and routed differently based on the autonomous system of the client
(anycast).

-- Petr


signature.asc
Description: PGP signature


Re: wget fails with this url

2022-05-20 Thread Gisle Vanem

ge...@mweb.co.za wrote:


I tried this from South Africa and I am getting the exact behaviour as the OP.


Okay. Now I tested with various other older Wgets:
  the one bundled with Ruby (a msys2 built version) -> 403 Forbidden
  the one bundled with GNU Octave-6.4 -> 403 Forbidden
  Lumito's 'wget2.exe', same bleeding 403.

I forgot to state, I got no '403' with my home-built
Wget from git master (on Windows-10). So this is probably
a bug that's been fixed. So you could upgrade and try
again.




Re: wget fails with this url

2022-05-20 Thread ge...@mweb.co.za
(not sure if reply all is appropriate here)

I tried this from South Africa and I am getting the exact behaviour as the OP. 

I tried a referer and a user agent as from my browser (a well-aged Firefox), 
without changing the result. 

(fwiw: The downloaded audio sounds Italian to me.)

Of course, the 403 is the server's response to what it finds out about the 
requestor in the request. I just wonder what the difference is between the 
browser-generated request and the wget request and how the server could react 
in this way - cookies, maybe?

Gerd





- Original Message -
From: "Gisle Vanem" 
To: "bug-wget" 
Sent: Friday, May 20, 2022 5:53:50 PM
Subject: Re: wget fails with this url

i...@mbsoft.biz wrote:

> I'm trying to download an audio file from this url:
> 
> https://api.spreaker.com/v2/episodes/6645151/download.mp3
> 
...

> Resolving d1bxy2pveef3fq.cloudfront.net (d1bxy2pveef3fq.cloudfront.net)...
> 18.66.200.93, 18.66.200.230, 18.66.200.32, ...
> Connecting to d1bxy2pveef3fq.cloudfront.net
> (d1bxy2pveef3fq.cloudfront.net)|18.66.200.93|:443... connected.
> HTTP request sent, awaiting response... 403 Forbidden
> 2022-05-20 01:19:11 ERROR 403: Forbidden.

Works fine for me in Wget. But I'm in Norway and you seems
to be in Italy. So could be a licence/copyright issue.

-- 
--gv



Re: wget fails with this url

2022-05-20 Thread Gisle Vanem

i...@mbsoft.biz wrote:


I'm trying to download an audio file from this url:

https://api.spreaker.com/v2/episodes/6645151/download.mp3


...


Resolving d1bxy2pveef3fq.cloudfront.net (d1bxy2pveef3fq.cloudfront.net)...
18.66.200.93, 18.66.200.230, 18.66.200.32, ...
Connecting to d1bxy2pveef3fq.cloudfront.net
(d1bxy2pveef3fq.cloudfront.net)|18.66.200.93|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-05-20 01:19:11 ERROR 403: Forbidden.


Works fine for me in Wget. But I'm in Norway and you seems
to be in Italy. So could be a licence/copyright issue.

--
--gv



wget fails with this url

2022-05-20 Thread info
I'm trying to download an audio file from this url:

https://api.spreaker.com/v2/episodes/6645151/download.mp3

it works fine from any browser but wget fails with 403 error. Is this a bug
or am I missing something?

Log:
--2022-05-20 01:19:11--
https://api.spreaker.com/v2/episodes/6645151/download.mp3
Resolving api.spreaker.com (api.spreaker.com)... 18.66.196.55, 18.66.196.83,
18.66.196.111, ...
Connecting to api.spreaker.com (api.spreaker.com)|18.66.196.55|:443...
connected.
HTTP request sent, awaiting response... 302 Found
Location:
https://d1bxy2pveef3fq.cloudfront.net/v1/download/episodes/original/36735214
?a=it=https%3A%2F%2Fapi.spreaker.com%2Fepisode%2F6645151=5982994=25
=256=10461127=395546=2015-09-18=6645151=980985=https%3A%2F%2Fwww
.spreaker.com%2Fshow%2F980985%2Fepisodes%2Ffeed=36735214=LINK+STATICO+-+
NON+RIMUOVERE%21=%5B%22IAB6-7%22%2C%22IAB7-39%22%2C%22IAB11-4%22%2C%22IAB2
6%22%5D=%5B%22news%22%5D=%5B%22hosting_plan_broadcaster%22%5D=attachme
nt%3Bfilename%3D%0519_africaoggi_conbase.mp3%22=1653088752
ture=Znjwo2ObnrvhRKF3U2wkEyBGnKmlwH2ixQ7%7EgNt8dDAuMDeLrcg-VRYxtovBi317pDbkD
fy6cN5pZy2kuaYkPW1lUXzj-1IA1cTe1JLNejsYRnlFSzYNwN9CluDF-PbHSA7LrgtNK1uFIhNcw
5fyi4kVHyufrqC%7E5%7Ef6ywlorFM9-c7-HfC4cSIeD-cTCgddE8IGLBTlVgRLCQdFREy-W3Ek7
UG5DK7xEJ38TIFlfi6MV%7E%7EnTWVsBnTE1kccqhqKHKvpM%7ES4c4TVd-x-wFRrwqRm6n5FjNS
Bg2pyNHTHA5pbrbv%7Eq-kaM4l7kpGJeRXPI2knjlUdacSW4kpcEBm7-Q__=K2KS
ORR5FSJ5FK [following]
--2022-05-20 01:19:11--
https://d1bxy2pveef3fq.cloudfront.net/v1/download/episodes/original/36735214
?a=it=https%3A%2F%2Fapi.spreaker.com%2Fepisode%2F6645151=5982994=25
=256=10461127=395546=2015-09-18=6645151=980985=https%3A%2F%2Fwww
.spreaker.com%2Fshow%2F980985%2Fepisodes%2Ffeed=36735214=LINK+STATICO+-+
NON+RIMUOVERE!=%5B%22IAB6-7%22%2C%22IAB7-39%22%2C%22IAB11-4%22%2C%22IAB26%
22%5D=%5B%22news%22%5D=%5B%22hosting_plan_broadcaster%22%5D=attachment
%3Bfilename%3D%0519_africaoggi_conbase.mp3%22=1653088752
re=Znjwo2ObnrvhRKF3U2wkEyBGnKmlwH2ixQ7~gNt8dDAuMDeLrcg-VRYxtovBi317pDbkDfy6c
N5pZy2kuaYkPW1lUXzj-1IA1cTe1JLNejsYRnlFSzYNwN9CluDF-PbHSA7LrgtNK1uFIhNcw5fyi
4kVHyufrqC~5~f6ywlorFM9-c7-HfC4cSIeD-cTCgddE8IGLBTlVgRLCQdFREy-W3Ek7UG5DK7xE
J38TIFlfi6MV~~nTWVsBnTE1kccqhqKHKvpM~S4c4TVd-x-wFRrwqRm6n5FjNSBg2pyNHTHA5pbr
bv~q-kaM4l7kpGJeRXPI2knjlUdacSW4kpcEBm7-Q__=K2KSORR5FSJ5FK
Resolving d1bxy2pveef3fq.cloudfront.net (d1bxy2pveef3fq.cloudfront.net)...
18.66.200.93, 18.66.200.230, 18.66.200.32, ...
Connecting to d1bxy2pveef3fq.cloudfront.net
(d1bxy2pveef3fq.cloudfront.net)|18.66.200.93|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-05-20 01:19:11 ERROR 403: Forbidden.