Re: Semicolon not allowed in userinfo

2024-05-10 Thread Jeffrey Walton
On Fri, May 10, 2024 at 1:51 PM Bachir Bendrissou 
wrote:

> Hi Jeff,
>
> Thank you for your reply.
>
> The colon has nothing to do with the issue. If we remove the colon, the
> issue still persists:
>
> *curl* "a;bc@xyz"
> curl: (6) Could not resolve host: xyz
>
> *wget* "a;bc@xyz"
> wget: unable to resolve host address ‘a;bc@xyz’
>
> *wget* "abc@xyz"
> wget: unable to resolve host address ‘xyz’
>
> So, when the semicolon is included in *userinfo*, wget treats *userinfo*
> as part of the hostname. You can replicate this after disconnecting from
> your network first.
>

My bad, I was only answering the question, "Why is the semicolon not
allowed in userinfo, despite the fact that other special characters are
allowed?"

I did not try to figure out what was wrong with the script.

Jeff


Re: Semicolon not allowed in userinfo

2024-05-10 Thread Bachir Bendrissou
Hi Jeff,

Thank you for your reply.

The colon has nothing to do with the issue. If we remove the colon, the
issue still persists:

*curl* "a;bc@xyz"
curl: (6) Could not resolve host: xyz

*wget* "a;bc@xyz"
wget: unable to resolve host address ‘a;bc@xyz’

*wget* "abc@xyz"
wget: unable to resolve host address ‘xyz’

So, when the semicolon is included in *userinfo*, wget treats *userinfo* as
part of the hostname. You can replicate this after disconnecting from your
network first.

Thank you,
Bachir

On Mon, Feb 5, 2024 at 10:08 PM Jeffrey Walton  wrote:

> On Mon, Feb 5, 2024 at 4:57 PM Bachir Bendrissou 
> wrote:
> >
> > The url attached example contains a semicolon in the userinfo segment.
> >
> > Wget rejects this url with the following error message:
> >
> > *Bad port number.*
> >
> > It seems that Wget sees "c" as a port number. When "c" is replaced by a
> > digit, Wget accepts the url and attempts to resolve "xyz".
> >
> > It's worth noting that both curl and aria2 accept the url example.
> >
> > Why is the semicolon not allowed in userinfo, despite the fact that other
> > special characters are allowed?
>
> A colon in the userinfo is deprecated but not forbidden. However, an
> application can choose to reject it. From RFC 3968, Uniform Resource
> Identifier (URI): Generic Syntax, Section 3.2,
> .
>
>The userinfo subcomponent may consist of a user name and, optionally,
>scheme-specific information about how to gain authorization to access
>the resource.  The user information, if present, is followed by a
>commercial at-sign ("@") that delimits it from the host.
>
>   userinfo= *( unreserved / pct-encoded / sub-delims / ":" )
>
>Use of the format "user:password" in the userinfo field is
>deprecated.  Applications should not render as clear text any data
>after the first colon (":") character found within a userinfo
>subcomponent unless the data after the colon is the empty string
>(indicating no password).  Applications may choose to ignore or
>reject such data when it is received as part of a reference and
>should reject the storage of such data in unencrypted form.
>
> According to the BNF is Appendix A, the semicolon ';' is allowed as a
>  token. It does not need to be percent encoded.
>
> Jeff
>


Re: question on using Wget v.1.20.3 built on mingw32

2024-04-07 Thread Darshit Shah
Well it looks like the server you're connecting to sending a redirection. In 
this case there's nothing that Wget can do. It is following the response it got 
from the server. 

Maybe check with the server admin if you think that the redirection is 
incorrect?

On Sun, Apr 7, 2024, at 21:09, Delta Impresa wrote:
> Hello!
> I try to download the file:
> https://cgamos.ru/images/MB_LS/01-0203-0745-001157/0001.jpg
>
> But instead Wget downloads another file:
> https://cgamos.ru/images/qr_pobeda2.png
>
> Please see the screenshot attached.
> It says after awaiting response: *302 Moved Temporarily*
> and then *it uses a location different from one that I supply in my
> url-list.txt file*.
>
> Could you please help me to fix this problem?
>
> Best regards,
> Vladimir
>
> Attachments:
> * 2024-04-07_21-56-54.png



Re: not working with ssl/ipv6?

2024-03-28 Thread Darshit Shah



On Wed, Mar 27, 2024, at 22:05, Brian Vargo wrote:
> Should probably include this:
> $ wget --version
> GNU Wget 1.21.2 built on linux-gnu.
> $ ufw down; wget
> http://ftp.us.debian.org/debian/pool/main/f/foliate/foliate_4.~really3.1.0-0.1_all.deb
> #no firewall
> sudo: ufw: command not found
> --2024-03-27 16:45:23--
> http://ftp.us.debian.org/debian/pool/main/f/foliate/foliate_4.~really3.1.0-0.1_all.deb
> Resolving ftp.us.debian.org (ftp.us.debian.org)... 2600:3402:200:227::2,
> 2600:3404:200:237::2, 2620:0:861:2:208:80:154:139, ...
> Connecting to ftp.us.debian.org (ftp.us.debian.org
> )|2600:3402:200:227::2|:80...
>
> I see this isn't over ssl, just 80.  But it's IPV6.  Is there some way to
> turn that off (for now until I figure out why IPv4 requests work and the
> others don't?

You can use the `-4` option to force IPv4 only mode. 
>
> Suggestions on where else I should start looking for the problem with IPv6?
> (*sigh*)  Firewall's not the issue as you can see unless there's some IP
> tables thing I didn't do.
>
No idea. But as an additional anecdote, I also seem to have troubles with IPv6 
connectivity to github servers in particular. 

> also what resolver is feeding it IPv6 addresses instead IPv4 addresses?  I
> could stop it there too...
>
Your default system resolver. We just make a libc getaddrinfo() call
>
> On Wed, Mar 27, 2024 at 2:01 PM Tim Rühsen  wrote:
>
>> On 3/27/24 17:32, Brian Vargo wrote:
>> > I've not been able to use wget on ssl with github and debian.  I ^C'ed
>> out
>> > of the gitusercontent one (it would have timed out) and the second one is
>> > just a random example:
>> >
>> > ```
>> > $ wget
>> >
>> https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
>> > --2024-03-27 12:27:25--
>> >
>> https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
>> > Resolving raw.githubusercontent.com (raw.githubusercontent.com)...
>> > 2606:50c0:8000::154, 2606:50c0:8002::154, 2606:50c0:8003::154, ...
>> > Connecting to raw.githubusercontent.com
>> > (raw.githubusercontent.com)|2606:50c0:8000::154|:443...
>> > ^C
>>
>> Works fine for me on Debian.
>> Did you check your network / firewall?
>>
>> $ wget
>>
>> https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
>> --2024-03-27
>> 
>> 18:49:21--
>>
>> https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
>> Resolving raw.githubusercontent.com (raw.githubusercontent.com)...
>> 2606:50c0:8002::154, 2606:50c0:8003::154, 2606:50c0:8000::154, ...
>> Connecting to raw.githubusercontent.com
>> (raw.githubusercontent.com)|2606:50c0:8002::154|:443... connected.
>> HTTP request sent, awaiting response... 200 OK
>> Length: 1251 (1.2K) [text/plain]
>> Saving to: ‘docker-compose.yml’
>>
>> docker-compose.yml
>> 100%[===>]   1.22K
>> --.-KB/sin 0s
>>
>> 2024-03-27 18:49:21 (42.8 MB/s) - ‘docker-compose.yml’ saved [1251/1251]
>>
>>
>> $ wget --version
>> GNU Wget 1.24.5 built on linux-gnu.
>>
>> -cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls
>> +ntlm +opie +psl +ssl/gnutls
>>



Re: not working with ssl/ipv6?

2024-03-27 Thread Brian Vargo
Should probably include this:
$ wget --version
GNU Wget 1.21.2 built on linux-gnu.
$ ufw down; wget
http://ftp.us.debian.org/debian/pool/main/f/foliate/foliate_4.~really3.1.0-0.1_all.deb
#no firewall
sudo: ufw: command not found
--2024-03-27 16:45:23--
http://ftp.us.debian.org/debian/pool/main/f/foliate/foliate_4.~really3.1.0-0.1_all.deb
Resolving ftp.us.debian.org (ftp.us.debian.org)... 2600:3402:200:227::2,
2600:3404:200:237::2, 2620:0:861:2:208:80:154:139, ...
Connecting to ftp.us.debian.org (ftp.us.debian.org
)|2600:3402:200:227::2|:80...

I see this isn't over ssl, just 80.  But it's IPV6.  Is there some way to
turn that off (for now until I figure out why IPv4 requests work and the
others don't?

Suggestions on where else I should start looking for the problem with IPv6?
(*sigh*)  Firewall's not the issue as you can see unless there's some IP
tables thing I didn't do.

also what resolver is feeding it IPv6 addresses instead IPv4 addresses?  I
could stop it there too...


On Wed, Mar 27, 2024 at 2:01 PM Tim Rühsen  wrote:

> On 3/27/24 17:32, Brian Vargo wrote:
> > I've not been able to use wget on ssl with github and debian.  I ^C'ed
> out
> > of the gitusercontent one (it would have timed out) and the second one is
> > just a random example:
> >
> > ```
> > $ wget
> >
> https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
> > --2024-03-27 12:27:25--
> >
> https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
> > Resolving raw.githubusercontent.com (raw.githubusercontent.com)...
> > 2606:50c0:8000::154, 2606:50c0:8002::154, 2606:50c0:8003::154, ...
> > Connecting to raw.githubusercontent.com
> > (raw.githubusercontent.com)|2606:50c0:8000::154|:443...
> > ^C
>
> Works fine for me on Debian.
> Did you check your network / firewall?
>
> $ wget
>
> https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
> --2024-03-27
> 
> 18:49:21--
>
> https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
> Resolving raw.githubusercontent.com (raw.githubusercontent.com)...
> 2606:50c0:8002::154, 2606:50c0:8003::154, 2606:50c0:8000::154, ...
> Connecting to raw.githubusercontent.com
> (raw.githubusercontent.com)|2606:50c0:8002::154|:443... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 1251 (1.2K) [text/plain]
> Saving to: ‘docker-compose.yml’
>
> docker-compose.yml
> 100%[===>]   1.22K
> --.-KB/sin 0s
>
> 2024-03-27 18:49:21 (42.8 MB/s) - ‘docker-compose.yml’ saved [1251/1251]
>
>
> $ wget --version
> GNU Wget 1.24.5 built on linux-gnu.
>
> -cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls
> +ntlm +opie +psl +ssl/gnutls
>


Re: not working with ssl/ipv6?

2024-03-27 Thread Tim Rühsen

On 3/27/24 17:32, Brian Vargo wrote:

I've not been able to use wget on ssl with github and debian.  I ^C'ed out
of the gitusercontent one (it would have timed out) and the second one is
just a random example:

```
$ wget
https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
--2024-03-27 12:27:25--
https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)...
2606:50c0:8000::154, 2606:50c0:8002::154, 2606:50c0:8003::154, ...
Connecting to raw.githubusercontent.com
(raw.githubusercontent.com)|2606:50c0:8000::154|:443...
^C


Works fine for me on Debian.
Did you check your network / firewall?

$ wget 
https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
--2024-03-27 18:49:21-- 
https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 
2606:50c0:8002::154, 2606:50c0:8003::154, 2606:50c0:8000::154, ...
Connecting to raw.githubusercontent.com 
(raw.githubusercontent.com)|2606:50c0:8002::154|:443... connected.

HTTP request sent, awaiting response... 200 OK
Length: 1251 (1.2K) [text/plain]
Saving to: ‘docker-compose.yml’

docker-compose.yml 
100%[===>]   1.22K 
--.-KB/sin 0s


2024-03-27 18:49:21 (42.8 MB/s) - ‘docker-compose.yml’ saved [1251/1251]


$ wget --version
GNU Wget 1.24.5 built on linux-gnu.

-cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls
+ntlm +opie +psl +ssl/gnutls


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [bug-wget] Version jump from 1.21.4 to 1.24.5

2024-03-16 Thread Darshit Shah
Hi Brian,

Thanks for pointing out the erroneous jump in the version numbers.
The jump in version numbers happened because I made a typo when finalizing the 
tag just before the release. Unfortunately, this typo went unseen until now.

Fear not, I will not be retracting this version. The version is out and it will 
stay as it is. The missing versions will remain a modern mystery :)

On Sat, Mar 16, 2024, at 16:17, Brian Inglis wrote:
> Hi folks,
>
> Why was the version incremented from 1.21.4 to 1.24.5 rather than 
> 1.21.5 or 1.22.1?
> I want to be certain that this will not be changed before I release a 
> package.
>
> -- 
> Take care. Thanks, Brian Inglis  Calgary, Alberta, Canada
>
> La perfection est atteinte   Perfection is achieved
> non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
> mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
>  -- Antoine de Saint-Exupéry



Re: wget-1.24.5 released [stable]

2024-03-11 Thread Sam James
Darshit Shah  writes:

> This is to announce wget-1.24.5, a stable release.
>
> This is another relative slow release with minor bug fixes. The main
> one being a correction in how subdomains of Top-Level Domains (TLDs)
> are treated when checking for suffixes during HSTS lookups. This is a
> very low criticality vulnerability that has now been patched.
>
> There have been 33 commits by 6 people in the 43 weeks since 1.21.4.
>
> See the NEWS below for a brief summary.
>
> Thanks to everyone who has contributed!
> The following people contributed changes to this release:
>
>   Christian Weisgerber (1)
>   Darshit Shah (20)
>   Jan Palus (1)
>   Jan-Michael Brummer (1)
>   Tim Rühsen (9)
>   Yaakov Selkowitz (1)
>
> Darshit Shah
>  [on behalf of the wget maintainers]
> ==
>
> Here is the GNU wget home page:
> https://gnu.org/s/wget/
>
> For a summary of changes and contributors, see:
> https://git.sv.gnu.org/gitweb/?p=wget.git;a=shortlog;h=v1.24.5
> or run this command from a git-cloned wget directory:
>   git shortlog v1.21.4..v1.24.5
>
> Here are the compressed sources:
> https://ftpmirror.gnu.org/wget/wget-1.24.5.tar.gz (5.0MB)
> https://ftpmirror.gnu.org/wget/wget-1.24.5.tar.lz (2.5MB)
>
> Here are the GPG detached signatures:
> https://ftpmirror.gnu.org/wget/wget-1.24.5.tar.gz.sig
> https://ftpmirror.gnu.org/wget/wget-1.24.5.tar.lz.sig
>
> Use a mirror for higher download bandwidth:
> https://www.gnu.org/order/ftp.html
>
> Here are the SHA1 and SHA256 checksums:
>
>   62525de6f09486942831ca2e352ae6802fc2c3dd  wget-1.24.5.tar.gz
>   +i3DW6tRhOy8Rqnvg97yqqo/TJ88l9S9GdywfU2mN94=  wget-1.24.5.tar.gz
>   01659f427c2e90c7c943805db69ea00f5da79b07  wget-1.24.5.tar.lz
>   V6EHFR5O+U/flK/+z6xZiWPzcvEyk+2cdAMhBTkLNu4=  wget-1.24.5.tar.lz
>
> Verify the base64 SHA256 checksum with cksum -a sha256 --check
> from coreutils-9.2 or OpenBSD's cksum since 2007.
>
> Use a .sig file to verify that the corresponding file (without the
> .sig suffix) is intact.  First, be sure to download both the .sig file
> and the corresponding tarball.  Then, run a command like this:
>
>   gpg --verify wget-1.24.5.tar.gz.sig
>
> The signature should match the fingerprint of the following key:
>
>   pub   rsa4096 2015-10-14 [SC]
>     7845 120B 07CB D8D6 ECE5  FF2B 2A17 43ED A91A 35B6
>   uid   Darshit Shah 
>   uid   Darshit Shah 
>
> If that command fails because you don't have the required public key,
> or that public key has expired, try the following commands to retrieve
> or refresh it, and then rerun the 'gpg --verify' command.
>
>   gpg --locate-external-key g...@darnir.net
>
>   gpg --recv-keys 64FF90AAE8C70AF9
>
>   wget -q -O-
> 'https://savannah.gnu.org/project/release-gpgkeys.php?group=wget=1'
> | gpg --import -
>

The version of your key in this keyring seems to be expired. Could you
upload a new one? Thanks.

> As a last resort to find the key, you can try the official GNU
> keyring:
>
>   wget -q https://ftp.gnu.org/gnu/gnu-keyring.gpg
>   gpg --keyring gnu-keyring.gpg --verify wget-1.24.5.tar.gz.sig
>
> This release was bootstrapped with the following tools:
>   Autoconf 2.72
>   Automake 1.16.5
>   Gnulib v0.1-7211-gd15237a22b
>
> NEWS
>
> * Noteworthy changes in release 1.24.5 (2024-03-10) [stable]
>
> ** Fix how subdomain matches are checked for HSTS.
>    Fixes a minor issue where cookies may be leaked to the wrong domain
>
> ** Wget will now also parse the srcset attribute in  HTML tags
>
> ** Support reading fetchmail style "user" and "passwd" fields from netrc
>
> ** In some cases, prevent the confusing "Cannot write to... (success)"
>error messages
>
> ** Support extremely fast download speeds (TB/s).
>    Previously this would cause Wget to crash when printing the speed
>
> ** Improve portability on OpenBSD to run the test suite
>
> ** Ensure that CSS URLs are corectly quoted (Bug: 64082)
>
> [2. OpenPGP public key --- application/pgp-keys; 
> OpenPGP_0x2A1743EDA91A35B6.asc]...



Re: Is it possible to get wget --content-disposition to overwrite an existing file.

2024-03-03 Thread Mark Sapiro

On 3/3/24 10:35 AM, Tim Rühsen wrote:


Wget tries not to overwrite files.

But if you want to remove the old file, no matter what, why don't you 
remove (or rename) that file before calling wget?



Thanks for the response. I understand that I can do that, but the wget 
is part of an automated Python script. I had already modified that 
script to rename the downloaded file, but in the interest of 
simplification I wanted to just do the whole thing with a single call to 
`wget --content-disposition`.


Anyway, this is all moot now. Originally, the script was using Python's 
urllib.request module to retrieve the file, but the server recently 
started blocking GETs from that module so I started using wget instead. 
I then realized I could set the User-Agent header in the GET request to 
avoid the block, and that's what I'm doing now.


--
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan




Re: Integer overflows in parse_content_range() and gethttp()

2024-03-03 Thread Tim Rühsen

Thanks vulnerabilityspotter,

are there any insights why this specific integer overflow is a 
vulnerability? ("potential" is not enough)


TL;DR for the audience:
An AI generated vulnerability report. I mostly agree with what has been 
written for curl at https://github.com/curl/curl/issues/12983


But I am not saying that the code is good code, it can definitely be 
improved. And also, I like the approach of automated searching for 
defects in software, be it an AI tool or whatever. Keep on improving!


In case you want to report more findings, please do not use the ML, as 
all posts here are public. Either report to Savannah or to Gitlab in 
"private" mode.


Regards, Tim

On 2/24/24 23:38, vulnerabilityspotter--- via Primary discussion list 
for GNU Wget wrote:

Security Vulnerability Report

File: src/http.c

Functions: parse_content_range() and gethttp()

Vulnerability Type: Integer Overflow

Location: Lines 936, 942, 955 and 3739

Severity: High

Description:

In the parse_content_range() function, at lines 936, 942, 955, there exists a 
vulnerability related to an integer overflow. The vulnerability arises from the 
calculation of the variable num, which is assigned the value of

num = 10 * num + (*hdr - '0');

Both the multiplication and addition can lead to an integer overflow, and lead 
to unexpected behavior, due to the lack of validation.

Furthermore similarly to 
[curl/curl#12983](https://github.com/curl/curl/issues/12983), at line 3739 of 
function gethttp(), the calculation of the contlen variable can also overflow:

contlen = last_byte_pos - first_byte_pos + 1;

Exploitation Scenario:

An attacker may craft a malicious request with carefully chosen values in the 
Content-Range header, triggering an integer overflow during the calculation of 
num and contlen. This could potentially lead to various security issues, such 
as memory corruption, buffer overflows, or unexpected behavior, depending on 
how the num and contlen variables is subsequently used.

Impact:

The impact of this vulnerability could be severe, potentially leading to:

Memory Corruption: If the calculated num and contlen value are used to allocate 
memory or perform operations such as copying data, an integer overflow could 
result in memory corruption, leading to crashes or arbitrary code execution.

Security Bypass: In scenarios where num and contlen value are used to enforce 
boundaries or permissions, an attacker may exploit the integer overflow to 
bypass security checks or gain unauthorized access to sensitive resources.

Denial of Service (DoS): A carefully crafted request exploiting the integer 
overflow could cause the application to enter an unexpected state or consume 
excessive resources, leading to a denial of service condition.

Recommendations:

Bounds Checking: Implement proper bounds checking to ensure that the values of 
num and contlen are within acceptable ranges before performing calculations.

Safe Arithmetic Operations: Consider using safer arithmetic operations or 
alternative calculation methods to prevent integer overflows, especially when 
dealing with potentially large or close-to-boundary values.

Input Validation: Validate input parameters to ensure they adhere to expected 
ranges and constraints before performing calculations.

Error Handling: Implement robust error handling mechanisms to gracefully handle 
scenarios where input parameters result in unexpected or invalid calculations.

Severity Justification:

The presence of an integer overflow vulnerability at lines 936, 942, 955 and 
3739 poses a high risk to the security and stability of the application. 
Exploitation of this vulnerability could lead to severe consequences, including 
memory corruption, security bypass, or denial of service conditions.

Affected Versions:

This vulnerability affects all versions of the application that include the 
vulnerable parse_content_range() and gethttp() functions.

References:

OWASP Integer Overflow
CWE-190: Integer Overflow or Wraparound
CERT Secure Coding - INT32-C

Conclusion:

The presence of an integer overflow vulnerability at lines 936, 942, 955 in the 
parse_content_range() function and line 3739 of gethttp() poses a high risk to 
the security and stability of the application. It is imperative to address this 
vulnerability promptly by implementing appropriate bounds checking and error 
handling mechanisms to prevent potential exploitation and associated security 
risks.

Sent with [Proton Mail](https://proton.me/) secure email.


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Is it possible to get wget --content-disposition to overwrite an existing file.

2024-03-03 Thread Tim Rühsen

On 2/25/24 03:40, Mark Sapiro wrote:
As far as I can tell, the only way to get wget to overwrite an existing 
file is with the `-O` option, but I want to use `--content-disposition` 
to write the output to the filename in the Content-Disposition: header. 
This works, but if the filename already exists it writes to filename.1 
instead. Is there a way to get it to overwrite the existing file?




Wget tries not to overwrite files.

But if you want to remove the old file, no matter what, why don't you 
remove (or rename) that file before calling wget?


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: wget input/output using stdin/stdout

2024-03-01 Thread Darshit Shah
Hi Dan,

For this usecase, I would highly recommmend using the successor to GNU Wget, 
GNU Wget2. It is not available in most distribution repositories. See 
https://gitlab.com/gnuwget/wget2

Wget2 supports reading from stdin throughout the life of the program. 

On Sat, Mar 2, 2024, at 09:35, Dan Lewis via Primary discussion list for GNU 
Wget wrote:
> Greetings,
>
> I have a program that loads and executes wget using the following command
> line:
>
> wget -i - -O -
>
>
> and dups wget's stdin, stdout (and stderr) handles so that I can write URLs
> to wget's stdin and read the responses from wget's stdout. What I wanted to
> do was to write a sequence of URLs to wget's stdin, reading each response
> before the next URL is sent. Rather, wget buffers its output so that it
> doesn't output anything until I close its stdin. As a result, it seems that I
> can only send all of the URLs to wget, close its stdin, and then read all
> of the responses.
>
> Is there any wget command line option that will cause wget to output a
> response after each URL without waiting for me to close its stdin?
>
> Thanks!
> Dan



Re: Wget fails to download some URLs from www.investing.com

2024-02-18 Thread Tim Rühsen
On 2/14/24 11:21, Chris Smith via Primary discussion list for GNU Wget 
wrote:

Hi guys,
This is not so much a bug as requests being blocked by the cloudflare server.

Checkout:
 https://www.investing.com/crypto/bitcoin/btc-usd-historical-data


The server just gives a 403 Forbidden.



The URL works in Firefox, but fails to download using cURL or Wget.
I have tried various user-agent strings, so that is not the problem.


Maybe the page requires a Referrer: header? In theory it could be 
anything, e.g. the page is HTTP/3 only or whatnot :|


You can start with F12 (developer console) in Firefox, load the page and 
check the request headers. Or better: there is a copy-as-curl when 
right-clicking on the request line that puts a full curl command line 
into your clipboard.




The following URL works for Firefox, cURL and Wget:
 https://www.investing.com/equities/lloyds-banking-grp-historical-data

Kind regards,
Chris Smith 

_

Your E-Mail. Your Cloud. Your Office. eclipso Mail Europe. 
https://www.eclipso.de





OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Semicolon not allowed in userinfo

2024-02-05 Thread Jeffrey Walton
On Mon, Feb 5, 2024 at 4:57 PM Bachir Bendrissou  wrote:
>
> The url attached example contains a semicolon in the userinfo segment.
>
> Wget rejects this url with the following error message:
>
> *Bad port number.*
>
> It seems that Wget sees "c" as a port number. When "c" is replaced by a
> digit, Wget accepts the url and attempts to resolve "xyz".
>
> It's worth noting that both curl and aria2 accept the url example.
>
> Why is the semicolon not allowed in userinfo, despite the fact that other
> special characters are allowed?

A colon in the userinfo is deprecated but not forbidden. However, an
application can choose to reject it. From RFC 3968, Uniform Resource
Identifier (URI): Generic Syntax, Section 3.2,
.

   The userinfo subcomponent may consist of a user name and, optionally,
   scheme-specific information about how to gain authorization to access
   the resource.  The user information, if present, is followed by a
   commercial at-sign ("@") that delimits it from the host.

  userinfo= *( unreserved / pct-encoded / sub-delims / ":" )

   Use of the format "user:password" in the userinfo field is
   deprecated.  Applications should not render as clear text any data
   after the first colon (":") character found within a userinfo
   subcomponent unless the data after the colon is the empty string
   (indicating no password).  Applications may choose to ignore or
   reject such data when it is received as part of a reference and
   should reject the storage of such data in unencrypted form.

According to the BNF is Appendix A, the semicolon ';' is allowed as a
 token. It does not need to be percent encoded.

Jeff



Re: Long time no pkt running with limit-rate

2024-01-30 Thread Darshit Shah
Hi,

Thanks for the report. 
 If I understand you correctly, then this is expected behavior. The way the 
rate limiting is implemented is that it allows a few packets through at the 
maximum bandwidth and then simply sleeps for a time such that the average rate 
is roughly equal to the one set by the user. 

So, waiting for about 70 seconds between packets seems about correct. I don't 
really know how one could implement a better rate limiting algorithm entirely 
in user space. 

On Tue, Jan 30, 2024, at 10:00, Lei B Bao wrote:
> Hi Wget team.
>
> Now we’re using the wget 1.21 version to download a 700K file with a 
> fixed rate-limit 10K, and we want the pkts can be kept running during 
> ~70s, but we found there are about 60s+ no pkt running:
>
>
> root@3c578c2b5c87:/# wget -V
>
> GNU Wget 1.21 built on linux-gnu.
>
>
>
> root@3c578c2b5c87:/# wget 
> http://192.168.200.30:80/wget_traffic_id_2_700KB
>  
> -O /dev/null --limit-rate=10K
>
>
>
>
>
> 
>
> 02:29:28.007124 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [.], ack 698203, win 1391, options [nop,nop,TS val 3141532161 ecr 
> 3882564233], length 0
>
> 02:29:28.007126 IP 192.168.200.30.80 > 192.168.200.200.44544: Flags 
> [P.], seq 698203:700259, ack 151, win 28, options [nop,nop,TS val 
> 3882564233 ecr 3141532160], length 2056: HTTP
>
> 02:29:28.007131 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [.], ack 700259, win 1408, options [nop,nop,TS val 3141532161 ecr 
> 3882564233], length 0
>
>
>
> < 60s+ no pkts running
>
>
>
>
>
> 02:30:36.358950 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141600513 
> ecr 3882564233], length 0
>
> 02:30:36.568323 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141600723 
> ecr 3882564233], length 0
>
> 02:30:36.776340 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141600931 
> ecr 3882564233], length 0
>
> 02:30:37.184305 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141601339 
> ecr 3882564233], length 0
>
> 02:30:38.032310 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141602187 
> ecr 3882564233], length 0
>
> 02:30:39.696306 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141603851 
> ecr 3882564233], length 0
>
> 02:30:42.960334 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141607115 
> ecr 3882564233], length 0
>
> 02:30:49.808344 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141613963 
> ecr 3882564233], length 0
>
> ^@02:31:03.120308 IP 192.168.200.200.44544 
> > 192.168.200.30.80: Flags [F.], seq 151, ack 700259, win 1408, options 
> [nop,nop,TS val 3141627275 ecr 3882564233], length 0
>
> 02:31:29.232320 IP 192.168.200.200.44544 > 192.168.200.30.80: Flags 
> [F.], seq 151, ack 700259, win 1408, options [nop,nop,TS val 3141653387 
> ecr 3882564233], length 0
>
> While we also tested with old version 1.20.1, it seems good, but also 
> about 5 s no pkts running:
>
>
> root@ebe3ce58547c:/# wget -V
>
> GNU Wget 1.20.1 built on linux-gnu.
>
>
>
> root@ebe3ce58547c:/# wget 
> http://192.168.200.30:80/wget_traffic_id_2_700KB
>  
> -O /dev/null --limit-rate=10K
>
>
>
> 02:42:00.187604 IP 192.168.200.200.46556 > 192.168.200.30.80: Flags 
> [.], ack 683507, win 195, options [nop,nop,TS val 265728065 ecr 
> 265435757], length 0
>
> 02:42:00.187631 IP 192.168.200.30.80 > 192.168.200.200.46556: Flags 
> [P.], seq 683507:692455, ack 165, win 489, options [nop,nop,TS val 
> 265435757 ecr 265728065], length 8948: HTTP
>
> 02:42:00.187633 IP 192.168.200.30.80 > 192.168.200.200.46556: Flags 
> [P.], seq 692455:698203, ack 165, win 489, options [nop,nop,TS val 
> 265435757 ecr 265728065], length 5748: HTTP
>
> 02:42:00.187633 IP 192.168.200.30.80 > 192.168.200.200.46556: Flags 
> [P.], seq 698203:700259, ack 165, win 489, options [nop,nop,TS val 
> 265435757 ecr 265728065], length 2056: HTTP
>
> 02:42:00.231702 IP 192.168.200.200.46556 > 192.168.200.30.80: Flags 
> [.], ack 700259, win 94, options [nop,nop,TS val 265728110 ecr 
> 265435757], length 0
>
>
>
> <<< ~5s no pkts running.
>
>
>
> 02:42:05.787250 IP 192.168.200.200.46556 > 192.168.200.30.80: Flags 
> [.], ack 700259, win 411, options [nop,nop,TS val 265733665 ecr 
> 265435757], length 0
>
> 02:42:07.746996 IP 192.168.200.200.46556 > 192.168.200.30.80: Flags 
> [F.], seq 165, ack 700259, win 443, options [nop,nop,TS val 265735625 
> ecr 265435757], length 0
>
> 

Re: Semicolon not allowed in userinfo

2024-01-29 Thread Bachir Bendrissou
Hi all,

Thank you for your replies.

The URL I posted is not the one you received and does not contain any
space. The url may have been botched by the mailing list. I attach the url
here for your reference.

"http://a;b:c@xyz;

Best,
Bachir

On Thu, Oct 5, 2023 at 9:47 PM Tim Rühsen  wrote:

> On 10/4/23 14:04, Bachir Bendrissou wrote:
> > Hi Tim,
> >
> > Wget doesn't follow the current specs and the parsing is lenient to
> >> accept some types of badly formatted URLs seen in the wild.
> >>
> >
> > Did you mean to say that the parsing is overly strict, and needs to be
> more
> > permissive?
>
> I tried to make clear that it is not the semicolon.
> What was unclear?
>
> > Also, as Daniel pointed out, your curl input example appears to have a
> > space.
>
> Sorry, the curl URL was no mine, it was yours. May I cite the URL from
> your original email? There is a space, no?
>
>  > *http://a ;b:c@xyz*
>
> Regards, Tim
>
> >
> > Bachir
> >
> > On Tue, Oct 3, 2023 at 1:45 PM Daniel Stenberg  wrote:
> >
> >> On Tue, 3 Oct 2023, Tim Rühsen wrote:
> >>
> >>> My  version of curl (8.3.0) doesn't accept it:
> >>>
> >>> curl -vvv 'http://a ;b:c@xyz'
> >>> * URL rejected: Malformed input to a URL function
> >>
> >> That's in no way a legal URL (accortding to RFC 3986) and it is not the
> >> semicolon that causes curl to reject it. It is the space.
> >>
> >> But I don't know if that is maybe your clients or the mailing list
> >> software
> >> that botched it so badly?
> >>
> >> --
> >>
> >>/ daniel.haxx.se
>
http://a;b:c@xyz

Re: [PATCH] * src/html-url.c: Parse attributes

2024-01-28 Thread Tim Rühsen

On 1/1/24 13:56, blankie via Primary discussion list for GNU Wget wrote:

Hello,

Attached is a patch to fix https://savannah.gnu.org/bugs/?55087. This 
patch is slightly hacky ( in a media element is allowed), 
but the code as-is also allows for  in a  and 
 outside of a  or media element.


Thank you for the contribution.

I decided to implement this in a slightly different way in commit 
4100339a2ba8b0c0fdf36558222f05c27aa7808a and added a missing test.


Without you, the issue 55087 would have been unseen for a much longer time.

Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: read error

2024-01-27 Thread Tim Rühsen

On 1/11/24 05:40, 王强 via Primary discussion list for GNU Wget wrote:

Wget is consistently encountering a byte read error, increasing the number of 
attempts and the waiting time does not solve the problem.
(2024-01-11 12:37:20 (17.3 KB/s) - Read error at byte 2122610 
(Success).Retrying.)
url:https://zinc15.docking.org/substances/subsets/in-trials.sdf?count=all




Thanks for the report.

Just to confirm: The issue is reproducible with wget 1.21.4 on Linux.

The download works as expected with wget2 2.1.0.

Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [bug #65009] wget refuses to use legitimate self signed CAs provided with the --ca-certificate flag

2024-01-25 Thread Jeffrey Walton
On Mon, Dec 11, 2023 at 2:32 PM Jeffrey Walton  wrote:
>
> On Mon, Dec 11, 2023 at 9:54 AM anonymous  wrote:
> >
> > URL:
> >   
> >
> >  Summary: wget refuses to use legitimate self signed CAs
> > provided with the --ca-certificate flag
> >Group: GNU Wget
> >Submitter: None
> >Submitted: Mon 11 Dec 2023 02:53:19 PM UTC
> > Category: Program Logic
> > Severity: 3 - Normal
> > Priority: 5 - Normal
> >   Status: None
> >  Privacy: Public
> >  Assigned to: None
> >  Originator Name: David Hadas
> > Originator Email: david.ha...@gmail.com
> >  Open/Closed: Open
> >  Release: None
> >  Discussion Lock: Any
> > Operating System: Mac OS
> >  Reproducibility: Every Time
> >Fixed Release: None
> >  Planned Release: None
> >   Regression: None
> >Work Required: None
> >   Patch Included: None
> >
> >
> > ___
> >
> > Follow-up Comments:
> >
> >
> > ---
> > Date: Mon 11 Dec 2023 02:53:19 PM UTC By: Anonymous
> > Release: 1.21
> >
> > ---
> >
> > Using mTLS with self signed certificates with various tools, it seems wget
> > misbehaves and does not add a legitimate self signed CA provided with the
> > --ca-certificate flag to the ca pool used internally.
> > (I expect that the same issue exists with TLS).
> >
> > The CA pem is legitimate and well structured as it is used successfully with
> > other tools: (1) curl (see below), (2) standard go client and server.
> >
> > Wget indicates "Self-signed certificate encountered" as an output although 
> > the
> > CA pem is provided using --ca-certificate
> > Wget provides the same response with and without the --ca-certificate...
> >
> > ---
> >
> > Here is an example:
> > % ./hack/ping.sh
> >
> > Connect to remote server using mTLS and self signed certificates
> >
> > Try Curl:
> >
> > + curl
> > https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> > --key prk.pem --cert cert.pem --cacert ca.pem
> > <<< Response from the server
> > Hello little client,<<< Response from the server
> > happy to serve you today<<< Response from the server
> > <<< Response from the server
> > + set +x
> >
> > Try Wget:
> >
> > + wget
> > https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> > --private-key prk.pem --certificate cert.pem --ca-certificate ca.pem
> > --2023-12-09 08:43:37--
> > https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud/
> > Resolving
> > myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> > (myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud)...
> > 169.63.244.138
> > Connecting to
> > myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> > (myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud)|169.63.244.138|:443...
> > connected.
> > ERROR: cannot verify
> > myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud's
> > certificate, issued by ‘CN=test,O=test.research.ibm.com’:
> >   Self-signed certificate encountered.
> > To connect to
> > myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> > insecurely, use `--no-check-certificate'.
> > + set +x
> >
> > ---
> >
> > Example running with debug mode:
> > % ./hack/ping.sh
> >
> > Connect to remote server using mTLS and self signed certificates
> >
> > Try Curl:
> >
> > + curl -v
> > https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> > --key prk.pem --cert cert.pem --cacert ca.pem
> > *   Trying 169.63.244.138:443...
> > * Connected to
> > myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> > (169.63.244.138) port 443 (#0)
> > * ALPN: offers h2,http/1.1
> > * (304) (OUT), TLS handshake, Client hello (1):
> > *  CAfile: ca.pem
> > *  CApath: none
> > * (304) (IN), TLS handshake, Server hello (2):
> > * (304) (IN), TLS handshake, Unknown (8):
> > * (304) (IN), TLS handshake, Request CERT (13):
> > * (304) (IN), TLS handshake, Certificate (11):
> > * (304) (IN), TLS handshake, CERT verify (15):
> > * (304) (IN), TLS handshake, Finished (20):
> > * (304) (OUT), TLS handshake, Certificate (11):
> > * (304) (OUT), TLS handshake, CERT verify (15):
> > * (304) (OUT), TLS handshake, Finished (20):
> > * SSL connection using TLSv1.3 / 

Re: Navigation Bar cannot be displayed after wgetting www.doxygen.nl

2024-01-11 Thread Gisle Vanem

Haowei Hsu wrote:


*wget --mirror --convert-links --adjust-extension --page-requisites
--no-parent https://www.doxygen.nl/ *

However, everything seems well except that the Navigation Bar cannot be
displayed.

[image: image.png]

What happened? Is this a bug of Wget? If so, how to fix this?


Trying this myself on Win-10, I get "Too many open files"
after 505 files were saved! As if Wget or Gnulib is not saving
a file correctly (?).


- OS version: Windows 11
- Wget version: 1.21.4


Same as my wget version.


--
--gv



Re: Bug report

2023-12-15 Thread Darshit Shah
And what is the problem?

On Mon, Dec 11, 2023, at 18:45, Ritick sethi wrote:
> riticksethi@d7-138-10 homebrew % ln -sf ../Cellar/wget/1.16.1/bin/wget 
> ~/homebrew/bin/wget
>
> ln: /Users/riticksethi/homebrew/bin/wget: No such file or directory
> riticksethi@d7-138-10 homebrew % mkdir -p ~/homebrew/bin
> riticksethi@d7-138-10 homebrew % ln -sf 
> ../../Cellar/wget/1.16.1/bin/wget ~/homebrew/bin/wget
> #
> zsh: command not found: #
> riticksethi@d7-138-10 homebrew % ln -sf 
> ../../Cellar/wget/1.16.1/bin/wget ~/homebrew/bin/wget
> riticksethi@d7-138-10 homebrew % wget --version
> GNU Wget 1.21.4 built on darwin23.0.0.
>
> -cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls 
> +ntlm +opie -psl +ssl/openssl 
>
> Wgetrc: 
> /opt/homebrew/etc/wgetrc (system)
> Locale: 
> /opt/homebrew/Cellar/wget/1.21.4/share/locale 
> Compile: 
> clang -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/opt/homebrew/etc/wgetrc" 
> -DLOCALEDIR="/opt/homebrew/Cellar/wget/1.21.4/share/locale" -I. 
> -I../lib -I../lib -I/opt/homebrew/opt/openssl@3/include 
> -I/opt/homebrew/Cellar/libidn2/2.3.4_1/include -DNDEBUG -g -O2 
> Link: 
> clang -I/opt/homebrew/Cellar/libidn2/2.3.4_1/include -DNDEBUG -g 
> -O2 -L/opt/homebrew/Cellar/libidn2/2.3.4_1/lib -lidn2 
> -L/opt/homebrew/opt/openssl@3/lib -lssl -lcrypto -ldl -lz 
> ../lib/libgnu.a -liconv -lintl -Wl,-framework -Wl,CoreFoundation 
> -lunistring 
>
> Copyright (C) 2015 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> .
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
>
> Originally written by Hrvoje Niksic .
> Please send bug reports and questions to .
>
> Regards
>
> Ritick sethi
> MSc student Sensor System Technology
> Hochschule Karlsruhe
> +49 176 2549 3112



Re: Issues on installation

2023-12-14 Thread Gisle Vanem

Noah Kpogo wrote:


Please I'm  trying to install nethunter using wget on termux, they only
keep saying invalid option. What should I do?


A typo. Here a,
  wget -O install-nethunder-termux htrps://offs.ec/2MceZW

gives "htrps://offs.ec/2MceZW: Unsupported scheme."

Fixing that, I get "ERROR 404: Not Found."

--
--gv



Re: Issues on installation

2023-12-14 Thread Darshit Shah
That looks like a 0 (Zero). The option you're looking for is -O (Capital O)

On Fri, Dec 15, 2023, at 06:12, Noah Kpogo wrote:
> Please I'm  trying to install nethunter using wget on termux, they only
> keep saying invalid option. What should I do?
>
> Attachments:
> * Screenshot_20231215-050807.png



Re: [bug #65009] wget refuses to use legitimate self signed CAs provided with the --ca-certificate flag

2023-12-11 Thread Jeffrey Walton
On Mon, Dec 11, 2023 at 9:54 AM anonymous  wrote:
>
> URL:
>   
>
>  Summary: wget refuses to use legitimate self signed CAs
> provided with the --ca-certificate flag
>Group: GNU Wget
>Submitter: None
>Submitted: Mon 11 Dec 2023 02:53:19 PM UTC
> Category: Program Logic
> Severity: 3 - Normal
> Priority: 5 - Normal
>   Status: None
>  Privacy: Public
>  Assigned to: None
>  Originator Name: David Hadas
> Originator Email: david.ha...@gmail.com
>  Open/Closed: Open
>  Release: None
>  Discussion Lock: Any
> Operating System: Mac OS
>  Reproducibility: Every Time
>Fixed Release: None
>  Planned Release: None
>   Regression: None
>Work Required: None
>   Patch Included: None
>
>
> ___
>
> Follow-up Comments:
>
>
> ---
> Date: Mon 11 Dec 2023 02:53:19 PM UTC By: Anonymous
> Release: 1.21
>
> ---
>
> Using mTLS with self signed certificates with various tools, it seems wget
> misbehaves and does not add a legitimate self signed CA provided with the
> --ca-certificate flag to the ca pool used internally.
> (I expect that the same issue exists with TLS).
>
> The CA pem is legitimate and well structured as it is used successfully with
> other tools: (1) curl (see below), (2) standard go client and server.
>
> Wget indicates "Self-signed certificate encountered" as an output although the
> CA pem is provided using --ca-certificate
> Wget provides the same response with and without the --ca-certificate...
>
> ---
>
> Here is an example:
> % ./hack/ping.sh
>
> Connect to remote server using mTLS and self signed certificates
>
> Try Curl:
>
> + curl
> https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> --key prk.pem --cert cert.pem --cacert ca.pem
> <<< Response from the server
> Hello little client,<<< Response from the server
> happy to serve you today<<< Response from the server
> <<< Response from the server
> + set +x
>
> Try Wget:
>
> + wget
> https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> --private-key prk.pem --certificate cert.pem --ca-certificate ca.pem
> --2023-12-09 08:43:37--
> https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud/
> Resolving
> myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> (myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud)...
> 169.63.244.138
> Connecting to
> myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> (myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud)|169.63.244.138|:443...
> connected.
> ERROR: cannot verify
> myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud's
> certificate, issued by ‘CN=test,O=test.research.ibm.com’:
>   Self-signed certificate encountered.
> To connect to
> myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> insecurely, use `--no-check-certificate'.
> + set +x
>
> ---
>
>
> Example running with debug mode:
> % ./hack/ping.sh
>
> Connect to remote server using mTLS and self signed certificates
>
> Try Curl:
>
> + curl -v
> https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> --key prk.pem --cert cert.pem --cacert ca.pem
> *   Trying 169.63.244.138:443...
> * Connected to
> myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
> (169.63.244.138) port 443 (#0)
> * ALPN: offers h2,http/1.1
> * (304) (OUT), TLS handshake, Client hello (1):
> *  CAfile: ca.pem
> *  CApath: none
> * (304) (IN), TLS handshake, Server hello (2):
> * (304) (IN), TLS handshake, Unknown (8):
> * (304) (IN), TLS handshake, Request CERT (13):
> * (304) (IN), TLS handshake, Certificate (11):
> * (304) (IN), TLS handshake, CERT verify (15):
> * (304) (IN), TLS handshake, Finished (20):
> * (304) (OUT), TLS handshake, Certificate (11):
> * (304) (OUT), TLS handshake, CERT verify (15):
> * (304) (OUT), TLS handshake, Finished (20):
> * SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
> * ALPN: server accepted h2
> * Server certificate:
> *  subject: O=test.research.ibm.com; CN=test
> *  start date: Dec  9 06:42:29 2023 GMT
> *  expire date: Jan  8 06:42:29 2024 GMT
> *  subjectAltName: host
> 

Re: bug #65007] wget uses non-standard way to print IPv6

2023-12-11 Thread Stephane Ascoet





I hesitate to change this because the format exists since a long time and we
can be sure that changing it will break many scripts.

Hi, very good point. Commands like free, rm, mv, top... change their 
ouptut from time to time and it's a hell. The good answer would be to 
use a parameter to use the new output.


--
Cordialement, Stephane Ascoet




Re: wget refuses to use legitimate self signed CAs provided with the --ca-certificate flag

2023-12-09 Thread Jeffrey Walton
On Sat, Dec 9, 2023 at 2:38 AM David Hadas  wrote:
>
> Using mTLS with self signed certificates with various tools, it seems wget
> misbehaves and does not add a legitimate self signed CA provided with the
> --ca-certificate flag to the ca pool used internally.
> (I expect that the same issue exists with TLS).
>
> The CA pem is legitimate and well structured as it is used successfully
> with other tools: (1) curl (see below), (2) standard go client and server.

Please show your CA certificate. Issue:

openssl x509 -in  -inform PEM -text -noout

The command assumes your cert is in PEM format.

> Wget indicates "Self-signed certificate encountered" as an output although
> the CA pem is provided using --ca-certificate
> Wget provides the same response with and without the --ca-certificate...
>
> [...]
> ERROR: cannot verify
> myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud's
> certificate, issued by ‘CN=test,O=test.research.ibm.com’:
>  Self-signed certificate encountered.

This may be a different problem. It sounds like the chain is
malformed, but you have not shown the chain. It may be due to your CA
cert, or it may not.

Please show the output of the TLS handshake. Issue:

export 
host=myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
openssl s_client -connect ${host}:443 -servername ${host} -showcerts

Jeff



Re: wget refuses to use legitimate self signed CAs provided with the --ca-certificate flag

2023-12-09 Thread Tim Rühsen

Hi,

yeah, I'd expect --ca-certificate should work. It would be interesting 
to see whether --ca-directory works for you.


Which TLS library is your wget binary linked with? (use "wget --version" 
and there is either openssl or gnutls listed)


Regards, Tim

On 12/9/23 08:11, David Hadas wrote:

Hi,

Using mTLS with self signed certificates with various tools, it seems wget
misbehaves and does not add a legitimate self signed CA provided with the
--ca-certificate flag to the ca pool used internally.
(I expect that the same issue exists with TLS).

The CA pem is legitimate and well structured as it is used successfully
with other tools: (1) curl (see below), (2) standard go client and server.

Wget indicates "Self-signed certificate encountered" as an output although
the CA pem is provided using --ca-certificate
Wget provides the same response with and without the --ca-certificate...


Here is an example:

% ./hack/ping.sh

Connect to remote server using mTLS and self signed certificates

Try Curl:

+ curl
https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
--key prk.pem --cert cert.pem --cacert ca.pem

Hello little client,
happy to serve you today

+ set +x

Try Wget:

+ wget
https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
--private-key prk.pem --certificate cert.pem --ca-certificate ca.pem
--2023-12-09 08:43:37--
https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud/
Resolving
myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
(myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud)...
169.63.244.138
Connecting to
myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
(myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud)|169.63.244.138|:443...
connected.
ERROR: cannot verify
myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud's
certificate, issued by ‘CN=test,O=test.research.ibm.com’:
   Self-signed certificate encountered.
To connect to
myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
insecurely, use `--no-check-certificate'.
+ set +x




When running with debug mode:

./hack/ping.sh

Connect to remote server using mTLS and self signed certificates

Try Curl:

+ curl -v
https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
--key prk.pem --cert cert.pem --cacert ca.pem
*   Trying 169.63.244.138:443...
* Connected to
myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
(169.63.244.138) port 443 (#0)
* ALPN: offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: ca.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Request CERT (13):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Certificate (11):
* (304) (OUT), TLS handshake, CERT verify (15):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: O=test.research.ibm.com; CN=test
*  start date: Dec  9 06:42:29 2023 GMT
*  expire date: Jan  8 06:42:29 2024 GMT
*  subjectAltName: host
"myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud"
matched cert's
"myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud"
*  issuer: O=test.research.ibm.com; CN=test
*  SSL certificate verify ok.
* using HTTP/2
* h2 [:method: GET]
* h2 [:scheme: https]
* h2 [:authority:
myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud]
* h2 [:path: /]
* h2 [user-agent: curl/8.1.2]
* h2 [accept: */*]
* Using Stream ID: 1 (easy handle 0x147811e00)

GET / HTTP/2
Host:

myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud

User-Agent: curl/8.1.2
Accept: */*


< HTTP/2 200
< content-type: text/plain; charset=utf-8
< content-length: 51
< date: Sat, 09 Dec 2023 06:53:45 GMT
<

Hello little client,
happy to serve you today

* Connection #0 to host
myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
left intact
+ set +x

Try Wget:

+ wget -d
https://myapp-default.myos-e621c7d733ece1fad737ff54a8912822-.us-south.containers.appdomain.cloud
--private-key prk.pem --certificate cert.pem --ca-certificate ca.pem
Setting --private-key (privatekey) to prk.pem
Setting --certificate (certificate) to cert.pem
Setting --ca-certificate (cacertificate) to ca.pem
DEBUG 

Re: fail to download big files correctly

2023-11-17 Thread Jeffrey Walton
On Fri, Nov 17, 2023 at 6:29 PM Tim Rühsen  wrote:
>
> On 11/17/23 20:34, grafgrim...@gmx.de wrote:
>  > I use Linux and so not exe files. I use Gentoo Linux.
>  >
>  > Command line example:
>  > One line (wget and the url):
>  >
>  > wget
>  >
> http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xz
>  >
>  > result: a file with a wrong checksum.
>
> Just a guess:
>
> If you have a bad network and your connection drops, wget does retries
> by default.
> These retries may result in multiple incomplete files, so that the
> checksums are different. Can you do a 'ls -la' to see which size these
> files have?
>
> I currently can't simulate it - none of the "bad network" emulators for
> Linux do random connection drops.

This one has always made me laugh:
. If it is as bad as it sounds,
then you should be able to experience a dropped connection without
unplugging your ethernet cable.

(Comcast has a bad reputation in the US. I experienced it first hand
in the paqst).

Jeff



Re: fail to download big files correctly

2023-11-17 Thread Tim Rühsen

On 11/17/23 20:34, grafgrim...@gmx.de wrote:
> I use Linux and so not exe files. I use Gentoo Linux.
>
> Command line example:
> One line (wget and the url):
>
> wget
> 
http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xz

>
> result: a file with a wrong checksum.

Just a guess:

If you have a bad network and your connection drops, wget does retries 
by default.
These retries may result in multiple incomplete files, so that the 
checksums are different. Can you do a 'ls -la' to see which size these 
files have?


I currently can't simulate it - none of the "bad network" emulators for 
Linux do random connection drops.


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [bug #64808] When I use wget to download some files from a web server, files with russian names do not get proper names

2023-11-17 Thread Tim Rühsen

On 11/17/23 20:39, Eli Zaretskii wrote:

Date: Fri, 17 Nov 2023 20:34:37 +0100
From: grafgrim...@gmx.de

I use Linux and so not exe files. I use Gentoo Linux.

Command line example:
One line (wget and the url):

wget
http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xz

result: a file with a wrong checksum.


But the above file name has no Russian characters, so why did you say
"files with russian names do not get proper names"?  What am I
missing?


The author just replied to the wrong email/thread.

Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [bug #64808] When I use wget to download some files from a web server, files with russian names do not get proper names

2023-11-17 Thread Michael D. Setzer II
On 17 Nov 2023 at 20:34, grafgrim...@gmx.de wrote:

Date sent:  Fri, 17 Nov 2023 20:34:37 +0100
From:   grafgrim...@gmx.de
To: bug-wget@gnu.org
Subject:Re: [bug #64808] When I use wget to download
some files from a web
server, files with russian names do not get proper
names

> I use Linux and so not exe files. I use Gentoo Linux.
>
> Command line example:
> One line (wget and the url):
>
> wget
> http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xz
>
> result: a file with a wrong checksum.
>
> Greetings
> Graf Grimm
>

What exact error are you seeing? Downloaded with wget and wget2 and worked fine?
Downloaded
wget 
http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xz

--2023-11-18 08:27:18--  
http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xz
Resolving releases.mozilla.org (releases.mozilla.org)... 34.117.35.28
Connecting to releases.mozilla.org (releases.mozilla.org)|34.117.35.28|:80... 
connected.
HTTP request sent, awaiting response... 200 OK
Length: 524717896 (500M) [application/x-tar]
Saving to: ‘firefox-119.0.1.source.tar.xz’

firefox-119.0.1.source.tar.xz 
100%[=>] 500.41M  5.89MB/s
in 83s

2023-11-18 08:28:42 (6.00 MB/s) - ‘firefox-119.0.1.source.tar.xz’ saved 
[524717896/524717896]

Moved file to subdirectory zz and then tar -xvf file no issues?

Downloaded with wget2
[root@setzconote ~]# wget2 
http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xzfirefox-119.0.1.sour
 100% [=>]  
500.40M6.08MB/s
  [Files: 1  Bytes: 500.40M [5.94MB/s] Redirects: 0  
Todo: 0  Errors: 0  ]
[root@setzconote ~]# cmp firefox-119.0.1.source.tar.xz 
zz/firefox-119.0.1.source.tar.xz
no difference in files.



> On Fri, 17 Nov 2023 14:12:28 -0500 (EST)
> invalid.nore...@gnu.org wrote:
> > Follow-up Comment #4, bug #64808 (project wget):
> >
> > Windows character encodings may be special.
> > Is this issue reproducible on e.g. GNU/Linux?
> > I am willing to test it on GNU/Linux, but I need a full command line
> > example from you.
> >
> > Out of curiosity, can you test wget2.exe from
> > https://gitlab.com/gnuwget/wget2/-/releases (there is also a .sig
> > file / PGP signature in case you want to verify the origin).
> >
> >
> >
> > ___
> >
> > Reply to this item at:
> >
> >   <https://savannah.gnu.org/bugs/?64808>
> >
> > ___
> > Message sent via Savannah
> > https://savannah.gnu.org/
> >
> >
> >
>


++
 Michael D. Setzer II - Computer Science Instructor (Retired)
 mailto:mi...@guam.net
 mailto:msetze...@gmail.com
 Guam - Where America's Day Begins
 G4L Disk Imaging Project maintainer
 http://sourceforge.net/projects/g4l/
++






Re: [bug #64808] When I use wget to download some files from a web server, files with russian names do not get proper names

2023-11-17 Thread Eli Zaretskii
> Date: Fri, 17 Nov 2023 20:34:37 +0100
> From: grafgrim...@gmx.de
> 
> I use Linux and so not exe files. I use Gentoo Linux.
> 
> Command line example:
> One line (wget and the url):
> 
> wget
> http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xz
> 
> result: a file with a wrong checksum.

But the above file name has no Russian characters, so why did you say
"files with russian names do not get proper names"?  What am I
missing?



Re: [bug #64808] When I use wget to download some files from a web server, files with russian names do not get proper names

2023-11-17 Thread grafgrimm77
I use Linux and so not exe files. I use Gentoo Linux.

Command line example:
One line (wget and the url):

wget
http://releases.mozilla.org/pub/firefox/releases/119.0.1/source/firefox-119.0.1.source.tar.xz

result: a file with a wrong checksum.

Greetings
Graf Grimm

On Fri, 17 Nov 2023 14:12:28 -0500 (EST)
invalid.nore...@gnu.org wrote:
> Follow-up Comment #4, bug #64808 (project wget):
>
> Windows character encodings may be special.
> Is this issue reproducible on e.g. GNU/Linux?
> I am willing to test it on GNU/Linux, but I need a full command line
> example from you.
>
> Out of curiosity, can you test wget2.exe from
> https://gitlab.com/gnuwget/wget2/-/releases (there is also a .sig
> file / PGP signature in case you want to verify the origin).
>
>
>
> ___
>
> Reply to this item at:
>
>   
>
> ___
> Message sent via Savannah
> https://savannah.gnu.org/
>
>
>



Re: fail to download big files correctly

2023-11-17 Thread Tim Rühsen

On 11/16/23 13:45, grafgrim...@gmx.de wrote:

Dear wget developers,

I use "GNU Wget 1.21.4" and I have problems to download big files with
wget. Unsure, what "big" exactly means.

For example I download firefox-source, noto-font, linux-firmware and
get checksum failures when using wget. These files are 340 MB to 1 GB.

size in bytes:
356057052  linux-firmware-2023.tar.xz
524717896  firefox-119.0.1.source.tar.xz
1062488324 noto-20231031.tar.gz

Download is okay when using a web browser or curl.

So my weak guess is that wget can not handle big file downloads.
When I use wget several times to get the file, i always get another
checksum after download finished with wget.

No problems when downloading small files. Unsure, what "small" exactly
means.


I can't reproduce with the 1.21.4 version from Debian testing.

Can you send the output of 'wget --version'?
Where did you get the wget binary from?
Are you using a proxy? If yes, can you provide name and version?
Can you provide full links to the test files?
And anything else that let us reproduce your issue.



Greetings
Graf Grimm



Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: fail to download big files correctly

2023-11-16 Thread Michael D. Setzer II
On 16 Nov 2023 at 13:45, grafgrim...@gmx.de wrote:

Date sent:  Thu, 16 Nov 2023 13:45:42 +0100
From:   grafgrim...@gmx.de
To: bug-wget@gnu.org
Subject:fail to download big files correctly

> Dear wget developers,
> 
> I use "GNU Wget 1.21.4" and I have problems to download big files with
> wget. Unsure, what "big" exactly means.
> 
> For example I download firefox-source, noto-font, linux-firmware and
> get checksum failures when using wget. These files are 340 MB to 1 GB.
> 
> size in bytes:
> 356057052  linux-firmware-2023.tar.xz
> 524717896  firefox-119.0.1.source.tar.xz
> 1062488324 noto-20231031.tar.gz
> 
> Download is okay when using a web browser or curl.
> 
> So my weak guess is that wget can not handle big file downloads.
> When I use wget several times to get the file, i always get another
> checksum after download finished with wget.
> 
> No problems when downloading small files. Unsure, what "small" exactly
> means.

Not sure on windows version, but tested downloading a 

2099451904 Fedora-Workstation-Live-x86_64-38-1.6.iso
using linux wget and wget2 programs.
Both downloaded files on my local 1G network and both times files 
matched 100% with a binary compare. Downloaded Compared
deleted fie and repeated. Got about 90MB download speed.

Do know the wget2 had an issue with multi-threading with the 2.0 
version, but the 2.1 version seems to handle up to 10 threads 
without issues. 

Only time I've used it with large files, and only locally with my 
local machines. Only have a 50Mb download and 3Mb upload, so 
not wanting to test. 

There is a windows build with the 2.1 version 
https://gitlab.com/gnuwget/wget2/-/releases
Has 64 bit version for windows.

> 
> Greetings
> Graf Grimm
> 


++
 Michael D. Setzer II - Computer Science Instructor (Retired) 
 mailto:mi...@guam.net
 mailto:msetze...@gmail.com
 Guam - Where America's Day Begins
 G4L Disk Imaging Project maintainer 
 http://sourceforge.net/projects/g4l/
++






RE: fail to download big files correctly

2023-11-16 Thread gerdd
But then 1.21.4 should be the latest, at least on windows, according to 
eternallybored.GerdSent from my Galaxy
 Original message From: gerdd  Date: 
2023/11/16  15:35  (GMT+02:00) To: grafgrim...@gmx.de, bug-wget@gnu.org 
Subject: RE: fail to download big files correctly Not sure what hit you there. 
I once had the problem on windows with files greater than 4GB. Downloading the 
latest version of wget fixed that for me. Gerd.Sent from my Galaxy 
Original message From: grafgrim...@gmx.de Date: 2023/11/16  15:16  
(GMT+02:00) To: bug-wget@gnu.org Subject: fail to download big files correctly 
Dear wget developers,I use "GNU Wget 1.21.4" and I have problems to download 
big files withwget. Unsure, what "big" exactly means.For example I download 
firefox-source, noto-font, linux-firmware andget checksum failures when using 
wget. These files are 340 MB to 1 GB.size in bytes:356057052  
linux-firmware-2023.tar.xz524717896  
firefox-119.0.1.source.tar.xz1062488324 noto-20231031.tar.gzDownload is okay 
when using a web browser or curl.So my weak guess is that wget can not handle 
big file downloads.When I use wget several times to get the file, i always get 
anotherchecksum after download finished with wget.No problems when downloading 
small files. Unsure, what "small" exactlymeans.GreetingsGraf Grimm

RE: fail to download big files correctly

2023-11-16 Thread gerdd
Not sure what hit you there. I once had the problem on windows with files 
greater than 4GB. Downloading the latest version of wget fixed that for me. 
Gerd.Sent from my Galaxy
 Original message From: grafgrim...@gmx.de Date: 2023/11/16  
15:16  (GMT+02:00) To: bug-wget@gnu.org Subject: fail to download big files 
correctly Dear wget developers,I use "GNU Wget 1.21.4" and I have problems to 
download big files withwget. Unsure, what "big" exactly means.For example I 
download firefox-source, noto-font, linux-firmware andget checksum failures 
when using wget. These files are 340 MB to 1 GB.size in bytes:356057052  
linux-firmware-2023.tar.xz524717896  
firefox-119.0.1.source.tar.xz1062488324 noto-20231031.tar.gzDownload is okay 
when using a web browser or curl.So my weak guess is that wget can not handle 
big file downloads.When I use wget several times to get the file, i always get 
anotherchecksum after download finished with wget.No problems when downloading 
small files. Unsure, what "small" exactlymeans.GreetingsGraf Grimm

Re: Links Not Parsing Correctly?

2023-11-15 Thread Stephane Ascoet

Le 14/11/2023 à 19:22, Derek Tombrello a écrit :

I appreciate that. I'll check that out. In the mean time, I came up with
a bash script to fix the issue with the ones I've already downloaded. In
case anyone else is interested or needs it, two simply commands run in
the same directory as the index.html files:


rename 's/index\.html\?page=([0-9]+)\&/index$1.html/' *
sed -Ei 's/index\.html\?page=([0-9]+)\/index\1.html/' *.html


Hi, that's the sort of things I do too. In a lot of the huge on-line 
archives of the Web of the past I've made the lasts years, even when 
sucking has worked mostly right, there are always some corrections like 
this to be done.



/"First they came for the Communists, but I was not a Communist so I did



Very long quotations(longer than the actual content). In french, we've 
got a little book telling this story, called "Les matins bruns". It sold 
well, but sadly without great effects on people's minds. A derivative 
short film, with the same title, very strange, has been made from it too.

--
Sincerely, Stephane Ascoet




Re: Links Not Parsing Correctly?

2023-11-14 Thread Derek Tombrello
I appreciate that. I'll check that out. In the mean time, I came up with 
a bash script to fix the issue with the ones I've already downloaded. In 
case anyone else is interested or needs it, two simply commands run in 
the same directory as the index.html files:



rename 's/index\.html\?page=([0-9]+)\&/index$1.html/' *
sed -Ei 's/index\.html\?page=([0-9]+)\/index\1.html/' *.html





✞ Derek Tombrello (KM4JAG)
www.RobotsAndComputers.com


/"First they came for the Communists, but I was not a Communist so I did 
not speak out.
Then they came for the Socialists and the Trade Unionists, but I was 
neither, so I did not speak out.

Then they came for the Jews, but I was not a Jew so I did not speak out.
And when they came for me, there was no one left to speak out for me."
/


/"Every record has been destroyed or falsified, every book rewritten, 
every picture has been repainted,
every statue and street building has been renamed, every date has been 
altered. And the process is continuing
day by day and minute by minute. History has stopped. Nothing exists 
except an endless present in which the Party

is always right." - George Orwell, "1984" /


On 11/13/23 02:33, Stephane Ascoet wrote:

Le 12/11/2023 à 18:00, bug-wget-requ...@gnu.org a écrit :

From: Derek Tombrello 
To: bug-wget@gnu.org
Subject: Links Not Parsing Correctly?

 From the main 'index.html' page, if you click on 'page 2', the address
bar reflects that it is displaying 'index.html?page=2&' but the actual
content is still that of the original 'index.html' page. I can double
click on the 'index.html?page=2&' file itself in the file manager and it
does, in fact, display the page associated with page 2.




Hi, I had almost exactly the same problem a few months ago and got no 
solution except migrating to WebHTTrack. You probably can find the 
thread in the archives, beginning on the 19/8/2023


Re: Links Not Parsing Correctly?

2023-11-13 Thread Stephane Ascoet

Le 12/11/2023 à 18:00, bug-wget-requ...@gnu.org a écrit :

From: Derek Tombrello 
To: bug-wget@gnu.org
Subject: Links Not Parsing Correctly?

 From the main 'index.html' page, if you click on 'page 2', the address
bar reflects that it is displaying 'index.html?page=2&' but the actual
content is still that of the original 'index.html' page. I can double
click on the 'index.html?page=2&' file itself in the file manager and it
does, in fact, display the page associated with page 2.




Hi, I had almost exactly the same problem a few months ago and got no 
solution except migrating to WebHTTrack. You probably can find the 
thread in the archives, beginning on the 19/8/2023

--
Cordialement, Stephane Ascoet




Re: How?

2023-11-12 Thread Taylor
> Thanks to the feedback from Michael I could create and upload a
> wget2.exe (x86, 64-bit, v2.1.0, stripped)

https://savannah.gnu.org/bugs/?61038

Can anyone compile 32-bit binaries?



Re: How?

2023-11-04 Thread Tim Rühsen
Thanks to the feedback from Michael I could create and upload a 
wget2.exe (x86, 64-bit, v2.1.0, stripped).


Find it at https://gitlab.com/gnuwget/wget2/-/releases/v2.1.0

Regards, Tim

On 10/25/23 14:23, Michael D. Setzer II via Primary discussion list for 
GNU Wget wrote:

On 25 Oct 2023 at 13:44, ge...@mweb.co.za wrote:

Date sent:  Wed, 25 Oct 2023 13:44:27 +0200 (SAST)
From:   "ge...@mweb.co.za" 
To: Fernando Cassia 
Copies to:  ENG WKJC , bug-wget

Subject:    Re: How?


Hi,

thanks for that link again, Fernando (I had misplaced my note on it:-)

And now I wonder if anyone has ported wget2 to Windows?

Thanks,

Gerd


The wget2 site has the older 2.0.1 wget with a windows exe file,
but I've been able to create the 2.1.0 version as well with info
using docker? Seems to support up to 10 threads, which the 2.0.1
version doesn't seem to.

https://gitlab.com/gnuwget/wget2/-/releases

Built the wget2 using the docker, and it creates a wget2.exe file.
24993339 Oct 16 16:36 wget2.exe resulted file
   7137339 Oct 16 16:37 wget2x.exe (compressed with upx).

GNU Wget2 2.1.0 - multithreaded metalink/file/website
downloader

+digest +https +ssl/gnutls +ipv6 +iri +large-file -nls -ntlm -opie
+psl -hsts
-iconv +idn2 +zlib -lzma -brotlidec -zstd -bzip2 -lzip +http2
-gpgme

Copyright (C) 2012-2015 Tim Ruehsen
Copyright (C) 2015-2021 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please send bug reports and questions to .

Get same results as with the linux wget2 program.







- Original Message -
From: "Fernando Cassia" 
To: "ENG WKJC" 
Cc: "bug-wget" 
Sent: Wednesday, October 25, 2023 1:14:17 AM
Subject: Re: How?

On Tue, 24 Oct 2023, 17:16 ENG WKJC,  wrote:


Guys,

Been looking for a solution that can run on Win7 or higher for HTTPS
downloading.  However, I don't understand the whole GNU thing, other than
it's
open source and generally free software.
I've looked at the software links and don't know what to do here.



HI Marv

You can find pre-built versions of wget for Windows on this Web site

32bit
https://eternallybored.org/misc/wget/1.21.4/32/wget.exe

And 64 bit
https://eternallybored.org/misc/wget/1.21.4/64/wget.exe

You just download the exe required for your system.

How to tell if your Windows installation is 32bit or 64bit
https://support.microsoft.com/en-us/windows/32-bit-and-64-bit-windows-frequently-asked-questions-c6ca9541-8dce-4d48-0415-94a3faa2e13d

... then manually copy to a folder in your computer, and then place that
folder in your "path" (the path is a listing of folder locations that
Windows uses to look for exe files)

(I personally use c:\utils and place all command line utilities there).

---
 From the command prompt:
MD C:\Utils
Copy %userprofile%\downloads\wget.exe c:\utils
---

Here is a tutorial on how to the add that folder you ju St created to the
system path

https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/

You can then run wget from the command prompt from any directory or folder
just by calling it with 'wget'

Eg to get a listing of all options available
wget - -help

Others will be able to guide you after that.

If you want to see a video, sometimes a picture is better than a thousand
words

https://youtu.be/cvvcG1a7dOM?si=Q5hO2GzlFFG0oRhC

Best,

FC
Buenos Aires, Argentina





++
  Michael D. Setzer II - Computer Science Instructor (Retired)
  mailto:mi...@guam.net
  mailto:msetze...@gmail.com
  Guam - Where America's Day Begins
  G4L Disk Imaging Project maintainer
  http://sourceforge.net/projects/g4l/
++




OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [Feature Request] Add a short option for --content-disposition

2023-11-04 Thread Tim Rühsen
On 10/29/23 21:08, No-Reply-Wolfietech via Primary discussion list for 
GNU Wget wrote:

Nowadays it seems increasingly common to find a file that is not being hosted 
where its actually stored, for access control presumably, and it seems to make 
no sense in having to type content-disposition when a single letter flag is all 
that is needed?


Well, we can't simply change the default behavior. That would break lots 
of workflows.


And enabling it is also a matter of trusting the server, which is not 
always the case.


May you can enable this "by default" in your environments by adding the 
flag to ~/.wgetrc or /etc/wgetrc. Or specify a diferent config file in 
$WGETRC.


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: wget claims "Success" when it failed to write to local directory

2023-10-29 Thread Tim Rühsen

On 10/27/23 11:16, Christian Rosentreter wrote:


Thanks Tim!
Seems to work well for me.


You are welcome, Christian :)


Took me a while to get to the "commit". wget's primary homepage under 
https://www.gnu.org/software/wget/
doesn't make it very easy to find any information about the repository. But 
luckily I eventually located
the "savannah" page with the required details. :-)


Sorry about that. We still keep the Savannah repository up-to-date, but 
develop and CI test on https://gitlab.com/gnuwget/wget.


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: wget claims "Success" when it failed to write to local directory

2023-10-27 Thread Christian Rosentreter


Thanks Tim!
Seems to work well for me.


Took me a while to get to the "commit". wget's primary homepage under 
https://www.gnu.org/software/wget/
doesn't make it very easy to find any information about the repository. But 
luckily I eventually located
the "savannah" page with the required details. :-)


> On 22 Oct 2023, at 2:08 PM, Tim Rühsen  wrote:
> 
> Hey Christian, Andries,
> 
> Thanks for the analysis and the patch.
> 
> I moved the store/restore of errno a bit up the call chain in order to make 
> it independent of the TLS backend.
> 
> Pushed as commit 25525f80372dbdfa367da7ab8592a6a747fc1f68.
> 
> Regards, Tim




Re: How?

2023-10-25 Thread Michael D. Setzer II
On 25 Oct 2023 at 13:44, ge...@mweb.co.za wrote:

Date sent:  Wed, 25 Oct 2023 13:44:27 +0200 (SAST)
From:   "ge...@mweb.co.za" 
To: Fernando Cassia 
Copies to:  ENG WKJC , bug-wget 

Subject:    Re: How?

> Hi, 
> 
> thanks for that link again, Fernando (I had misplaced my note on it:-)
> 
> And now I wonder if anyone has ported wget2 to Windows? 
> 
> Thanks, 
> 
> Gerd

The wget2 site has the older 2.0.1 wget with a windows exe file, 
but I've been able to create the 2.1.0 version as well with info 
using docker? Seems to support up to 10 threads, which the 2.0.1 
version doesn't seem to.

https://gitlab.com/gnuwget/wget2/-/releases

Built the wget2 using the docker, and it creates a wget2.exe file.
24993339 Oct 16 16:36 wget2.exe resulted file
  7137339 Oct 16 16:37 wget2x.exe (compressed with upx).

GNU Wget2 2.1.0 - multithreaded metalink/file/website 
downloader

+digest +https +ssl/gnutls +ipv6 +iri +large-file -nls -ntlm -opie 
+psl -hsts
-iconv +idn2 +zlib -lzma -brotlidec -zstd -bzip2 -lzip +http2 
-gpgme

Copyright (C) 2012-2015 Tim Ruehsen
Copyright (C) 2015-2021 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please send bug reports and questions to .

Get same results as with the linux wget2 program. 



> 
> 
> 
> - Original Message -
> From: "Fernando Cassia" 
> To: "ENG WKJC" 
> Cc: "bug-wget" 
> Sent: Wednesday, October 25, 2023 1:14:17 AM
> Subject: Re: How?
> 
> On Tue, 24 Oct 2023, 17:16 ENG WKJC,  wrote:
> 
> > Guys,
> >
> > Been looking for a solution that can run on Win7 or higher for HTTPS
> > downloading.  However, I don't understand the whole GNU thing, other than
> > it's
> > open source and generally free software.
> > I've looked at the software links and don't know what to do here.
> 
> 
> HI Marv
> 
> You can find pre-built versions of wget for Windows on this Web site
> 
> 32bit
> https://eternallybored.org/misc/wget/1.21.4/32/wget.exe
> 
> And 64 bit
> https://eternallybored.org/misc/wget/1.21.4/64/wget.exe
> 
> You just download the exe required for your system.
> 
> How to tell if your Windows installation is 32bit or 64bit
> https://support.microsoft.com/en-us/windows/32-bit-and-64-bit-windows-frequently-asked-questions-c6ca9541-8dce-4d48-0415-94a3faa2e13d
> 
> ... then manually copy to a folder in your computer, and then place that
> folder in your "path" (the path is a listing of folder locations that
> Windows uses to look for exe files)
> 
> (I personally use c:\utils and place all command line utilities there).
> 
> ---
> From the command prompt:
> MD C:\Utils
> Copy %userprofile%\downloads\wget.exe c:\utils
> ---
> 
> Here is a tutorial on how to the add that folder you ju St created to the
> system path
> 
> https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/
> 
> You can then run wget from the command prompt from any directory or folder
> just by calling it with 'wget'
> 
> Eg to get a listing of all options available
> wget - -help
> 
> Others will be able to guide you after that.
> 
> If you want to see a video, sometimes a picture is better than a thousand
> words
> 
> https://youtu.be/cvvcG1a7dOM?si=Q5hO2GzlFFG0oRhC
> 
> Best,
> 
> FC
> Buenos Aires, Argentina
> 



++
 Michael D. Setzer II - Computer Science Instructor (Retired) 
 mailto:mi...@guam.net
 mailto:msetze...@gmail.com
 Guam - Where America's Day Begins
 G4L Disk Imaging Project maintainer 
 http://sourceforge.net/projects/g4l/
++




Re: How?

2023-10-25 Thread ge...@mweb.co.za
Hi, 

thanks for that link again, Fernando (I had misplaced my note on it:-)

And now I wonder if anyone has ported wget2 to Windows? 

Thanks, 

Gerd



- Original Message -
From: "Fernando Cassia" 
To: "ENG WKJC" 
Cc: "bug-wget" 
Sent: Wednesday, October 25, 2023 1:14:17 AM
Subject: Re: How?

On Tue, 24 Oct 2023, 17:16 ENG WKJC,  wrote:

> Guys,
>
> Been looking for a solution that can run on Win7 or higher for HTTPS
> downloading.  However, I don't understand the whole GNU thing, other than
> it's
> open source and generally free software.
> I've looked at the software links and don't know what to do here.


HI Marv

You can find pre-built versions of wget for Windows on this Web site

32bit
https://eternallybored.org/misc/wget/1.21.4/32/wget.exe

And 64 bit
https://eternallybored.org/misc/wget/1.21.4/64/wget.exe

You just download the exe required for your system.

How to tell if your Windows installation is 32bit or 64bit
https://support.microsoft.com/en-us/windows/32-bit-and-64-bit-windows-frequently-asked-questions-c6ca9541-8dce-4d48-0415-94a3faa2e13d

... then manually copy to a folder in your computer, and then place that
folder in your "path" (the path is a listing of folder locations that
Windows uses to look for exe files)

(I personally use c:\utils and place all command line utilities there).

---
>From the command prompt:
MD C:\Utils
Copy %userprofile%\downloads\wget.exe c:\utils
---

Here is a tutorial on how to the add that folder you ju St created to the
system path

https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/

You can then run wget from the command prompt from any directory or folder
just by calling it with 'wget'

Eg to get a listing of all options available
wget - -help

Others will be able to guide you after that.

If you want to see a video, sometimes a picture is better than a thousand
words

https://youtu.be/cvvcG1a7dOM?si=Q5hO2GzlFFG0oRhC

Best,

FC
Buenos Aires, Argentina



Re: How?

2023-10-24 Thread Fernando Cassia
On Tue, 24 Oct 2023, 17:16 ENG WKJC,  wrote:

> Guys,
>
> Been looking for a solution that can run on Win7 or higher for HTTPS
> downloading.  However, I don't understand the whole GNU thing, other than
> it's
> open source and generally free software.
> I've looked at the software links and don't know what to do here.


HI Marv

You can find pre-built versions of wget for Windows on this Web site

32bit
https://eternallybored.org/misc/wget/1.21.4/32/wget.exe

And 64 bit
https://eternallybored.org/misc/wget/1.21.4/64/wget.exe

You just download the exe required for your system.

How to tell if your Windows installation is 32bit or 64bit
https://support.microsoft.com/en-us/windows/32-bit-and-64-bit-windows-frequently-asked-questions-c6ca9541-8dce-4d48-0415-94a3faa2e13d

... then manually copy to a folder in your computer, and then place that
folder in your "path" (the path is a listing of folder locations that
Windows uses to look for exe files)

(I personally use c:\utils and place all command line utilities there).

---
>From the command prompt:
MD C:\Utils
Copy %userprofile%\downloads\wget.exe c:\utils
---

Here is a tutorial on how to the add that folder you ju St created to the
system path

https://www.architectryan.com/2018/03/17/add-to-the-path-on-windows-10/

You can then run wget from the command prompt from any directory or folder
just by calling it with 'wget'

Eg to get a listing of all options available
wget - -help

Others will be able to guide you after that.

If you want to see a video, sometimes a picture is better than a thousand
words

https://youtu.be/cvvcG1a7dOM?si=Q5hO2GzlFFG0oRhC

Best,

FC
Buenos Aires, Argentina


Re: wget claims "Success" when it failed to write to local directory

2023-10-22 Thread Tim Rühsen

Hey Christian, Andries,

Thanks for the analysis and the patch.

I moved the store/restore of errno a bit up the call chain in order to 
make it independent of the TLS backend.


Pushed as commit 25525f80372dbdfa367da7ab8592a6a747fc1f68.

Regards, Tim

On 10/21/23 23:58, Christian Rosentreter wrote:


Hi Andries,

I re-done my wget build with the following patch. It fixes my issue and it now
properly reports "(Permission denied)" instead "(Success)" in case of error
when wget lacks writing permissions to the directory it operates in.


This confirms your past findings/ analysis.

OpenSSL's shutdown clobbers errno which wget fails to properly cache/ handle.
Hence we don't see the issue with the system default wget from Raspian: that 
copy
is build with GNUTLS.




--- src/openssl.c.orig  2023-05-11 00:18:48.0 +0200
+++ src/openssl.c   2023-10-21 23:42:06.0 +0200
@@ -757,7 +757,16 @@
struct openssl_transport_context *ctx = arg;
SSL *conn = ctx->conn;
  
+  /* HOTFIX:  OpenSSL's SSL_shutdown clobbers 'errno' which

+   *  wget fails to properly cache in http.c leading
+   *  to curious "Success" messages in failure cases.
+   *
+   * WARNING: This is not a PROPER fix!
+   */
+  int olderrno = errno;
SSL_shutdown (conn);
+  errno = olderrno;
+
SSL_free (conn);
xfree (ctx->last_error);
xfree (ctx);






On 21 Oct 2023, at 9:03 PM, Andries E. Brouwer  wrote:

Hi Tim,

That reminds me of some earlier discussion, see
https://lists.gnu.org/archive/html/bug-wget/2021-05/msg00012.html

I rechecked wget 1.21.4, and it still gives Cannot write to ‘’ (Success).
I suppose the same analysis still applies.

Andries


On Sat, Oct 21, 2023 at 07:22:25PM +0200, Tim Rühsen wrote:

Hi,

do you run the latest wget (1.21.4)?

With that version, you get a

  Cannot write to 'index.html' (Permission denied).

Regards, Tim

On 10/21/23 17:16, Christian Rosentreter wrote:


Hi there,

There's a minor cosmetic bug in wget 1.x where it claims "Success" when it in 
fact entirely failed to write to the local
disk, e.g. because of missing permissions/ write access to the current directory. The 
return code is "3" however, so it's
basically only the message that it prints on screen that is funny in a 
suspicious way:


### Prepare situation…
$ mkdir foobar
$ chmod -w foobar   # remove write access
$ cd foobar


### Note: the "Permission denied" and "Cannot write to" messages, but we
###   get a "(Success)" anyway:
$ wget https://www.christianrosentreter.com/
--2023-10-21 17:05:35--  https://www.christianrosentreter.com/
Resolving www.christianrosentreter.com (www.christianrosentreter.com)... 
85.13.142.16
Connecting to www.christianrosentreter.com 
(www.christianrosentreter.com)|85.13.142.16|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
index.html: Permission denied

Cannot write to 'index.html' (Success).


### The return code seems to be reasonable though:
$ echo $?
3


### Version tested:
$ wget --version
GNU Wget 1.21.4 built on darwin14.5.0.

+cares +digest -gpgme +https +ipv6 +iri +large-file +metalink -nls
+ntlm +opie +psl +ssl/openssl

…cut…













OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: wget claims "Success" when it failed to write to local directory

2023-10-21 Thread Christian Rosentreter


Hi Andries,

I re-done my wget build with the following patch. It fixes my issue and it now
properly reports "(Permission denied)" instead "(Success)" in case of error
when wget lacks writing permissions to the directory it operates in.


This confirms your past findings/ analysis.

OpenSSL's shutdown clobbers errno which wget fails to properly cache/ handle.
Hence we don't see the issue with the system default wget from Raspian: that 
copy
is build with GNUTLS.




--- src/openssl.c.orig  2023-05-11 00:18:48.0 +0200
+++ src/openssl.c   2023-10-21 23:42:06.0 +0200
@@ -757,7 +757,16 @@
   struct openssl_transport_context *ctx = arg;
   SSL *conn = ctx->conn;
 
+  /* HOTFIX:  OpenSSL's SSL_shutdown clobbers 'errno' which
+   *  wget fails to properly cache in http.c leading
+   *  to curious "Success" messages in failure cases.
+   *
+   * WARNING: This is not a PROPER fix!
+   */
+  int olderrno = errno;
   SSL_shutdown (conn);
+  errno = olderrno;
+  
   SSL_free (conn);
   xfree (ctx->last_error);
   xfree (ctx);





> On 21 Oct 2023, at 9:03 PM, Andries E. Brouwer  wrote:
> 
> Hi Tim,
> 
> That reminds me of some earlier discussion, see
> https://lists.gnu.org/archive/html/bug-wget/2021-05/msg00012.html
> 
> I rechecked wget 1.21.4, and it still gives Cannot write to ‘’ (Success).
> I suppose the same analysis still applies.
> 
> Andries
> 
> 
> On Sat, Oct 21, 2023 at 07:22:25PM +0200, Tim Rühsen wrote:
>> Hi,
>> 
>> do you run the latest wget (1.21.4)?
>> 
>> With that version, you get a
>> 
>>  Cannot write to 'index.html' (Permission denied).
>> 
>> Regards, Tim
>> 
>> On 10/21/23 17:16, Christian Rosentreter wrote:
>>> 
>>> Hi there,
>>> 
>>> There's a minor cosmetic bug in wget 1.x where it claims "Success" when it 
>>> in fact entirely failed to write to the local
>>> disk, e.g. because of missing permissions/ write access to the current 
>>> directory. The return code is "3" however, so it's
>>> basically only the message that it prints on screen that is funny in a 
>>> suspicious way:
>>> 
>>> 
>>> ### Prepare situation…
>>> $ mkdir foobar
>>> $ chmod -w foobar   # remove write access
>>> $ cd foobar
>>> 
>>> 
>>> ### Note: the "Permission denied" and "Cannot write to" messages, but we
>>> ###   get a "(Success)" anyway:
>>> $ wget https://www.christianrosentreter.com/
>>> --2023-10-21 17:05:35--  https://www.christianrosentreter.com/
>>> Resolving www.christianrosentreter.com (www.christianrosentreter.com)... 
>>> 85.13.142.16
>>> Connecting to www.christianrosentreter.com 
>>> (www.christianrosentreter.com)|85.13.142.16|:443... connected.
>>> HTTP request sent, awaiting response... 200 OK
>>> Length: unspecified [text/html]
>>> index.html: Permission denied
>>> 
>>> Cannot write to 'index.html' (Success).
>>> 
>>> 
>>> ### The return code seems to be reasonable though:
>>> $ echo $?
>>> 3
>>> 
>>> 
>>> ### Version tested:
>>> $ wget --version
>>> GNU Wget 1.21.4 built on darwin14.5.0.
>>> 
>>> +cares +digest -gpgme +https +ipv6 +iri +large-file +metalink -nls
>>> +ntlm +opie +psl +ssl/openssl
>>> 
>>> …cut…
>>> 
>>> 
>>> 
>>> 
>>> 
> 
> 
> 




Re: wget claims "Success" when it failed to write to local directory

2023-10-21 Thread Andries E. Brouwer
Hi Tim,

That reminds me of some earlier discussion, see
https://lists.gnu.org/archive/html/bug-wget/2021-05/msg00012.html

I rechecked wget 1.21.4, and it still gives Cannot write to ‘’ (Success).
I suppose the same analysis still applies.

Andries


On Sat, Oct 21, 2023 at 07:22:25PM +0200, Tim Rühsen wrote:
> Hi,
> 
> do you run the latest wget (1.21.4)?
> 
> With that version, you get a
> 
>   Cannot write to 'index.html' (Permission denied).
> 
> Regards, Tim
> 
> On 10/21/23 17:16, Christian Rosentreter wrote:
> > 
> > Hi there,
> > 
> > There's a minor cosmetic bug in wget 1.x where it claims "Success" when it 
> > in fact entirely failed to write to the local
> > disk, e.g. because of missing permissions/ write access to the current 
> > directory. The return code is "3" however, so it's
> > basically only the message that it prints on screen that is funny in a 
> > suspicious way:
> > 
> > 
> > ### Prepare situation…
> > $ mkdir foobar
> > $ chmod -w foobar   # remove write access
> > $ cd foobar
> > 
> > 
> > ### Note: the "Permission denied" and "Cannot write to" messages, but we
> > ###   get a "(Success)" anyway:
> > $ wget https://www.christianrosentreter.com/
> > --2023-10-21 17:05:35--  https://www.christianrosentreter.com/
> > Resolving www.christianrosentreter.com (www.christianrosentreter.com)... 
> > 85.13.142.16
> > Connecting to www.christianrosentreter.com 
> > (www.christianrosentreter.com)|85.13.142.16|:443... connected.
> > HTTP request sent, awaiting response... 200 OK
> > Length: unspecified [text/html]
> > index.html: Permission denied
> > 
> > Cannot write to 'index.html' (Success).
> > 
> > 
> > ### The return code seems to be reasonable though:
> > $ echo $?
> > 3
> > 
> > 
> > ### Version tested:
> > $ wget --version
> > GNU Wget 1.21.4 built on darwin14.5.0.
> > 
> > +cares +digest -gpgme +https +ipv6 +iri +large-file +metalink -nls
> > +ntlm +opie +psl +ssl/openssl
> > 
> > …cut…
> > 
> > 
> > 
> > 
> > 






Re: wget claims "Success" when it failed to write to local directory

2023-10-21 Thread Christian Rosentreter


Hello Tim,

thanks for your reply.

Yes, I'm running the latest version (running on a rusty OS X "Yosemite"), also 
see the
text you quoted at the bottom (unless there are silent updates w/o version 
bump?). My copy
was build on July 25, 2023. Pretty up-to-date by my standards. :-) All used 
dependencies
are also pretty recent (maybe a day or 2 days older than wget, most was build 
for wget).
Full output of `wget --version`, see below at the end of my reply.

I only applied this required single-line patch from the bug tracker to fix FTPS 
crashes
for one of my use cases: https://savannah.gnu.org/bugs/?62137   I don't see how 
this should
break other things however.

I always get "(Success)" here, never "(Permission denied)". I tried `chown 
root`, or just
`chmod -w`, etc., as soon wget has no permissions it will still report 
"(Success)". I also tried
with `--no-config` just in case (I only have "local-encoding = UTF-8" in 
there), same result.


I tested an older wget (1.20.1) on my Raspberry PI under whatever Linux it runs 
("Raspbian
GNU/Linux 10 (buster)" according to "/etc/os-release") and it works as you 
describe and reports
"Permission denied" in this case. But I also build the latest 1.21.4 version 
and it fails
with "(Success)" under my Linux too:

   $ mkdir ~/Desktop/wget_tmp
   $ cd ~/Desktop/wget_tmp/
   $ wget https://ftp.gnu.org/gnu/wget/wget-1.21.4.tar.gz
   $ tar -xvzf wget-1.21.4.tar.gz
   $ cd wget-1.21.4/
   $ ./configure --with-ssl=openssl  # GnuTLS not available
   $ make

   $ mkdir ~/Desktop/foobar
   $ cd ~/Desktop/foobar
   $ chmod -w .

   $ ../wget_tmp/wget-1.21.4/src/wget https://www.christianrosentreter.com/
   --2023-10-21 20:53:19--  https://www.christianrosentreter.com/
   Resolving www.christianrosentreter.com... 85.13.142.16
   Connecting to www.christianrosentreter.com|85.13.142.16|:443... connected.
   HTTP request sent, awaiting response... 200 OK
   Length: unspecified [text/html]
   index.html: Permission denied

   Cannot write to ‘index.html’ (Success).


   $ ../wget_tmp/wget-1.21.4/src/wget --version
   GNU Wget 1.21.4 built on linux-gnueabihf.

   -cares +digest -gpgme +https +ipv6 -iri +large-file -metalink +nls 
   +ntlm +opie -psl +ssl/openssl 

   Wgetrc: 
   /usr/local/etc/wgetrc (system)
   Locale: 
   /usr/local/share/locale 
   Compile: 
   gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/usr/local/etc/wgetrc" 
   -DLOCALEDIR="/usr/local/share/locale" -I. -I../lib -I../lib 
   -DHAVE_LIBSSL -DNDEBUG -g -O2 
   Link: 
   gcc -DHAVE_LIBSSL -DNDEBUG -g -O2 -lssl -lcrypto -lz 
   ../lib/libgnu.a 

   Copyright (C) 2015 Free Software Foundation, Inc.
   License GPLv3+: GNU GPL version 3 or later
   .
   This is free software: you are free to change and redistribute it.
   There is NO WARRANTY, to the extent permitted by law.

   Originally written by Hrvoje Niksic .
   Please send bug reports and questions to .



I'm happy to investigate further, but I would need a direction where I should 
look at. I'm
not familiar with wget from a development/ code perspective. Basically I'm just 
a wget user
that's already happy when it builds w/o major hassle and works w/o crashing 
afterwards. :-)





Here's the full uncut output of `wget --version` on my OS X machine:

   $ wget --version
   GNU Wget 1.21.4 built on darwin14.5.0.

   +cares +digest -gpgme +https +ipv6 +iri +large-file +metalink -nls 
   +ntlm +opie +psl +ssl/openssl 

   Wgetrc: 
   /usr/local/etc/wget/wgetrc (system)
   Compile: 
   gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/usr/local/etc/wget/wgetrc" 
   -DLOCALEDIR="/usr/local/silo/wget/1.21.4/share/locale" -I. -I../lib 
   -I../lib -I/usr/local/silo/libiconv/latest/include 
   -I/usr/local/silo/libunistring/latest/include 
   -I/usr/local/silo/libmetalink/latest/include 
   -I/usr/local/silo/c-ares/latest/include 
   -I/usr/local/silo/pcre2/latest/include 
   -I/usr/local/silo/uuid-ossp/latest/include 
   -I/usr/local/silo/libidn2/latest/include 
   -I/usr/local/silo/openssl/latest@3/include -DHAVE_LIBSSL 
   -I/usr/local/silo/zlib/latest/include 
   -I/usr/local/silo/libpsl/latest/include -DNDEBUG -g -O2 
   Link: 
   gcc -I/usr/local/silo/libmetalink/latest/include 
   -I/usr/local/silo/c-ares/latest/include 
   -I/usr/local/silo/pcre2/latest/include 
   -I/usr/local/silo/uuid-ossp/latest/include 
   -I/usr/local/silo/libidn2/latest/include 
   -I/usr/local/silo/openssl/latest@3/include -DHAVE_LIBSSL 
   -I/usr/local/silo/zlib/latest/include 
   -I/usr/local/silo/libpsl/latest/include -DNDEBUG -g -O2 
   -L/usr/local/silo/libiconv/latest/lib 
   -L/usr/local/silo/libunistring/latest/lib 
   -L/usr/local/silo/libmetalink/latest/lib -lmetalink 
   -L/usr/local/silo/c-ares/latest/lib -lcares 
   -L/usr/local/silo/pcre2/latest/lib -lpcre2-8 
   -L/usr/local/silo/uuid-ossp/latest/lib -luuid 

Re: wget claims "Success" when it failed to write to local directory

2023-10-21 Thread Michael D. Setzer II
On 21 Oct 2023 at 19:22, Tim Rühsen wrote:

Date sent:  Sat, 21 Oct 2023 19:22:25 +0200
Subject:Re: wget claims "Success" when it failed to write to local 
directory
To: Christian Rosentreter , bug-wget@gnu.org
From:   Tim Rühsen 

Did a test with wget2 and get this result with error (2) if no write
permission?
wget2 https://www.christianrosentreter.com/
Failed to open 'index.html' (2)
index.html   100%
[==
===>]1.89K--.-KB/s
  [Files: 1  Bytes: 1.89K [2.23KB/s] Redirects: 0  
Todo: 0  Errors: 0]

With write permission it downloads index.html fine.
wget2 https://www.christianrosentreter.com/
index.html   100%
[==
==>]1.89K--.-KB/s
  [Files: 1  Bytes: 1.89K [2.27KB/s] Redirects: 0  
Todo: 0  Errors: 0   ]





> Hi,
>
> do you run the latest wget (1.21.4)?
>
> With that version, you get a
>
>Cannot write to 'index.html' (Permission denied).
>
> Regards, Tim
>
> On 10/21/23 17:16, Christian Rosentreter wrote:
> >
> > Hi there,
> >
> > There's a minor cosmetic bug in wget 1.x where it claims "Success" when it 
> > in fact entirely failed to write to the local
> > disk, e.g. because of missing permissions/ write access to the current 
> > directory. The return code is "3" however, so it's
> > basically only the message that it prints on screen that is funny in a 
> > suspicious way:
> >
> >
> > ### Prepare situation…
> > $ mkdir foobar
> > $ chmod -w foobar   # remove write access
> > $ cd foobar
> >
> >
> > ### Note: the "Permission denied" and "Cannot write to" messages, but we
> > ###   get a "(Success)" anyway:
> > $ wget https://www.christianrosentreter.com/
> > --2023-10-21 17:05:35--  https://www.christianrosentreter.com/
> > Resolving www.christianrosentreter.com (www.christianrosentreter.com)... 
> > 85.13.142.16
> > Connecting to www.christianrosentreter.com 
> > (www.christianrosentreter.com)|85.13.142.16|:443... connected.
> > HTTP request sent, awaiting response... 200 OK
> > Length: unspecified [text/html]
> > index.html: Permission denied
> >
> > Cannot write to 'index.html' (Success).
> >
> >
> > ### The return code seems to be reasonable though:
> > $ echo $?
> > 3
> >
> >
> > ### Version tested:
> > $ wget --version
> > GNU Wget 1.21.4 built on darwin14.5.0.
> >
> > +cares +digest -gpgme +https +ipv6 +iri +large-file +metalink -nls
> > +ntlm +opie +psl +ssl/openssl
> >
> > …cut…
> >
> >
> >
> >
> >



++
 Michael D. Setzer II - Computer Science Instructor (Retired)
 mailto:mi...@guam.net
 mailto:msetze...@gmail.com
 Guam - Where America's Day Begins
 G4L Disk Imaging Project maintainer
 http://sourceforge.net/projects/g4l/
++




Re: wget claims "Success" when it failed to write to local directory

2023-10-21 Thread Michael D. Setzer II
On 21 Oct 2023 at 17:16, Christian Rosentreter wrote:

From:   Christian Rosentreter 
Subject:wget claims "Success" when it failed to write to
local directory
Date sent:  Sat, 21 Oct 2023 17:16:44 +0200
To: bug-wget@gnu.org

>
> Hi there,
>
> There's a minor cosmetic bug in wget 1.x where it claims "Success" when it in 
> fact entirely failed to write to the local
> disk, e.g. because of missing permissions/ write access to the current 
> directory. The return code is "3" however, so it's
> basically only the message that it prints on screen that is funny in a 
> suspicious way:
>

I'm a user of wget2 but in looking at man pages reports that error
code 3 is a File I/O error?

Exit Status
   Wget2 may return one of several error codes if it encounters
problems.

0   No problems occurred.
1   Generic error code.
2   Parse error. For instance, when parsing
command-line options, the .wget2rc or .netrc...
3   File I/O error.

So, would agree the Success is probable an incorrect message, but
since error code did report that there was a File I/O error the wget
program worked successfully.


>
> ### Prepare situation…
> $ mkdir foobar
> $ chmod -w foobar   # remove write access
> $ cd foobar
>
>
> ### Note: the "Permission denied" and "Cannot write to" messages, but we
> ###   get a "(Success)" anyway:
> $ wget https://www.christianrosentreter.com/
> --2023-10-21 17:05:35--  https://www.christianrosentreter.com/
> Resolving www.christianrosentreter.com (www.christianrosentreter.com)... 
> 85.13.142.16
> Connecting to www.christianrosentreter.com 
> (www.christianrosentreter.com)|85.13.142.16|:443... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: unspecified [text/html]
> index.html: Permission denied
>
> Cannot write to 'index.html' (Success).
>
>
> ### The return code seems to be reasonable though:
> $ echo $?
> 3
>
>
> ### Version tested:
> $ wget --version
> GNU Wget 1.21.4 built on darwin14.5.0.
>
> +cares +digest -gpgme +https +ipv6 +iri +large-file +metalink -nls
> +ntlm +opie +psl +ssl/openssl
>
> …cut…
>
>
>
>
>


++
 Michael D. Setzer II - Computer Science Instructor (Retired)
 mailto:mi...@guam.net
 mailto:msetze...@gmail.com
 Guam - Where America's Day Begins
 G4L Disk Imaging Project maintainer
 http://sourceforge.net/projects/g4l/
++






Re: wget claims "Success" when it failed to write to local directory

2023-10-21 Thread Tim Rühsen

Hi,

do you run the latest wget (1.21.4)?

With that version, you get a

  Cannot write to 'index.html' (Permission denied).

Regards, Tim

On 10/21/23 17:16, Christian Rosentreter wrote:


Hi there,

There's a minor cosmetic bug in wget 1.x where it claims "Success" when it in 
fact entirely failed to write to the local
disk, e.g. because of missing permissions/ write access to the current directory. The 
return code is "3" however, so it's
basically only the message that it prints on screen that is funny in a 
suspicious way:


### Prepare situation…
$ mkdir foobar
$ chmod -w foobar   # remove write access
$ cd foobar


### Note: the "Permission denied" and "Cannot write to" messages, but we
###   get a "(Success)" anyway:
$ wget https://www.christianrosentreter.com/
--2023-10-21 17:05:35--  https://www.christianrosentreter.com/
Resolving www.christianrosentreter.com (www.christianrosentreter.com)... 
85.13.142.16
Connecting to www.christianrosentreter.com 
(www.christianrosentreter.com)|85.13.142.16|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
index.html: Permission denied

Cannot write to 'index.html' (Success).


### The return code seems to be reasonable though:
$ echo $?
3


### Version tested:
$ wget --version
GNU Wget 1.21.4 built on darwin14.5.0.

+cares +digest -gpgme +https +ipv6 +iri +large-file +metalink -nls
+ntlm +opie +psl +ssl/openssl

…cut…







OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Problematic default file naming system (BUG?)

2023-10-15 Thread ge...@mweb.co.za
Functioning as designed ... 

(Disclaimer: I am not an expert user of this program, but I have some 
experience that may help you:)

I guess you are Windows users. Unlike Unix and Linux systems, in Windows the 
last part of a file name (anything following the last ("rightmost") period is 
considered the file extension and can be used to determine what application 
would open the file by default (e.g a .html file would be opened by a browser, 
a .doc (or nowadays a .docx file) would be given to a word processor, such as 
Microsoft Office's word.exe (the .exe indicating that this file contains 
executable code)etc.) 

That is the one aspect of what is going on here - you downloaded something that 
was a .html file, but you didn't give it a name. Somewhere in teh documentation 
it will tell you that (and presumably why) it will give such a file a default 
file name of "index" followed by the file extension. 

The other aspect is what will happen if you download a file to a location where 
a file of the same name and extension is already present. There are a few 
options, between which you can choose using parameters on the command line - 
and these options make good sense in certain circumstances and none at all in 
certain other circumstances. (I'll let you dig through the documentation of 
wget, since that is an important part of testing (evaluating) the program as 
part of your project ;-) 

The most obvious choices you may want to try out are the following (and they 
apply regardless of whether you are downloading a file named index.html or an 
image file named JamesBond007.jpg - I'll go with index.html for an example): 

First option: 

Your existing file index.html is now outdated and the new version - with the 
same file name - will overwrite it. (hint: in the language of the 
documentation, it will "clobber" the file.) 

Second option: 

Your existing file should not be overwritten ("clobbered"), so even though your 
new file was meant to have the same name, it will be called index.html.1 or 
index.html.2 or - eventually index.html.4711 and so on. This may not be pretty, 
but it is effective. Windows users typically would expect to see a different 
syntax (but wget is not just for Windows) - index (1).html, index (2).html, 
..., index (4711).html might look more acceptable to you ...

Third option:

When downloading files across a notoriously unreliable line the process may be 
interrupted by line failure before the file is complete. Wget gives you the 
option then to continue downloading by adding the additional data from retrying 
the download to the end of the existing file - in my life that has been the 
option I used most, especially since Murphy's Law stipulates that the worse 
your line, the bigger your files. 

Obviously, wget can't make the decision for you, which of these options you 
need in any given situation. And it is pretty much impossible to fix the 
results after the fact if you chose the wrong one. What you can do, though, is 
rename all the .1, .2, .3, etc. files to something more sensible. And when you 
plan to download complete web sites or similar groups of files, wget offers you 
ways to drop them with sensible names (most likely taken from your source) into 
a suitable directory structure (e.g. to duplicate the source structure.) 

Study the documentation that came with your downlaoded copy of wget (or find it 
elsewhere on the web) and play with the program a bit more. Do come back here 
for more advice if/when needed. And I'll let the experts answer when their 
input is needed ;-)

Good luck, 

Gerd

 


- Original Message -
From: "Joel F Leppänen" 
To: "bug-wget" 
Sent: Sunday, October 15, 2023 4:44:33 PM
Subject: Problematic default file naming system (BUG?)

Hi all,

We’re testing wget version 1.24.4 for a school project. When downloading an 
.html file, if you don’t name it and download additional .html files, also 
unnamed, it saves the second and the following files after that in formats that 
don’t exist. The first one is saved as ”index.html” and the second one as 
”index.html.1”, the third one as ”index.html.2” and so forth. The files can of 
course be changed back to .html-formats afterwards, but I feel like this is a 
bug that affects user experience negatively (or it’s intended, but I can’t 
figure out why that would be).

Regards,
Joel Leppänen and Werneri Punavaara
LUT University



Re: wget -p shall honor -H (isn´t used unless -r is given)

2023-10-09 Thread Elmar Stellnberger

Hi Tim

  No, the images are simple, plain displayed as part of the web page without any Javascript, otherwise the 
grep -o| sed would not have worked. The point is that the images do not 
get downloaded by a usual invocation, because their domain name is 
different from www.esquire.de, that is static.esquire.de instead of 
www.esquire.de. Note that for wget, -D remains without any effect as 
long as you do not specify -H. The -r should not be needed for a wget -p 
as these are two different semantics: download a single web page and 
recursively download a sub-directory, one or more whole domains.


  The problem about:
wget -p -r -l 1 -N -H -D static.esquire.de 
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur
... is that it will only download the directly referenced .-html page 
from www.esquire.de, no supporting .css or .js files from www.-. Note 
also that when -H isn´t given, it won´t even download a single file from 
static.esquire.de. That is a -D different.domain without an additional 
-H results in -p and -r being disregarded.


  Also:
wget -p -r -l 1 -N -H -D esquire.de --exclude-domains www.esquire.de 
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur
does nothing different with wget 1.21.3 (In the former email I had used 
a newer version of wget).


  That is apparently you first need a wget -p for www.esquire.de and 
then download the images at static.esquire.de with -r in a second run. 
Not ideal.
  Forget the posted screen output from grep -o/sed download of the 
static.-images in my last email; wget must not behave like that 
(binary/cpu exposes wrong behaviour). It is not a bug (see also: 
https://www.elstel.org/uni/, SAT-solver master thesis, Epilogue, 
starting on from point 6 as well as countless other examples).


  Having a try with wget2 will be an interesting thing, though I´d 
personally consider wget -p to be a standard feature and I´d like to see 
it work there also when you have more than one domain on a specific page 
and that currently needs to be given with -H -D xy.tld. "-D xy.tld" is 
needed if you don´t want to download Javascripts from Google, although 
this may as well be interesting whenever you intend to view the page 
entirely offline, afterwards. Look at my last posting "wget 
--page-requisites/-p should download as for a web browser" which is 
pretty much about similar stuff:

https://lists.gnu.org/archive/html/bug-wget/2023-10/msg8.html

  To me wget[2] -r -D without -H would be an interesting thing for to 
tell that additional domains shall be considered but without recursively 
fetching content from there (useful for -p as well as -r, as long as you 
don´t download recursively from two or more domains by one invocation). 
That is the first example in this email made sense also when -H is not 
given.
  Basically if you implement a feature like this you could make it work 
for multi-domain recursive downloads as well, that is you could have 
separate options for both behaviours like -H xy.tld,wz.tld -D 
add-non-recusrive.tld. As by that example the parameter of -H would need 
to start without a minus and you needed wget -H -- https://..., as I 
usually do that, here to avoid searching for ://.).


Regards,
Elmar


Am 08.10.23 um 19:48 schrieb Tim Rühsen:

Hey Elmar,

did you try the following?

wget2 -p -r -l 1 -N -D static.esquire.de 
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur


It downloads 94 files, 44 are .jpg files in static.esquire.de/.

TBH, I am not 100% sure what you are trying to do, so excuse me if I am 
off the track.
The -p option is for downloading the files you need for displaying a 
page (e.g. inlined images). If the images are just links, they are not 
downloaded by -p. In this case, -r -l 1 help. If images that are 
displayed in the browser are downloaded/displayed by javascript, 
wget/wget2 won't help you.


Regards, Tim

On 9/13/23 00:14, Elmar Stellnberger wrote:

Hi to all!

   Today I wanted to download the following web page for means of
archiving:
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur

   The following command line did not do what I want:
wget -p -N -H -D esquire.de --tries=10 
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur


   The following seemed to do:
wget -p -r -N -H -D esquire.de --exclude-domains www.esquire.de 
--tries=10 
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur

: files downloaded:
now/static.esquire.de/1200x630/smart/images/2023-08/gettyimages-1391653079.jpg
now/www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur
: dld.log:
...
BEENDET --2023-09-12 23:18:01--
Verstrichene Zeit: 1,2s
Geholt: 2 Dateien, 246K in 0,07s (3,62 MB/s)
i.e. diz "two files fetched, no error"

   Without -r & --exclude-domains it did download 52 files (most of them
.js), all from www.esquire.de and none from static.esquire.de. Finally I
succeeded to download the images desired by me by á: (here 

Re: wget -p shall honor -H (isn´t used unless -r is given)

2023-10-08 Thread Tim Rühsen

Hey Elmar,

did you try the following?

wget2 -p -r -l 1 -N -D static.esquire.de 
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur


It downloads 94 files, 44 are .jpg files in static.esquire.de/.

TBH, I am not 100% sure what you are trying to do, so excuse me if I am 
off the track.
The -p option is for downloading the files you need for displaying a 
page (e.g. inlined images). If the images are just links, they are not 
downloaded by -p. In this case, -r -l 1 help. If images that are 
displayed in the browser are downloaded/displayed by javascript, 
wget/wget2 won't help you.


Regards, Tim

On 9/13/23 00:14, Elmar Stellnberger wrote:

Hi to all!

   Today I wanted to download the following web page for means of
archiving:
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur

   The following command line did not do what I want:
wget -p -N -H -D esquire.de --tries=10 
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur

   The following seemed to do:
wget -p -r -N -H -D esquire.de --exclude-domains www.esquire.de --tries=10 
https://www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur
: files downloaded:
now/static.esquire.de/1200x630/smart/images/2023-08/gettyimages-1391653079.jpg
now/www.esquire.de/life/reisen/schoenste-wasserfaelle-welt-natur
: dld.log:
...
BEENDET --2023-09-12 23:18:01--
Verstrichene Zeit: 1,2s
Geholt: 2 Dateien, 246K in 0,07s (3,62 MB/s)
i.e. diz "two files fetched, no error"

   Without -r & --exclude-domains it did download 52 files (most of them
.js), all from www.esquire.de and none from static.esquire.de. Finally I
succeeded to download the images desired by me by á: (here starting from
the second file as I did a manual download of the first)
grep -o "https://static.esquire.de/[^ ]*\.jpg" schoenste-wasserfaelle-welt-natur.html | 
sed -n '2,500/./p' | while read line; do wget -p "$line"; done

   Might (theoretically) be a bug of wget 1.21.4 (1.mga9, i.e. Mageia 9
i686) that it did not download more than two files at the second attempt,
though that may also be supposed to be a public-avail-silicon fallacy by
whomever wants it to assume.

   BTW: 'wpdld' is my scriptlet to archive the web pages I read. Regarding
the pages it works for (using wget) I prefer this over a Firefox
save-page, as it keeps the web page more or less in pristine state to be
mirrored like at the Wayback machine, if necessary. Not to save on disk
what I read is something I have experienced that it can be nasty, caus´
not every article in news is kept online forever, or be it that it is
just deleted from the indexes of search engines (and on-page searches).
I would also have 'wpv' for viewing, but alas that isn´t multidomain or
non-relative link ready - Hi, what about a make-relative feature of
already downloaded web pages on disk for wget2? (would be my desire as I
prefer to download non-relative and doing that on disk allows a 'dircmp'
(another self-written program to compare (and sync) directories; using it
more or less since 2008).)

Regards,
Elmar Stellnberger



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Not sure if this list for wget2?? Have an issue after recent updates.

2023-10-08 Thread Tim Rühsen

Hey Michael,

Sorry that the Windows version of Wget2 doesn't support multi-threading. 
This is a known issue when cross-building wget2.exe on Linux.
As far as I know, you need a native Windows build, but I can't help you 
much with that.


There is actually no obvious reason why you experience a slowdown with 
the Linux. Running your command wget2 here (same wget2 version) needs 
21.5s to download all 69 files. (Just for the record: using 69 threads 
instead of 32 doesn't download much faster. Server limitations, my 
bandwidth is far from being expired.)


My observation is that the www.uog.edu server doesn't support HTTP/2.
This unnecessarily slows down multiple-file downloads, because for every 
single file a new TCP connection must be established. On top of that, 
the TLS layer needs to be established as well, which is relatively 
expensive.


But 138 seconds from prior 25 seconds is weird.
Here (Debian Testing, kernel 6.5.3-1), even with just 5 threads, it 
takes only 32s for me.


My guess is, Fedora somehow managed to build wget2 without 
multi-threading support, likely accidentally.


You can either build your own wget2 binary to test with or open a bug 
report on Fedora (and let the experts investigate). Or both :)


Regards, Tim

On 10/8/23 11:08, Michael D. Setzer II via Primary discussion list for 
GNU Wget wrote:

I have used wget2 to download 69 to 70 pages from a University
College Campus directory. The process has worked with no
problems for many years and reduced time to about 25 seconds,

But know I get errors if I set it to more than 32 threads.
wget2  --max-threads=32 --secure-protocol=PFS
--base="https://www.uog.edu/; -i testlistuog

works fine
testlistuog contains
directory/?page=01
directory/?page=02
...
...
directory/?page=68
directory/?page=69

Know the wget2 recently was updated in the Fedora 38 repo,
GNU Wget2 2.1.0 - multithreaded metalink/file/website
downloader

+digest +https +ssl/gnutls +ipv6 +iri +large-file +nls -ntlm -opie
+psl -hsts +iconv +idn2 +zlib -lzma +brotlidec +zstd -bzip2 -lzip
+http2 +gpgme

Don't know if that change did something with threads? or perhaps
some other update?

I had found that the windows version of wget2 did not work well
with threads so have it run with threads set to 1.
Time with windows to download is:
Time to Download Campus Directory 154.332887 Seconds

The linux version with 32 threads now takes.
Time to Download Campus Directory 138.430772 Seconds
While previously it was running about 25 seconds with 70 threads?

Origainal lines in program
Call to get page 1 to find total number of pages in directory.
 system("wget2 --restrict-file-names=windows --secure-protocol=PFS -q
\"https://www.uog.edu/directory/?page=01\";);

Creates the testlistuog file with ?page=01 to ?page=lastpage number

Call with linux (Runs the wget in backgroud and loop to display with downloads
 system("wget2 --restrict-file-names=windows --max-threads=70 
--secure-protocol=PFS -q
--base=\"https://www.uog.edu/directory/\; -i testlistuog 2>error & PID=$! ; 
printf '[' ; while ps hp $PID

/dev/null ; do  printf  '▓'; sleep 1 ; done ; printf '] done!\n'");

This produces individual files for each page, and then combines them into one 
allraw.uog when done.

With windows it uses single thread and downloads pages 1 to last and sends 
output to allraw.uog.
 system("wget2 --max-threads=1 --restrict-file-names=windows 
--secure-protocol=PFS
--progress=none --base=\"https://www.uog.edu/directory/\; -O \"allraw.uog\" -i 
testlistuog");

Run wget2 commands outside cpp program to make sure it wasn't that causing 
issue.

Going from 25 seconds to 138 isn't a huge problem, but seeing the change in how 
the program is
working is concerning.

Perhaps a change in max number of threads was done, or perhaps some other 
update in Fedora or
within kernels? 6.5.5-200.fc38.x86_64







++
  Michael D. Setzer II - Computer Science Instructor (Retired)
  mailto:mi...@guam.net
  mailto:msetze...@gmail.com
  Guam - Where America's Day Begins
  G4L Disk Imaging Project maintainer
  http://sourceforge.net/projects/g4l/
++




OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Semicolon not allowed in userinfo

2023-10-05 Thread Tim Rühsen

On 10/4/23 14:04, Bachir Bendrissou wrote:

Hi Tim,

Wget doesn't follow the current specs and the parsing is lenient to

accept some types of badly formatted URLs seen in the wild.



Did you mean to say that the parsing is overly strict, and needs to be more
permissive?


I tried to make clear that it is not the semicolon.
What was unclear?


Also, as Daniel pointed out, your curl input example appears to have a
space.


Sorry, the curl URL was no mine, it was yours. May I cite the URL from 
your original email? There is a space, no?


> *http://a ;b:c@xyz*

Regards, Tim



Bachir

On Tue, Oct 3, 2023 at 1:45 PM Daniel Stenberg  wrote:


On Tue, 3 Oct 2023, Tim Rühsen wrote:


My  version of curl (8.3.0) doesn't accept it:

curl -vvv 'http://a ;b:c@xyz'
* URL rejected: Malformed input to a URL function


That's in no way a legal URL (accortding to RFC 3986) and it is not the
semicolon that causes curl to reject it. It is the space.

But I don't know if that is maybe your clients or the mailing list
software
that botched it so badly?

--

   / daniel.haxx.se


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Semicolon not allowed in userinfo

2023-10-04 Thread Bachir Bendrissou
Hi Tim,

Wget doesn't follow the current specs and the parsing is lenient to
> accept some types of badly formatted URLs seen in the wild.
>

Did you mean to say that the parsing is overly strict, and needs to be more
permissive?

Because not allowing a semicolon is strict parsing, which needs to be
relaxed.

Also, as Daniel pointed out, your curl input example appears to have a
space.

Bachir

On Tue, Oct 3, 2023 at 1:45 PM Daniel Stenberg  wrote:

> On Tue, 3 Oct 2023, Tim Rühsen wrote:
>
> > My  version of curl (8.3.0) doesn't accept it:
> >
> > curl -vvv 'http://a ;b:c@xyz'
> > * URL rejected: Malformed input to a URL function
>
> That's in no way a legal URL (accortding to RFC 3986) and it is not the
> semicolon that causes curl to reject it. It is the space.
>
> But I don't know if that is maybe your clients or the mailing list
> software
> that botched it so badly?
>
> --
>
>   / daniel.haxx.se


Re: Semicolon not allowed in userinfo

2023-10-03 Thread Daniel Stenberg

On Tue, 3 Oct 2023, Tim Rühsen wrote:


My  version of curl (8.3.0) doesn't accept it:

curl -vvv 'http://a ;b:c@xyz'
* URL rejected: Malformed input to a URL function


That's in no way a legal URL (accortding to RFC 3986) and it is not the 
semicolon that causes curl to reject it. It is the space.


But I don't know if that is maybe your clients or the mailing list software 
that botched it so badly?


--

 / daniel.haxx.se


Re: Semicolon not allowed in userinfo

2023-10-03 Thread Tim Rühsen

Hi,

On 10/2/23 10:55, Bachir Bendrissou wrote:

Hi,

The following url example contains a semicolon in the userinfo segment:


*http://a ;b:c@xyz*
Wget rejects this url with the following error message:

*http://a ;b:c@xyz: Bad port number.*

It seems that Wget sees "c" as a port number. When "c" is replaced by a
digit, Wget accepts the url and attempts to resolve "xyz".


Wget doesn't follow the current specs and the parsing is lenient to 
accept some types of badly formatted URLs seen in the wild.


But we should possibly become more strict and compliant to current specs.



It's worth noting that curl and aria2 both accept the url example.


My  version of curl (8.3.0) doesn't accept it:

curl -vvv 'http://a ;b:c@xyz'
* URL rejected: Malformed input to a URL function
* Closing connection
curl: (3) URL rejected: Malformed input to a URL function

All the URL parsers are slightly different when it comes to edge cases.
I'd consider curl as a good reference.


Why is the semicolon not allowed in userinfo, despite that other special
characters are allowed?


First of all, userinfo does not allow spaces at all (look at 
https://datatracker.ietf.org/doc/html/rfc3986).

  userinfo= *( unreserved / pct-encoded / sub-delims / ":" )
  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
  sub-delims  = !$&'()*+,;=
  pct-encoded = "%" HEXDIG HEXDIG



Thank you,
Bachir


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: URL query syntax

2023-10-02 Thread Petr Pisar
V Mon, Oct 02, 2023 at 10:54:54AM +0100, Bachir Bendrissou napsal(a):
> Hi,
> 
> Are there any query strings that are invalid and should be rejected by the
> Wget parser?
> 
> Wget seems to accept all sorts of strings in the query segment. For example:
> 
> 
> 
> *"https://example.com/?a=ba=a "*
> The URL is accepted with no errors reported, despite missing a delimiter.
> 
> Is this correct?
> 
> Thank you,
> Bachir

Please do not cross post. The very same question on curl-list
 already got an
answer.

-- Petr


signature.asc
Description: PGP signature


Re: Unable to execute command

2023-09-29 Thread Dalingcebo Dlamini
Yes, you are right. Thank you, appreciated

On Fri, Sep 29, 2023, 17:19 ge...@mweb.co.za  wrote:

> Hi,
>
> could it be that you were trying to specify an output file by using the
> "-o" option and by mistake entered a "-0" (dash Zero instead of dash
> lowercase O) ?
>
> Regards,
>
> Gerd (not one of the resident experts here ;-)
>
>
>
> - Original Message -
> From: "Dalingcebo Dlamini" 
> To: "bug-wget" 
> Sent: Friday, September 29, 2023 4:40:44 PM
> Subject: Unable to execute command
>
> Good evening, I have an issue whereby when I try to enter this command
> (wget -0 install-nethunter-termux https://offs.ec/2MceZWr) I get the
> following issue.
>
> wget: invalid option -- '0'
> Usage: wget [OPTION]... [URL]...
>
> Please assist
>


Re: Unable to execute command

2023-09-29 Thread ge...@mweb.co.za
Hi, 

could it be that you were trying to specify an output file by using the "-o" 
option and by mistake entered a "-0" (dash Zero instead of dash lowercase O) ?

Regards, 

Gerd (not one of the resident experts here ;-)



- Original Message -
From: "Dalingcebo Dlamini" 
To: "bug-wget" 
Sent: Friday, September 29, 2023 4:40:44 PM
Subject: Unable to execute command

Good evening, I have an issue whereby when I try to enter this command
(wget -0 install-nethunter-termux https://offs.ec/2MceZWr) I get the
following issue.

wget: invalid option -- '0'
Usage: wget [OPTION]... [URL]...

Please assist



Re: Trying to mirror some blogs before destruction this month's 21

2023-08-30 Thread Stephane Ascoet

Le 20/08/2023 à 14:22, Tim Rühsen a écrit :

Hi,

which version of Wget are you using ? (wget --version)


Hi, I've seen your mail only today, it's 1.21



Wget only replaces links of successfully downloaded pages.
Can you give 1-2 examples of links that haven't been converted?
Can you also send your /tmp/speak as it might contain information that
helps debugging?


The platform is now supposed to be offline, so we can't pursue tests. We 
could try from the wayback machine but this one causes others problems 
because of the way it rewrites URIs(I sent a request to them about this 
a long time ago). Anyway, I succeed with HTTrack...



When testing, the website pretty quickly seem to block my IP.
So I can not really reproduce anything.


I guess they had overload with hundreds of people sucking Websites the 
last day... I had shortages too while downloading...




These combinations of options are long-used by me and happen to work,
even if I already had to correct links manually(thanks Sed!)


What do you think has changed? Did you update Wget and this may be a
regression?


It was on another Websites, with 1.18





--
Regards, Stephane Ascoet




Re: build bug

2023-08-26 Thread Freedom Dev

I can't figure out why the build test failed.
Now going fine.
I'm sorry for the inconvenience



On 25.08.2023 17:22, Tim Rühsen wrote:
I have seen this with WolfSSL, setting a single certificate does in 
fact *add* the certificate to whatever the system environment provides.


So instead of seeing the expected error, the executed wget2 command 
succeeds.


Is there a chance that you built with WolfSSL. Maybe provide the 
config.log file, these details are all in there.


Regards, Tim






Re: build bug

2023-08-25 Thread Tim Rühsen
I have seen this with WolfSSL, setting a single certificate does in fact 
*add* the certificate to whatever the system environment provides.


So instead of seeing the expected error, the executed wget2 command 
succeeds.


Is there a chance that you built with WolfSSL. Maybe provide the 
config.log file, these details are all in there.


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Trying to mirror some blogs before destruction this month's 21

2023-08-20 Thread Tim Rühsen

Hi,

which version of Wget are you using ? (wget --version)

On 8/19/23 05:37, Stephane Ascoet wrote:

Hi, I launch
'd0="/media/demo/PartitionNTFS/" && cd $d0 && d1="speakerinertl" && echo 
"On va supprimer "$d1" dans "`pwd` ; sleep 9 ; rm -rvf $d1 ; mkdir $d1 ; 
cd $d1 && wget -c --no-check-certificate -m -k -K -E --show-progress -np 
-nH -p https://speakerinertl.skyrock.com/ -o /tmp/speak' but Wget 
doesn't replace links (however *html and *html.orig are different).


Wget only replaces links of successfully downloaded pages.
Can you give 1-2 examples of links that haven't been converted?
Can you also send your /tmp/speak as it might contain information that 
helps debugging?


And, since, as usual with blogs, pictures are stored on another domain, 
I've tried 'd0="/media/demo/PartitionNTFS/" && cd $d0 && 
d1="speakerinertl" && echo "On va supprimer "$d1" dans "`pwd` ; sleep 9 
; rm -rvf $d1 ; mkdir $d1 ; cd $d1 && wget -c --no-check-certificate -H 
-Di.skyrock.net,speakerinertl.skyrock.com/ -m -k -K -E --show-progress 
-np -nH -p https://speakerinertl.skyrock.com/' but something even more 
weird happens: only pictures are downloaded, zero Webpages!


When testing, the website pretty quickly seem to block my IP.
So I can not really reproduce anything.



These combinations of options are long-used by me and happen to work, 
even if I already had to correct links manually(thanks Sed!)


What do you think has changed? Did you update Wget and this may be a 
regression?




Thanks for help despite the short delay


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Rejecting 'index.html*' files causes recursion to include parent-directories

2023-08-16 Thread Carl Ponder via Primary discussion list for GNU Wget



Ok here's what worked:

   wget -P dir -r -R 'index.html*' -R '..' -nH -np --cut-dirs 
3https://site.org/X/Y/Z


Can anyone tell me why the behavior was happening in the first place, 
though? That excluding "index,html" would cause recursion in the 
parent-directories, when it had been disabled?


Re: Wget crash in printf - bugfix

2023-08-15 Thread Mark Esler
Hi Tim,

Thank you for your response.

In a Launchpad bug I've marked this bug as confirmed but, not security
relevant: https://bugs.launchpad.net/ubuntu/+source/wget/+bug/2029930/

Kind regards,
Mark



Re: Wget crash in printf - bugfix

2023-08-12 Thread Tim Rühsen

Hey Mark,

On 8/8/23 23:38, Mark Esler wrote:

Hi Tim,

Will this issue receive a CVE? Would you like help assigning a CVE?


We, the maintainers, are understaffed and not even able to fix all 
incoming bugs. So while we appreciate when you request a CVE for the 
issue, please understand that we can't be of much help here.


Sorry about that :|

Regards, Tim



Thank you,
Mark Esler


OpenPGP_signature
Description: OpenPGP digital signature


Re: Wget crash in printf - bugfix

2023-08-08 Thread Mark Esler
Hi Tim,

Will this issue receive a CVE? Would you like help assigning a CVE?

Thank you,
Mark Esler



Re: Wget crash in printf - bugfix

2023-08-03 Thread Tim Rühsen

Thanks,
your patch is correct. I also added a unit test for retr_rate() to 
reproduce the issue.


Regards, Tim

On 8/2/23 15:31, Wiebe Cazemier wrote:

Hi,

We're getting the following segfault. We haven't been able to reproduce it with 
debug builds or builds from 'apt-get source wget', so here's a trace from the 
release build 1.21.2-2ubuntu1 (from Ubuntu 22.04):

dmesg line: wget[3522173]: segfault at 1 ip 7f17a81a023c sp 
7fff7b14e7f8 error 4 in libc.so.6[7f17a8016000+195000]


#0  __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
#1  0x7f111424cdb1 in __vfprintf_internal (s=s@entry=0x7ffc2e5c50d0, 
format=format@entry=0x55e763577735 "%.*f %s", ap=ap@entry=0x7ffc2e5c5250, 
mode_flags=mode_flags@entry=2) at ./stdio-common/vfprintf-internal.c:1517
#2  0x7f111425e51a in __vsnprintf_internal (string=0x55e763591080 "7.95 GB/s", 
maxlen=, format=0x55e763577735 "%.*f %s", args=args@entry=0x7ffc2e5c5250, 
mode_flags=2) at ./libio/vsnprintf.c:114
#3  0x7f111430ace5 in ___snprintf_chk (s=, maxlen=, 
flag=, slen=, format=) at 
./debug/snprintf_chk.c:38
#4  0x55e76353d69c in ?? ()
#5  0x55e763538656 in ?? ()
#6  0x55e763542c8b in ?? ()
#7  0x55e763545482 in ?? ()
#8  0x55e763517cee in ?? ()
#9  0x7f11141ffd90 in __libc_start_call_main 
(main=main@entry=0x55e763516260, argc=argc@entry=4, 
argv=argv@entry=0x7ffc2e5c5cd8) at ../sysdeps/nptl/libc_start_call_main.h:58
#10 0x7f11141ffe40 in __libc_start_main_impl (main=0x55e763516260, argc=4, 
argv=0x7ffc2e5c5cd8, init=, fini=, 
rtld_fini=, stack_end=0x7ffc2e5c5cc8) at ../csu/libc-start.c:392
#11 0x55e7635192d5 in ?? ()


Attached is a patch to fix something that at least looks like it can cause a crash, but 
looking at this stack trace, which already shows the formatted string "7.95 
GB/s" in the output string, I'm not sure if that is really the fix/cause.

Regards,

Wiebe


OpenPGP_signature
Description: OpenPGP digital signature


Re: tests: portability fix

2023-08-03 Thread Tim Rühsen

Thanks, pushed.

On 8/2/23 11:35, Christian Weisgerber wrote:

There is a portability problem in wget's tests/Makefile, which
variously refers to "unit-tests" and "./unit-tests".  GNU make
recognizes "foo" and "./foo" as the same target.  Other make(1)
implementations may not; OpenBSD's doesn't.

Unbreak the test suite:

--- tests/Makefile.am.orig
+++ tests/Makefile.am
@@ -156,7 +156,7 @@ AM_CFLAGS = $(WERROR_CFLAGS) $(WARN_CFLAGS)
  
  CLEANFILES = *~ *.bak core core.[0-9]*
  
-TESTS = ./unit-tests$(EXEEXT) $(PX_TESTS)

+TESTS = unit-tests$(EXEEXT) $(PX_TESTS)
  TEST_EXTENSIONS = .px
  PX_LOG_COMPILER = $(PERL)
  AM_PX_LOG_FLAGS = -I$(srcdir)


OpenPGP_signature
Description: OpenPGP digital signature


Re: unit tests fail due to extra output file wget-log

2023-07-26 Thread Nam Nguyen
Tim Rühsen writes:

> I am currently not sure how to best solve it except by not running the
> tests in the background.

I propose these two patches to the perl and python testing scripts. They
ignore wget-log. This makes the test suite more robust and these errors
go away.

--8<---cut here---start->8---
Index: testenv/conf/expected_files.py
--- testenv/conf/expected_files.py.orig
+++ testenv/conf/expected_files.py
@@ -27,7 +27,7 @@ class ExpectedFiles:
 # pubring.gpg, pubring.kbx, dirmngr.conf, gpg.conf will be 
created by libgpgme
 #   if $HOME doesn't contain the .gnupg directory.
 # setting $HOME to CWD (in base_test.py) breaks two Metalink 
tests, so we skip this file here.
-if name in [ 'pubring.gpg', 'pubring.kbx', 'dirmngr.conf', 
'gpg.conf' ]:
+if name in [ 'pubring.gpg', 'pubring.kbx', 'dirmngr.conf', 
'gpg.conf', 'wget-log' ]:
 continue
 
 f = {'content': ''}

Index: tests/WgetTests.pm
--- tests/WgetTests.pm.orig
+++ tests/WgetTests.pm
@@ -356,7 +356,7 @@ sub _verify_download
 __dir_walk(
 q{.},
 sub {
-if (!(exists $self->{_output}{$_[0]} || $self->{_existing}{$_[0]}))
+if (!(exists $self->{_output}{$_[0]} || $self->{_existing}{$_[0]}) 
&& ($_[0] != "wget-log"))
 {
 push @unexpected_downloads, $_[0];
 }
--8<---cut here---end--->8---

SKIP: Test-https-badcerts.px

Testsuite summary for wget 1.21.4

# TOTAL: 94
# PASS:  85
# SKIP:  9
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0


SKIP: Test-no_proxy-env.py

Testsuite summary for wget 1.21.4

# TOTAL: 45
# PASS:  44
# SKIP:  1
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0


On OpenBSD I can reproduce what you see: (OpenBSD uses OpenBSD make and
not gnu's gmake.)
- manually going to the build directory and `make check &' results in
  many errors
- same but foregrounded with `make check' results in resolving these errors
- `make test' from the ports directory /usr/ports/net/wget (not build
  directory) results in many errors



Re: Bug in Termux

2023-07-08 Thread Tim Rühsen
You likely chose the wrong URL. Search the web for nethunter 
installation on termux and try again.


Regards, Tim

On 7/6/23 15:07, Antreas Zaxo wrote:

Hello, I wanted to say about one problem that i have in termux, i put the
code "wget -O install-nethunder-termux https://offs.ec/2MecZWr; and the
termux said "--2023-07-06 14:57:34-- https://offs.ec/2MecZWr
Resolving offs.ec (offs.ec)... 67.199.248.12, 67.199.248.13
Connecting to offs.ec (offs.ec)|67.199.248.12|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2023-07-06 14:57:34 ERROR 404: Not Found."
What should I do???


OpenPGP_signature
Description: OpenPGP digital signature


Re: Wget recursive option not working correctly with scheme relative URLs

2023-07-01 Thread Tim Rühsen

Hey Jan,

On 7/1/23 15:16, Jan Bidler via Primary discussion list for GNU Wget wrote:

Hello,
I have part of a website (`example.com/index.html`) I want to mirror which 
contains scheme relative URLs (`//otherexample.com/image.png`). Trying to 
download these with the -r flag, results in wget converting them to a wrong URL 
(`example.com//otherexample.com`).

So using
`wget -r example.com/index.html`
Will cause links with 
`https://example.com/index.html\/\/otherexample.com\/image.png` in the output
Using the debug flag reveals this:
`merge(»example.com/index.html «, » //otherexample.com/image.png«) -> 
https://example.com/index.html\/\/otherexample.com\/image.png 
[`](https://example.com/index.html//otherexample.com/image.png`)


This is unexpected since these kind of links are relatively common and 
so far nobody complaint about it.


I just added a new test function for uri_merge(), the function that does 
this job. It has no issue to merge a relative URL like 
'//otherexample.com/image.png'.


So is it possible to share a real world wget command line to reproduce 
the issue ?


Regards, Tim


OpenPGP_signature
Description: OpenPGP digital signature


Re: unit tests fail due to extra output file wget-log

2023-07-01 Thread Tim Rühsen

Hey Nam,

I see that wget unexpectedly writes a log file (from test-suite.log):

   Redirecting output to ‘wget-log’.

In log.c, L973 (function check_redirect()), writing the log is caused by
  pid_t foreground_pgrp = tcgetpgrp (STDIN_FILENO);

  if (foreground_pgrp != -1 && foreground_pgrp != getpgrp () && 
!opt.quiet)

{
  /* Process backgrounded */
  redirect_output (true,NULL);
}

By knowing this, I can reproduce your issue now (on Linux) with a `make 
check &`, running the test suite in the background.


Looks like commit dd5c549f6af8e1143e1a6ef66725eea4bcd9ad50 introduced 
this behavior.


I am currently not sure how to best solve it except by not running the 
tests in the background.


Regards, Tim

On 7/1/23 02:21, Nam Nguyen wrote:

I am trying to get unit tests to pass for the openbsd port of wget
1.21.4.

80 unit tests in test/ and more in testenv/ currently fail.

test-suite.log:
http://namtsui.com/public/wget-test-suite.txt

All failing tests are of the form:

--8<---cut here---start->8---
Test failed: unexpected downloaded files [wget-log]
FAIL Test-auth-basic.px (exit status: 1)
--8<---cut here---end--->8---

With the following two patches, tests pass, no tests fail and a small
number are skipped. These patches get rid of the error on extra
files. Perhaps the generation of wget-log is messing with the test
suite?

--8<---cut here---start->8---
Index: testenv/conf/expected_files.py
--- testenv/conf/expected_files.py.orig
+++ testenv/conf/expected_files.py
@@ -55,4 +55,3 @@ class ExpectedFiles:
  raise TestFailed('Expected file %s not found.' % file.name)
  if local_fs:
  print(local_fs)
-raise TestFailed('Extra files downloaded.')

Index: tests/WgetTests.pm
--- tests/WgetTests.pm.orig
+++ tests/WgetTests.pm
@@ -365,8 +365,6 @@ sub _verify_download
);
  if (@unexpected_downloads)
  {
-return 'Test failed: unexpected downloaded files [' .
-  (join ', ', @unexpected_downloads) . "]\n";
  
  }

--8<---cut here---end--->8---

  
before:


--8<---cut here---start->8---

Testsuite summary for wget 1.21.4

# TOTAL: 94
# PASS:  5
# SKIP:  9
# XFAIL: 0
# FAIL:  80
# XPASS: 0
# ERROR: 0

See tests/test-suite.log
--8<---cut here---end--->8---


after patches:

--8<---cut here---start->8---
SKIP: Test-https-pfs.px
SKIP: Test-https-tlsv1x.px
SKIP: Test-https-selfsigned.px
SKIP: Test-https-tlsv1.px
SKIP: Test-https-clientcert.px
SKIP: Test-https-badcerts.px
SKIP: Test-https-weboftrust.px
SKIP: Test-https-crl.px

Testsuite summary for wget 1.21.4

# TOTAL: 94
# PASS:  85
# SKIP:  9
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

SKIP: Test-no_proxy-env.py

Testsuite summary for wget 1.21.4

# TOTAL: 45
# PASS:  44
# SKIP:  1
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

--8<---cut here---end--->8---



OpenPGP_signature
Description: OpenPGP digital signature


Re: wget refuses IPv6 link local addresses as invalid

2023-07-01 Thread Tim Rühsen

Hey Moritz,

On 6/28/23 15:40, Moritz Wilhelmy wrote:

Dear wget maintainers,

I tried to connect to IPv6 link local address using both wget 1.20 as well as 
1.21.3.
I've replaced the address by ... for privacy reasons but since IPv6LL addresses 
are
automatically assigned on modern operating systems you can easily try this at 
home :)

The correct way to specify an IPv6 link local URL is as follows,
%25 being the url-encoded version of the percent sign:
$ wget -O- 'http://[fe80::...%25br0]:9090/topology/status'
http://[fe80::...%25br0]:9090/topology/status: Invalid IPv6 numeric address.

The following (technically invalid but otherwise harmless) method of specifying 
the zone
identifier also isn't accepted by wget - curl does seem to accept it, though:
$ wget -O- 'http://[fe80::...%br0]:9090/topology/status'
http://[fe80::...%br0]:9090/topology/status: Invalid IPv6 numeric address.

Link local addresses need a zone identifier to be valid, otherwise it isn't 
clear which
interface to bind to for sending the request since they're valid only in the 
local network
segment and not routed.

As expected, if the zone identifier is omitted, wget fails as follows:
$ wget -O- 'http://[fe80::...]:9090/topology/status'
--2023-06-28 13:21:45--  http://[fe80::...]:9090/topology/status
Connecting to [fe80::...]:9090... failed: Invalid argument.


Thanks for bringing this up.
Tbh, I wasn't even aware of the zone id :)

If I get it right, it specifies the network interface. Well, I have to 
read up on it anyways.


Did you try to use --bind-address ?

Regards, Tim




Best regards,

Moritz



OpenPGP_signature
Description: OpenPGP digital signature


Re: Adding MPTCP option to wget

2023-06-05 Thread Gisle Vanem

Bastien Wiaux wrote:

diff --git a/src/connect.h b/src/connect.h
index d03a1708..e50fafe7 100644
--- a/src/connect.h
+++ b/src/connect.h
@@ -86,4 +86,38 @@ int select_fd_nb (int, double, int);
 #define select_fd_nb select_fd
 #endif

+#ifdef ENABLE_MPTCP
+#ifndef IPPROTO_MPTCP
+#define IPPROTO_MPTCP 262
+#endif
+#include 
+#ifndef SOL_MPTCP
+#define SOL_MPTCP 284
+#endif

---

This won't compile for non-Linux. Should perhaps be:

#if defined(ENABLE_MPTCP)
  #if defined(__linux__)
  #include 
  #else
  #error "Only Linux is supported for 'ENABLE_MPTCP'"
  #endif
#endif

/* make the rest compile with 'opt.mptcp == true'
 */
#ifndef IPPROTO_MPTCP
#define IPPROTO_MPTCP 262
#endif

#ifndef SOL_MPTCP
#define SOL_MPTCP 284
#endif


--
--gv



Re: css.c: No such file or directory when lex lib not found during build

2023-05-24 Thread Darshit Shah
Hi,

Thanks for reporting this. I think we can modify the configure script to make 
the check on flex a hard error.

I'll do that soon(tm)


On Wed, May 24, 2023, at 11:01, David Cepelik wrote:
> Hi all,
>
> I noticed some strange behavior when building recent wget (fbbdf9ea)
> from sources using recent Flex (d30bdf4) built from sources. I tracked
> the issue down to Flex [1], but I thought that wget's toolchain made the
> problem harder to debug, so I figured it might be a good think to report
> it here as well.
>
> In a nutshell, when lex library is not detected, configure will output
> the following:
>
> [...]
> checking for flex... flex
> checking for lex output file root... lex.yy
> checking for lex library... not found
> configure: WARNING: required lex library not found; giving up on flex
> [...]
>
> which gets drowned in the rest of the output. Even though the library is
> "required" as per the warning above, the build process continues and
> eventually crashes with,
>
> [...]
> echo '#include "wget.h"' > css_.c
> cat css.c >> css_.c
> cat: css.c: No such file or directory
> make[3]: *** [Makefile:3156: css_.c] Error 1
> [...]
>
> This is because LEX defaults to : in src/Makefile:
>
> [...]
>   CC   convert.o
>   CC   cookies.o
> :  -ocss.c css.l
>   CC   ftp.o
>   CC   css-url.o
> [...]
>
> Is this expected?
>
> Best, David
>
> [1] https://github.com/westes/flex/issues/565
> Attachments:
> * signature.asc



Re: Test failures on i686-linux

2023-05-20 Thread Andreas Enge
Am Wed, May 10, 2023 at 06:44:52PM +0200 schrieb Andreas Enge:
> a quick reminder, it would be great if we could get a new wget1 release!

We did on the same day, thanks a lot!

Andreas




Re: bug reporting

2023-05-18 Thread Darshit Shah
And what is the bug? What issue are you facing?

On Thu, May 18, 2023, at 05:59, 亦君羊心 via Primary discussion list for GNU Wget 
wrote:
> == Reinstalling wget
>
> == Pouring wget--1.21.3_1.arm64_monterey.bottle.1.tar.gz
>
>  /opt/homebrew/Cellar/wget/1.21.3_1: 89 files, 4.2MB
>
> == Running `brew cleanup wget`...
>
> Disable this behaviour by setting HOMEBREW_NO_INSTALL_CLEANUP.
>
> Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).
>
> ➜ ~ wget --version
>
> GNU Wget 1.21.3 在 darwin21.6.0 上编译。
>
>
>
>
> -cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls
>
> +ntlm +opie -psl +ssl/openssl
>
>
>
>
> Wgetrc:
>
>   /opt/homebrew/etc/wgetrc (系统)
>
> 语区:
>
>   /opt/homebrew/Cellar/wget/1.21.3_1/share/locale
>
> 编译:
>
>   clang -DHAVE_CONFIG_H 
> -DSYSTEM_WGETRC="/opt/homebrew/etc/wgetrc"
>
>   
> -DLOCALEDIR="/opt/homebrew/Cellar/wget/1.21.3_1/share/locale" -I.
>
>   -I../lib -I../lib -I/opt/homebrew/opt/openssl@3/include
>
>   -I/opt/homebrew/Cellar/libidn2/2.3.4_1/include -DNDEBUG 
> -g -O2
>
> 链接:
>
>   clang -I/opt/homebrew/Cellar/libidn2/2.3.4_1/include 
> -DNDEBUG -g
>
>   -O2 -L/opt/homebrew/Cellar/libidn2/2.3.4_1/lib -lidn2
>
>   -L/opt/homebrew/opt/openssl@3/lib -lssl -lcrypto -ldl -lz
>
>   ../lib/libgnu.a -liconv -lintl -Wl,-framework 
> -Wl,CoreFoundation
>
>   -lunistring
>
>
>
>
> Copyright © 2015 Free Software Foundation, Inc.
>
> 授权 GPLv3+: GNU GPL 第三版或更高版本
>
> 
> 这是自由软件:您可以自由地更改并重新分发它。
>
> 在法律所允许的范围内,没有任何担保。
>
>
>
>
> 最初由 Hrvoje Nikšić 
> 请将错误报告或建议寄给 
>
>
>
>
>
>
>
>
>
>
>
> 亦君羊心
> 450415...@qq.com
>
>
>
> 



Re: Recursive downloading of pages through the "action" attributes of the following "form" tags

2023-05-14 Thread BERBAR Florian
I reproduce this issue with the lastest version (1.21.4) with the 
following pages :


form.html:
form.html:    
form.html:        
form.html:            
form.html:            
form.html:        
form.html:    
form.html:

post.html:
post.html:    
post.html:        link
post.html:    
post.html:

link.html:
link.html:    
link.html:        form
link.html:    
link.html:

A basic recusive command only downloads the form.html page when I 
expected to download all 3 pages.


wget-1.21.4$ ./src/wget -r http://127.0.0.1/form.html
--2023-05-15 01:08:55-- http://127.0.0.1/form.html
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 145 [text/html]
Saving to: '127.0.0.1/form.html'

127.0.0.1/form.html 100%[===>] 145 --.-KB/s    in 0s

2023-05-15 01:08:55 (18.6 MB/s) - '127.0.0.1/form.html' saved [145/145]

FINISHED --2023-05-15 01:08:55--

Regards,

Florian

On 4/17/23 21:22, BERBAR Florian wrote:

Hi folk,

I have question about recursive downloading of webpages. Trying to 
download all pages from a website using recursing option (--recursive) 
on wget 1.21, the webpages processing seems to don't follow form 
"action" attributs of "form" tags.


- Does it be the expecting behavior?
- Is there a combination of options to download all pages of a website 
with the attribut "action"?



Exemple with 3 HTML pages :

- Page 1 - form.html : HTML form with "action" attribut pointing to 
"Page 2"

- Page 2 - post.html : HTML page with a link to "Page 3".
- Page 3 - link.html : HTML page without link.

I tried this command to download all tree pages but only "Page 1" was 
downloaded:


$ wget -r https://host/form.html


I tried "--follow-tags=form" option but the same behavior was observed.


Regards,

Florian




Re: wget-1.21.4 released [stable]

2023-05-14 Thread Tim Rühsen

Darshit, thank you for taking the time to publish a new release !

Regards, Tim

On 5/11/23 03:29, Darshit Shah wrote:

This is to announce wget-1.21.4, a stable release.

This is a slow release, with not many exciting things to talk about. The main 
reason is to allow HSTS tests to function again on i686 systems.

There have been 29 commits by 3 people in the 62 weeks since 1.21.3.

See the NEWS below for a brief summary.

Thanks to everyone who has contributed!
The following people contributed changes to this release:

   Darshit Shah (6)
   Tim Rühsen (22)
   jinfuchiang (1)

Darshit
  [on behalf of the wget maintainers]
==

Here is the GNU wget home page:
 http://gnu.org/s/wget/

For a summary of changes and contributors, see:
   http://git.sv.gnu.org/gitweb/?p=wget.git;a=shortlog;h=v1.21.4
or run this command from a git-cloned wget directory:
   git shortlog v1.21.3..v1.21.4

Here are the compressed sources:
   https://ftpmirror.gnu.org/wget/wget-1.21.4.tar.gz   (4.9MB)
   https://ftpmirror.gnu.org/wget/wget-1.21.4.tar.lz   (2.4MB)

Here are the GPG detached signatures:
   https://ftpmirror.gnu.org/wget/wget-1.21.4.tar.gz.sig
   https://ftpmirror.gnu.org/wget/wget-1.21.4.tar.lz.sig

Use a mirror for higher download bandwidth:
   https://www.gnu.org/order/ftp.html

Here are the SHA1 and SHA256 checksums:

   c6dc52cbda882c14fa5c3401d039901a0ba823fc  wget-1.21.4.tar.gz
   gVQvXO+4+qzDm7vGyC3tgOPkqIUFrnLqUd8nUlvN4Ew=  wget-1.21.4.tar.gz
   42384273c1937458c9db3766a5509afa636a2f00  wget-1.21.4.tar.lz
   NoNhml9Q7cvMsXIKeQBvo3v5uaJVqMW0gEi8PHqHS9k=  wget-1.21.4.tar.lz

Verify the base64 SHA256 checksum with cksum -a sha256 --check
from coreutils-9.2 or OpenBSD's cksum since 2007.

Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  First, be sure to download both the .sig file
and the corresponding tarball.  Then, run a command like this:

   gpg --verify wget-1.21.4.tar.gz.sig

The signature should match the fingerprint of the following key:

   pub   rsa4096 2015-10-14 [SC]
 7845 120B 07CB D8D6 ECE5  FF2B 2A17 43ED A91A 35B6
   uid   Darshit Shah 
   uid   Darshit Shah 

If that command fails because you don't have the required public key,
or that public key has expired, try the following commands to retrieve
or refresh it, and then rerun the 'gpg --verify' command.

   gpg --locate-external-key g...@darnir.net

   gpg --recv-keys 64FF90AAE8C70AF9

   wget -q -O- 
'https://savannah.gnu.org/project/release-gpgkeys.php?group=wget=1' | 
gpg --import -

As a last resort to find the key, you can try the official GNU
keyring:

   wget -q https://ftp.gnu.org/gnu/gnu-keyring.gpg
   gpg --keyring gnu-keyring.gpg --verify wget-1.21.4.tar.gz.sig

This release was bootstrapped with the following tools:
   Autoconf 2.71
   Automake 1.16.5
   Gnulib v0.1-6178-gdfdf33a466

NEWS

* Noteworthy changes in release 1.21.4 (2023-05-11)

** Document --retry-on-host-error in help text

** Increase read buffer size to 64k. This should speed up downloads on gigabit 
and faster connections

** Update deprecated option '--html-extension' to '--adjust-extension' in 
documentation

** Update gnulib compatibility layer.
Fixes HSTS test failures on i686. (Thanks to Andreas Enge for ponting it 
out)




OpenPGP_signature
Description: OpenPGP digital signature


Re: wget-1.21.4 released [stable]

2023-05-11 Thread Luna Jernberg
Updated Arch Linux PKGBUILD

On Thu, May 11, 2023 at 7:31 AM Darshit Shah  wrote:
>
> This is to announce wget-1.21.4, a stable release.
>
> This is a slow release, with not many exciting things to talk about. The main 
> reason is to allow HSTS tests to function again on i686 systems.
>
> There have been 29 commits by 3 people in the 62 weeks since 1.21.3.
>
> See the NEWS below for a brief summary.
>
> Thanks to everyone who has contributed!
> The following people contributed changes to this release:
>
>   Darshit Shah (6)
>   Tim Rühsen (22)
>   jinfuchiang (1)
>
> Darshit
>  [on behalf of the wget maintainers]
> ==
>
> Here is the GNU wget home page:
> http://gnu.org/s/wget/
>
> For a summary of changes and contributors, see:
>   http://git.sv.gnu.org/gitweb/?p=wget.git;a=shortlog;h=v1.21.4
> or run this command from a git-cloned wget directory:
>   git shortlog v1.21.3..v1.21.4
>
> Here are the compressed sources:
>   https://ftpmirror.gnu.org/wget/wget-1.21.4.tar.gz   (4.9MB)
>   https://ftpmirror.gnu.org/wget/wget-1.21.4.tar.lz   (2.4MB)
>
> Here are the GPG detached signatures:
>   https://ftpmirror.gnu.org/wget/wget-1.21.4.tar.gz.sig
>   https://ftpmirror.gnu.org/wget/wget-1.21.4.tar.lz.sig
>
> Use a mirror for higher download bandwidth:
>   https://www.gnu.org/order/ftp.html
>
> Here are the SHA1 and SHA256 checksums:
>
>   c6dc52cbda882c14fa5c3401d039901a0ba823fc  wget-1.21.4.tar.gz
>   gVQvXO+4+qzDm7vGyC3tgOPkqIUFrnLqUd8nUlvN4Ew=  wget-1.21.4.tar.gz
>   42384273c1937458c9db3766a5509afa636a2f00  wget-1.21.4.tar.lz
>   NoNhml9Q7cvMsXIKeQBvo3v5uaJVqMW0gEi8PHqHS9k=  wget-1.21.4.tar.lz
>
> Verify the base64 SHA256 checksum with cksum -a sha256 --check
> from coreutils-9.2 or OpenBSD's cksum since 2007.
>
> Use a .sig file to verify that the corresponding file (without the
> .sig suffix) is intact.  First, be sure to download both the .sig file
> and the corresponding tarball.  Then, run a command like this:
>
>   gpg --verify wget-1.21.4.tar.gz.sig
>
> The signature should match the fingerprint of the following key:
>
>   pub   rsa4096 2015-10-14 [SC]
> 7845 120B 07CB D8D6 ECE5  FF2B 2A17 43ED A91A 35B6
>   uid   Darshit Shah 
>   uid   Darshit Shah 
>
> If that command fails because you don't have the required public key,
> or that public key has expired, try the following commands to retrieve
> or refresh it, and then rerun the 'gpg --verify' command.
>
>   gpg --locate-external-key g...@darnir.net
>
>   gpg --recv-keys 64FF90AAE8C70AF9
>
>   wget -q -O- 
> 'https://savannah.gnu.org/project/release-gpgkeys.php?group=wget=1' 
> | gpg --import -
>
> As a last resort to find the key, you can try the official GNU
> keyring:
>
>   wget -q https://ftp.gnu.org/gnu/gnu-keyring.gpg
>   gpg --keyring gnu-keyring.gpg --verify wget-1.21.4.tar.gz.sig
>
> This release was bootstrapped with the following tools:
>   Autoconf 2.71
>   Automake 1.16.5
>   Gnulib v0.1-6178-gdfdf33a466
>
> NEWS
>
> * Noteworthy changes in release 1.21.4 (2023-05-11)
>
> ** Document --retry-on-host-error in help text
>
> ** Increase read buffer size to 64k. This should speed up downloads on 
> gigabit and faster connections
>
> ** Update deprecated option '--html-extension' to '--adjust-extension' in 
> documentation
>
> ** Update gnulib compatibility layer.
>Fixes HSTS test failures on i686. (Thanks to Andreas Enge for ponting it 
> out)
>
>


PKGBUILD
Description: Binary data


Re: Test failures on i686-linux

2023-05-10 Thread Andreas Enge
Hello,

a quick reminder, it would be great if we could get a new wget1 release!

Thanks,

Andreas


Am Mon, Apr 17, 2023 at 09:18:33AM +0200 schrieb Darshit Shah:
> I'll try and make a new release this week. 
> 
> On Sun, Apr 16, 2023, at 20:51, Andreas Enge wrote:
> > Hi Tim,
> >
> > Am Sun, Apr 16, 2023 at 06:38:32PM +0200 schrieb Tim Rühsen:
> >> Hm, cb114... looks like it's the needed commit. Maybe also cherry-pick
> >> 27d3fcba3331a981bcb8807c663a16b8fa4ebeb3 (gnulib update).
> >
> > it looks like this is definitely needed. But integrating it into our build
> > system is tricky, since it is not just a matter of applying a patch to
> > the tarball. (Actually, it looks like the gnulib update is the only one
> > that is needed. When I run ./bootstrap with the new gnulib, then git
> > checkout v1.21.3, ./configure and make dist, I get a tarball that works
> > for us on i686.)
> >
> >> > Have you got an idea which other commit would be crucial? Or do you think
> >> > you could make a new release soonish?
> >> We should indeed make a release soon. Do you have some spare time @Darshit 
> >> ?
> >
> > That would indeed be most welcome! I would be happy to test a release
> > candidate. The one I got and put there:
> >https://www.multiprecision.org/wget-1.21.3.24-2b723.tar.lz
> > works with the core-updates branch of Guix on i686 and x86_64.
> >
> > Andreas



Re: Termux error

2023-05-07 Thread Tim Rühsen

On 5/1/23 13:04, Abdullah Basharat wrote:

wget -0 install- nethunter-termux https://offs.ec/2MceZWr  please correct
this error for upgrade turmux


Correct command line:
wget -O install-nethunter-termux https://offs.ec/2MceZWr


OpenPGP_signature
Description: OpenPGP digital signature


Re: Recursive downloading of pages through the "action" attributes of the following "form" tags

2023-04-22 Thread BERBAR Florian

Hello Tim,

The 3 pages used during my tests are the following :

form.html:
form.html:    
form.html:        
form.html:            
form.html:        
form.html:    
form.html:

post.html:
post.html:    
post.html:        
post.html:    
post.html:

link.html:
link.html:    
link.html:        
link.html:    
link.html:


I tried to download all the three pages with recursive mode using the 
following command but only the first page was downloaded (form.html) :


$ wget -r http://127.0.0.1/form.html


Regards,

Florian

On 4/22/23 20:21, Tim Rühsen wrote:

On 17.04.23 21:22, BERBAR Florian wrote:

Hi folk,

I have question about recursive downloading of webpages. Trying to 
download all pages from a website using recursing option 
(--recursive) on wget 1.21, the webpages processing seems to don't 
follow form "action" attributs of "form" tags.


- Does it be the expecting behavior?
- Is there a combination of options to download all pages of a 
website with the attribut "action"?



Exemple with 3 HTML pages :

- Page 1 - form.html : HTML form with "action" attribut pointing to 
"Page 2"

- Page 2 - post.html : HTML page with a link to "Page 3".
- Page 3 - link.html : HTML page without link.

I tried this command to download all tree pages but only "Page 1" was 
downloaded:


$ wget -r https://host/form.html


I tried "--follow-tags=form" option but the same behavior was observed.


Generally, Wget supports form tags with action attributes.
So maybe you encounter malformed HTML or there is a bug in Wget.

Could you please give us a copy of that page, or at least the HTML 
part containing the form tags ?


Regards, Tim




Regards,

Florian






Re: Recursive downloading of pages through the "action" attributes of the following "form" tags

2023-04-22 Thread Tim Rühsen

On 17.04.23 21:22, BERBAR Florian wrote:

Hi folk,

I have question about recursive downloading of webpages. Trying to 
download all pages from a website using recursing option (--recursive) 
on wget 1.21, the webpages processing seems to don't follow form 
"action" attributs of "form" tags.


- Does it be the expecting behavior?
- Is there a combination of options to download all pages of a website 
with the attribut "action"?



Exemple with 3 HTML pages :

- Page 1 - form.html : HTML form with "action" attribut pointing to 
"Page 2"

- Page 2 - post.html : HTML page with a link to "Page 3".
- Page 3 - link.html : HTML page without link.

I tried this command to download all tree pages but only "Page 1" was 
downloaded:


$ wget -r https://host/form.html


I tried "--follow-tags=form" option but the same behavior was observed.


Generally, Wget supports form tags with action attributes.
So maybe you encounter malformed HTML or there is a bug in Wget.

Could you please give us a copy of that page, or at least the HTML part 
containing the form tags ?


Regards, Tim




Regards,

Florian


OpenPGP_signature
Description: OpenPGP digital signature


Re: Regarding unable to run wget on compute node

2023-04-18 Thread Petr Pisar
V Tue, Apr 18, 2023 at 02:02:41PM +0530, Vanshika Saxena napsal(a):
>I have a file.txt that contains several ftp links to various SRR files
> in the following format:-
> wget '
> ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR848/002/SRR8482202/SRR8482202_1.fastq.gz
> '
> When, I use these files on my local Ubuntu 22.04 system or HPC CLuster
> login node, the program runs and returns the fastq files but when, this
> program is run through a bash script and executed on compute node it gives
> the following error:
> --2023-04-18 12:17:07--
> ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR848/002/SRR8482202/SRR8482202_1.fastq.gz
>=> �^�^�SRR8482202_1.fastq.gz�^�^�
> Resolving ftp.sra.ebi.ac.uk (ftp.sra.ebi.ac.uk)... failed: Temporary
> failure in name resolution.
> wget: unable to resolve host address �^�^�ftp.sra.ebi.ac.uk�^�^�

It seems that your compute node has broken domain name resolution. Does
"getent hosts ftp.sra.ebi.ac.uk" command work on that compute node? Or any
other downloader, like curl?

-- Petr


signature.asc
Description: PGP signature


Re: Test failures on i686-linux

2023-04-17 Thread Darshit Shah
I'll try and make a new release this week. 

On Sun, Apr 16, 2023, at 20:51, Andreas Enge wrote:
> Hi Tim,
>
> Am Sun, Apr 16, 2023 at 06:38:32PM +0200 schrieb Tim Rühsen:
>> Hm, cb114... looks like it's the needed commit. Maybe also cherry-pick
>> 27d3fcba3331a981bcb8807c663a16b8fa4ebeb3 (gnulib update).
>
> it looks like this is definitely needed. But integrating it into our build
> system is tricky, since it is not just a matter of applying a patch to
> the tarball. (Actually, it looks like the gnulib update is the only one
> that is needed. When I run ./bootstrap with the new gnulib, then git
> checkout v1.21.3, ./configure and make dist, I get a tarball that works
> for us on i686.)
>
>> > Have you got an idea which other commit would be crucial? Or do you think
>> > you could make a new release soonish?
>> We should indeed make a release soon. Do you have some spare time @Darshit ?
>
> That would indeed be most welcome! I would be happy to test a release
> candidate. The one I got and put there:
>https://www.multiprecision.org/wget-1.21.3.24-2b723.tar.lz
> works with the core-updates branch of Guix on i686 and x86_64.
>
> Andreas



Re: Test failures on i686-linux

2023-04-16 Thread Andreas Enge
Hi Tim,

Am Sun, Apr 16, 2023 at 06:38:32PM +0200 schrieb Tim Rühsen:
> Hm, cb114... looks like it's the needed commit. Maybe also cherry-pick
> 27d3fcba3331a981bcb8807c663a16b8fa4ebeb3 (gnulib update).

it looks like this is definitely needed. But integrating it into our build
system is tricky, since it is not just a matter of applying a patch to
the tarball. (Actually, it looks like the gnulib update is the only one
that is needed. When I run ./bootstrap with the new gnulib, then git
checkout v1.21.3, ./configure and make dist, I get a tarball that works
for us on i686.)

> > Have you got an idea which other commit would be crucial? Or do you think
> > you could make a new release soonish?
> We should indeed make a release soon. Do you have some spare time @Darshit ?

That would indeed be most welcome! I would be happy to test a release
candidate. The one I got and put there:
   https://www.multiprecision.org/wget-1.21.3.24-2b723.tar.lz
works with the core-updates branch of Guix on i686 and x86_64.

Andreas




Re: Test failures on i686-linux

2023-04-16 Thread Tim Rühsen

Hi,

On 16.04.23 15:58, Andreas Enge wrote:

Hello,

when trying to build wget for i686-linux in Guix, we are getting test
failures (x86_64 works well):
FAIL: Test-hsts
Error: Expected file hw not found..
FAIL: Test--https
Error: Expected file File1 not found..
FAIL: Test-pinnedpubkey-der-https
Error: Expected file File1 not found..
FAIL: Test-pinnedpubkey-hash-https
Error: Expected file File1 not found..
FAIL: Test-pinnedpubkey-pem-https
Error: Expected file File1 not found..

Using a tarball rolled with "make dist" from the latest git HEAD solves the
problem, just backporting commit cb114fbbf73eb687d28b01341c8d4266ffa96c9d
does not seem to be enough.


Hm, cb114... looks like it's the needed commit. Maybe also cherry-pick 
27d3fcba3331a981bcb8807c663a16b8fa4ebeb3 (gnulib update).



Have you got an idea which other commit would be crucial? Or do you think
you could make a new release soonish?


We should indeed make a release soon. Do you have some spare time @Darshit ?



Thanks for your help,

Andreas




Regards, Tim


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] Update deprecated option in Wget documentation

2023-04-09 Thread Tim Rühsen

On 09.04.23 09:38, chiang jinfu wrote:

Hello Wget maintainers,

Since Wget 1.12 renames '--html-extension' to '--adjust-extension', it is 
better to update the usage in the documentation.  I've attached a patch to 
update the deprecated option '--html-extension' to its new name 
'--adjust-extension' in the Examples chapter. Please review and let me know if 
any additional changes are necessary.

Thanks
Jinfu


Thanks, applied :)

Regards, Tim


OpenPGP_signature
Description: OpenPGP digital signature


Re: wget core dump on ubuntu 22.04

2023-03-19 Thread ge...@mweb.co.za
Even TLS 1.2 is not quite "current" - it dates back to 2008, TLS 1.3 was 
published in 2018. 

Quoting wikipedia: 

Website protocol support (as per May 2022) 

Protocol
version Website
support[87] Security 
SSL 2.0 0.3%Insecure
SSL 3.0 2.5%Insecure 
TLS 1.0 37.1%   Deprecated 
TLS 1.1 40.6%   Deprecated 
TLS 1.2 99.7%   Depends on cipher and client mitigations
TLS 1.3 54.2%   Secure 

I suppose the main reason why TLS 1.2 is not deprecated yet is that the support 
for TLS 1.3 in the field is not quite everywhere yet. 

But anything called SSL is akin to no security at all. (There may still be 
valid excuses for supporting it, especially if you don't really need the 
connection to be secure and the effort for reconfiguration seems too high.)

Gerd




- Original Message -
From: "Jeffrey Walton" 
To: "Art Mellor" 
Cc: "bug-wget" 
Sent: Sunday, March 19, 2023 6:38:17 PM
Subject: Re: wget core dump on ubuntu 22.04

On Tue, Feb 28, 2023 at 7:08 PM Art Mellor  wrote:
>
> *% wget -v --tries=1 --secure-protocol=SSLv3 -T 3 --no-check-certificate
> https://localhost:1234 <https://localhost:1234>*
> --2023-02-28 12:15:09--  https://localhost:1234/
> OpenSSL: unimplemented 'secure-protocol' option value 2
> Please report this issue to bug-wget@gnu.org
> Aborted

Off topic, I would be interested in learning your use case for SSLv3.

Nowadays, it is usually TLS 1.2 and above. I even see TLS 1.0 and
above on occasion.

But not SSLv2 or SSLv3. SSLv2 died about 25 years ago, and SSLv3 died
about 10 years ago.

Jeff



Re: wget core dump on ubuntu 22.04

2023-03-19 Thread Art Mellor
It's for QA on some software we provide, which supports some really old
stuff that some of our long-time customers still use. So we like to make
sure it works. It's not ideal, but we get paid to support it. :-/

: Art Mellor : Skelmir LLC : 414-678-9011 : a...@skelmir.com :



On Sun, Mar 19, 2023 at 11:38 AM Jeffrey Walton  wrote:

> On Tue, Feb 28, 2023 at 7:08 PM Art Mellor  wrote:
> >
> > *% wget -v --tries=1 --secure-protocol=SSLv3 -T 3 --no-check-certificate
> > https://localhost:1234 *
> > --2023-02-28 12:15:09--  https://localhost:1234/
> > OpenSSL: unimplemented 'secure-protocol' option value 2
> > Please report this issue to bug-wget@gnu.org
> > Aborted
>
> Off topic, I would be interested in learning your use case for SSLv3.
>
> Nowadays, it is usually TLS 1.2 and above. I even see TLS 1.0 and
> above on occasion.
>
> But not SSLv2 or SSLv3. SSLv2 died about 25 years ago, and SSLv3 died
> about 10 years ago.
>
> Jeff
>


Re: wget core dump on ubuntu 22.04

2023-03-19 Thread Jeffrey Walton
On Tue, Feb 28, 2023 at 7:08 PM Art Mellor  wrote:
>
> *% wget -v --tries=1 --secure-protocol=SSLv3 -T 3 --no-check-certificate
> https://localhost:1234 *
> --2023-02-28 12:15:09--  https://localhost:1234/
> OpenSSL: unimplemented 'secure-protocol' option value 2
> Please report this issue to bug-wget@gnu.org
> Aborted

Off topic, I would be interested in learning your use case for SSLv3.

Nowadays, it is usually TLS 1.2 and above. I even see TLS 1.0 and
above on occasion.

But not SSLv2 or SSLv3. SSLv2 died about 25 years ago, and SSLv3 died
about 10 years ago.

Jeff



  1   2   3   4   5   6   7   8   9   10   >