Re: URL query syntax

2023-10-02 Thread Petr Pisar
V Mon, Oct 02, 2023 at 10:54:54AM +0100, Bachir Bendrissou napsal(a):
> Hi,
> 
> Are there any query strings that are invalid and should be rejected by the
> Wget parser?
> 
> Wget seems to accept all sorts of strings in the query segment. For example:
> 
> 
> 
> *"https://example.com/?a=ba=a "*
> The URL is accepted with no errors reported, despite missing a delimiter.
> 
> Is this correct?
> 
> Thank you,
> Bachir

Please do not cross post. The very same question on curl-list
 already got an
answer.

-- Petr


signature.asc
Description: PGP signature


Re: Regarding unable to run wget on compute node

2023-04-18 Thread Petr Pisar
V Tue, Apr 18, 2023 at 02:02:41PM +0530, Vanshika Saxena napsal(a):
>I have a file.txt that contains several ftp links to various SRR files
> in the following format:-
> wget '
> ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR848/002/SRR8482202/SRR8482202_1.fastq.gz
> '
> When, I use these files on my local Ubuntu 22.04 system or HPC CLuster
> login node, the program runs and returns the fastq files but when, this
> program is run through a bash script and executed on compute node it gives
> the following error:
> --2023-04-18 12:17:07--
> ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR848/002/SRR8482202/SRR8482202_1.fastq.gz
>=> �^�^�SRR8482202_1.fastq.gz�^�^�
> Resolving ftp.sra.ebi.ac.uk (ftp.sra.ebi.ac.uk)... failed: Temporary
> failure in name resolution.
> wget: unable to resolve host address �^�^�ftp.sra.ebi.ac.uk�^�^�

It seems that your compute node has broken domain name resolution. Does
"getent hosts ftp.sra.ebi.ac.uk" command work on that compute node? Or any
other downloader, like curl?

-- Petr


signature.asc
Description: PGP signature


[bug #63253] install wget sha256sum impersonated from Stockholm's android servers

2022-10-23 Thread Petr Pisar
Follow-up Comment #1, bug #63253 (project wget):

What URL do you download and what tool do you use for that?


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: 32 bit to 64 bit casting

2022-06-30 Thread Petr Pisar
V Thu, Jun 30, 2022 at 07:12:50PM +0200, Petr Pisar napsal(a):
> V Thu, Jun 30, 2022 at 07:04:43AM -0500, KeithG napsal(a):
> > On Wed, Jun 22, 2022 at 12:16 PM Petr Pisar  wrote:
> > >
> > > Can you post here a complete certificate chain the server presents to 
> > > wget?
> > > You can use "openssl s_client -connect THE_HOST:https -showcerts to 
> > > obtain it.
> > > If it so, than the only fix is to recompile your system with 
> > > "-D_TIME_BITS=64
> > > -D_FILE_OFFSET_BITS=64" CFLAGS. (Provided your platform supports it.)
> > >
> > > -- Petr
> > 
> > Petr,
> > 
> > I have done a bit more checking and it appears some configurational
> > issue with Arch on armv7 is the cause. I have not been able to pin it
> > down. I did verify that the binary is built with the 2 flags you
> > mention:
> > This is a snippet of the config.log for the binary that does not work:
> > 
> > | #define HAVE_FSEEKO 1
> > | #define _FILE_OFFSET_BITS 64
> > | #define _TIME_BITS 64
> > | #define _FILE_OFFSET_BITS 64
> > | /* end confdefs.h.  */
> > | #if defined __aarch64__ && !(defined __ILP32__ || defined _ILP32)
> > |int ok;
> > |   #else
> > |error fail
> > |   #endif
> > |
> 
> It's normal that some configure tests fail. This is how configure explores
> a system. Specifically this test checks whether the system is 64-bit. Because
> armv7 isn't, the compilation fails and configure sets
> gl_cv_host_cpu_c_abi_32bit=yes.
> 
> I recommed you reading sources in ./m4 directory instead of ./configure.
> 
> > configure:11550: result: yes
> > configure:11559: checking for ELF binary format
> > configure:11583: result: yes
> > configure:11635: checking for the common suffixes of directories in
> > the library search path
> > configure:11704: result: lib,lib,lib
> > configure:12229: checking for CFPreferencesCopyAppValue
> > configure:12248: gcc -o conftest -DNDEBUG -march=armv7-a
> > -mfloat-abi=hard -mfpu=vfpv3-d16 -O2 -pipe -fstack-protector-strong
> > -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat
> > -Werror=format-security -fstack-clash-protection
> > -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now conftest.c
> > -Wl,-framework -Wl,CoreFoundation >&5
> > conftest.c:74:10: fatal error: CoreFoundation/CFPreferences.h: No such
> > file or directory
> >74 | #include 
> >   |  ^~~~
> > compilation terminated.
> > distcc[3295] ERROR: compile conftest.c on localhost failed
> > configure:12248: $? = 1
> > configure: failed program was:
> > | /* confdefs.h */
> > 
> 
> > I tried it on RpiOS using the same architecture (armv7) which builds a
> > binary that passes all the tests and works properly and this part of
> > the config.log is dramatically different:
> > 
> > | #define HAVE_FSEEKO 1
> > | #define _FILE_OFFSET_BITS 64
> > | /* end confdefs.h.  */
> > |
> > |   #include 
> > |   /* Check that time_t can represent 2**32 - 1 correctly.  */
> > |   #define LARGE_TIME_T \
> > | ((time_t) (((time_t) 1 << 30) - 1 + 3 * ((time_t) 1 << 30)))
> > |   int verify_time_t_range[(LARGE_TIME_T / 65537 == 65535
> > |&& LARGE_TIME_T % 65537 == 0)
> > |   ? 1 : -1];
> > |
> > configure:9815: result: no
> > configure:9818: checking for 64-bit time_t with _TIME_BITS=64
> > configure:9838: gcc -c -DNDEBUG -g -O2  conftest.c >&5
> > conftest.c:78:43: warning: integer overflow in expression of type
> > 'long int' results in '-1073741824' [-Woverflow]
> >78 | ((time_t) (((time_t) 1 << 30) - 1 + 3 * ((time_t) 1 << 30)))
> >   |   ^
> > conftest.c:79:28: note: in expansion of macro 'LARGE_TIME_T'
> >79 |   int verify_time_t_range[(LARGE_TIME_T / 65537 == 65535
> >   |^~~~
> > conftest.c:78:43: warning: integer overflow in expression of type
> > 'long int' results in '-1073741824' [-Woverflow]
> >78 | ((time_t) (((time_t) 1 << 30) - 1 + 3 * ((time_t) 1 << 30)))
> >   |   ^
> > conftest.c:80:31: note: in expansion of macro 'LARGE_TIME_T'
> >80 |&& LARGE_TIME_T % 65537 == 0)
> >   |   ^~~~
&

Re: 32 bit to 64 bit casting

2022-06-30 Thread Petr Pisar
V Thu, Jun 30, 2022 at 07:04:43AM -0500, KeithG napsal(a):
> On Wed, Jun 22, 2022 at 12:16 PM Petr Pisar  wrote:
> >
> > Can you post here a complete certificate chain the server presents to wget?
> > You can use "openssl s_client -connect THE_HOST:https -showcerts to obtain 
> > it.
> > If it so, than the only fix is to recompile your system with 
> > "-D_TIME_BITS=64
> > -D_FILE_OFFSET_BITS=64" CFLAGS. (Provided your platform supports it.)
> >
> > -- Petr
> 
> Petr,
> 
> I have done a bit more checking and it appears some configurational
> issue with Arch on armv7 is the cause. I have not been able to pin it
> down. I did verify that the binary is built with the 2 flags you
> mention:
> This is a snippet of the config.log for the binary that does not work:
> 
> | #define HAVE_FSEEKO 1
> | #define _FILE_OFFSET_BITS 64
> | #define _TIME_BITS 64
> | #define _FILE_OFFSET_BITS 64
> | /* end confdefs.h.  */
> | #if defined __aarch64__ && !(defined __ILP32__ || defined _ILP32)
> |int ok;
> |   #else
> |error fail
> |   #endif
> |

It's normal that some configure tests fail. This is how configure explores
a system. Specifically this test checks whether the system is 64-bit. Because
armv7 isn't, the compilation fails and configure sets
gl_cv_host_cpu_c_abi_32bit=yes.

I recommed you reading sources in ./m4 directory instead of ./configure.

> configure:11550: result: yes
> configure:11559: checking for ELF binary format
> configure:11583: result: yes
> configure:11635: checking for the common suffixes of directories in
> the library search path
> configure:11704: result: lib,lib,lib
> configure:12229: checking for CFPreferencesCopyAppValue
> configure:12248: gcc -o conftest -DNDEBUG -march=armv7-a
> -mfloat-abi=hard -mfpu=vfpv3-d16 -O2 -pipe -fstack-protector-strong
> -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat
> -Werror=format-security -fstack-clash-protection
> -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now conftest.c
> -Wl,-framework -Wl,CoreFoundation >&5
> conftest.c:74:10: fatal error: CoreFoundation/CFPreferences.h: No such
> file or directory
>74 | #include 
>   |  ^~~~
> compilation terminated.
> distcc[3295] ERROR: compile conftest.c on localhost failed
> configure:12248: $? = 1
> configure: failed program was:
> | /* confdefs.h */
> 

> I tried it on RpiOS using the same architecture (armv7) which builds a
> binary that passes all the tests and works properly and this part of
> the config.log is dramatically different:
> 
> | #define HAVE_FSEEKO 1
> | #define _FILE_OFFSET_BITS 64
> | /* end confdefs.h.  */
> |
> |   #include 
> |   /* Check that time_t can represent 2**32 - 1 correctly.  */
> |   #define LARGE_TIME_T \
> | ((time_t) (((time_t) 1 << 30) - 1 + 3 * ((time_t) 1 << 30)))
> |   int verify_time_t_range[(LARGE_TIME_T / 65537 == 65535
> |&& LARGE_TIME_T % 65537 == 0)
> |   ? 1 : -1];
> |
> configure:9815: result: no
> configure:9818: checking for 64-bit time_t with _TIME_BITS=64
> configure:9838: gcc -c -DNDEBUG -g -O2  conftest.c >&5
> conftest.c:78:43: warning: integer overflow in expression of type
> 'long int' results in '-1073741824' [-Woverflow]
>78 | ((time_t) (((time_t) 1 << 30) - 1 + 3 * ((time_t) 1 << 30)))
>   |   ^
> conftest.c:79:28: note: in expansion of macro 'LARGE_TIME_T'
>79 |   int verify_time_t_range[(LARGE_TIME_T / 65537 == 65535
>   |^~~~
> conftest.c:78:43: warning: integer overflow in expression of type
> 'long int' results in '-1073741824' [-Woverflow]
>78 | ((time_t) (((time_t) 1 << 30) - 1 + 3 * ((time_t) 1 << 30)))
>   |   ^
> conftest.c:80:31: note: in expansion of macro 'LARGE_TIME_T'
>80 |&& LARGE_TIME_T % 65537 == 0)
>   |   ^~~~
> conftest.c:79:7: error: variably modified 'verify_time_t_range' at file scope
>79 |   int verify_time_t_range[(LARGE_TIME_T / 65537 == 65535
>   |   ^~~
> configure:9838: $? = 1
> configure: failed program was:
> | /* confdefs.h */
> 
> I am not that well versed in what I should do at this point, but it
> appears that the tests for 64 bit time are different and I wonder why
> they are and what should be fixed.
> 
I think you compare apples and oranges. 32-bit time_t on Debian and 64-bit
t

Re: 32 bit to 64 bit casting

2022-06-24 Thread Petr Pisar
V Wed, Jun 22, 2022 at 06:42:44PM -0500, KeithG napsal(a):
> On Wed, Jun 22, 2022 at 12:16 PM Petr Pisar  wrote:
> > That patch does not seem right. gnutls_x509_crt_get_expiration_time() 
> > returns
> > time_t and now is also time_t.
> >
> > Either there is a bug in gnutls, or glibc, or simply the expiration time of
> > the certificate is not representable with 32-bit time_t type. I would not be
> > surprised if it were the last case.
> >
> > Can you post here a complete certificate chain the server presents to wget?
> > You can use "openssl s_client -connect THE_HOST:https -showcerts to obtain 
> > it.
> > If it so, than the only fix is to recompile your system with 
> > "-D_TIME_BITS=64
> > -D_FILE_OFFSET_BITS=64" CFLAGS. (Provided your platform supports it.)
> >
> > -- Petr
> 
> I am not sure what I need to reply when the command completes. I typed
> '0'. This is on an armv7 running arch linux:
> 
> # openssl s_client -connect google.com:https -showcerts
> CONNECTED(0003)
> depth=2 C = US, O = Google Trust Services LLC, CN = GTS Root R1
> verify return:1
> depth=1 C = US, O = Google Trust Services LLC, CN = GTS CA 1C3
> verify return:1
> depth=0 CN = *.google.com
> verify return:1
> ---
> Certificate chain
>  0 s:CN = *.google.com
>i:C = US, O = Google Trust Services LLC, CN = GTS CA 1C3

The certificates look good. Their timestamps fit into 32-bit time_t
resolution.

Are you sure a machine where wget fails has correctly set clock? The server
certificate expires on Aug 29 08:29:45 2022 GMT.

I tried wget-1.21.3 built with GnuTLS on i686 machine, which is also 32-bit
platform, with Fedora operating system and I don't observe any failure.

Can you try using GnuTLS client directly on the affected system? Make sure
an argument of --x509cafile option points to a file with all CA certificates:

gnutls-cli --x509cafile /etc/ssl/certs/ca-bundle.crt --port https google.com

If this is a bug in GnuTLS (or some of its libraries), it should fail,
too.

-- Petr


signature.asc
Description: PGP signature


Re: 32 bit to 64 bit casting

2022-06-22 Thread Petr Pisar
V Tue, Jun 21, 2022 at 10:54:47PM -0500, KeithG napsal(a):
> With the latest wget 1.21.3 on arch linux on armv7, I was experiencing
> a 'certificate expired' error. This has been going on for a while but
> only on my 32 bit machine. The 64 bit version never had a problem on
> x86_64 or aarch64. 1.21.1 did not have this problem on 32 bit
> machines, either. I was able to resolve the issue by a patch found in
> this thread:
> https://archlinuxarm.org/forum/viewtopic.php?f=57=16070=69630#p69630
> I still have test problems for the iri tests, but this patch does fix
> the https certificate bug on arch arm armv7 and maybe other 32 bit OS.
> 

   time_t now = time (NULL);
   [...]
--- wget-1.21.3.org/src/gnutls.c2022-02-26 15:47:42.0 +0100
+++ wget-1.21.3/src/gnutls.c2022-06-21 20:51:40.244552644 +0200
@@ -1085,7 +1085,7 @@
   logprintf (LOG_NOTQUIET, _("The certificate has not yet been 
activated\n"));
   success = false;
 }
-  if (now >= gnutls_x509_crt_get_expiration_time (cert))
+  if (now >= (unsigned long) gnutls_x509_crt_get_expiration_time (cert))
 {
   logprintf (LOG_NOTQUIET, _("The certificate has expired\n"));
   success = false;

That patch does not seem right. gnutls_x509_crt_get_expiration_time() returns
time_t and now is also time_t.

Either there is a bug in gnutls, or glibc, or simply the expiration time of
the certificate is not representable with 32-bit time_t type. I would not be
surprised if it were the last case.

Can you post here a complete certificate chain the server presents to wget?
You can use "openssl s_client -connect THE_HOST:https -showcerts to obtain it.
If it so, than the only fix is to recompile your system with "-D_TIME_BITS=64
-D_FILE_OFFSET_BITS=64" CFLAGS. (Provided your platform supports it.)

-- Petr


signature.asc
Description: PGP signature


Re: wget fails with this url

2022-05-20 Thread Petr Pisar
V Fri, May 20, 2022 at 06:18:15PM +0200, ge...@mweb.co.za napsal(a):
> (not sure if reply all is appropriate here)
> 
> I tried this from South Africa and I am getting the exact behaviour as the 
> OP. 
> 
It works from the Czech Republic. Both Firefox and wget.

> (fwiw: The downloaded audio sounds Italian to me.)
>
Yes.

> Of course, the 403 is the server's response to what it finds out about the
> requestor in the request. I just wonder what the difference is between the
> browser-generated request and the wget request and how the server could
> react in this way - cookies, maybe?
> 
In my case I cannot see any cookies involved. Nonetheless, wget supports
cookies by default. That should not be a problem.

In Firefox, press F12 to invoke a debugger, select Network tab, and request the
URL from in address bar. Then you can study HTTP headers sent and recieved in
Headers tab for each of the three listed requests.

Wget has -S option to print the recieved headers. Sent headers can be only
seen in a warc dump file created with --warc-file option.

You can use --header option to manually inject or override a particular HTTP
header.

The server actually uses Cloudfront proxies. I guess the proxies in your
region behaves differently.

Maybe pinning d1bxy2pveef3fq.cloudfront.net server to a particular IP address
could help. My system used 65.9.96.42. But be aware the the same IP address
does not guarantee anything. The same address might be assigined to multiple
hosts and routed differently based on the autonomous system of the client
(anycast).

-- Petr


signature.asc
Description: PGP signature


Re: wget - uninplemented 'secure-protocol' option value 1

2022-05-15 Thread Petr Pisar
V Sun, May 15, 2022 at 09:44:56AM +0200, kg.202205@kgsw.de napsal(a):
> $ wget --no-check-certificate --secure-protocol=SSLv2 
> https://192.168.2.106/CN8000
> --2022-05-15 09:29:30--  https://192.168.2.106/CN8000
> OpenSSL: unimplemented 'secure-protocol' option value 1

I think this error messages says that your OpenSSL library does not support
SSLv2. This is not a problem in wget, but in your OpenSSL.

See "openssl s_client -help" error output. If there is no -ssl2 option listed
among -tls1_3 (compare to
), then your
OpenSSL does not support it.

> Please report this issue to bug-wget@gnu.org

Maybe wget could be smarter and recommend contacting OpenSSL in this case.

-- Petr


signature.asc
Description: PGP signature


[bug #62110] HSTS broken on 32 bit big endian devices

2022-03-18 Thread Petr Pisar
Follow-up Comment #11, bug #62110 (project wget):

Indeed. A 32-bit big-endian MIPS:

$ file
staging_dir/toolchain-mips_24kc_gcc-8.4.0_musl/lib/gcc/mips-openwrt-linux-musl/8.4.0/crtbegin.o

staging_dir/toolchain-mips_24kc_gcc-8.4.0_musl/lib/gcc/mips-openwrt-linux-musl/8.4.0/crtbegin.o:
ELF 32-bit MSB relocatable, MIPS, MIPS32 rel2 version 1 (SYSV), not stripped

I compiled with it a simple code to check the sizes: time_t is 4 bytes, int is
4 bytes, int64_t is supported and has 8 bytes.

I tried adding glibc's -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64, but that did
not change anything. musl C library does not support it. musl supports 64-bit
time_t since version 1.2.0 . And of course
on needs "recent" Linux. I guess this is not the case of OpenWRT.


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #62110] HSTS broken on 32 bit big endian devices

2022-03-17 Thread Petr Pisar
Follow-up Comment #9, bug #62110 (project wget):

Yes, i686 is 32-bit little-endian and GCC supports int64_t there. However, the
original reporter wrote "broken on 32 bit big endian devices".


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #62110] HSTS broken on 32 bit big endian devices

2022-03-17 Thread Petr Pisar
Follow-up Comment #7, bug #62110 (project wget):

That gnulib "int64" code does not work on all platforms. It only works on
platforms where a 64-bit integer type is supported by the compiler (int64_t in
stdint.h). For the sake of completeness, it is supported e.g. by GCC on 32-bit
little-endian x86. I have no idea about the platform of the original reporter.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




Re: wget BUG: header host assign connection address

2021-06-22 Thread Petr Pisar
V Tue, Jun 22, 2021 at 02:55:29PM +0430, Pejman Taslimi napsal(a):
> The following command with any random IP retrieves google.com! Here I've
> just set a header, but wget connects really to google.com instead of
> 192.168.15.15.
> 
> $ wget -O- http://192.168.15.15 --header="Host: www.google.com"
> 
> However if I change http to https, it behaves as expected:
> 
> $ wget -O- https://192.168.15.15 --header="Host: www.google.com"
> 
> happens to the following version:
> wget-1.20.3-4.fc32.x86_64
> wget-1.21.1-2.fc32.x86_64
> 
> I wish this is the correct way of bug report, for this beloved wget!!
> 
I cannot reproduce it (wget 1.21.1 on Gentoo). I think you have set up an HTTP
proxy which intercepts the requests and forwards them to a google.com. That
would explain why HTTPS does not work for you.

You have either the proxy set in your enviroment and wget should print about:

$ wget -O /dev/null http://192.168.15.15 --header='Host: www.google.com'
--2021-06-22 17:50:31--  http://192.168.15.15/
Resolving router.bayer.uni.cx (router.bayer.uni.cx)... 2001:470:993c::1, 
10.0.0.1
Connecting to router.bayer.uni.cx 
(router.bayer.uni.cx)|2001:470:993c::1|:8118... connected.
Proxy request sent, awaiting response... 503 Connect failed
2021-06-22 17:50:31 ERROR 503: Connect failed.

or you have a transparent proxy on your network a command like this should
manifest the same problem:

$ printf "GET http://192.168.15.15/ HTTP/1.1\r\nUser-Agent: 
Wget/1.21.1\r\nAccept: */*\r\nAccept-Encoding: identity\r\nHost: 
www.google.com\r\nConnection: Keep-Alive\r\nProxy-Connection: 
Keep-Alive\r\n\r\n" | nc 192.168.15.15 80
(UNKNOWN) [192.168.15.15] 80 (http) : No route to host

-- Petr


signature.asc
Description: PGP signature


[bug #60494] Percent character in filename gets escaped twice

2021-05-16 Thread Petr Pisar
Follow-up Comment #6, bug #60494 (project wget):

You cannot state a question like that because a random string is ambiguous by
it's nature.

According to the specification
 there is nothing
as an unescaped URI. URI is always escaped by the definition.

Look at the original report: You have a file name
"qtwebengine-everywhere-src-5.15.2-%231904652.patch.gz". It's a file name. Not
an URI. If you construct a URL for the file name using an
"https://mirrors.slackware.com/slackware/slackware64-current/source/l/qt5/patches/;
base URL, then you need to escape the file name string and then append it it
after a path delimiter of the base URL. I.e. you convert the file name to
"qtwebengine-everywhere-src-5.15.2-%25231904652.patch.gz" and then append it
to the base resulting into
"https://mirrors.slackware.com/slackware/slackware64-current/source/l/qt5/patches/qtwebengine-everywhere-src-5.15.2-%25231904652.patch.gz;
URL. This URL is passed to to wget command. Thus wget should not escape it
again. It could validate and report an error. But not escape it.

I will quote the specification here:

   Under normal circumstances, the only time when octets within a URI
   are percent-encoded is during the process of producing the URI from
   its component parts.  This is when an implementation determines which
   of the reserved characters are to be used as subcomponent delimiters
   and which can be safely used as data.  Once produced, a URI is always
   in its percent-encoded form.

Please, pay attention to the last sentence.

Of course wget could state that its argument is a byte stream without any
other constrains. But a manual of wget(1) reads something different, it states
it's a URL:

SYNOPSIS
   wget [option]... [URL]...

Hence wget should not attempt any escaping.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




Re: Unpleasant particularity Wget

2021-03-09 Thread Petr Pisar
V Wed, Mar 10, 2021 at 02:34:05AM +0200, kmb...@yandex.ru napsal(a):
> Здравствуйте, Bug-wget.
> 
> Unpleasant particularity Wget
> In version 1.11.4 when I downloade file from http://sourceforge.net/ I get it.
> For example
> https://sourceforge.net/projects/sevenzip/files/7-Zip/17.01/7z1701.exe/download
> I get file with name 7z1701.exe
> 
> In new version in this case I get file with name "download".
> It is how 7z1701.exe but has name "download".
> Maybe you will correct it.
> OS: WinXP SP3 NTFS
> 
You've already sent this question under 475094225.20210307021...@yandex.ru
message ID and got an ansewer
.

-- Petr


signature.asc
Description: PGP signature


Re: Unpleasant particularity Wget

2021-03-07 Thread Petr Pisar
V Sun, Mar 07, 2021 at 02:15:40AM +0200, kmb...@yandex.ru napsal(a):
> Unpleasant particularity Wget
> In version 1.11.4 when I downloaded file from http://sourceforge.net/ got it.
> For example
> https://sourceforge.net/projects/sevenzip/files/7-Zip/17.01/7z1701.exe/download
> I got file with name 7z1701.exe
> wget.exe -x -c --no-check-certificate 
> https://sourceforge.net/projects/sevenzip/files/7-Zip/17.01/7z1701.exe/download
> 
> In new version in this case I got file with name "download".
> It is how 7z1701.exe but has name "download".

You need to pass --content-disposition option to wget to respect the name sent
by the server in a Content-Disposition HTTP header. Otherwise the new wget uses
the last component of the URL as a local file name.

-- Petr



signature.asc
Description: PGP signature


Re: Timeout never happens with https://www.tesco.com

2020-12-06 Thread Petr Pisar
V Sun, Dec 06, 2020 at 06:48:13PM +, Tom Crane napsal(a):
> tmp$ strace -s999 -f -e trace=network wget -S -v --no-check-certificate 
> --waitretry=1 --tries=1 --timeout=1 --read-timeout=1 --wait=1 -O 
> /tmp/280200433.tmp https://www.tesco.com/groceries/en-GB/products/280200433
[...]
> connect(4, {sa_family=AF_INET, sin_port=htons(443), 
> sin_addr=inet_addr("104.82.193.249")}, 16) = 0
> connected.
> HTTP request sent, awaiting response...
> 
> but the timeout never happens,
Do not restrict strace to network syscalls. We cannot see where it blocks.
The timeout should occur in select() syscall.

-- Petr


signature.asc
Description: PGP signature


Re: Bad arg length for Socket::inet_ntoa

2020-11-29 Thread Petr Pisar
V Sun, Nov 29, 2020 at 05:36:50PM +0100, Tim Rühsen napsal(a):
> Ya, this bug is fixed upstream since a while: 
> https://github.com/libwww-perl/HTTP-Daemon/issues/21.
> 
> Just for the record:
> - the patch just touches a single line of perl code.
> - it took since 2011 to fix it (report is from 2011-10-01).
> 
> Some "stable" OSes still won't upgrade... ¯\_(ツ)_/¯
> 
But that does not exlude that the stable OSes can port the IPv6 feature back
to their versions, like RHEL did. However, this discussion here is not
productive. If you have a problem with Ubuntu, you need to talk to Ubuntu.
Not to wget.

-- Petr


signature.asc
Description: PGP signature


Re: Is this a bug?

2020-06-14 Thread Petr Pisar
On Sun, Jun 14, 2020 at 04:02:49PM +, George R Goffe wrote:
> I occasionally get a message from wget(?) that states "just got xxx of yyy
> bytes".
> 
> Is this a failure in wget? A server problem? I have the source for wget1 and
> wget2 and don't see any strings "just got" in any of the code. Maybe it
> comes from one of the external libraries wget* uses?
> 
libwget/http.c in wget2:

if (nbytes < 0)
error_printf(_("Failed to read %zd bytes (%d)\n"), nbytes, 
errno);
if (body_len < resp->content_length) {
resp->length_inconsistent = true;
error_printf(_("Just got %zu of %zu bytes\n"), body_len, 
resp->content_length);
} else if (body_len > resp->content_length) {
resp->length_inconsistent = true;
error_printf(_("Body too large: %zu instead of %zu bytes\n"), 
body_len, resp->content_length);
}

It happens when a server claimed that the response body had XXX bytes, but
then the server only sent yyy bytes.

It is either a bug in the the server, or a network error. There also was a bug
in wget2 when resuming a file retrieval
.

-- Petr


signature.asc
Description: PGP signature


Re: SSL certificate issue

2020-06-02 Thread Petr Pisar
On Wed, Jun 03, 2020 at 12:35:06AM +0800, Fai Yip wrote:
> hi,
>   I am a web developer, and I use wget for some of my routines.  I am
> getting a certificate error, and I have no idea how to solve this.  I
> already check my date, and ca-certificates.
>   Please help
> 
> Thanks
> Fai
> 
> 
> fai@pi205:~ $ date
> Tue  2 Jun 12:32:14 EDT 2020
> fai@pi205:~ $ openssl s_client -connect www.odysseyofthemind.com:443
> 2>/dev/null |openssl x509 -noout  -dates
> notBefore=Jan 26 00:00:00 2020 GMT
> notAfter=Feb  7 23:59:59 2021 GMT
> 
> fai@pi205:~ $ wget  https://www.google.com
> --2020-06-02 12:32:23--  https://www.google.com/
> Resolving www.google.com (www.google.com)... 172.217.161.164,
> 2404:6800:4005:80f::2004
> Connecting to www.google.com (www.google.com)|172.217.161.164|:443...
> connected.
> HTTP request sent, awaiting response... 200 OK
> Length: unspecified [text/html]
> Saving to: ‘index.html.1’
> 
> index.html.1[ <=>
> 
>   ]  12.89K  --.-KB/sin
> 0.001s
> 
> 2020-06-02 12:32:23 (8.93 MB/s) - ‘index.html.1’ saved [13202]
> 
> fai@pi205:~ $ wget  https://www.odysseyofthemind.com
> --2020-06-02 12:32:28--  https://www.odysseyofthemind.com/
> Resolving www.odysseyofthemind.com (www.odysseyofthemind.com)...
> 144.208.72.199
> Connecting to www.odysseyofthemind.com
> (www.odysseyofthemind.com)|144.208.72.199|:443...
> connected.
> ERROR: The certificate of ‘www.odysseyofthemind.com’ is not trusted.
> ERROR: The certificate of ‘www.odysseyofthemind.com’ has expired.
> fai@pi205:~ $

See . In
short, update your GnuTLS library.

-- Petr


signature.asc
Description: PGP signature


Re: Sectigo root CA expiry issue

2020-05-30 Thread Petr Pisar
On Sat, May 30, 2020 at 07:57:22PM +0200, Tenboro wrote:
> Today I started getting some errors with a maintenance script that makes
> use of wget, where it claims that a certificate has expired.
> 
> DEBUG output created by Wget 1.19.5 on linux-gnu.
> 
> Reading HSTS entries from /root/.wget-hsts
> URI encoding = ‘UTF-8’
> --2020-05-30 17:29:58--  https://ehwiki.org/
> Certificates loaded: 154
> Resolving ehwiki.org (ehwiki.org)... 94.100.29.76
> Caching ehwiki.org => 94.100.29.76
> Connecting to ehwiki.org (ehwiki.org)|94.100.29.76|:443... connected.
> Created socket 4.
> Releasing 0x5633a3c84880 (new refcount 1).
> ERROR: The certificate of ‘ehwiki.org’ is not trusted.
> ERROR: The certificate of ‘ehwiki.org’ has expired.
> 
> However, the certificate does not expire until March 2021.

Yes. That's a badly worder error message by wget. The issue is not with
ehwiki.org certificate. The issue is with its authority's certificate.

> Doing the same
> with curl on the same box produces no errors, so it does not seem to be an
> issue with the system CA certs. Based on some slouching around, it seems to
> have something to do with wget not correctly handling the expiry of the
> Sectigo AddTrust root certificate:
> 
> https://support.sectigo.com/articles/Knowledge/Sectigo-AddTrust-External-CA-Root-Expiring-May-30-2020
> 
[...]
> The issue is present on CentOS 6, CentOS 7 and CentOS 8 installations with
> all updates applied.
> 
> I'm not sure if this is a distro issue or an issue with wget itself?

I experience it on Gentoo either. The problem is not in wget:

$ wget --version
GNU Wget 1.20.3 built on linux-gnu.

-cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls 
-ntlm +opie -psl +ssl/gnutls 

but in GnuTLS library:

$ gnutls-cli --port https ehwiki.org
Processed 158 CA certificate(s).
Resolving 'ehwiki.org:https'...
Connecting to '94.100.29.76:443'...
- Certificate type: X.509
- Got a certificate list of 3 certificates.
- Certificate[0] info:
 - subject `CN=ehwiki.org,OU=Gandi Standard SSL,OU=Domain Control Validated', 
issuer `CN=Gandi Standard SSL CA 2,O=Gandi,L=Paris,ST=Paris,C=FR', serial 
0x63a5ea656ff9efdfe68ec85d3025466c, RSA key 2048 bits, signed using RSA-SHA256, 
activated `2019-01-31 00:00:00 UTC', expires `2021-03-12 23:59:59 UTC', 
pin-sha256="wPbqFLlZqQbuF+thnCarsf0k8CbvM8wbbjhcT45lx78="
Public Key ID:
sha1:63ddc827cb0c5efda0634864ececc9855001c8bc

sha256:c0f6ea14b959a906ee17eb619c26abb1fd24f026ef33cc1b6e385c4f8e65c7bf
Public Key PIN:
pin-sha256:wPbqFLlZqQbuF+thnCarsf0k8CbvM8wbbjhcT45lx78=

- Certificate[1] info:
 - subject `CN=Gandi Standard SSL CA 2,O=Gandi,L=Paris,ST=Paris,C=FR', issuer 
`CN=USERTrust RSA Certification Authority,O=The USERTRUST Network,L=Jersey 
City,ST=New Jersey,C=US', serial 0x05e4dc3b9438ab3b8597cba6a19850e3, RSA key 
2048 bits, signed using RSA-SHA384, activated `2014-09-12 00:00:00 UTC', 
expires `2024-09-11 23:59:59 UTC', 
pin-sha256="WGJkyYjx1QMdMe0UqlyOKXtydPDVrk7sl2fV+nNm1r4="
- Certificate[2] info:
 - subject `CN=USERTrust RSA Certification Authority,O=The USERTRUST 
Network,L=Jersey City,ST=New Jersey,C=US', issuer `CN=AddTrust External CA 
Root,OU=AddTrust External TTP Network,O=AddTrust AB,C=SE', serial 
0x13ea28705bf4eced0c36630980614336, RSA key 4096 bits, signed using RSA-SHA384, 
activated `2000-05-30 10:48:38 UTC', expires `2020-05-30 10:48:38 UTC', 
pin-sha256="x4QzPSC810K5/cMjb05Qm4k3Bw5zBn4lTdO/nEW/Td4="
- Status: The certificate is NOT trusted. The certificate chain uses expired 
certificate. 
*** PKI verification of server certificate failed...
*** Fatal error: Error in the certificate.

It seems that GnuTLS stops on a failure in the first certificate chain, while
other libraries like OpenSSL explore other chains before giving up.

It would help if ehwiki.org server did not send to expired certificate in the
certificate chain of the TLS handshake and send the alternative one that has
not yet expired as advertised on the Sectigo web page you linked.

-- Petr


signature.asc
Description: PGP signature


Re: Undefined reference to gnutls_protocol_set_priority() when compiling latest wget version

2020-05-12 Thread Petr Pisar
On Tue, May 12, 2020 at 05:34:22PM -0600, Stephen Kirby wrote:
> I'm using GnuTLS version 3.6.13.  I believe it is the latest.  If anyone
> knows otherwise please let me know.
> 
> Sorry for the delay in getting back to you Tim (was swamped this morning)
> and thanks for your fast response!  I double-checked the versions of GnuTLS
> and wget I am using.  Both are the absolute latest (gnutls-3.6.13 and
> wget-1.20.3).  As such, I am not sure why the latest wget (in src/gnutls.c)
> would employ a deprecated/removed function, specifically,
> "gnutls_protocol_set_priority()?  Do you recommend stepping back to an
> older version of GnuTLS to get around this and if so which one would work?
> Otherwise, would anyone know of a patch for the wget source code,
> specifically, for the file  /src/gnutls.c so I can use the latest versions
> of GnuTLS and wget?  Thanks so much.
> 
I have also these latest versions and I do not observe your problem.

Indeed GnuTLS version 3.6.13 does not provide gnutls_protocol_set_priority
symbol. It provides gnutls_priority_set_direct. You can check it by inspecting
the library:

$ nm -D /usr/lib64/libgnutls.so.30.27.0 |grep gnutls_priority_set_direct
00052bc0 T gnutls_priority_set_direct
$ nm -D /usr/lib64/libgnutls.so.30.27.0 |grep gnutls_protocol_set_priority

If you read wget code, you will find out that gnutls_protocol_set_priority()
function is used only if HAVE_GNUTLS_PRIORITY_SET_DIRECT C preprocessor macro
is not defined. Please check src/config.h generated after running ./configure.
I bet it defines it.

If that's so, you need to find out why the configure check was unable to
discover support for gnutls_priority_set_direct. configure does this:

for ac_func in gnutls_priority_set_direct
do :
  ac_fn_c_check_func "$LINENO" "gnutls_priority_set_direct" 
"ac_cv_func_gnutls_priority_set_direct"
if test "x$ac_cv_func_gnutls_priority_set_direct" = xyes; then :
  cat >>confdefs.h <<_ACEOF
#define HAVE_GNUTLS_PRIORITY_SET_DIRECT 1
_ACEOF

I recommed you reading config.log (around "checking for
gnutls_priority_set_direct" line) to find it out.

I suspect your GnuTLS installation is botched. Probably the library and header
files do not match.

-- Petr


signature.asc
Description: PGP signature


Re: wget in cron issue with dns resolution of local server

2019-12-17 Thread Petr Pisar
On Tue, Dec 17, 2019 at 03:45:57PM +0100, Álvaro Pinel Bueno wrote:
> El mar., 17 dic. 2019 a las 15:36, Tim Rühsen ()
> escribió:
> 
> > are you sure that " and * are not somehow removed / expanded ?
> >
> > To test, put the wget command into a shell script and start that script
> > in the crontab. Does this make a difference ?
> >
> 
> Yes, I fact I realized this behavior using wget inside a bash script in
> crontab, I copied the wget line from the script and added isolated to
> root-cron and the behavior keep going.
> 
Isn't the unwanted resolution of a the machine hostname performed by cron (or
a sendmail command)? Cron can send an output of a job through e-mail. And by
default it sends it to the owner of the crontab at the hostname. I would
replace the wget command with something that prints to stdout and stderr and
maybe exists with a non-zero exit code to verify that.

-- Pete


signature.asc
Description: PGP signature


Re: [Bug-wget] /usr/bin/env: invalid option -- 'S'

2019-06-02 Thread Petr Pisar
On Sun, Jun 02, 2019 at 02:00:54PM +0200, Tim Rühsen wrote:
> On 31.05.19 21:15, Petr Pisar wrote:
> > On Thu, May 30, 2019 at 09:56:33AM -0400, Jeffrey Walton wrote:
> >> I used PERL5LIB to put teests/ on path for Perl. It looks like at
> >> least one Debian machine I have is back to the Socket::inet_ntoa
> >> problems.
> >>
> >> I'm calling it good.
> >>
> >> The Perl people need to fix Socket::inet_ntoa, and the Debian people
> >> need to make it available. I'm guessing Debian is the holdup. They
> >> will leave things broke rather than supplying an update. It is a waste
> >> of time to file a Debian bug report.
> >>
> > You can report your issues directly to Socket authors if you believe the 
> > issue
> > is not specific to Debian.
> > 
> > May I know what's your issue with Socket::inet_ntoa?
> 
> It's not about Socket::inet_ntoa (sorry for not correcting this before).
> IMO, it's about https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=887590
> 
I see. Those are the IPv6 patches for HTTP::Daemon I wrote two years ago for
RHEL.

The patches are indeed quite large and have an effect on other packages that
use HTTP::Daemon. Especially on tests. Because various packages are not
prepared for HTTP::Daemon listening on an IPv6 socket. I understand why Debian
does not want to apply them to a stable distribution. Applying them would
change a behavior and people could get mad at them.

I can see two solutions for wget. Either use 127.0.0.1 instead of localhost
everywhere, or skip the particular test if HTTP::Daemon is unable to listen on
an IPv6 while plain Socket (or IO::Socket::IP) is.

-- Petr


signature.asc
Description: PGP signature


Re: [Bug-wget] /usr/bin/env: invalid option -- 'S'

2019-05-31 Thread Petr Pisar
On Thu, May 30, 2019 at 09:56:33AM -0400, Jeffrey Walton wrote:
> I used PERL5LIB to put teests/ on path for Perl. It looks like at
> least one Debian machine I have is back to the Socket::inet_ntoa
> problems.
> 
> I'm calling it good.
> 
> The Perl people need to fix Socket::inet_ntoa, and the Debian people
> need to make it available. I'm guessing Debian is the holdup. They
> will leave things broke rather than supplying an update. It is a waste
> of time to file a Debian bug report.
> 
You can report your issues directly to Socket authors if you believe the issue
is not specific to Debian.

May I know what's your issue with Socket::inet_ntoa?

-- Petr


signature.asc
Description: PGP signature


Re: [Bug-wget] strange behaviour

2018-05-20 Thread Petr Pisar
On Thu, May 10, 2018 at 10:27:35AM +, VINEETHSIVARAMAN wrote:
> My server is behind a firewall and a  proxy, but when i give  2 "wget" in
> command  gives me a DNS resolution but not with the single wget !
> 
[...]
> [~]$ nslookup google.com
> 
> Non-authoritative answer:
> Name:   google.com
> Address: 74.125.24.102
> Name:   google.com
> Address: 74.125.24.101
> Name:   google.com
> Address: 74.125.24.139
> Name:   google.com
> Address: 74.125.24.113
> Name:   google.com
> Address: 74.125.24.138
> Name:   google.com
> Address: 74.125.24.100
> 
> [~]$ wget google.com --no-proxy -d
> DEBUG output created by Wget 1.14 on linux-gnu.
> 
> URI encoding = ‘UTF-8’
> Converted file name 'index.html' (UTF-8) -> 'index.html' (UTF-8)
> Converted file name 'index.html' (UTF-8) -> 'index.html' (UTF-8)
> --2018-05-10 06:24:33--  http://google.com/
> Resolving google.com (google.com)... failed: Name or service not known.

nslookup bypasses system domain name resolver and querries DNS servers
directly comparing to wget or most of the other programs.

Do you experience the same issue with other programs that use system
resolver? E.g. "getent host google.com"? Maybe one of your name servers
misbehaves and only the second query to the second one succeeds. Maybe your
system resolved validates DNSSEC signatures and your network or name servers
block EDNS packets. What's your system resolver (/etc/nsswitch.conf), do you
use nscd or sssd caching deamons? If you do, what does happen if you flush
their caches (e.g. nscd --invalidate hosts). Capturing and studying network
packets while experiencing the issue would also help.

-- Petr


signature.asc
Description: PGP signature


Re: [Bug-wget] Windows cert store support

2015-12-11 Thread Petr Pisar
On Fri, Dec 11, 2015 at 01:22:48PM +0200, Eli Zaretskii wrote:
> > Date: Thu, 10 Dec 2015 01:12:37 +0100
> > From: Ángel González 
> > Cc: bug-wget 
> > 
> > On 09/12/15 03:06, Random Coder wrote:
> > > I'm not sure if the wget maintainers would be interested, but I've
> > > been carrying this patch around in my private builds of wget for a
> > > while.  It allows wget to load SSL certs from the default Windows cert
> > > store.
> > >
> > > The patch itself is fairly straightforward, but as it changes the
> > > default SSL behavior, and no care was taken to follow coding convents
> > > when I wrote it, so it's probably not ready for inclusion in the
> > > codebase.  Still, if it's useful, feel free to use it for ideas.
> > Wow, supporting the OS store would certainly be very cool.
> > 
> > I would probably move it to windows.c and attempt to make it also work 
> > in gnutls, but in general it looks good.
> 
> Wget compiled with GnuTLS already supports this feature: it calls
> gnutls_certificate_set_x509_system_trust when the GnuTLS library
> supports that.  gnutls_certificate_set_x509_system_trust does
> internally what the proposed patch does.
> 
> So I think this code should indeed go only to openssl.c, as gnutls.c
> already has its equivalent.
> 
AFAIK OpenSSL source contains crypto engine that delegates all operations
to Windows native cryptographical subsystem. It's only matter of default
configuration.

-- Petr


signature.asc
Description: PGP signature


Re: [Bug-wget] [PATCH] UTF-8-ify contributors' names in wget.texi (#40472)

2015-03-16 Thread Petr Pisar
On Mon, Mar 16, 2015 at 07:55:47PM +0100, Giuseppe Scrivano wrote:
 ./texi2pod.pl -D VERSION=1.16.3.6-b74a-dirty ./wget.texi wget.pod
 /usr/bin/pod2man --center=GNU Wget --release=GNU Wget 1.16.3.6-b74a-dirty 
 wget.pod  wget.1
 Wide character in printf at /usr/share/perl5/vendor_perl/Pod/Simple.pm line 
 541.
 wget.pod around line 2395: Non-ASCII character seen before =encoding in 
 'Nikšić'. Assuming UTF-8
 POD document had syntax errors at /usr/bin/pod2man line 69.
 
A year or so, podlators require to specify POD encoding if the POD is not
ASCII or ISO-8859-1. Future versions will default to CP1251. I recommend to
write:

=encoding utf8

above first POD line.

Also for some time, podlators have pretty verbose diagnosis and complains on
things which older versions did not complain. The actuall output depends on
how old your perl or CPAN installation is.

-- Petr


signature.asc
Description: PGP signature


Re: [Bug-wget] SSL Poodle attack

2014-10-15 Thread Petr Pisar
On Wed, Oct 15, 2014 at 11:57:47AM +0200, Tim Rühsen wrote:
 (means, the libraries defaults are used, whatever that is).
 
 Should we break compatibility and map 'auto' to TLSv1 ?
 For the security of the users.

Please no. Instead of changing each TLS program, one should patch only the TLS
library. This is the reason why why have shared libraries.

So just report the issue to your vendor, he will fix few TSL implementations
he delivers and all application will get fixed automatically.

-- Petr



pgpt4V8rcL4IP.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH] Allow to redefine ciphers list for OpenSSL

2014-07-09 Thread Petr Pisar
On Tue, Jul 08, 2014 at 10:00:24AM -0400, Tomas Hozza wrote:
 I'm afraid this is not suitable for us. We need to be able to define the
 policy somewhere in /etc, where the user is not able to change it (only
 the system administrator).

I hope can also prevent the user from running his own wget executable, or
ld-preloading modified OpenSSL library, or intercepting open(2) calls to
provide fake /etc file.

 Also the main intention to have a single place to set the policy for all
 system components, therefore wgetrc is not the right place for us.
 
What about to change wget to call OPENSSL_config(NULL) instead of setting some
hard-coded preference string. Then you can teach OpenSSL to load your /etc
configuration instead of patching each application.

-- Petr


pgpieMgSxd4PH.pgp
Description: PGP signature


Re: [Bug-wget] glibc detected

2014-06-28 Thread Petr Pisar
On Sat, Jun 28, 2014 at 10:39:09AM +0200, Martin Jašek wrote:
 
 wget -q -O spoje.html
 www.idos.cz/olomouc/odjezdy/\?f=Trnkova\t=Hlavní%20nádraží\submit=true
 http://www.idos.cz/olomouc/odjezdy/%5C?f=Trnkova%5Ct=Hlavn%C3%AD%20n%C3%A1dra%C5%BE%C3%AD%5Csubmit=true

Could you send the URL once again and this time as a plain text (no HTML).
Because text which ended up in the mailing list consists of two differents
addresses and both of them return 404 Not Found.

 in some cases (about one half of tries) it throws glibc detected:
 
 *** glibc detected *** wget: double free or corruption (!prev): 0x0843cb80
 ***
 === Backtrace: =
 /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x70f01)[0xb7457f01]
 /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x72768)[0xb7459768]
 /lib/i386-linux-gnu/i686/cmov/libc.so.6(cfree+0x6d)[0xb745c81d]
 wget[0x8076c44]
 /lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xb73fde46]
 wget[0x804c931]
 (...)

Please install package with debugging data for wget and try again. And provide
longer back-trace.

Also running the command under gdb or valgrind would be benefical.

 
 I am using wget and system as following:
 wget --version
 GNU Wget 1.13.4 sestaven na systému linux-gnu.
 
There is 1.14 in the wild. Does it crash too?

-- Petr


pgpzQ9UUP8kCa.pgp
Description: PGP signature


Re: [Bug-wget] glibc detected

2014-06-28 Thread Petr Pisar
On Sat, Jun 28, 2014 at 01:43:15PM -0400, Ray Satiro wrote:
 In Ubuntu 14.04 / wget 1.15 I don't see anything unusual. Logs are 
 attached with version information and --debug output and a valgrind on 
 Ubuntu 13.10/wget 1.14 but I don't have the debugging package.. did you 
 mean wget debug package Petr?

The wget114out4.txt with valgrind output is promissing. The first reported
issue can be wget or glibc fault. The other ones look like wget fault.
However without the debugging symbols available, one cannot see which source
code line the mistake is on (valgrind shows only ??? in that case). Yes, you
need to install the wget debug package.

-- Petr



pgpapAKF2vq4p.pgp
Description: PGP signature


Re: [Bug-wget] SINGLE QUOTE in wget output is replaced by UTF-8 value of SINGLE QUOTE

2012-08-09 Thread Petr Pisar
On Thu, Aug 09, 2012 at 04:50:46PM +0530, Avinash wrote:
 2) Have built a rpm on top of this source for FEDORA CORE 3

Do you know this is really ancient piece of software?

 3) It get installed correctly on FC3 and wget downloads files as well
 4) But the output of wget shows UTF-8 value of SINGLE QUOTE instead of
 showing SINGLE QUOTE i.e. '
 
[...]
 Can somebody please help as to why it is replacing SINGLE QUOTE by its
 UTF-8 equivalent ?

What's your locale value (run command `locale')? If it is UTF-8, then you get
UTF-8 characters. Try running `LC_ALL=C wget'.

-- Petr


pgpx2pUtohCc0.pgp
Description: PGP signature


Re: [Bug-wget] [FEATURE-REQUEST] Pinning SSL certificates / check SSL fingerprints

2012-07-08 Thread Petr Pisar
On Sat, Jul 07, 2012 at 01:25:49PM -0600, Daniel Kahn Gillmor wrote:
 On 07/07/2012 12:50 PM, Ángel González wrote:
  On 06/07/12 01:01, pro...@secure-mail.biz wrote:
  Because SSL CA's have failed many times (Comodo, DigiNotar, ...) I wish to 
  have an option to pin a SSL certificate. The fingerprint may be optionally 
  provided through a new option.
  Have you tried using --ca-certificate option?
 
 I believe the OP wants to pin the certificate of the remote server (that
 is, the end entity certificate), whereas --ca-certificate pins the
 certificate of the issuing authority.
 
Indeed? I thought the --ca-certificate just makes the certificate trustful, so
declaring server certificate using --ca-certificate could be enought.

Though there can be problem with HTTP redirects and of course some picky TLS
libraries can insist on CA=true X.509 attribute. Also some TLS implementations
checks the server hostname against certificate name. 

So if the TLS library cannot be cheated with --ca-certificate option,
overriding root of trust in other way is good idea.

I'm just little worried about the digest algorithm. One can claim the MD5 is
too weak. There have been real successfull attacks exploiting MD5 collisions
of signed object in X.509 certificate. The ability to specify different
alogrithm is necessary.

Also remember the HTTP redirect scenario where you need to verify two (or
more) servers. It's necessary to be able to supply more pinning options.

Maybe option pinning certificate to hostname would be the best choice. No
hashes. Just supply the peer certificate content like with --ca-certificate.
E.g. `--peer-certificate example.com:/tmp/example.cert'.

-- Petr


pgpsNRlttrc9I.pgp
Description: PGP signature


[Bug-wget] Message faults in wget 1.13-pre1

2012-06-24 Thread Petr Pisar
Hello,

while updating Czech translation for wget 1.13-pre1, I found following
mistakes. 


# FIXME: Double dot
#: src/http.c:3121
msgid Cannot write to WARC file..\n

# FIXME: missing space after comma
#: src/main.c:1197
#, c-format
msgid 
Both --no-clobber and --convert-links were specified,only --convert-links 
will be used.\n

-- Petr


pgpSkWZNah6gx.pgp
Description: PGP signature


Re: [Bug-wget] alpha release 1.13.4.59-2b1dd

2012-05-27 Thread Petr Pisar
On Sat, May 26, 2012 at 03:00:26PM +0200, Giuseppe Scrivano wrote:
 I have just uploaded a new alpha version with the few changes of the
 last days:
 
 ftp://alpha.gnu.org/gnu/wget/wget-1.13.4.59-2b1dd.tar.bz2
 
 If nothing horrible happens, then I will make an official release in few
 days.
 
When are you going to publish new translation catalog template on the
Translation Project (TP)? I checked Czech translation in the tar ball and
there is 24 messages untranslated. If you want to get all translations ready
for new release, please push the wget.pot into the TP and get translators few
days for updating the translations.

-- Petr


pgptliG8EDiun.pgp
Description: PGP signature


Re: [Bug-wget] Wget issues with delicious (SAN)

2012-01-31 Thread Petr Pisar
On Tue, Jan 31, 2012 at 06:08:43PM +0100, Sven Herzberg wrote:
 Hi,
 
 I used to execute a script to backup my bookmarks from delicious.com.
 However, since quite some time, wget doesn't connect to the remote site
 anymore, the error is:
 
  $ wget 'https://api.del.icio.us'
  --2012-01-31 12:58:52--  https://api.del.icio.us/
  Resolving api.del.icio.us... 184.72.40.0, 184.72.44.135, 50.18.156.75, ...
  Connecting to api.del.icio.us|184.72.40.0|:443... connected.
  ERROR: certificate common name `d-static.com' doesn't match requested host 
  name `api.del.icio.us'.
  To connect to api.del.icio.us insecurely, use `--no-check-certificate'.
  Unable to establish SSL connection.
 
 This looks like a trivial error, one might think. However, while the common
 name of the certificate indeed doesn't api.del.icio.us, it includes that
 name (among others) in the list of subject alternative names:
 
[...]
 Firefox, Chrome and Safari seem to be happy with this setup, wget isn't. Is
 there a reason behind this or is it just awaiting a patch to become fixed?
 
Subject alternative names are supported by wget since 1.13
(http://bzr.savannah.gnu.org/lh/wget/trunk/revision/2317).

-- Petr


pgpqBaP3W6jUl.pgp
Description: PGP signature


Re: [Bug-wget] Storing the URL in extended attributes

2011-09-04 Thread Petr Pisar
On Mon, Oct 18, 2010 at 12:48:14PM +0200, Petr Pisar wrote:
 On Mon, Oct 18, 2010 at 12:27:24PM +0200, Michelle Konzack wrote:
  Am 2010-10-17 22:38:51, hacktest Du folgendes herunter:
   I created a patch to store the URL inside the user xattrs of the
   downloaded file; this way, its origin can be identified afterwards.
   
   I uploaded the change to my Github account and attached the diff, and
   I am still working on portability issues, but I'd like to hear some
   opinions on this:
   
   http://github.com/wertarbyte/wget/tree/xattrurl
  
  I am right that this works only on Windos and not GNU/Linux?
  
 See my patch for wget
 http://article.gmane.org/gmane.comp.web.wget.general/7894. It works on
 GNU/Linux, IRIX and Darwin. Other option is to use libattr as the API is not
 standardized (it support IRIX and GNU/Linux only). FreeBSD has yet another
 API.
 

Just if somebody is still interested (maybe new wget maintainer), rebased
patch against 1.13.3 is still available on http://xpisar.wz.cz/wget-xattr/.

-- Petr



pgpNWXhdvliTt.pgp
Description: PGP signature


Re: [Bug-wget] Need to handle wild card certificates

2010-12-03 Thread Petr Pisar
On Fri, Dec 03, 2010 at 08:17:37AM +0100, Petr Pisar wrote:
 
 I communicated the copyright assignement with Mica by direct e-mails. Please
 ask him or FSF.)

Or I can send you a poor scan of the assignment signed by both parties.

-- Petr



pgpnDHMnZn8eh.pgp
Description: PGP signature


Re: [Bug-wget] Need to handle wild card certificates

2010-12-02 Thread Petr Pisar
On Thu, Dec 02, 2010 at 10:21:29PM +0100, Giuseppe Scrivano wrote:
 I am not sure yet about the next release, I can't apply a patch because
 the author hasn't assigned copyright to the FSF yet.  I don't think
 there will be a release before 2-3 weeks.
 
This concrete server certificate provides DNS subject alternative names (SAN)
which are implemented in wget by a patch already commited into trunk. The SAN
patch has been written by me and the copyright asssignement to FSF has been
handled more than a year ago. (Otherwise Mica did not commit the code).

(BTW referencing the commit by ordinal number in bazaar VCS is nonsense as the
ordinal numbers are not global stable. The correct identifier is revision-id
`petr.pi...@atlas.cz-20091024230644-bawcvao7wi71y1ky'.)

I communicated the copyright assignement with Mica by direct e-mails. Please
ask him or FSF.)

-- Petr


pgpSHSHTCdoh9.pgp
Description: PGP signature


Re: [Bug-wget] Re: Re: Re: Storing the URL in extended attributes

2010-10-19 Thread Petr Pisar
On Tue, Oct 19, 2010 at 10:21:39PM +0200, Michelle Konzack wrote:
 Note:   Programs which disallow to disable such things are crap.
 
 So for wget it should only  be  used  if  the  user  set  explicite  a
 commandline option or in the wgetrc.  Of course, it should not activated
 hardcoded or as a default in the wgetrc.
 
Dear angry user, it's matter of configuration. You can compile curl/wget
without a feature, you can disable it in system wide configuration file,
a user can do the same in his own configuration file or explicitly in
arguments of the utility.

We can discuss what's ideal default option, you can talk about it with your
operating system vendor or you can do it yourself (yes, Gentoo rules :).

-- Petr


pgpmfkoy6TeEI.pgp
Description: PGP signature


Re: [Bug-wget] new alpha release 1.12-2416

2010-08-08 Thread Petr Pisar
On Sun, Aug 08, 2010 at 02:13:56PM +0200, Giuseppe Scrivano wrote:
 I have just uploaded a new alpha version of wget.
 
Great. Could you please upload new gettext template into Translation Project?
Last one is 1.12_pre6 from 2009-09-08
(http://translationproject.org/domain/wget.html). It would help translators to
get time to update translations and to get them into new stable wget release.

-- Petr


pgpQ1BkixFay6.pgp
Description: PGP signature


[Bug-wget] Wrong msgids in 1.12-pre3

2009-07-27 Thread Petr Pisar
Hello,

I hit some inconsistences in wget-1.12-pre3 message template:

 msgid 
 msgstr 
 Project-Id-Version: PACKAGE VERSION\n
 Report-Msgid-Bugs-To: w...@sunsite.dk\n
[…]

Here should be new mailing list address. This is msgstr however it's
distributed from po/wget.pot. 

 #: src/main.c:574
 msgid 
--auth-no-challenge Send Basic HTTP authentication information\n
without first waiting for the server's\n
challenge.\n

First letter should be lower-cased.

 #: src/main.c:434
 msgid 
   -B,  --base=URLresolves HTML input-file links (-i -F)\n
  relative to URL,\n

Last symbol should be full stop, not a comma.

-- Petr


pgp7RDEStSvDa.pgp
Description: PGP signature


[Bug-wget] Some flaws in wget-1.12-pre1 messages

2009-07-05 Thread Petr Pisar
Hello,

translating wget-1.12-pre1, I found some flaws in msgid's:

# TODO: msgid bug: explicit quotation
#: src/gnutls.c:293
#, c-format
msgid The certificate's owner does not match hostname '%s'\n
msgstr Jméno vlastníka certifikátu se neshoduje se jménem počítače „%s“\n

# TODO: msgid bug: missing space after colon
#: src/http.c:2785
#, c-format
msgid %s URL:%s %2d %s\n
msgstr %s URL:%s %2d %s\n

# TODO: msgid bug: old copyright year
#. TRANSLATORS: When available, an actual copyright character
#. (cirle-c) should be used in preference to (C).
#: src/main.c:836
msgid Copyright (C) 2008 Free Software Foundation, Inc.\n
msgstr Copyright © 2008 Free Software Foundation, Inc.\n

# TODO: msgid bug: explicit quotation
#: src/netrc.c:421
#, c-format
msgid %s: %s:%d: unknown token \%s\\n
msgstr %s: %s:%d: neznámý token „%s“\n


Also some text in wget --version is not internationalized (marked with ^):

$ ./src/wget --version
GNU Wget 1.12-b2525 built on linux-gnu.

+digest +ipv6 +nls +ntlm +opie +md5/openssl +https -gnutls +openssl 
+gettext +iri 

Wgetrc:
/home/petr/.wgetrc (user)
   ^^
/usr/local/etc/wgetrc (system)
  

-- Petr


pgpKaKkK5875N.pgp
Description: PGP signature


Re: [Bug-wget] Wget and IPv6 link local addresses

2009-06-01 Thread Petr Pisar
On Sat, May 30, 2009 at 12:18:02PM +0100, Fabian Hugelshofer wrote:
 
 Wget 1.11.4 does not support IPv6 link local addresses. Addresses with a
 zone identifier are being rejected with Invalid IPv6 numeric address.
 
 e.g.:
 $ wget http://[fe80::1%25eth0]/
 http://[fe80::1%25eth0]/: Invalid IPv6 numeric address.
 
 %25 is encoded for '%', the delimiter for the zone identifier.
 
 Such an URL gets rejected by is_valid_ipv6_address() in host.c.

Fix is not easy because the zone identifier should be considered only during
making local sockets. It should be avoided in upper layer (HTTP, SSL, FTP)
processing.

You can try following patch which is quick and dirty hack only for HTTP.
Propper patch would require significant changes over all the code.

-- Petr

commit cd898b30ae67881df842f35e844c99afd7e9585b
Author: Petr Písař petr.pi...@atlas.cz
Date:   Mon Jun 1 20:58:38 2009 +0200

Naive IPv6 zone identifier implementation for HTTP.

diff --git a/src/host.c b/src/host.c
index fb5a2cb..48756d5 100644
--- a/src/host.c
+++ b/src/host.c
@@ -478,7 +478,12 @@ is_valid_ipv6_address (const char *str, const char *end)
 
   if (str == end)
 return false;
-  
+ 
+  /* RFC 4007: link scope address with zone identifier */
+  const char *realend = strchr(str, '%');
+  if (realend  end)
+  realend = end;
+
   /* Leading :: requires some special handling. */
   if (*str == ':')
 {
@@ -491,7 +496,7 @@ is_valid_ipv6_address (const char *str, const char *end)
   saw_xdigit = false;
   val = 0;
 
-  while (str  end)
+  while (str  realend)
 {
   int ch = *str++;
 
@@ -517,7 +522,7 @@ is_valid_ipv6_address (const char *str, const char *end)
   colonp = str + tp;
   continue;
 }
-  else if (str == end)
+  else if (str == realend)
 return false;
   if (tp  ns_in6addrsz - ns_int16sz)
 return false;
@@ -529,13 +534,13 @@ is_valid_ipv6_address (const char *str, const char *end)
 
   /* if ch is a dot ... */
   if (ch == '.'  (tp = ns_in6addrsz - ns_inaddrsz)
-   is_valid_ipv4_address (curtok, end) == 1)
+   is_valid_ipv4_address (curtok, realend) == 1)
 {
   tp += ns_inaddrsz;
   saw_xdigit = false;
   break;
 }
-
+
   return false;
 }
 
@@ -556,6 +561,16 @@ is_valid_ipv6_address (const char *str, const char *end)
   if (tp != ns_in6addrsz)
 return false;
 
+  /* Zone identifier has to be only decimal natural number in nummerical
+   * address. Othewise, it's plarform specific zone string (e.g. interface
+   * number) a and it has to be resolved. */
+  const char *ch;
+  for (ch = realend + 1; ch  end; ch++)
+{
+  if (*ch  '0' || *ch  '9')
+  return false;
+}
+
   return true;
 }
 
diff --git a/src/http.c b/src/http.c
index bdfe100..e6ee993 100644
--- a/src/http.c
+++ b/src/http.c
@@ -1508,6 +1508,13 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, 
struct url *proxy)
becomes ambiguous and needs to be rewritten as Host:
[3ffe:8100:200:2::2]:1234.  */
   {
+/* Hide IPv6 zone identifier. Currently it has only meaning on link local
+ * connections and the idetifier is local node specific.
+ * XXX: This ugly hack mangles data inside struct url *u. Not thread-safe! 
*/
+char *zone_delimiter = strchr(u-host, '%');
+if (zone_delimiter)
+  *zone_delimiter = '\0';
+
 /* Formats arranged for hfmt[add_port][add_squares].  */
 static const char *hfmt[][2] = {
   { %s, [%s] }, { %s:%d, [%s]:%d }
@@ -1517,6 +1524,9 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, 
struct url *proxy)
 request_set_header (req, Host,
 aprintf (hfmt[add_port][add_squares], u-host, 
u-port),
 rel_value);
+/* Unhide zone delimiter */
+if (zone_delimiter)
+  *zone_delimiter = '%';
   }
 
   if (!inhibit_keep_alive)
diff --git a/src/url.c b/src/url.c
index 5f61e35..de5e43e 100644
--- a/src/url.c
+++ b/src/url.c
@@ -717,7 +717,14 @@ url_parse (const char *url, int *error)
 
 #ifdef ENABLE_IPV6
   /* Check if the IPv6 address is valid. */
-  if (!is_valid_ipv6_address(host_b, host_e))
+  /* XXX: The host_b is not decoded yet. This is not perfect check for
+   * zone delimiter '%'. */
+  p = strchr(host_b, '%');
+  if (p  host_e)
+{
+  p = NULL;
+}
+  if (!is_valid_ipv6_address(host_b, p ? p : host_e))
 {
   error_code = PE_INVALID_IPV6_ADDRESS;
   goto error;


pgp8kXIs5NzJb.pgp
Description: PGP signature


Re: [Bug-wget] Bug with multiname certificates

2009-05-26 Thread Petr Pisar
On Tue, May 26, 2009 at 09:51:08AM +0200, Marc MAURICE wrote:

 Trying to wget https://www.me.etsglobal.org/ returns a certificate error 
 (domain mismatch).
 However, Firefox and IE7 report no problem.
 I think wget doesn't handle certificates with several names correctly.


This is known bug #23934
http://savannah.gnu.org/bugs/?func=detailitemitem_id=23934. You are free to
test patch that is available on the bug page.

-- Petr


pgplZ67z6ganU.pgp
Description: PGP signature


Re: [Bug-wget] can wget uncompress gzip ?

2009-05-19 Thread Petr Pisar
On Tue, May 19, 2009 at 10:52:53AM +0200, fabrice régnier wrote:

 My web server send gzip pages.

 Wget can download this gzip pages with this command:

 # wget -S --header='Accept-Encoding: gzip,deflate' http://my_url

 But it stays gzip compressed. When i open these pages with firefox, i'd 
 like to see them in a more clear format ;)

This is interesting problem. I have web server behing slow link, thus I have
all bigger files compressed (on file system) and I send them to clients
compressed with propper Content-Encoding header.

 My questions are:
 * should wget uncompress gzip on the fly while downloading ?
I think it would be convenient for the user. (This feature should be driven on
Content-Encoding header only, not on Content-Type.)

 * should firefox should uncompress gzip while opening a local gzip file ?
Problem is, Firefox doesn't investigate the content of the file and it's
MIME type guess is based on file extension only. Firefox also doesn't
consider content encoding at all on local files. (It's pain to browse localy
saved web pages with compressed SVG images by Firefox.)

 * should i manually uncompress all gzip files before i can open them with 
 firefox ?
That's equivalent to first question: Is Content-Encoding just a transport
encoding (like base64 of quoted-printable in e-mail) or is it a permanent
mark `This file is compressed'?

Additionally, I have another approach: The MIME type and Content-Encoding can
be saved into extended file attributes and then any application can reuse this
info to figure out the type of data easily without file name extionsion or
file content magic. (This feature has been discussed on this list already and
my patch has been dismissed.)

-- Petr


pgpO5TedS7Ci4.pgp
Description: PGP signature


Re: [Bug-wget] download page-requisites with spanning hosts

2009-04-30 Thread Petr Pisar
On Wed, Apr 29, 2009 at 06:50:11PM -0500, Jake b wrote:
 
 The wGet command I am using:
 wget.exe -p -k -w 15
 http://forums.sijun.com/viewtopic.php?t=29807postdays=0postorder=ascstart=27330;
 
 It has 2 problems:
 
 1) Rename file:
 
 Instead of creating something like: 912.html or index.html it instead
 becomes: viewtopic@t=29807postdays=0postorder=ascstart=27330

That't normal because the server doean't provide any usefull alternative name
via HTTP headers which can be obtained using wget's option
--content-disposition.

If you want get number of the page of the gallery, you need to parse the HTML
code by hand to obtain it (e.g. using grep).

However I guess better naming conventions is the value of start URL
parameter (in your example the number 27330).

 2) images that span hosts are failing.
 
 I have page-resuisites on, but, since some pages are on tinypic, or
 imageshack, etc it is not downloading them. Meaning it looks like
 this:
 
 sijun/page912.php
   imageshack.com/1.png
   tinypic.com/2.png
   randomguyshost.com/3.png
 
 
 Because of this, I cannot simply list all domains to span. I don't
 know all the domains, since people have personal servers.
 
 How do I make wget download all images on the page? I don't want to
 recurse other hosts, or even sijun, just download this page, and all
 images needed to display it.
 
That's not easy task. Especially because all big desktop images are stored on
other servers. I think wget is not enough powerfull to do it all on its own.

I propose using other tools to extract the image ULRs and then to download them
using wget. E.g.:

wget -O - 
'http://forums.sijun.com/viewtopic.php?t=29807postdays=0postorder=ascstart=27+330'
 | grep -o -E 'http:\/\/[^]*\.(jpg|jpeg|png)' | wget -i -

This command downloads the HTML code, uses grep to find out all image files
stored on other servers (deciding throug file name extensions and absolute
addresses), and finally, it downloads such images.

There is little problem: not all of the images still exist and some servers
return dummy page instead of proper error code. So you can get non-image files
sometimes.

 [ This one is a lower priority, but someone might already know how to
 solve this ]
 3) After this is done, I want to loop to download multiple pages. It
 would be cool If I downloaded pages 900 to 912, and each pages next
 link work correctly to link to the local versions.
 
[…]
 Either way, I have a simple script that can convert 900 to 912 into
 the correct URLs, and pausing in between each request.
 
Wrap your script inside counted for-loop:

for N in $(seq 900 912); do
# variable N contains here the right number
echo $N
done

Acctually, I suppose you use some unix enviroment, where you have available
powerfull collection of external tools (grep, seq) and amazing shell scripting
abilities (like colons and loops).

-- Petr


pgptCJIEFQP9c.pgp
Description: PGP signature


Re: [Bug-wget] download page-requisites with spanning hosts

2009-04-30 Thread Petr Pisar
On Thu, Apr 30, 2009 at 03:31:21AM -0500, Jake b wrote:
 On Thu, Apr 30, 2009 at 3:14 AM, Petr Pisar petr.pi...@atlas.cz wrote:
 
  On Wed, Apr 29, 2009 at 06:50:11PM -0500, Jake b wrote:
 but i'm not sure how to tell wget that the output html file should be named.
 
wget -O OUTPUT_FILE_NAME

   How do I make wget download all images on the page? I don't want to
   recurse other hosts, or even sijun, just download this page, and all
   images needed to display it.
  
  That's not easy task. Especially because all big desktop images are stored
  on other servers. I think wget is not enough powerfull to do it all on its
  own.
 
 Are you saying because some services show a thumbnail, then click to do the
 full image? 
[…]
 Would it be simpler to say something like: download page 912, recursion
 level=1 ( or 2? ), except for non-image links. ( so it only allows recursion
 on images, ie: downloading randomguyshost.com/3.png
 
You can limit downloads according file name extentions (option -A), however
this will remove the sole main HTML file and prevent you in recursion. And
no, there is no option to download only files pointed from special HTML
element like IMG.

Without the -A option, you get a lot of useless files (reagardless spanning).

If you look on locations of files you are interested in, you will see that all
the files are located outside the Sijun domain. On every page is only small
amount of such files. Thus it's more efficient and friendly to the servers to
extract these URLs only at first and then download them only.

 But the problem that it does not span any hosts? Is there a way I can
 achieve this, if I do the same, except, allow span everybody, recurse
 lvl=1, and only recurse non-images.

There is option -H for spanning. Following wget-only command does what you
want, but as I said it produce a lot of useless requests and files.

wget -p -l 1 -H 
'http://forums.sijun.com/viewtopic.php?t=29807postdays=0postorder=ascstart=27+330'
 


  I propose using other tools to extract the image ULRs and then to download
  them using wget. E.g.:
 
 I guess I could use wget to get the html, and parse that for image tags
 manually, but, then I don't get the forum thread comments. Which isn't
 required, but would be nice.

You can do both: extract image URLs and extract comments.

 
  wget -O
  - 
  'http://forums.sijun.com/viewtopic.php?t=29807postdays=0postorder=ascstart=27+330'
  | grep -o -E 'http:\/\/[^]*\.(jpg|jpeg|png)' | wget -i -
 
 Ok, will have to try it out. ( In windows ATM so I can't pipe. )
 
AFAIK Windows shells command.com and cmd.exe supports pipes.

 Using python, and I have dual boot if needed.
 
Or you can execute programs connected through pipes in Python.

-- Petr


pgpkvyaPL0mO2.pgp
Description: PGP signature


[Bug-wget] Re: Fwd: wget file-size parsing bug when doing FTP mirroring?

2009-04-01 Thread Petr Pisar
On 2009-04-01, Niek Bergboer niekbergb...@gmail.com wrote:

 [...snip...]
 -rw-rw-r-- � 1 guest � �everyone 91892436 Sep 13 �2003 Sent Items.dbx
 [...snip...]

 Mostly as expected. However, wget then tells me:

 The sizes do not match (local 91892436) -- retrieving.
 ftp://username:*passwo...@server/some/remote/dir/Sent%20Items.dbx

 ... and gets the entire ~90MB file again, despite the fact that the
 sizes are in fact equal. This seems to occur for all files =
 10,000,000 bytes.

Can you reproduce it?

I tried to reproduce it, but without any success. I mirrored
subdirectory with file (having same size, date and name as yours) from
vsftpd FTP server running on Linux using wget-1.11.4 to Linux client. On
second mirror I got message:

Remote file no newer than local file `router/pub/TEMP/testdir/Sent
Items.dbx' -- not retrieving.

So for some reason your wget does not check timestamp.

-- Petr





[Bug-wget] [Bug #23934] Support for subjectAltNames

2009-03-31 Thread Petr Pisar
Hello,

I have hit problem that wget doesn't support subject alternative names in HTTPS
communication. It's bug #23934 https://savannah.gnu.org/bugs/?23934.

Some time ago I wrote a patch solving this particular problem. It's attached
to the bug report
https://savannah.gnu.org/file/wget-1.11.4-subjectAltNames.diff?file_id=17403.

Could maintainer (Micah) look on it? Is it acceptable?

-- Petr


pgp1JRxumrDwi.pgp
Description: PGP signature