Re: Timeout never happens with https://www.tesco.com
08.12.2020 8:28, Tom Crane пишет: I suspect you use OpenSSL for TLS and wget is missing the fcntl() call from src/openssl.c. That was it. Checking the Slackware distro's build script http://slackware.osuosl.org/slackware64-current/source/n/wget/wget.SlackBuild shows it specifies, Release 1.20.3 doesn't suppose fcntl() call, when ssl=openssl, it uses run_with_timeout(). And there was/is a bug in src/openssl.c: openssl_peek() - SSL_peek() is called directly wo run_with_timeout() - reason of "the hang" In current trunk src this has been fixed: - ./configure --with-ssl=openssl write(2, "\n---request begin---\nGET /groceries/en-GB/products/280200433 HTTP/1.1\r\nUser-Agent: Wget/1.20.3.206-794b7\r\nAccept: */*\r\nAccept-Encoding: identity\r\nHost: www.tesco.com\r\nConnection: Keep-Alive\r\n\r\n---request end---\n", 211 ---request begin--- GET /groceries/en-GB/products/280200433 HTTP/1.1 User-Agent: Wget/1.20.3.206-794b7 Accept: */* Accept-Encoding: identity Host: www.tesco.com Connection: Keep-Alive ---request end--- ) = 211 _newselect(5, NULL, [4], NULL, {tv_sec=1, tv_usec=0}) = 1 (out [4], left {tv_sec=0, tv_usec=94}) write(4, "\...", 194) = 194 ioctl(0, TIOCGPGRP, [20708]) = 0 getpgrp() = 20708 write(2, "HTTP request sent, awaiting response... ", 40HTTP request sent, awaiting response... ) = 40 fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 clock_gettime64(CLOCK_MONOTONIC, {tv_sec=1225, tv_nsec=8033473}) = 0 read(4, 0x1d70e03, 5) = -1 EAGAIN (Resource temporarily unavailable) _newselect(5, [4], NULL, NULL, {tv_sec=1, tv_usec=0}) = 1 (in [4], left {tv_sec=0, tv_usec=946914}) clock_gettime64(CLOCK_MONOTONIC, {tv_sec=1225, tv_nsec=61924252}) = 0 read(4, "\27\3\3\1\n", 5) = 5 read(4, "f\...", 266) = 266 read(4, "\27\3\3\1\n", 5) = 5 read(4, "\...", 266) = 266 read(4, 0x1d70e03, 5) = -1 EAGAIN (Resource temporarily unavailable) _newselect(5, [4], NULL, NULL, {tv_sec=0, tv_usec=946109}) = 0 (Timeout) fcntl64(4, F_SETFL, O_RDWR) = 0 ioctl(0, TIOCGPGRP, [20708]) = 0 getpgrp() = 20708 write(2, "Read error (Connection timed out) in headers.\n", 46Read error (Connection timed out) in headers. ) = 46 write(4, "\27\3\3\0\23\307\257=\26\16ha\2204\310*R\302\225'_d\205\360", 24) = 24 close(4) = 0 ioctl(0, TIOCGPGRP, [20708]) = 0 getpgrp() = 20708 write(2, "Closed 4/SSL 0x01d6a310\n", 24Closed 4/SSL 0x01d6a310 ) = 24 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1229, ...}) = 0 ioctl(0, TIOCGPGRP, [20708]) = 0 getpgrp() = 20708 write(2, "Giving up.\n\n", 12Giving up. ) = 12 close(3) = 0 exit_group(4) = ? +++ exited with 4 +++ - ./configure --with-ssl=openssl CPPFLAGS="-DOPENSSL_RUN_WITHTIMEOUT" write(2, "\n---request begin---\nGET /groceries/en-GB/products/280200433 HTTP/1.1\r\nUser-Agent: Wget/1.20.3.206-794b7\r\nAccept: */*\r\nAccept-Encoding: identity\r\nHost: www.tesco.com\r\nConnection: Keep-Alive\r\n\r\n---request end---\n", 211 ---request begin--- GET /groceries/en-GB/products/280200433 HTTP/1.1 User-Agent: Wget/1.20.3.206-794b7 Accept: */* Accept-Encoding: identity Host: www.tesco.com Connection: Keep-Alive ---request end--- ) = 211 _newselect(5, NULL, [4], NULL, {tv_sec=1, tv_usec=0}) = 1 (out [4], left {tv_sec=0, tv_usec=93}) write(4, "\...", 194) = 194 ioctl(0, TIOCGPGRP, [10869]) = 0 getpgrp() = 10869 write(2, "HTTP request sent, awaiting response... ", 40HTTP request sent, awaiting response... ) = 40 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigaction(SIGALRM, {sa_handler=0x454ba0, sa_mask=[ALRM], sa_flags=SA_RESTART}, {sa_handler=SIG_DFL, sa_mask=[ALRM], sa_flags=SA_RESTART}, 8) = 0 setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=1, tv_usec=0}}, NULL) = 0 read(4, "\27\3\3\1\n", 5) = 5 read(4, "s\...", 266) = 266 read(4, "\27\3\3\1\n", 5) = 5 read(4, "~\...", 266) = 266 read(4, 0x1e0edf3, 5) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigaction(SIGALRM, {sa_handler=SIG_DFL, sa_mask=[ALRM], sa_flags=SA_RESTART}, {sa_handler=0x454ba0, sa_mask=[ALRM], sa_flags=SA_RESTART}, 8) = 0 ioctl(0, TIOCGPGRP, [10869]) = 0 getpgrp() = 10869 write(2, "Read error (Connection timed out) in headers.\n", 46Read error (Connection timed out)
Re: Timeout never happens with https://www.tesco.com
On Mon, Dec 7, 2020 at 9:29 PM Tom Crane wrote: > > On Mon, 7 Dec 2020, Petr Pisar wrote: > > > V??Mon, Dec 07, 2020 at 03:44:32AM +,??Tom Crane napsal(a): > ... > > > > I suspect you use OpenSSL for TLS and wget is missing the fcntl() call from > > src/openssl.c. > > That was it. Checking the Slackware distro's build script > http://slackware.osuosl.org/slackware64-current/source/n/wget/wget.SlackBuild > shows it specifies, > > ./configure --with-ssl=openssl There's also --with-libssl-prefix when OpenSSL is not in a standard location. I believe the same option is needed when using an updated GnuTLS. (I build against OpenSSL because OpenSSL is easier to update. OpenSSL has fewer dependencies than GnuTLS. I don't know how --with-libssl-prefix behaves with GnuTLS). You might need --with-libssl-prefix on a system like OS X. OS X provides an antique version OpenSSL 0.9.x, so you usually need to build and install OpenSSL 1.1.x. For example: # OS X 10.5 (PowerMac) $ /usr/bin/openssl version OpenSSL 0.9.7l 28 Sep 2006 # OS X 10.9 $ /usr/bin/openssl version OpenSSL 0.9.8zg 14 July 2015 # OS X 10.12 $ /usr/bin/openssl version OpenSSL 0.9.8zh 14 Jan 2016 Jeff
Re: Timeout never happens with https://www.tesco.com
On Mon, 7 Dec 2020, Petr Pisar wrote: V??Mon, Dec 07, 2020 at 03:44:32AM +,??Tom Crane napsal(a): [analysis cut] Many thanks for your analysis. I suspect you use OpenSSL for TLS and wget is missing the fcntl() call from src/openssl.c. That was it. Checking the Slackware distro's build script http://slackware.osuosl.org/slackware64-current/source/n/wget/wget.SlackBuild shows it specifies, ./configure --with-ssl=openssl Changing that to '--with-ssl=gnutls' [default] fixed the problem. Best regards Tom.
Re: Timeout never happens with https://www.tesco.com
07.12.2020 9:44, Tom Crane пишет: On Sun, 6 Dec 2020, Petr Pisar wrote: V Sun, Dec 06, 2020 at 06:48:13PM +, Tom Crane napsal(a): tmp$ strace -s999 -f -e trace=network wget -S -v --no-check-certificate --waitretry=1 --tries=1 --timeout=1 --read-timeout=1 --wait=1 -O /tmp/280200433.tmp https://www.tesco.com/groceries/en-GB/products/280200433 [...] connect(4, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("104.82.193.249")}, 16) = 0 connected. HTTP request sent, awaiting response... but the timeout never happens, Do not restrict strace to network syscalls. We cannot see where it blocks. The timeout should occur in select() syscall. -- Petr Sure and thanks for the follow-up. Please see, https://www.mklab.rhul.ac.uk/~tom/tempwget/wget-strace.log After waiting several minutes I -C out of the strace/wget session. Out of interest, can you reproduce the hang with that URL on your system/wget version? Tom. timeout happens .. ---request begin--- GET /groceries/en-GB/products/280200433 HTTP/1.1 User-Agent: Wget/1.20.3.206-794b7 Accept: */* Accept-Encoding: identity Host: www.tesco.com Connection: Keep-Alive ---request end--- ) = 211 _newselect(5, NULL, [4], NULL, {tv_sec=1, tv_usec=0}) = 1 (out [4], left {tv_sec=0, tv_usec=94}) sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\27\3\3\0\275\16\236S\263\307\7k\274\357\371\340\273\10x\264s\336\206\335WY!5\363w\207\255X\211\230\21\342\245\254\7\265\266\277t\335\2349\204o\243\5\373eE\240\235\210\224\4<,g\256aE\327z\4\206\33\346\320\371\344\350\204\351N,z\203\264\4\\kN\205\363\324\251\366F\363s\332\272\345\355|]\274\202\35\32\210;Y\262", iov_len=194}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 194 ioctl(0, TIOCGPGRP, [21671]) = 0 getpgrp() = 21671 write(2, "HTTP request sent, awaiting response... ", 40HTTP request sent, awaiting response... ) = 40 fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 clock_gettime64(CLOCK_MONOTONIC, {tv_sec=3245, tv_nsec=533340227}) = 0 recv(4, "\27\3\3\1\n", 5, 0) = 5 recv(4, "\272\377,?\211\246-\245\253\376\247\263\317\247\343j \244\nB\377\353\305\366/\354\341\204\377\243\226\354\205\272\203\316F\"\10\276N\10\22\37\312\273\335\347e\33)\276;\326\215|Ne5!\325m\326Nzf\233m\34\7O\254&6;\326\345-DE>\336\372g@\231\243\325\307\213S\321\235\351\25\376\374\10oq\37\265i\32\fS\6\2262]\34\315\201\20\353\t\3\36s\231&3>\1\211\302\276\347\255\223\211\351*z\2651Q\223\231~\247\311\22\16\332\f;h\212\277o\310\222U\7Y\274p\331lA\222\337\206\312\315K\34)\r\3734\256H\316bJ+\30o\5\370BH\252\315O\373(s\316\22xY\345\266w:\\\316$\250\37\213\214)D\33\244\251\336\0\374\207\336\261\241:\222\260\324\333%\346VlE\304\213\240\303=\2420\262\34\372\25\357\323J\272\245u\202\310-\277a]\n\256\376c;\17~@t\306|\262O\36\353X", 266, 0) = 266 clock_gettime64(CLOCK_REALTIME, {tv_sec=1607339907, tv_nsec=865035973}) = 0 _newselect(5, [4], NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) fcntl64(4, F_SETFL, O_RDWR) = 0 ioctl(0, TIOCGPGRP, [21671]) = 0 getpgrp() = 21671 write(2, "Read error (Connection timed out) in headers.\n", 46Read error (Connection timed out) in headers. ) = 46 close(4) = 0 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1229, ...}) = 0 ioctl(0, TIOCGPGRP, [21671]) = 0 getpgrp() = 21671 write(2, "Giving up.\n\n", 12Giving up. ) = 12 close(3) = 0 exit_group(4) = ? +++ exited with 4 +++ wget -S -dv --no-check-certificate --waitretry=1 --tries=1 --timeout=1 --read-timeout=1 --wait=1 -O ./280200433.tmp https://www.tesco.com/groceries/en-GB/products/280200433 Setting --verbose (verbose) to 1 Setting --verbose (verbose) to 1 Setting --check-certificate (checkcertificate) to 0 Setting --check-certificate (checkcertificate) to 0 Setting --waitretry (waitretry) to 1 Setting --waitretry (waitretry) to 1 Setting --tries (tries) to 1 Setting --tries (tries) to 1 Setting --timeout (timeout) to 1 Setting --timeout (timeout) to 1 Setting --read-timeout (readtimeout) to 1 Setting --read-timeout (readtimeout) to 1 Setting --wait (wait) to 1 Setting --wait (wait) to 1 Setting --output-document (outputdocument) to ./280200433.tmp Setting --output-document (outputdocument) to ./280200433.tmp DEBUG output created by Wget 1.20.3 on linux-gnu. Reading HSTS entries from /home/slv/.wget-hsts URI encoding = ‘UTF-8’ --2020-12-07 17:16:33-- https://www.tesco.com/groceries/en-GB/products/280200433 Certificates loaded: 252 Resolving www.tesco.com (www.tesco.com)... 23.54.9.112 Caching www.tesco.com => 23.54.9.112 Connecting to www.tesco.com (www.tesco.com)|23.54.9.112|:443... connected. Created socket 4. Releasing 0x02050ba0 (new refcount 1). ---request begin--- GET
Re: Timeout never happens with https://www.tesco.com
On Sun, 6 Dec 2020, Petr Pisar wrote: V Sun, Dec 06, 2020 at 06:48:13PM +, Tom Crane napsal(a): tmp$ strace -s999 -f -e trace=network wget -S -v --no-check-certificate --waitretry=1 --tries=1 --timeout=1 --read-timeout=1 --wait=1 -O /tmp/280200433.tmp https://www.tesco.com/groceries/en-GB/products/280200433 [...] connect(4, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("104.82.193.249")}, 16) = 0 connected. HTTP request sent, awaiting response... but the timeout never happens, Do not restrict strace to network syscalls. We cannot see where it blocks. The timeout should occur in select() syscall. -- Petr Sure and thanks for the follow-up. Please see, https://www.mklab.rhul.ac.uk/~tom/tempwget/wget-strace.log After waiting several minutes I -C out of the strace/wget session. Out of interest, can you reproduce the hang with that URL on your system/wget version? Tom.
Re: Timeout never happens with https://www.tesco.com
V Sun, Dec 06, 2020 at 06:48:13PM +, Tom Crane napsal(a): > tmp$ strace -s999 -f -e trace=network wget -S -v --no-check-certificate > --waitretry=1 --tries=1 --timeout=1 --read-timeout=1 --wait=1 -O > /tmp/280200433.tmp https://www.tesco.com/groceries/en-GB/products/280200433 [...] > connect(4, {sa_family=AF_INET, sin_port=htons(443), > sin_addr=inet_addr("104.82.193.249")}, 16) = 0 > connected. > HTTP request sent, awaiting response... > > but the timeout never happens, Do not restrict strace to network syscalls. We cannot see where it blocks. The timeout should occur in select() syscall. -- Petr signature.asc Description: PGP signature
Timeout never happens with https://www.tesco.com
I am using the following command in a script (shown here running under strace), tmp$ strace -s999 -f -e trace=network wget -S -v --no-check-certificate --waitretry=1 --tries=1 --timeout=1 --read-timeout=1 --wait=1 -O /tmp/280200433.tmp https://www.tesco.com/groceries/en-GB/products/280200433 --2020-12-06 17:52:25-- https://www.tesco.com/groceries/en-GB/products/280200433 Resolving www.tesco.com (www.tesco.com)... socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 4 connect(4, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 4 connect(4, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 4 setsockopt(4, SOL_IP, IP_RECVERR, [1], 4) = 0 connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, 16) = 0 sendmmsg(4, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\357\317\1\0\0\1\0\0\0\0\0\0\3www\5tesco\3com\0\0\1\0\1", iov_len=31}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=31}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\376\303\1\0\0\1\0\0\0\0\0\0\3www\5tesco\3com\0\0\34\0\1", iov_len=31}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=31}], 2, MSG_NOSIGNAL) = 2 recvfrom(4, "\357\317\201\0\0\1\0\1\0\0\0\0\3www\5tesco\3com\0\0\1\0\1\300\f\0\1\0\1\0\0\1,\0\4hR\301\371", 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, [28->16]) = 47 recvfrom(4, "\376\303\201\2\0\1\0\0\0\0\0\0\3www\5tesco\3com\0\0\34\0\1", 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.254")}, [28->16]) = 31104.82.193.249 Connecting to www.tesco.com (www.tesco.com)|104.82.193.249|:443... socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4 connect(4, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("104.82.193.249")}, 16) = 0 connected. HTTP request sent, awaiting response... but the timeout never happens, it just hangs indefinitely with the TCP connection remaining in the ESTABLISHED state according to netstat. I am wondering if the problem is something to do with the initial part of the SSL negotiation since it happens very early on after the connection is established, and checking old messages in this list, there were bug reports about this a few years back. Accessing the same URL from a web browser works fine, rendering the content OK. This may of course involve client-side scripting but that is another matter. I am assuming wget should be able to read and store whatever HTML is produced and not just hang. I know I could run wget under the Gnutils timeout command as a workaround but I would prefer to get wget's timeout to work. Any thoughts? Thanks Tom Crane System/wget details: OS: Slackware64-current Kernel: 5.4.6 $ wget --version GNU Wget 1.20.3 built on linux-gnu. -cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls +ntlm +opie +psl +ssl/openssl Wgetrc: /home/tom/.wgetrc (user) /etc/wgetrc (system) Locale: /usr/share/locale Compile: gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc" -DLOCALEDIR="/usr/share/locale" -I. -I../lib -I../lib -DHAVE_LIBSSL -DNDEBUG -O2 -fPIC Link: gcc -DHAVE_LIBSSL -DNDEBUG -O2 -fPIC -lpcre2-8 -luuid -lidn2 -lssl -lcrypto -lz -lpsl ftp-opie.o openssl.o http-ntlm.o ../lib/libgnu.a -lunistring $ cat ~/.wgetrc robots = off user-agent = Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0