Re: Does wget support SOCKS proxy?

2021-01-13 Thread Peng Yu
I know the environment variables of http_proxy and https_proxy. But I don't know whether they can be used for SOCKS proxy or not because the manpage does not mention any keywords of SOCKS. On 1/13/21, Jeffrey Walton wrote: > On Wed, Jan 13, 2021 at 5:05 PM Peng Yu wrote: >> >> Hi

Does wget support SOCKS proxy?

2021-01-13 Thread Peng Yu
Hi, I don't see where wget support socks proxy or not in the manpage. Does wget support SOCKS proxy? If it supports SOCKS, could anybody let me know how to use it the command (not by .wgetrc)? Thanks. -- Regards, Peng

Re: Meaning of --timeout

2020-08-14 Thread Peng Yu
orrect error code to reflect distinguish whether it is a timeout error or an error from wget? > > Regards, Tim > > On 09.08.20 13:54, Peng Yu wrote: >> I saw that in the man. What are the rest of the time besides dns-time, >> connect-time, read-time? Thanks. >> >

Re: Meaning of --timeout

2020-08-09 Thread Peng Yu
> setting --dns-timeout + --connect-timeout + --read-timeout. > > For such tasks you can easily use the `timeout` command from GNU coreutils. > > Regards, Tim > > On 08.08.20 21:05, Peng Yu wrote: > > I want to set the time by which wget must finish. But it seems > > --t

Meaning of --timeout

2020-08-08 Thread Peng Yu
I want to set the time by which wget must finish. But it seems --timeout doesn't do so. If I set it to N, wget can not guarantee to finish in N seconds. Could anybody explain why --timeout can not be used for this purpose? How to achieve this goal? -- Regards, Peng

How to send a POST request by wget same to a httpie request?

2020-07-01 Thread Peng Yu
$ http --form POST localhost:9000 f...@1.txt The above httpie (https://httpie.org/) command will send the following POST request. Could anybody let me know what is the equivalent wget command to achieve the same HTTP request? Thanks. POST / HTTP/1.1 Host: localhost:9000 User-Agent: HTTPie/2.2.0

Refresh header and Wget

2020-06-22 Thread Peng Yu
Could the process of the Refresh header be added to Wget? Thanks. On 6/22/20, Tim Rühsen wrote: > Hi, > > the Refresh header is not processed by Wget. > > Regards, Tim > > On 22.06.20 00:39, Peng Yu wrote: >> I see Refresh response header as described below.

[no subject]

2020-06-21 Thread Peng Yu
I see Refresh response header as described below. It seems that wget does not support. Could anybody confirm whether this is the case? Thanks. http://www.otsukare.info/2015/03/26/refresh-http-header -- Regards, Peng

how to capture "20 redirections exceeded." error?

2020-05-11 Thread Peng Yu
Hi, wget returns 8 when it sees "20 redirections exceeded.". But how to capture such an error from the program calling wget? Thanks. -- Regards, Peng

Why wget doesn't download data when the HTTP status is 4xx?

2020-05-10 Thread Peng Yu
Hi, $ curl --verbose ... ... < HTTP/1.1 403 Forbidden ... < { [359 bytes data] 100 814980 814980 0 50214 0 --:--:-- 0:00:01 --:--:-- 50183 When the above 403 code occurs, curl still can download the data. But wget just gives up. Is there a way to let wget download the data

The relation of HTTP status code and wget exit status code

2020-05-09 Thread Peng Yu
It is not clear what the mapping between HTTP status code and wget exit status code is. I know HTTP 404 results in the exit status of 8. But I am looking for a complete mapping table. Is it available somewhere? Thanks. -- Regards, Peng

How to get 404 error from wget?

2020-05-09 Thread Peng Yu
Hi, When 404 error occurs, wget returns 8. How to know that the HTTP error code is 404 when wget returns the exit status code 8? Thanks. -- Regards, Peng

Can --post-file support pipes?

2020-04-20 Thread Peng Yu
Hi, It seems that --post-file does not support pipes but just files. Can it be made to support pipes? Thanks. -- Regards, Peng

Can multiple wget processes uses the same cookie file for --load-cookies cookie.txt --keep-session-cookies --save-cookies cookie.txt?

2020-02-16 Thread Peng Yu
Hi, I'd like to use multiple wget processes with the same cookie file. But I am afraid multiple write to the same cookie file may end up corrupting it. Is the corruption of the cookie file possible? If so, is there a way to allow multiple wget processes updating the cookie file safely? Thanks.

Re: Can wget work with google chrome cookie?

2020-02-16 Thread Peng Yu
cookies.txt by NOT having a leading dot for the domain. > E.g. '.example.com' means NOT hostOnly (= cookie is valid also for all > subdomains). > > Our cookie format does not have 'sameSite' and there is no 'storeId'. > > Everything else should be mapped 1-1. > > Regards, Tim &g

Can wget work with google chrome cookie?

2020-02-15 Thread Peng Yu
Hi, https://apple.stackexchange.com/questions/232433/where-are-google-chrome-cookies-stored-on-a-mac "Session cookies are only stored in memory, but the rest are in ~/Library/Application Support/Google/Chrome/Default/Cookies, it's an sqlite3 database." I see that session cookies are not stored

overrider or append entries in cookie file with --header "cookie: xxx"?

2020-02-15 Thread Peng Yu
Hi, I have a cookie file. I'd like to override an entity in it or specific an extra cookie entry with -H. But it seems that wget just ignore what is in the cookie file if -H "cookie: xxx" is specified. Is there a way to override entries in the cookie file? Thanks. $ wget --load-cookies

Re: Are --load-cookies and --save-cookies effective at the same time?

2020-02-15 Thread Peng Yu
> > $ wget --load-cookie=cookies.txt --save-cookies=cookies.txt google.de > $ sum cookies.txt > 27483 1 > > What version of wget do you use ? > > Regards, Tim > > On 14.02.20 23:11, Peng Yu wrote: >> Can the options for --load-cookies and --save-cookies

Re: Are --load-cookies and --save-cookies effective at the same time?

2020-02-14 Thread Peng Yu
txt isn't updated. You can run with --debug and > maybe that gives you a hint. > > Regards, Tim > > On 14.02.20 22:46, Peng Yu wrote: >> Hi, >> >> I want to load cookies at the beginning of wget run and save the >> cookies at the finish. >> >> I spec

Are --load-cookies and --save-cookies effective at the same time?

2020-02-14 Thread Peng Yu
Hi, I want to load cookies at the beginning of wget run and save the cookies at the finish. I specified the following options. But it seems that the timestamp of cookies.txt file is not changed at the finish of wget run. Is it normal? Or if the cookies.txt file is not changed, the time stamp

Re: wget GET vs HEAD

2020-02-03 Thread Peng Yu
> No. Wget does not perform this optimization. As mentioned by Tim, there > are many valid usecases where one would want to actually download the > body, but not store it. > However, some servers do not > follow this (*cough* Google *cough*). If it works fine for you, yes, you > can simply use

wget GET vs HEAD

2020-02-03 Thread Peng Yu
Hi, I'd like to understand the following two commands. One uses GET, the other uses HEAD. wget -q -O /dev/null -S -o- URL wget -q --spider -S -o- URL Is there first still download response body? Does wget know that its /dev/null so that it just download the header and ignore the response body?

[no subject]

2020-02-03 Thread Peng Yu
Hi, I'd like to understand the following two commands. One uses GET, the other uses HEAD. wget -q -O /dev/null -S -o- URL wget -q --spider -S -o- URL Is there first still download response body? Does wget know that its /dev/null so that it just download the header and ignore the response body?

Re: What does status bar show when --compression=auto is enabled?

2020-01-25 Thread Peng Yu
I don't know which URL is compressed. For example, I don't find which one on http://httpbin.org is compressed. Do you know which one is good for testing purpose? Thanks. On 1/25/20, Tim Rühsen wrote: > On 25.01.20 19:09, Peng Yu wrote: >> Hi, >> >> When --compre

What does status bar show when --compression=auto is enabled?

2020-01-25 Thread Peng Yu
Hi, When --compression=auto is enabled, is the size shown in the status bar the raw file size or the compressed file size? Thanks. -- Regards, Peng

How to control the filename to show with wget in progress bar?

2020-01-25 Thread Peng Yu
Hi, wget shows the name same (in this case tmp.ItHx8WTPu7) as the option -O gives in the progress bar. $ wget -q --show-progress -O tmp.ItHx8WTPu7 -- https://httpbin.org/get tmp.ItHx8WTPu7 100%[>] 306 --.-KB/sin 0s I'd like to set it to something

When can cause exit_status 8?

2019-11-02 Thread Peng Yu
Hi, For the exit status 8, besides the number of max redirects is reached. Is there any other possibility? Can I have a specific exit status code just for the max redirection allowed reached? Thanks. $ wget -q --spider -S -o /dev/null https://httpbin.org/absolute-redirect/3 || echo $? $ wget -q

[Bug-wget] Standard cookie file extension

2019-10-23 Thread Peng Yu
Hi, I am wondering if there is a standard cookie file extension for cookie files written by wget. So far, I only see filenames like cookie.txt cookies.txt. So the extension is just .txt. But .txt is not a specific extension for cookie files. I'd like an extension dedicated to cookie files for

Re: [Bug-wget] Is there an option same as curl --compressed?

2019-09-27 Thread Peng Yu
t uses homebrew to install > dependencies. > > Regards, Tim > > > On 27.09.19 15:50, Peng Yu wrote: >> I don't find wget2 on homebrew. Can anybody make a formula for it? >> >> On Fri, Sep 27, 2019 at 5:53 AM Tim Rühsen > <mailto:tim.rueh...@gmx.de&g

Re: [Bug-wget] Is there an option same as curl --compressed?

2019-09-27 Thread Peng Yu
p > > Regards, Tim > > On 27.09.19 05:03, Peng Yu wrote: > > Hi, > > > > curl has the option `--compressed` which will decompress the data > > automatically. But I don't think wget's option --compression can > > automatically decompress the data. > >

Re: [Bug-wget] Is there an option same as curl --compressed?

2019-09-27 Thread Peng Yu
ts OSX and we build Wget2 on it, using homebrew to install > dependencies. So anyone making up a homebrew formula might take it as > quick starter. > > On 27.09.19 17:20, Peng Yu wrote: >> What is the pros and cons of TravisCI vs homebrew? >> >> On 9/27/19, Tim Rühsen wrot

[Bug-wget] Is there an option same as curl --compressed?

2019-09-26 Thread Peng Yu
Hi, curl has the option `--compressed` which will decompress the data automatically. But I don't think wget's option --compression can automatically decompress the data. Is there a way to let wget automatically decompress the data? Thanks. -- Regards, Peng

Re: [Bug-wget] Does `wget -q -O /dev/null -S -o- url` ignore response body?

2019-08-12 Thread Peng Yu
, curl displays the file size and last modification time only. On 8/9/19, Tim Rühsen wrote: > On 09.08.19 18:06, Peng Yu wrote: >> Hi, >> >> I just want to retrieve the response header instead of the response body. >> >> Does `wget -q -O /dev/null -S -o- url` st

[Bug-wget] Why limit --max-redirect can results in exit_status 8?

2019-08-09 Thread Peng Yu
Hi, The exit status 8 means "Server issued an error response." But I think the exit status may not be considered as an error in the specific use case. Is there a way to make wget return 0 instead? $ wget -q --spider -S -o /dev/null --max-redirect 1 https://httpbin.org/absolute-redirect/3 || echo

[Bug-wget] Does `wget -q -O /dev/null -S -o- url` ignore response body?

2019-08-09 Thread Peng Yu
Hi, I just want to retrieve the response header instead of the response body. Does `wget -q -O /dev/null -S -o- url` still download the response body, but then dump it to /dev/null? Or wget is smart enough to know the destination is /dev/null so that it will not download the response body at

[Bug-wget] --content-disposition not support some cases

2019-08-08 Thread Peng Yu
Hi, I have an HTTP request response header which contains the following lines. Content-Type: application/pdf; name=main.pdf content-disposition: inline; filename=main.pdf But wget --content-disposition is not able to name the output file name accordingly. But this is supported by Chrome.

Re: [Bug-wget] Does wget has the "outstanding read data" problem?

2019-05-18 Thread Peng Yu
> Just try it out and report back if there is a problem. It is hard to reproduce the problem. -- Regards, Peng

[Bug-wget] How to print just download time to stderr along with a tag when the output is to stdout?

2019-05-18 Thread Peng Yu
Hi, When I use wget to download and output to stdout, I want to print a tag (any string that a user can specify in the command line) and the download time (separated by a TAB) to stderr. Is this possible with wget? Thanks. -- Regards, Peng

Re: [Bug-wget] Does wget has the "outstanding read data" problem?

2019-05-18 Thread Peng Yu
On 5/18/19, Tim Rühsen wrote: > On 17.05.19 23:09, Peng Yu wrote: >> Hi, >> >> curl has this problem "(18) transfer closed with outstanding read data >> remaining". Does wget have a similar problem like this? Thanks. >> >> https://curl.haxx.se/

Re: [Bug-wget] Does wget has the "outstanding read data" problem?

2019-05-17 Thread Peng Yu
The question is whether wget can fix the problem by resuming from where it failed. Curl can not do it as mentioned in the email. On Fri, May 17, 2019 at 4:20 PM Daniel Stenberg wrote: > On Fri, 17 May 2019, Peng Yu wrote: > > > curl has this problem "(18) transfer closed with

[Bug-wget] Does wget has the "outstanding read data" problem?

2019-05-17 Thread Peng Yu
Hi, curl has this problem "(18) transfer closed with outstanding read data remaining". Does wget have a similar problem like this? Thanks. https://curl.haxx.se/mail/archive-2018-10/0015.html -- Regards, Peng

[Bug-wget] How to show a user specified string to represent a file with wget --show-progress

2018-12-15 Thread Peng Yu
Hi, The following command only show '-' as the file in the output. $ wget -q --show-progress https://httpbin.org/get -O- | gzip > /tmp/1.txt.gz -100%[>] 266 --.-KB/sin 0s But since the output is piped to a different file, I'd rather manually specify a string

[Bug-wget] How does --connect-timeout work?

2018-10-24 Thread Peng Yu
Hi, The second command hangs forever. I am not sure what is wrong with it. My understanding is that if I set connect-timeout short enough, wget should fail. Could anybody let me know how to get this behavior? (See the curl output below as a reference.) $ time wget -qO- http://httpbin.org/get {

Re: [Bug-wget] exit status problem with pipe

2018-10-22 Thread Peng Yu
a > result, we simply ignore the SIGPIPE handler. So when Wget dies in the > command > you shared, its not really killed by SIGPIPE. Hence, in theory, it would be > incorrect for Wget to exit with a code of 141. > > * Peng Yu [181022 16:29]: >> Hi, >> >> wget ret

[Bug-wget] exit status problem with pipe

2018-10-21 Thread Peng Yu
Hi, wget returns the following exit code when it is dealing with pipe. But it does not follow the common practice. Should this behavior be fixed? $ wget -qO- http://httpbin.org/get | echo $ echo ${PIPESTATUS[@]} 3 0 $ seq 10 | echo $ echo ${PIPESTATUS[@]} 141 0 -- Regards, Peng

Re: [Bug-wget] How to intercept wget to extract the raw requests and the raw responses?

2018-02-15 Thread Peng Yu
On Wed, Feb 14, 2018 at 12:47 PM Bykov Alexey wrote: > Greetings > > Did You tried "--warc-file" option? > > wget --warc-file=httpbin -qO- https://httpbin.org/get How to convert the warc format to the actual header of requests and responses? >

Re: [Bug-wget] How to intercept wget to extract the raw requests and the raw responses?

2018-02-10 Thread Peng Yu
On Sat, Feb 10, 2018 at 3:46 PM, Tim Ruehsen <tim.rueh...@gmx.de> wrote: > Am Samstag, den 10.02.2018, 10:34 -0600 schrieb Peng Yu: >> > Use 'wget -d -olog -qO- http://httpbin.org/get'. >> >> This requires the compilation of wget with debugging support. Is >> th

Re: [Bug-wget] How to intercept wget to extract the raw requests and the raw responses?

2018-02-10 Thread Peng Yu
> Use 'wget -d -olog -qO- http://httpbin.org/get'. This requires the compilation of wget with debugging support. Is there any other way so that it does not require recompilation of wget? Thanks. $ wget -d -o/tmp/log -qO- http://httpbin.org/get Debugging support not compiled in. Ignoring --debug

[Bug-wget] How to remove a header?

2018-02-08 Thread Peng Yu
Hi, wget sets some headers. Is there a way to remove some headers, e.g., Accept-Encoding? Thanks. $ wget -qO- http://httpbin.org/get { "args": {}, "headers": { "Accept": "*/*", "Accept-Encoding": "identity", "Connection": "close", "Host": "httpbin.org", "User-Agent":

[Bug-wget] Should wget -qO- be used? (when "Read error at byte 2416640 (Success).Retrying." is seen)

2016-04-08 Thread Peng Yu
Hi, I see the message like "Read error at byte 2416640 (Success).Retrying." when I use `wget -qO- url`. I am wondering whether stdout should be use as an output file when an error like this may occur. When an error occur, will wget sometimes rewind the output file a bit (in this case, the output

Re: [Bug-wget] 'or' and flexible expression of wget options

2011-08-08 Thread Peng Yu
wrote: Hi Peng, GNU find has to follow POSIX specifications, while wget has not.  IMHO, the best way to expand wget to do such cool things is to support GNU guile to evaluate expressions, or executing an external process. Cheers, Giuseppe Peng Yu pengyu...@gmail.com writes: Hi

[Bug-wget] Bug in processing url query arguments that have '/'

2011-08-07 Thread Peng Yu
Hi, The following line is in utils.c. # in acceptable (const char *s) while (l s[l] != '/') --l; if (s[l] == '/') s += (l + 1); It essentially gets a substring after the last '/'. However, when a query has '/', this is problematic. For example, the above code snip will extract

Re: [Bug-wget] Bug in processing url query arguments that have '/'

2011-08-07 Thread Peng Yu
a different 'acceptable' function should be used. if (opt.match_query_string) full_file = concat_strings(u-file, ?, u-query, (char *) 0); if (!acceptable (full_file)) { DEBUGP ((%s (%s) does not match acc/rej rules.\n, url, full_file)); goto out; } } Peng Yu pengyu

[Bug-wget] 'or' and flexible expression of wget options

2011-08-07 Thread Peng Yu
Hi, It seems that all the options checked in download_child_p() are AND'ed. In gnu find, options can be and'ed or or'ed much more flexibly. I looked at wget source, it is not clear to me that flexible expression can be supported by the current wget cmdline option parsing framework. Does anybody

[Bug-wget] --match-query-string [bug #31147]

2011-08-06 Thread Peng Yu
Hi, https://savannah.gnu.org/bugs/?31147 I don't see that this patch has been applied to wget, as the latest version of wget had been published before the patch was submitted. Would you please let me know how to apply this patch to wget? -- Regards, Peng

Re: [Bug-wget] --match-query-string [bug #31147]

2011-08-06 Thread Peng Yu
On Sat, Aug 6, 2011 at 11:25 PM, Peng Yu pengyu...@gmail.com wrote: Hi, https://savannah.gnu.org/bugs/?31147 I don't see that this patch has been applied to wget, as the latest version of wget had been published before the patch was submitted. Would you please let me know how to apply

[Bug-wget] How to download all the links on a webpage which are in some directory?

2011-08-01 Thread Peng Yu
Suppose I want download www.xxx.org/somefile/aaa.sfx and the links therein (but restricted to the directory www.xxx.org/somefile/aaa/) I tried the option '--mirror -I /somefile/aaa', but it only download www.xxx.org/somefile/aaa.sfx. I'm wondering what is the correct option to do so? --

[Bug-wget] How to just download cookies?

2011-07-31 Thread Peng Yu
Hi, I use the following code to download the cookies. But it will always download some_page. Is there a way to just download the cookies? wget --post-data='something' --directory-prefix=/tmp --save-cookies=cookies_file --keep-session-cookies http://xxx.com/some_page /dev/null -- Regards, Peng

[Bug-wget] How to exclude link?

2010-07-10 Thread Peng Yu
Suppose I need to excluded links that start with xyz with the following option. However, wget --mirror \ --limit-rate=400k \ --no-parent \ --no-host-directories \ --wait=5 \ --random-wait \ --convert-links \ --cut-dirs=2 \ --reject xyz* \

Re: [Bug-wget] How to set -l to be zero? (Or how to download a single webpage and convert the absolute links to relative links with wget?)

2010-06-02 Thread Peng Yu
On Tue, Jun 1, 2010 at 10:40 PM, Micah Cowan mi...@cowan.name wrote: On 06/01/2010 05:53 PM, Peng Yu wrote: On Tue, Jun 1, 2010 at 6:48 PM, Micah Cowan mi...@cowan.name wrote: On 06/01/2010 04:36 PM, Peng Yu wrote: I need to use the option --convert-links to download only one webpage, because

Re: [Bug-wget] How to set -l to be zero? (Or how to download a single webpage and convert the absolute links to relative links with wget?)

2010-06-02 Thread Peng Yu
Ah, sorry, I misunderstood what you wanted. What I described will convert relative links to absolute links, not vice versa. No problem. You're right, to get what you want, then you need recursion; wget only converts links to point at pages locally, if it directly knows they've been

Re: [Bug-wget] How to ignore link like index.html?lang=ja?

2010-06-01 Thread Peng Yu
On Sat, May 29, 2010 at 12:11 PM, Micah Cowan mi...@cowan.name wrote: Unfortunately, wget doesn't currently let you match query strings. Yes, this is a major shortcoming. Peng Yu pengyu...@gmail.com wrote: There is the link index.html?lang=ja in index.html. I want to ignore such links. I use

[Bug-wget] How to set -l to be zero? (Or how to download a single webpage and convert the absolute links to relative links with wget?)

2010-06-01 Thread Peng Yu
I need to use the option --convert-links to download only one webpage, because I want to convert absolute links to relative links if the links are under the host directory where the webpage is in. Since I only interest in one page, I'd like to set -l be zero. But it seems that if I set it to zero,

[Bug-wget] How to convert-links when the download has finished?

2010-05-31 Thread Peng Yu
I have download a web directory (with option --mirror --timestamping --no-parent). But I forget to specify the option --convert-links. Is there a way to post-processing the already downloaded files to convert the links without having to redownload all the files? -- Regards, Peng

[Bug-wget] How to ignore link like index.html?lang=ja?

2010-05-29 Thread Peng Yu
There is the link index.html?lang=ja in index.html. I want to ignore such links. I use the following command. Would you please let me know how to ignore index.html?lang=ja? get --mirror \ --timestamping \ --no-parent \ url/index.html -- Regards, Peng

[Bug-wget] How to download files from website that require password?

2010-05-27 Thread Peng Yu
For example, I need to download some webpages from my amazon account, which requires a password. Could you please let me know how to download such webpages? -- Regards, Peng