submitting wget bugs is like a blackhole

2007-01-03 Thread Dan Jacobson
>From the users perspective, sending bugs to [EMAIL PROTECTED] is like a black hole. This is in contrast to other systems like the Debian bug tracking system. No, don't move to bugzilla, else we won't be able to send email.

-Y is gone from the man page

2006-11-16 Thread Dan Jacobson
-Y is gone from the man page, except one tiny mention, and --help doesn't show its args. Didn't check Info.

wget --save-even-if-error

2006-11-12 Thread Dan Jacobson
One discovers that wget secretly (not documented) throws away the content of the response if there was an error (404, 503, etc.). So there needs to be a --save-even-if-error switch.

give error count at end

2006-08-02 Thread Dan Jacobson
Downloaded: 735,142 bytes in 24 files looks great. But if 09:49:46 ERROR 404: WWWOFFLE Host Not Got. flew off the screen, one will never know. That's why you should say Downloaded: 735,142 bytes in 24 files. 3 files not downloaded.

--force-html -i file.html

2006-07-04 Thread Dan Jacobson
$ man wget -i file --input-file=file The file need not be an HTML document (but no harm if it is)---it is enough if the URLs are just listed sequentially. Well even with -i file.html, one still needs --force-html. So "yes harm if it is". GNU Wget 1.10.2

can't recurse if no index.html

2006-03-22 Thread Dan Jacobson
I notice with server created directory listings, one can't recurse. $ lynx -dump http://localhost/~jidanni/test|head Index of /~jidanni/test Icon [1]Name [2]Last modified [3]Size [4]Description _

HTTP 1.1?

2006-02-21 Thread Dan Jacobson
The documents don't say how or why one can/not force wget to send HTTP 1.1 headers, instead of 1.0. Maybe it is simply not ready yet?

make clear that no --force-html means .txt

2006-02-12 Thread Dan Jacobson
Man page says: -i file --input-file=file Read URLs from file. If - is specified as file, URLs are read from the standard input. (Use ./- to read from a file literally named -.) If this function is used, no URLs need be present on the comm

Re: --random-wait: users can no longer specify a minimum wait

2006-02-04 Thread Dan Jacobson
H> Maybe it should rather vary between 0.5*wait and 1.5*wait? There you go again making assumptions about what the user wants. H> I think it'd be a shame to spend more arguments on such a rarely-used H> feature. --random-wait[=a,b,c,d...] loaded with lots of downwardly compatible arguments that are

--random-wait: users can no longer specify a minimum wait

2006-02-02 Thread Dan Jacobson
"--random-wait causes the time between requests to vary between 0 and 2 * wait seconds, where wait was specified using the --wait option, " So one can no longer specify a minimum wait time! The 2 and at least the 0 should be user configurable floating numbers.

hard to tell -Y off means --no-proxy

2006-01-20 Thread Dan Jacobson
Nowadays in the documents, it is very hard to tell -Y off means --no-proxy. You must be phasing out -Y or something. No big deal. OK. Also reported already I think: For more information about the use of proxies with Wget, And then nothing, on the man page. GNU Wget 1.10.2

Wishlist: support the file:/// protocol

2005-12-11 Thread Dan Jacobson
Wishlist: support the file:/// protocol: $ wget file:///home/jidanni/2005_safe_communities.html

mention -r on the Recursive Download Info page

2005-11-08 Thread Dan Jacobson
In Info "3 Recursive Download", mention "-r, --recursive"! Also don't threaten to remove -L!

I have a whole file of --headers

2005-10-21 Thread Dan Jacobson
What if I have a whole file of headers I want to use: $ sed 1d /var/cache/wwwoffle/outgoing/O6PxpG00D+DBLAI8puEtOew|col|colrm 22 Host: www.hsr.gov.tw User-Agent: Mozilla/5 Accept: text/xml,appl Accept-Language: en-u Accept-Encoding: gzip Accept-Charset: Big5, Keep-Alive: 300 Proxy-Connection: kee R

curl has --max-filesize

2005-10-21 Thread Dan Jacobson
Curl has this impressive looking feature: $ man curl --max-filesize Specify the maximum size (in bytes) of a file to download. If the file requested is larger than this value, the transfer will not start and curl will return with exit code 63. NOTE: The file size is not always known prio

-Y not mentioned fully in man and info

2005-08-20 Thread Dan Jacobson
-Y not mentioned fully in man and info: $ wget --help|grep -- -Y -Y, --proxy explicitly turn on proxy. $ man wget|col -b|grep -- -Y if Wget crashes while downloading wget -rl0 -kKE -t5 -Y0 $ wget -V GNU Wget 1.10.1-beta1 ... Originally written by Hrvoje Niksic <[EMAI

say where -l levels start

2005-07-24 Thread Dan Jacobson
In the man page -l depth --level=depth Specify recursion maximum depth level depth. The default maximum depth is 5. Say what levels 0 and 1 do, so one gets an idea about what depth means 'this page only' and 'just the links in this page, and no further'.

add --print-uris or --dry-run

2005-06-28 Thread Dan Jacobson
Wget needs a --print-uris or --dry-run option, to show what it would get/do without actually doing it! Not only can one check if e.g., -B will do what they want before actually doing it, one could also use wget as a general URL extractor, etc. --debug is not what I'm talking about. I'm more talki

Re: why must -B need -F to take effect?

2005-06-28 Thread Dan Jacobson
Ok, then here -B URL --base=URL When used in conjunction with -F, prepends URL to relative links in the file specified by -i. don't mention -F!

why must -B need -F to take effect?

2005-06-26 Thread Dan Jacobson
Why must -B need -F to take effect? Why can't one do xargs wget -B http://bla.com/ -i - <

flag to display just the errors

2005-03-16 Thread Dan Jacobson
I see I must do wget --spider -i file -nv 2>&1|awk '!/^$|^200 OK$/' as the only way to just get the errors. There is no flag that will only let the errors thru and silence the rest. -q silences all. Wget 1.9.1

Re: bug-wget still useful

2005-03-15 Thread Dan Jacobson
P> I don't know why you say that. I see bug reports and discussion of fixes P> flowing through here on a fairly regular basis. All I know is my reports for the last few months didn't get the usual (any!) cheery replies. However, I saw them on Gmane, yes.

bug-wget still useful

2005-03-15 Thread Dan Jacobson
Is it still useful to mail to [EMAIL PROTECTED] I don't think anybody's home. Shall the address be closed?

--header with more than one cookie

2005-02-12 Thread Dan Jacobson
In the man page, show how one does this wget --cookies=off --header "Cookie: =" with more than one cookie.

-N vs. Last-modified header missing

2005-02-08 Thread Dan Jacobson
1. Anybody home? 2. No way to make wget not refetch the file when: Last-modified header missing -- time-stamps turned off. 09:55:20 URL:http://bm2ddp.myweb.hinet.net/b3.htm [16087] -> "uris.d/bm2ddp.myweb.hinet.net/b3.htm" [1] when using wget -s -w 2 -e robots=off -P bla.d -p -t 1 -N -nv

weird

2005-01-29 Thread Dan Jacobson
Anybody home? This looks weird: $ wget --spider -S -r -l 1 http://www.noise.com.tw/eia/product.htm --09:22:13-- http://www.noise.com.tw/eia/product.htm => `www.noise.com.tw/eia/product.htm' Resolving localhost... 127.0.0.1 Connecting to localhost[127.0.0.1]:8080... connected. Proxy req

error message contents thrown away

2005-01-08 Thread Dan Jacobson
There is no way to see what $ lynx -dump http://wapp8.taipower.com.tw/ can show me when $ wget -O - -S -s http://wapp8.taipower.com.tw/ 08:54:48 ERROR 403: Access Forbidden. i.e., the site's error message contents.

-p vs. ftp

2004-11-22 Thread Dan Jacobson
>>>>> "D" == Derek B Noonburg <[EMAIL PROTECTED]> writes: D> On 20 Nov, Dan Jacobson wrote: D> Can you try the binary on my web site? D> ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.00-linux.tar.gz) >> >> But my batch script to wget it doesn't

No URLs found in -

2004-11-04 Thread Dan Jacobson
Odd, $ ssh debian.linux.org.tw wget -e robots=off --spider -t 1 -i - < a.2 No URLs found in -. Or is this wget just too old? P.S., no cheery responses received recently.

incomplete sentence on man page

2004-09-25 Thread Dan Jacobson
On the man page: For more information about the use of proxies with Wget, -Q quota

document -N -p

2004-09-06 Thread Dan Jacobson
To Info node "Time-Stamping Usage" add a clarification about what happens when -N and -p are used together: are e.g., all the included images also checked, or just the main page?

mention that -p turns on -x

2004-08-21 Thread Dan Jacobson
Mention that -p turns on or implies -x in both the -p and -x parts of both the man and info pages.

tmp names

2004-08-01 Thread Dan Jacobson
Perhaps a useful option would be to have files use a temporary name until download is complete, then moving to the permanent name.

Re: --post-data --spider

2004-07-29 Thread Dan Jacobson
BTW, because wget 1.9.1 has no way to save "session cookies" yet, that example will fail often. Hopefully the user will soon be able to control what cookies are saved no matter what they themselves say.

-x vs. file.1

2004-07-28 Thread Dan Jacobson
$ man wget When running Wget without -N, -nc, or -r, downloading the same file in the same directory will result in the original copy of file being preserved and the second copy being named file.1. $ wget -x http://static.howstuffworks.com/flash/toilet.swf $ wget

--post-data --spider

2004-07-28 Thread Dan Jacobson
$ man wget This example shows how to log to a server using POST and then proceed to download the desired pages, presumably only accessible to authorized users: # Log in to the server. This can be done only once. You mean "we only do this once".

Re: parallel fetching

2004-07-22 Thread Dan Jacobson
H> I suppose forking would not be too hard, but dealing with output from H> forked processes might be tricky. Also, people would expect `-r' to H> "parallelize" as well, which would be harder yet. OK, maybe add a section to the manual, showing that you have considered parallel fetching, but the c

Re: parallel fetching

2004-07-18 Thread Dan Jacobson
Phil> How about Phil> $ wget URI1 & wget URI2 Mmm, OK, but unwieldy if many. I guess I'm thinking about e.g., $ wget --max-parallel-fetches=11 -i url-list (hmm, with default=1 meaning not parallel, but sequential.)

only depend on the timestamp, not size

2004-07-18 Thread Dan Jacobson
Man page: When running Wget with -N, with or without -r, the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and size of the file. I have an application where I want it only to depend on the timestamp. Too bad ther

parallel fetching

2004-07-13 Thread Dan Jacobson
Maybe add an option so e.g., $ wget --parallel URI1 URI2 ... would get them at the same time instead of in turn.

--print-uris

2004-06-21 Thread Dan Jacobson
Wget should have a --print-uris option, to tell us what it is planning to get, so we can adjust things without making commitment yet. Perhaps useful with -i or -r...

--random-wait but no --wait

2004-06-21 Thread Dan Jacobson
The man page doesn't say what will happen if one specifies --random-wait but no --wait has been used. Perhaps just say under --wait that --wait=0 if not set.

mention -e in the same paragraph

2004-06-20 Thread Dan Jacobson
In Info where you mention: Most of these commands have command-line equivalents (*note Invoking::), though some of the more obscure or rarely used ones do not. You should also mention -e in the same paragraph.

say what circumstances wget will return non-zero

2004-06-17 Thread Dan Jacobson
The docs should mention return value... In fact it should be an item in the Info Concept Index. I.e. how to depend on $ wget ... && bla || mla So say what circumstances wget will return non-zero.

Say "older or the same age"

2004-06-16 Thread Dan Jacobson
$ info The time-stamping in GNU Wget is turned on using `--timestamping' (`-N') option, or through `timestamping = on' directive in `.wgetrc'. With this option, for each file it intends to download, Wget will check whether a local file of the same name exists. If it does, and the remo

Re: save more cookies between invocations

2004-05-29 Thread Dan Jacobson
H> Do you really need an option to also save expired cookies? You should allow the user power over all aspects...

save more cookies between invocations

2004-05-28 Thread Dan Jacobson
Wishlist: giving a way to save the types of cookies you say you won't in: `--save-cookies FILE' Save cookies to FILE at the end of session. Cookies whose expiry time is not specified, or those that have already expired, are not saved. so we can carry state between wget invocations,

-O vs. -nc

2004-04-27 Thread Dan Jacobson
On the man page the interaction between -O vs. -nc is not mentioned! Nor perhaps -O vs. -N. Indeed, why don't you cause an error when you find both -O and -nc used, if you don't intend to allow -nc work with -O, which would actually be best.

wget has no tool to just show the size of a FTP url without fetching it?

2004-04-26 Thread Dan Jacobson
True, the man page doesn't say --spider will tell me the size of a file without fetching it, but I already got used to that on http, but for ftp, wget --spider -Y off -S ftp://gmt.soest.hawaii.edu/pub/gmt/4/GMT_high.tar.bz2 just give some messages, ending in 227 Entering Passive Mode (128,171,159,1

Re: apt-get via Windows with wget

2004-02-18 Thread Dan Jacobson
It seems one cannot just use the wget .exe without the DLLs, even if one only wants to connect to just http sites, not any https sites. So one cannot just click on the wget .exe from inside Unzip's filelist.

Re: apt-get via Windows with wget

2004-01-30 Thread Dan Jacobson
H> For getting Wget you might want to link directly to H> ftp://ftp.sunsite.dk/projects/wget/windows/wget-1.9.1b-complete.zip, OK, but too bad there's no stable second link .../latest.zip so I don't have to update my web page to follow the link. Furthermore, they don't need SSL, but I don't see an

check just the size on ftp

2004-01-29 Thread Dan Jacobson
Normally, if I want to check out how big a page is before committing to download it, I use wget -S --spider URL You might give this as a tip in the docs. However, for FTP it doesn't get one the file size. At least for wget -S --spider ftp://ftp.sunsite.dk/projects/wget/windows/wget-1.9.1b-complet

apt-get via Windows with wget

2004-01-29 Thread Dan Jacobson
I suppose Windows users don't have a way to get more that one file at once, hence to have a Windows user download 500 files and burn them onto a CD, as in http://jidanni.org/comp/apt-offline/index_en.html so one needs wget? Any tips on the concept in my web page? I don't have Windows to try it. C

Re: wget -s -O pp --spider

2004-01-27 Thread Dan Jacobson
> "Hrvoje" == Hrvoje Niksic <[EMAIL PROTECTED]> writes: Hrvoje> Please send bug reports to [EMAIL PROTECTED], or at least make sure Hrvoje> that they don't go only to me. Yes, but needing a confirmation message over and over has driven me nuts.

--spider gets file if ftp !

2003-12-08 Thread Dan Jacobson
--spider ...it will not download the pages... $ wget -Y off --spider ftp://alpha.gnu.org/gnu/coreutils/coreutils-5.0.91.tar.bz2 --12:13:37-- ftp://alpha.gnu.org/gnu/coreutils/coreutils-5.0.91.tar.bz2 => `coreutils-5.0.91.tar.bz2' Resolving alpha.gnu.org... done. Conne

Re: feature request: --second-guess-the-dns

2003-11-17 Thread Dan Jacobson
H> It's not very hard to fix `--header' to replace Wget-generated H> values. H> Is there consensus that this is a good replacement for H> `--connect-address'? I don't want to tamper with headers. I want to be able to do experiments leaving all variables alone except for IP address. Thus --connec

Re: non-subscribers have to confirm each message to bug-wget

2003-11-17 Thread Dan Jacobson
>> And stop making me have to confirm each and every mail to this list. Hrvoje> Currently the only way to avoid confirmations is to subscribe to the Hrvoje> list. I'll try to contact the list owners to see if the mechanism can Hrvoje> be improved. subscribe me with the "nomail" option, if it can

Re: feature request: --second-guess-the-dns

2003-11-17 Thread Dan Jacobson
> "P" == Post, Mark K <[EMAIL PROTECTED]> writes: P> You can do this now: P> wget http://216.46.192.85/ P> Using DNS is just a convenience after all, not a requirement. but then one doesn't get the HTTP Host field set to what he wants.

Re: feature request: --second-guess-the-dns

2003-11-17 Thread Dan Jacobson
By the way, I did edit /etc/hosts to do one experiment http://groups.google.com/groups?threadm=vrf7007pbg2136%40corp.supernews.com i.e. <[EMAIL PROTECTED]> to test an IP/name combination, without waiting for DNS's to update. Good thing I was root so I could do it. I sure hope that when one sees

if anything bad happens, return non-zero

2003-11-17 Thread Dan Jacobson
$ wget --spider BAD_URL GOOD_URL; echo $? 0 $ wget --spider GOOD_URL BAD_URL; echo $? 1 I say they both should be 1. If anything bad happens, return 1 or some other non-zero value. By BAD, I mean a producer of e.g., ERROR 503: Service Unavailable. --spider or not, too. And stop making me have to

feature request: --second-guess-the-dns

2003-11-15 Thread Dan Jacobson
I see there is --bind-address=ADDRESS When making client TCP/IP connections, "bind()" to ADDRESS on the local machine. ADDRESS may be specified as a hostname or IP address. This option can be useful if your machine is bound to multiple IPs. But I want a

-T default really 15 minutes?

2003-10-31 Thread Dan Jacobson
Man says: -T seconds ... The default timeout is 900 seconds Ok, then why does this take only 3 minutes to give up?: --07:58:54-- http://linux.csie.nctu.edu.tw/OS/Linux/distributions/debian/dists/sid/main/binary-i386/Packages.gz => `Packages.gz' Resolving linux.csie

-q and -S are incompatible

2003-10-06 Thread Dan Jacobson
-q and -S are incompatible and should perhaps produce errors and be noted thus in the docs. BTW, there seems no way to get the -S output, but no progress indicator. -nv, -q kill them both. P.S. one shouldn't have to confirm each bug submission. Once should be enough.

dug long to find how to not look in .netrc

2003-08-27 Thread Dan Jacobson
The man page says To prevent the pass- words from being seen, store them in .wgetrc or .netrc, The problem is that if you just happen to have a .netrc entry for a certain machine, but you don't wish wget would notice it, then what to do? Can you believe $ wget --http-user=

-O --spider

2003-07-24 Thread Dan Jacobson
> You can view the map at: > http://home.sara.nl/~bram/debchart.jpeg < WARNING: this image is ENORMOUS. OK, so I will use wget -O --spider -Y off http://home.sara.nl/~bram/debchart.jpeg to see how big before biting with my modem, I thought. But I mistyped -O for -S and ended up getting the whole

want date too

2003-07-07 Thread Dan Jacobson
"--15:33:01--" is not adequate for beyond 24 hours. Wish there was a way to put more date info into this message, like syslog does, without stepping outside wget.

can't turn off good messages without taking the bad too

2003-07-04 Thread Dan Jacobson
I was hoping to separate the usual news, $ wget http://abc.iis.sinica.edu.tw/ --09:26:00-- http://abc.iis.sinica.edu.tw/ => `index.html' Resolving localhost... done. Connecting to localhost[127.0.0.1]:8080... connected. from the bad news, Proxy request sent, awaiting response... 503

wget --continue vs. wwwoffle

2003-06-24 Thread Dan Jacobson
The following message is a courtesy copy of an article that has been posted to gmane.network.wwwoffle.user as well. As we wwwoffle users all might know, wget has -c, --continue resume getting a partially-downloaded file. Quite handy when a large download got interrupted. However

return codes

2001-03-07 Thread Dan Jacobson
No documentation found on what wget's return codes are. e.g. reasnonable wish: $ wget -N URL && echo got it|mail john please add to docs about what the policy is, even if 'none at present' -- http://www.geocities.com/jidanni Tel886-4-25854780 ¿n¤¦¥§

no status option

2001-02-23 Thread Dan Jacobson
I thinking wget could have a status option, like bash's $ set -o allexport off braceexpand on errexit off... perhaps a plain $ wget -d might be a good place. -- http://www.geocities.com/jidanni Tel886-4-25854780