why must -B need -F to take effect?
Why must -B need -F to take effect? Why can't one do xargs wget -B http://bla.com/ -i - <
Re: why must -B need -F to take effect?
Ok, then here -B URL --base=URL When used in conjunction with -F, prepends URL to relative links in the file specified by -i. don't mention -F!
add --print-uris or --dry-run
Wget needs a --print-uris or --dry-run option, to show what it would get/do without actually doing it! Not only can one check if e.g., -B will do what they want before actually doing it, one could also use wget as a general URL extractor, etc. --debug is not what I'm talking about. I'm more talking about someting like apt-get's --print-uris.
say where -l levels start
In the man page -l depth --level=depth Specify recursion maximum depth level depth. The default maximum depth is 5. Say what levels 0 and 1 do, so one gets an idea about what depth means 'this page only' and 'just the links in this page, and no further'.
-Y not mentioned fully in man and info
-Y not mentioned fully in man and info: $ wget --help|grep -- -Y -Y, --proxy explicitly turn on proxy. $ man wget|col -b|grep -- -Y if Wget crashes while downloading wget -rl0 -kKE -t5 -Y0 $ wget -V GNU Wget 1.10.1-beta1 ... Originally written by Hrvoje Niksic <[EMAIL PROTECTED]> P.S., also say bug address here too else he will get bugs.
curl has --max-filesize
Curl has this impressive looking feature: $ man curl --max-filesize Specify the maximum size (in bytes) of a file to download. If the file requested is larger than this value, the transfer will not start and curl will return with exit code 63. NOTE: The file size is not always known prior to download, and for such files this option has no effect even if the file transfer ends up being larger than this given limit. This concerns both FTP and HTTP transfers.* Anyways, wget could also have --max-filesize. WWWOFFLE could do something on a per URL basis. We modem users then wouldn't have to worry about something going hog wild as much. (*Well, they ought to have another option about what to do if it is not known: get up to max=0, XXX bytes, or infinity. Wait... couldn't some track of how many bytes swallowed so far be made and a stop be put to it if it exceeds... Indeed, wget prints those progress messages showing it is keeping track, info in header or not. Indeed, at least we can still see the top part of some whopping .JPG, etc. Too bad .pdfs are seemingly useless if truncated.)
I have a whole file of --headers
What if I have a whole file of headers I want to use: $ sed 1d /var/cache/wwwoffle/outgoing/O6PxpG00D+DBLAI8puEtOew|col|colrm 22 Host: www.hsr.gov.tw User-Agent: Mozilla/5 Accept: text/xml,appl Accept-Language: en-u Accept-Encoding: gzip Accept-Charset: Big5, Keep-Alive: 300 Proxy-Connection: kee Referer: http://www.h Why isn't there an option where I can give the whole file to wget? Why must one do painstaking scripts like perl -anwe 'BEGIN{$a=0}s/\r//;chomp;if(/^GET/){print "\nwget \"$F[1]\" "}; if(/^(Refer|Cook|etc.etc.)/){print "--header=\"$_\" "}' O* just to feed them one by one to wget?!
mention -r on the Recursive Download Info page
In Info "3 Recursive Download", mention "-r, --recursive"! Also don't threaten to remove -L!
Wishlist: support the file:/// protocol
Wishlist: support the file:/// protocol: $ wget file:///home/jidanni/2005_safe_communities.html
hard to tell -Y off means --no-proxy
Nowadays in the documents, it is very hard to tell -Y off means --no-proxy. You must be phasing out -Y or something. No big deal. OK. Also reported already I think: For more information about the use of proxies with Wget, And then nothing, on the man page. GNU Wget 1.10.2
--random-wait: users can no longer specify a minimum wait
"--random-wait causes the time between requests to vary between 0 and 2 * wait seconds, where wait was specified using the --wait option, " So one can no longer specify a minimum wait time! The 2 and at least the 0 should be user configurable floating numbers.
Re: --random-wait: users can no longer specify a minimum wait
H> Maybe it should rather vary between 0.5*wait and 1.5*wait? There you go again making assumptions about what the user wants. H> I think it'd be a shame to spend more arguments on such a rarely-used H> feature. --random-wait[=a,b,c,d...] loaded with lots of downwardly compatible arguments that are [?] ignored if this is an older wget.
make clear that no --force-html means .txt
Man page says: -i file --input-file=file Read URLs from file. If - is specified as file, URLs are read from the standard input. (Use ./- to read from a file literally named -.) If this function is used, no URLs need be present on the command line. If there are URLs both on the command line and in an input file, those on the command lines will be the first ones to be retrieved. The file need not be an HTML document (but no harm if it is)---it is enough if the URLs are just listed sequentially. Say if you don't use --force-html then it better not be HTML! However, if you specify --force-html, the document will be regarded as html. In that case you may have problems with relative links, which you can solve either by adding "" to the documents or by specifying --base=url on the command line. Also one can't use -i file:///bla.html
HTTP 1.1?
The documents don't say how or why one can/not force wget to send HTTP 1.1 headers, instead of 1.0. Maybe it is simply not ready yet?
can't recurse if no index.html
I notice with server created directory listings, one can't recurse. $ lynx -dump http://localhost/~jidanni/test|head Index of /~jidanni/test Icon [1]Name [2]Last modified [3]Size [4]Description ___ [DIR] [5]Parent Directory - [TXT] [8]cd.html 23-Feb-2006 20:55 931 $ wget --spider -S -r http://localhost/~jidanni/test/ localhost/~jidanni/test/index.html: No such file or directory
--force-html -i file.html
$ man wget -i file --input-file=file The file need not be an HTML document (but no harm if it is)---it is enough if the URLs are just listed sequentially. Well even with -i file.html, one still needs --force-html. So "yes harm if it is". GNU Wget 1.10.2
give error count at end
Downloaded: 735,142 bytes in 24 files looks great. But if 09:49:46 ERROR 404: WWWOFFLE Host Not Got. flew off the screen, one will never know. That's why you should say Downloaded: 735,142 bytes in 24 files. 3 files not downloaded.
wget --save-even-if-error
One discovers that wget secretly (not documented) throws away the content of the response if there was an error (404, 503, etc.). So there needs to be a --save-even-if-error switch.
-Y is gone from the man page
-Y is gone from the man page, except one tiny mention, and --help doesn't show its args. Didn't check Info.
submitting wget bugs is like a blackhole
>From the users perspective, sending bugs to [EMAIL PROTECTED] is like a black hole. This is in contrast to other systems like the Debian bug tracking system. No, don't move to bugzilla, else we won't be able to send email.
no status option
I thinking wget could have a status option, like bash's $ set -o allexport off braceexpand on errexit off... perhaps a plain $ wget -d might be a good place. -- http://www.geocities.com/jidanni Tel886-4-25854780
return codes
No documentation found on what wget's return codes are. e.g. reasnonable wish: $ wget -N URL && echo got it|mail john please add to docs about what the policy is, even if 'none at present' -- http://www.geocities.com/jidanni Tel886-4-25854780 ¿n¤¦¥§
wget --continue vs. wwwoffle
The following message is a courtesy copy of an article that has been posted to gmane.network.wwwoffle.user as well. As we wwwoffle users all might know, wget has -c, --continue resume getting a partially-downloaded file. Quite handy when a large download got interrupted. However there are some caveats when using it thru wwwoffle. It would be neat if the right combination of wget switches could be known for doing this without causing wwwoffle to get the whole file again, and without resorting to -Y off, bypassing wwwoffle. Another case is where we get the partially downloaded (larva, pupa, whatever) file out of the wwwoffle cache by hand and rename it to wget expects, and continue with wget -Y off -c. Wget doesn't know that there is a HTTP header swelling the file, so that must first be chopped off. There are other considerations too probably.
can't turn off good messages without taking the bad too
I was hoping to separate the usual news, $ wget http://abc.iis.sinica.edu.tw/ --09:26:00-- http://abc.iis.sinica.edu.tw/ => `index.html' Resolving localhost... done. Connecting to localhost[127.0.0.1]:8080... connected. from the bad news, Proxy request sent, awaiting response... 503 Connect failed 09:26:00 ERROR 503: Connect failed. but I see they both go to stderr. Also, none of the command line switches affect one without affecting the other. Yes, stdout is reserved for -O -, but there ought to be a switch that will cause no output on stdout unless it is real error output... even -nv doesn't do that. I must now write t=/tmp/site-checker if wget -Y off -t 1 --spider http://jidanni.org/index.html > $t 2>&1 then : else cat $t >> $HOME/errors fi rm $t when instead, wget --real-errors-only-please -q ... >> $HOME/errors Would have done. Then a test -s $HOME/errors is all that would be needed to know there had been trouble.
want date too
"--15:33:01--" is not adequate for beyond 24 hours. Wish there was a way to put more date info into this message, like syslog does, without stepping outside wget.
-O --spider
> You can view the map at: > http://home.sara.nl/~bram/debchart.jpeg < WARNING: this image is ENORMOUS. OK, so I will use wget -O --spider -Y off http://home.sara.nl/~bram/debchart.jpeg to see how big before biting with my modem, I thought. But I mistyped -O for -S and ended up getting the whole file anyway. So next time I wish wget would see this as missing arguments. We can write ./--spider if that is where we really want to put the output.
dug long to find how to not look in .netrc
The man page says To prevent the pass- words from being seen, store them in .wgetrc or .netrc, The problem is that if you just happen to have a .netrc entry for a certain machine, but you don't wish wget would notice it, then what to do? Can you believe $ wget --http-user= --http-passwd= http://debian.linux.org.tw/ $ wget --http-user=x --http-passwd=x http://debian.linux.org.tw/ don't even override .netrc?! $ wget http://:@debian.linux.org.tw/ http://:@debian.linux.org.tw/: Invalid user name. $ wget http://x:[EMAIL PROTECTED]/ OK, that overrides it, but still, one can't achieve no username and password. OK, faraway in an Info page do I finally find a .wgetrc's netrc=off ... this should be noted everywhere the docs mention .netrc. Also there should be a way to do it from the command line -- I making a script that shouldn't blow up just because the user has an account on some mirror. BTW, the man page also says For more information about security issues with Wget, but then the sentence just stops. GNU Wget 1.8.2
-q and -S are incompatible
-q and -S are incompatible and should perhaps produce errors and be noted thus in the docs. BTW, there seems no way to get the -S output, but no progress indicator. -nv, -q kill them both. P.S. one shouldn't have to confirm each bug submission. Once should be enough.
-T default really 15 minutes?
Man says: -T seconds ... The default timeout is 900 seconds Ok, then why does this take only 3 minutes to give up?: --07:58:54-- http://linux.csie.nctu.edu.tw/OS/Linux/distributions/debian/dists/sid/main/binary-i386/Packages.gz => `Packages.gz' Resolving linux.csie.nctu.edu.tw... done. Connecting to linux.csie.nctu.edu.tw[140.113.17.250]:80... failed: Connection timed out. Giving up. --08:02:07-- http://debian.csie.ntu.edu.tw/debian/dists/sid/main/binary-i386/Packages.gz I used wget -t 1 -Y off -S --spider url1 url2 ... Therefore the man page does not mention other factors involved. I want to limit that 3 minute above timeout to only 30 seconds, but it appears -T is not the one affecting this case, or else it should have waited 15 minutes as documented. Nothing here messing things up: $ grep ^[^#] $HOME/.wgetrc /etc/wgetrc /home/jidanni/.wgetrc:netrc=off /etc/wgetrc:passive_ftp = on /etc/wgetrc:waitretry = 10 $ wget --version GNU Wget 1.8.2 ... Originally written by Hrvoje Niksic <[EMAIL PROTECTED]>. I'd put the bug address there too or instead.
feature request: --second-guess-the-dns
I see there is --bind-address=ADDRESS When making client TCP/IP connections, "bind()" to ADDRESS on the local machine. ADDRESS may be specified as a hostname or IP address. This option can be useful if your machine is bound to multiple IPs. But I want a --second-guess-the-dns=ADDRESS so I can $ wget http://jidanni.org/ Resolving jidanni.org... done. Connecting to jidanni.org[216.46.203.182]:80... connected. HTTP request sent, awaiting response... 503 Service Unavailable $ wget --second-guess-the-dns=216.46.192.85 http://jidanni.org/ Connecting to jidanni.org[216.46.192.85]:80... connected... Even allow different port numbers there, even though we can add them after the url already: $ wget --second-guess-the-dns=216.46.192.85:66 http://jidanni.org:888/ or whatever. Also pick a better name than --second-guess-the-dns -- which is just a first guess for a name. Perhaps the user should do all this in the name server or something, but lets say he isn't root, and doesn't want to use netcat etc. either.
if anything bad happens, return non-zero
$ wget --spider BAD_URL GOOD_URL; echo $? 0 $ wget --spider GOOD_URL BAD_URL; echo $? 1 I say they both should be 1. If anything bad happens, return 1 or some other non-zero value. By BAD, I mean a producer of e.g., ERROR 503: Service Unavailable. --spider or not, too. And stop making me have to confirm each and every mail to this list.
Re: feature request: --second-guess-the-dns
By the way, I did edit /etc/hosts to do one experiment http://groups.google.com/groups?threadm=vrf7007pbg2136%40corp.supernews.com i.e. <[EMAIL PROTECTED]> to test an IP/name combination, without waiting for DNS's to update. Good thing I was root so I could do it. I sure hope that when one sees Connecting to jidanni.org[216.46.192.85]:80... connected. that there is no interference along the way, that that IP is really where we are going, to wget's best ability. By the way, /etc/hosts affects other users on the system, and other jobs than the current one; and one might be using various caching DNSs, etc. Just one more justification for this wishlist item. --connect-address sounds ok... whatever.
Re: feature request: --second-guess-the-dns
> "P" == Post, Mark K <[EMAIL PROTECTED]> writes: P> You can do this now: P> wget http://216.46.192.85/ P> Using DNS is just a convenience after all, not a requirement. but then one doesn't get the HTTP Host field set to what he wants.
Re: non-subscribers have to confirm each message to bug-wget
>> And stop making me have to confirm each and every mail to this list. Hrvoje> Currently the only way to avoid confirmations is to subscribe to the Hrvoje> list. I'll try to contact the list owners to see if the mechanism can Hrvoje> be improved. subscribe me with the "nomail" option, if it can't be fixed. often I come back from a long vacation, only to find my last reply is waiting for confirmation, that probably expired.
Re: feature request: --second-guess-the-dns
H> It's not very hard to fix `--header' to replace Wget-generated H> values. H> Is there consensus that this is a good replacement for H> `--connect-address'? I don't want to tamper with headers. I want to be able to do experiments leaving all variables alone except for IP address. Thus --connect-address is still needed.
--spider gets file if ftp !
--spider ...it will not download the pages... $ wget -Y off --spider ftp://alpha.gnu.org/gnu/coreutils/coreutils-5.0.91.tar.bz2 --12:13:37-- ftp://alpha.gnu.org/gnu/coreutils/coreutils-5.0.91.tar.bz2 => `coreutils-5.0.91.tar.bz2' Resolving alpha.gnu.org... done. Connecting to alpha.gnu.org[199.232.41.11]:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done.==> PWD ... done. ==> TYPE I ... done. ==> CWD /gnu/coreutils ... done. ==> PASV ... done.==> RETR coreutils-5.0.91.tar.bz2 ... done. Length: 4,183,673 (unauthoritative) 0% [ ] 40,544 3.38K/sETA 19:56... excuse me, you said will not download the file. I just wanted to know how big it was, not get it. GNU Wget 1.8.2
Re: wget -s -O pp --spider
> "Hrvoje" == Hrvoje Niksic <[EMAIL PROTECTED]> writes: Hrvoje> Please send bug reports to [EMAIL PROTECTED], or at least make sure Hrvoje> that they don't go only to me. Yes, but needing a confirmation message over and over has driven me nuts.
apt-get via Windows with wget
I suppose Windows users don't have a way to get more that one file at once, hence to have a Windows user download 500 files and burn them onto a CD, as in http://jidanni.org/comp/apt-offline/index_en.html so one needs wget? Any tips on the concept in my web page? I don't have Windows to try it. Certainly something will go wrong? Also note http://groups.google.com/groups?threadm=1i3A5-1YO-13%40gated-at.bofh.it
check just the size on ftp
Normally, if I want to check out how big a page is before committing to download it, I use wget -S --spider URL You might give this as a tip in the docs. However, for FTP it doesn't get one the file size. At least for wget -S --spider ftp://ftp.sunsite.dk/projects/wget/windows/wget-1.9.1b-complete.zip Of course one need only get the directory to see the size. P.S. H> Yes, I have now changed the behavior of qconfirm for the wget lists to H> only ask for confirmation once pr envelope-sender. H> SunSITE.dk Staff Great!
Re: apt-get via Windows with wget
H> For getting Wget you might want to link directly to H> ftp://ftp.sunsite.dk/projects/wget/windows/wget-1.9.1b-complete.zip, OK, but too bad there's no stable second link .../latest.zip so I don't have to update my web page to follow the link. Furthermore, they don't need SSL, but I don't see any 'diet' versions... H> Oh, and the Windows users should preferrably be ones who know how to H> run a command-line application, but I assume you've got that covered. Exactly not. I recall being able to get to a little window where one enters a command... Anyway, can you give an example of all the steps needed to do wget -x -i fetch_list.txt -B http://debian.linux.org.tw/debian/pool/main/ You probably could add this example to the web page too, (without the [] lines.): [Click on fetch_list.txt; save it to a file.] Click on ..wget...zip URL UNzip it [yes, can get this far, I remember] then what then what wget [options] [nero [OK, they can handle that.]
Re: apt-get via Windows with wget
It seems one cannot just use the wget .exe without the DLLs, even if one only wants to connect to just http sites, not any https sites. So one cannot just click on the wget .exe from inside Unzip's filelist.
wget has no tool to just show the size of a FTP url without fetching it?
True, the man page doesn't say --spider will tell me the size of a file without fetching it, but I already got used to that on http, but for ftp, wget --spider -Y off -S ftp://gmt.soest.hawaii.edu/pub/gmt/4/GMT_high.tar.bz2 just give some messages, ending in 227 Entering Passive Mode (128,171,159,169,154,34) not really much that assures the user that the file is there, and still no idea of how big it is, without trying to fetch it. > then just FTP the directory, the size usually can be seen there. OK, so wget has no tool to just show the size of a FTP url without fetching it?
-O vs. -nc
On the man page the interaction between -O vs. -nc is not mentioned! Nor perhaps -O vs. -N. Indeed, why don't you cause an error when you find both -O and -nc used, if you don't intend to allow -nc work with -O, which would actually be best.
save more cookies between invocations
Wishlist: giving a way to save the types of cookies you say you won't in: `--save-cookies FILE' Save cookies to FILE at the end of session. Cookies whose expiry time is not specified, or those that have already expired, are not saved. so we can carry state between wget invocations, without having to dig them out of -S output. There should be several levels of saving allowed, including overriding expiry dates, etc.
Re: save more cookies between invocations
H> Do you really need an option to also save expired cookies? You should allow the user power over all aspects...
Say "older or the same age"
$ info The time-stamping in GNU Wget is turned on using `--timestamping' (`-N') option, or through `timestamping = on' directive in `.wgetrc'. With this option, for each file it intends to download, Wget will check whether a local file of the same name exists. If it does, and the remote file is older, Wget will not download it. Say "older or the same age", not just "older". On another info page: The `Last-Modified' header is examined to find which file was modified more recently (which makes it "newer"). If the remote file is newer, it will be downloaded; if it is older, Wget will give up.(1) Mention what if they are the same. (1) As an additional check, Wget will look at the `Content-Length' header, and compare the sizes; if they are not the same, the remote file will be downloaded no matter what the time-stamp says. Mention what happens if we get Length: unspecified. Apparently that will not trigger a download. (Good.)
say what circumstances wget will return non-zero
The docs should mention return value... In fact it should be an item in the Info Concept Index. I.e. how to depend on $ wget ... && bla || mla So say what circumstances wget will return non-zero.
mention -e in the same paragraph
In Info where you mention: Most of these commands have command-line equivalents (*note Invoking::), though some of the more obscure or rarely used ones do not. You should also mention -e in the same paragraph.
--random-wait but no --wait
The man page doesn't say what will happen if one specifies --random-wait but no --wait has been used. Perhaps just say under --wait that --wait=0 if not set.
--print-uris
Wget should have a --print-uris option, to tell us what it is planning to get, so we can adjust things without making commitment yet. Perhaps useful with -i or -r...
parallel fetching
Maybe add an option so e.g., $ wget --parallel URI1 URI2 ... would get them at the same time instead of in turn.
only depend on the timestamp, not size
Man page: When running Wget with -N, with or without -r, the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and size of the file. I have an application where I want it only to depend on the timestamp. Too bad there's no way to decouple the two conditions.
Re: parallel fetching
Phil> How about Phil> $ wget URI1 & wget URI2 Mmm, OK, but unwieldy if many. I guess I'm thinking about e.g., $ wget --max-parallel-fetches=11 -i url-list (hmm, with default=1 meaning not parallel, but sequential.)
Re: parallel fetching
H> I suppose forking would not be too hard, but dealing with output from H> forked processes might be tricky. Also, people would expect `-r' to H> "parallelize" as well, which would be harder yet. OK, maybe add a section to the manual, showing that you have considered parallel fetching, but the complications outweigh the gains.
--post-data --spider
$ man wget This example shows how to log to a server using POST and then proceed to download the desired pages, presumably only accessible to authorized users: # Log in to the server. This can be done only once. You mean "we only do this once". wget --save-cookies cookies.txt \ --post-data 'user=foo&password=bar' \ http://server.com/auth.php Say, sometimes I bet --spider could be added to make it even more efficient. Mention that. (WWWOFFLE note: WWWOFFLE turns HEADs into GETs, and strips any --post-data content. Maybe WWWOFFLE should tell the user in such cases, or something.)
-x vs. file.1
$ man wget When running Wget without -N, -nc, or -r, downloading the same file in the same directory will result in the original copy of file being preserved and the second copy being named file.1. $ wget -x http://static.howstuffworks.com/flash/toilet.swf $ wget -x http://static.howstuffworks.com/flash/toilet.swf Clobbered the first. So better fix the docs. Also there is no way to get file.1 with -x. I suggest you make a way.
Re: --post-data --spider
BTW, because wget 1.9.1 has no way to save "session cookies" yet, that example will fail often. Hopefully the user will soon be able to control what cookies are saved no matter what they themselves say.
tmp names
Perhaps a useful option would be to have files use a temporary name until download is complete, then moving to the permanent name.
mention that -p turns on -x
Mention that -p turns on or implies -x in both the -p and -x parts of both the man and info pages.
document -N -p
To Info node "Time-Stamping Usage" add a clarification about what happens when -N and -p are used together: are e.g., all the included images also checked, or just the main page?
incomplete sentence on man page
On the man page: For more information about the use of proxies with Wget, -Q quota
No URLs found in -
Odd, $ ssh debian.linux.org.tw wget -e robots=off --spider -t 1 -i - < a.2 No URLs found in -. Or is this wget just too old? P.S., no cheery responses received recently.
-p vs. ftp
>>>>> "D" == Derek B Noonburg <[EMAIL PROTECTED]> writes: D> On 20 Nov, Dan Jacobson wrote: D> Can you try the binary on my web site? D> ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.00-linux.tar.gz) >> >> But my batch script to wget it doesn't get it. >> I used wget -w 2 -e robots=off -p -t 1 -N -nv --random-wait D> Looks like "-p" doesn't work correctly (or at least doesn't do the D> expected thing) with ftp URLs.
error message contents thrown away
There is no way to see what $ lynx -dump http://wapp8.taipower.com.tw/ can show me when $ wget -O - -S -s http://wapp8.taipower.com.tw/ 08:54:48 ERROR 403: Access Forbidden. i.e., the site's error message contents.
weird
Anybody home? This looks weird: $ wget --spider -S -r -l 1 http://www.noise.com.tw/eia/product.htm --09:22:13-- http://www.noise.com.tw/eia/product.htm => `www.noise.com.tw/eia/product.htm' Resolving localhost... 127.0.0.1 Connecting to localhost[127.0.0.1]:8080... connected. Proxy request sent, awaiting response... 1 HTTP/1.0 200 OK 2 Date: Thu, 27 Jan 2005 00:00:07 GMT 3 Server: Apache/1.3.20 (Unix) PHP/4.3.10 4 Last-Modified: Wed, 19 Jan 2005 02:39:31 GMT 5 ETag: "9f071-124a-41edc863" 6 Accept-Ranges: bytes 7 Content-Type: text/html 8 Connection: close 9 Proxy-Connection: close 200 OK www.noise.com.tw/eia/product.htm: No such file or directory FINISHED --09:22:13-- Downloaded: 0 bytes in 0 files
-N vs. Last-modified header missing
1. Anybody home? 2. No way to make wget not refetch the file when: Last-modified header missing -- time-stamps turned off. 09:55:20 URL:http://bm2ddp.myweb.hinet.net/b3.htm [16087] -> "uris.d/bm2ddp.myweb.hinet.net/b3.htm" [1] when using wget -s -w 2 -e robots=off -P bla.d -p -t 1 -N -nv --random-wait -i -
--header with more than one cookie
In the man page, show how one does this wget --cookies=off --header "Cookie: =" with more than one cookie.
bug-wget still useful
Is it still useful to mail to [EMAIL PROTECTED] I don't think anybody's home. Shall the address be closed?
Re: bug-wget still useful
P> I don't know why you say that. I see bug reports and discussion of fixes P> flowing through here on a fairly regular basis. All I know is my reports for the last few months didn't get the usual (any!) cheery replies. However, I saw them on Gmane, yes.
flag to display just the errors
I see I must do wget --spider -i file -nv 2>&1|awk '!/^$|^200 OK$/' as the only way to just get the errors. There is no flag that will only let the errors thru and silence the rest. -q silences all. Wget 1.9.1