I use that rule in .wgetrc:
accept = *[?]*
and
reject = *\.[zZ][iI][pP]*
I think that rule exclude all *.zip* from download, but in test url like
http://domain.com/price/source/10-20030915.zip?PHPSESSID=0cd4eb0801c656a292e
33c9b8134c899
downloaded.
Whats wrong wget or my regexp?
`--no-clobber' is very usfull option, but i retrive document not only with
.html/.htm suffix.
Make addition option that like -A/-R define all allowed/rejected rules
for -nc option.
I use Wget 1.8.2.
When I try receive page with '-nc' option and server return 302 and new url,
wget not test that url on rules in '-nc' and download and rewrite existing
file.
I think wget not used command line option rules when parse server response
header!
It is a bug?
I use wget 1.8.2.
When I try recursive download site site.com where
site.com/ first page redirect to site.com/xxx.html that have first link in
the page to site.com/
then Wget download only xxx.html and stop.
Other links from xxx.html not followed!
Have wget any rules to convert retrive url to store url?
Or may be in future?
For example:
Get - site.com/index.php?PHPSESSID=123124324
Filter - /PHPSESSID=[a-z0-9]+//i
Save as - site.com/index.php
I use wget 1.8.2
Try recursive downdload www.map-by.info/index.html, but wget stop in first
page.
Why?
index.html have links to another page.
/usr/local/bin/wget -np -r -N -nH --referer=http://map-by.info -P
/tmp/www.map-by.info -D map-by.info http://map-by.info
http://www.map-by.info
I think wget strong verify link syntax:
a href=about_rus.html onMouseOver=img_on('main21');
onMouseOut=img_off('main21')
That link have incorrect symbol ';' not quoted in a
-Original Message-
From: Sergey Vasilevsky [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 16, 2003 10:15
Thanks for explain this reasons.
And I have anoter problem:
in .wgetrc I use
reject =
*.[zZ][iI][pP]*,*.[rR][aA][rR]*,*.[gG][iI][fF]*,*.[jJ][pP][gG]*,*.[Ee][xX][E
e]*,*[=]http*
accept =
*.yp*,*.pl*,*.dll*,*.nsf*,*.[hH][tT][mM]*,*.[pPsSjJ][hH][tT][mM]*,*.[pP][hH]
Wget 1.9.1
.wgetrc:
reject = *.[Ee][xX][Ee]*
follow_ftp = off
Command line:
wget -np -nv -r -N -nH --referer=http://www.orion.by -P
/tmp/www.orion.by -D orion.by http://www.orion.by
Output:
Last-modified header missing -- time-stamps turned off.
13:15:08