Clicking on that link redirects to that page:
https://lists.man.lodz.pl/mailman/listinfo
and from all the links which are on that page the files are unnecessarily
downloaded (I do not want that page and the subpages).
So how can I block it?
Could you use -X /mailman/listinfo ?
I
I believe 1.9.1 had a bug in this area when -m (which implies -l0) was
used. Could you try specifying -l50 along with the other options, and
after -m?
It still downloaded everything.
a.
Yup. So I assume that the problem you see is not that of wget mirroring, but
a combination of saving to a custom dir (with --cut-dirs and the like) and
conversion of the links. Obviously, the link to
http://znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/index.html which would be
correct for a
Yup. So I assume that the problem you see is not that of wget mirroring, but
a combination of saving to a custom dir (with --cut-dirs and the like) and
conversion of the links. Obviously, the link to
http://znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/index.html which would be
correct for a
With that patch the mirror seems correct in the 2nd run. Please let
me know if it works for you.
*After* I deleted the files with the wrong URLs, the patched wget 1.9.1
retrieved the files correctly, and after second run did not change the
URLs for the wrong ones. So it worked on the
There is no index.html under this link:
http://znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/
but when I mirror the whole http://znik.wbc.lublin.pl/Mineraly/ web site
with command:
cd $HOME/web/mineraly
wget -m -nv -k -K -E -nH --cut-dirs=1 -np -t 1000 -D wbc.lublin.pl -o
Yes, because this is in th HTML file itself:
http://znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/index.html;
It does not work in a browser, so why should it work in wget?
It works in the browser:
http://znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/
The is no index.html and the content of the directory is
When you specify a directory, it is up to the web server to determine
what resource gets returned. Some web servers will return a directory
listing, some will return some file (such as index.html), and others will
return an error.
I know! But that is intentionally left without
That was under 1.9.1 version.
No, sorry, I've just checked that it was under 1.8.1.
Two more problems:
There is no index.html under this link:
http://znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/
but when I mirror the whole http://znik.wbc.lublin.pl/Mineraly/ web site
with command:
cd $HOME/web/mineraly
wget -m -nv -k -K -E -nH --cut-dirs=1 -np -t 1000 -D wbc.lublin.pl -o
Two problems:
There is no index.html under this link:
http://znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/
but when I mirror the whole http://znik.wbc.lublin.pl/Mineraly/ web site
with command:
cd $HOME/web/mineraly
wget -m -nv -k -K -E -nH --cut-dirs=1 -np -t 1000 -D wbc.lublin.pl -o
IMO, this is not correct. index.html will include the info the directory
listing contains at the point of download.
This works for me with znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/ as well -
what seemed to be problem according to your other post.
Yes it works for me as well when I already
Which link? The non-working one on your incorrect mirror or the working one
on my correct mirror on my HDD?
The non-working one on my mirror.
No need to get snappy, Andrzej.
You're right, I am *really* sorry!
The problem is
solved though by running the 1.9.1 wget version.
I still am
Two problems:
There is no index.html under this link:
http://znik.wbc.lublin.pl/Mineraly/Ftp/UpLoad/
but when I mirror the whole http://znik.wbc.lublin.pl/Mineraly/ web site
with command:
cd $HOME/web/mineraly
wget -m -nv -k -K -E -nH --cut-dirs=1 -np -t 1000 -D wbc.lublin.pl -o
Thus it seems that it should not matter what is the sequence of the
options. If it does I suggest that the developers of wget place
appriopriate info in the manual.
Yes, you right. Anyway I found out often that it's sometimes quite tricky
setting up your command line to get exactly
I agree with your criticism, if not with your tone. We are working on
improving Wget, and I believe that the problems you have seen will be
fixed in the versions to come. (I plan to look into some of them for
the 1.11 release.)
OK. Thanks. Good to hear that. Looking forward impatiently for
Hi!
I want to mirror the website:
http://web.pertus.com.pl/~andyk/
into the directory
/www/wyciszanie.pc
I use it like that:
wget -m -nv -k -K -E -nH -p -np -t 1000 -D web.pertus.com.pl -o
$HOME/logiwget/logwyciszanie -P /www/wyciszanie.pc
http://web.pertus.com.pl/~andyk/
Unfortunately it
Using the -p option should guarantee downloading of all the graphics etc:
wget -m -nv -k -K -E -nH -p -np -t 1000 -D andyk.feedle.com -o
$HOME/logiwget/logandyk -P /www/andyk http://andyk.feedle.com/
but it doesn't work here. Why?
Wget version installed on the system is 1.7 .
That command:
wget -m -nv -k -K -E -nH -p -np -t 1000 -o $HOME/logiwget/logminerals -P
/infos/www/data/org/chemfan/minerals http://minerals.feedle.com/
should download the content of the page http://minerals.feedle.com/ and
the http://lists.feedle.net/pipermail/minerals/
but instead in the
When I set it like this:
wget -m -nv -k -K -E -nH -p -np -t 1000 -D wbc.lublin.pl -o
$HOME/logiwget/logchemfan -P /infos/www/data/org/chemfan/chemfan
http://znik.wbc.lublin.pl/ChemFan/
than everything is downloaded into subdirectory ChemFan:
/infos/www/data/org/chemfan/chemfan/ChemFan/
while
Thanks Patrick for a reply,
AFAIKS your command line is somehow complete mixed up.
Usually I call wget and first give it the path where to it should save all
files followed by more options and at last the url from where to get them
(usually in quotation marks to be sure).
According to man
Because a ~andyk directory is in the URL. If you don't want it, use
either -nd or --cut-dirs=1, depending on whether you want to get rid
of the whole directory hierarchy or just that one dir.
--cut-dirs=1 solves the problem. Thanks.
BTW, why are there two dashes -- instead of one - before
Images are loaded from other sites. Use -H to allow Wget to go to
But -H would download all internet using it with -m and I don't want
that.
other sites and -D feedle.com to limit the search to *.feedle.com.
Sometimes it creates problems, as I described before:
Sorry about that.
When can I expect it to work (ver. 1.10?) ?
The manual reagarding the -p option says: Note that Wget will
behave as if -r had been specified, but only that single page and
its requisites will be downloaded. Links from that page to external
documents will not be
I mirrored the chemfan site using those options:
wget -m -nv -k -K -E -nH --cut-dirs=1 -np -t 1000 -D wbc.lublin.pl -o
$HOME/logiwget/logchemfan.pl -P $HOME/web/chemfan.pl -p
http://znik.wbc.lublin.pl/ChemFan/
and unfortunately the links are not converted properly in the mirror:
I only want to mirror a web site, which content is on two different
servers (different domains), and do not want to mirror parent
directories.
It seems that it would be good if -np option would work for settings in
the -I option and/or in -D options should be possible to put not only
Unfortunately not. Wget is run by volunteers, and if this extension
is not interesting enough for a programmer to pick it up, it won't
get done.
I think it is very interesting and first of all useful. Without it the
real, smooth, fully automatic mirroring of web sites, which have their
Hi!
I want to make mirror of this site:
http://znik.wbc.lublin.pl/ChemFan/
However that site also includes (has a link to) an archive on another
server under the addresses:
http://lists.man.lodz.pl/pipermail/chemfan/
ftp://ftp.man.lodz.pl/pub/doc/LISTY-DYSKUSYJNE/CHEMFAN
so I want to download
How to force wget to download Java Script links:
http://znik.wbc.lublin.pl/ChemFan/kalkulatory/javascript:wrzenie():
17:04:44 ERROR 404: Not Found.
http://znik.wbc.lublin.pl/ChemFan/kalkulatory/javascript:cisnienia():
17:04:45 ERROR 404: Not Found.
Or maybe it can download it, but there is just
I'm using the following command to mirror a web page:
wget -m -nv -k -K -nH -t 100 -D wbc.lublin.pl -o logchemfanpl -P
public_html/mirror http://znik.wbc.lublin.pl/ChemFan/
For some strange reason wget tries to download all http:// addresses put
by visitors in a Guest Book:
Hi,
please use the wget command as follows:
wget -t 1 -T 1 1.1.1.1
or on any other not accessible address.
After execution of this command I have the following messages:
--08:57:37-- http://1.1.1.1/
= `index.html'
Connecting to 1.1.1.1:80...
and after a much longer period than 1
to
download all available domains I suppose?
So why is it happening and how to avoid it?
Regards
Andrzej.
On 18 Aug 2003 at 13:49, Post, Mark K wrote:
man wget shows:
-D domain-list
--domains=domain-list
Set domains to be followed. domain-list is a comma-separated
list of domains.
Note that it does not turn on -H.
Right, but by default wget should not
How could I download using wget that:
mms://mms.itvp.pl/bush_archiwum/bush.wmv
If wget cannot manage it then what can?
Cheers!
Andy
Hi!
I sent a question yesterday to the list without subscribing, hoping that I
will read answers in the archives, however none of the 3 archives work!
Please forward me any replies to my yesterday's e-mail in private.
Cheers!
ak
35 matches
Mail list logo