Hello,
I would like to talk about the --page-requisites (-p) and -H (host spanning)
options. From the manual we can read that -p « causes Wget to download all
the files that are necessary to properly display a given html page. This
includes such things as inlined images, sounds, and referenced
Hrvoje Niksic wrote:
I think you have a point there -- -A shouldn't so blatantly invalidate
-p. That would be IMHO the best fix to the problem you're
encountering.
Frank mentionned that limitation in its first reply.
Tobias Tiederle wrote:
I just set up my compile environment for WGet again.
When I did regex support, I had the same problem with exclusion, so I
introduced a new parameter --follow-excluded-html.
(Which is of course the default) but you can turn it off with
--no-follow-excluded-html...
See
Mauro Tortonesi wrote:
although i really dislike the name --no-follow-excluded-html, i
certainly agree on the necessity to introduce such a feature into
wget.
can we come up with a better name (and reach consensus on that)
before i include this feature in wget 1.11?
I agree no shouldn't be
Hello,
I want to archive a HTML page and « all the files that are necessary to
properly display » it (Wget manual), plus all the linked images (a
href=linked_image_urlimg src=inlined_image_url/a). I tried most
options and features : recursive archiving, including and excluding
directories and
Frank McCown wrote:
I'm afraid wget won't do exactly what you want it to do. Future
versions of wget may enable you to specify a wildcard to select which
files you'd like to download, but I don't know when you can expect
that behavior.
The more I use wget, the more I like it, even if I use
Frank McCown wrote:
I'm afraid wget won't do exactly what you want it to do. Future
versions of wget may enable you to specify a wildcard to select which
files you'd like to download, but I don't know when you can expect
that behavior.
I have an other opinion about that limitation. Could it
Tony Lewis wrote:
The --convert-links option changes the website path to a local file
system path. That is, it changes the directory, not the file name.
Thanks I didn't understand it that way.
IMO, your suggestion has merit, but it would require wget to maintain
a list of MIME types and
Hello,
Since I began to post here I got some spam from [EMAIL PROTECTED] Mostly it
sends replies to my posts and never subscribed to any mailing list. Does
unsuscribing from the list will stop it ?
Thanks and sorry, I'm not accustomed to mailing list but don't understand
how come I got
Hello,
I was looking for an alternative to HTTrack to archive single pages and
found wget. It works just like a charm thanks to the --page-requisites
options. However I would like to post-process the archived files. I thought
of using the logs but it seems they are... just a bunch of messages. I
Jonathan wrote:
I think you should be using a tool like linklint (www.linklint.org)
not wget.
Thanks I didn't know that tool. However as I'm not really into Perl
scripting I wonder if you know any PHP equivalent. And if I understand well
how it works, it seems link checkers like Linklint are
Shahram Bakhtiari wrote:
I would like to share my experience of my failed attempt on using
wget to get the list of files.
I used the following command to get a list of all existing mp3 files,
without really downloading them:
wget --http-user=peshvar2000 --http-passwd=peshvar2000 -r -np
12 matches
Mail list logo