--page-requisites host spanning

2006-05-17 Thread Jean-Marc Molina
Hello, I would like to talk about the --page-requisites (-p) and -H (host spanning) options. From the manual we can read that -p « causes Wget to download all the files that are necessary to properly display a given html page. This includes such things as inlined images, sounds, and referenced

Re: Download all the necessary files and linked images

2006-05-17 Thread Jean-Marc Molina
Hrvoje Niksic wrote: I think you have a point there -- -A shouldn't so blatantly invalidate -p. That would be IMHO the best fix to the problem you're encountering. Frank mentionned that limitation in its first reply.

Re: Download all the necessary files and linked images

2006-03-11 Thread Jean-Marc MOLINA
Tobias Tiederle wrote: I just set up my compile environment for WGet again. When I did regex support, I had the same problem with exclusion, so I introduced a new parameter --follow-excluded-html. (Which is of course the default) but you can turn it off with --no-follow-excluded-html... See

Re: Download all the necessary files and linked images

2006-03-11 Thread Jean-Marc MOLINA
Mauro Tortonesi wrote: although i really dislike the name --no-follow-excluded-html, i certainly agree on the necessity to introduce such a feature into wget. can we come up with a better name (and reach consensus on that) before i include this feature in wget 1.11? I agree no shouldn't be

Download all the necessary files and linked images

2006-03-09 Thread Jean-Marc MOLINA
Hello, I want to archive a HTML page and « all the files that are necessary to properly display » it (Wget manual), plus all the linked images (a href=linked_image_urlimg src=inlined_image_url/a). I tried most options and features : recursive archiving, including and excluding directories and

Re: Download all the necessary files and linked images

2006-03-09 Thread Jean-Marc MOLINA
Frank McCown wrote: I'm afraid wget won't do exactly what you want it to do. Future versions of wget may enable you to specify a wildcard to select which files you'd like to download, but I don't know when you can expect that behavior. The more I use wget, the more I like it, even if I use

Re: Download all the necessary files and linked images

2006-03-09 Thread Jean-Marc MOLINA
Frank McCown wrote: I'm afraid wget won't do exactly what you want it to do. Future versions of wget may enable you to specify a wildcard to select which files you'd like to download, but I don't know when you can expect that behavior. I have an other opinion about that limitation. Could it

Re: bug retrieving embedded images with --page-requisites

2005-11-09 Thread Jean-Marc MOLINA
Tony Lewis wrote: The --convert-links option changes the website path to a local file system path. That is, it changes the directory, not the file name. Thanks I didn't understand it that way. IMO, your suggestion has merit, but it would require wget to maintain a list of MIME types and

How come I get spammed by wget@sunsite.dk ?

2005-11-09 Thread Jean-Marc MOLINA
Hello, Since I began to post here I got some spam from [EMAIL PROTECTED] Mostly it sends replies to my posts and never subscribed to any mailing list. Does unsuscribing from the list will stop it ? Thanks and sorry, I'm not accustomed to mailing list but don't understand how come I got

logfile and log messages parser

2005-11-08 Thread Jean-Marc MOLINA
Hello, I was looking for an alternative to HTTrack to archive single pages and found wget. It works just like a charm thanks to the --page-requisites options. However I would like to post-process the archived files. I thought of using the logs but it seems they are... just a bunch of messages. I

Re: Getting list of files

2005-11-08 Thread Jean-Marc MOLINA
Jonathan wrote: I think you should be using a tool like linklint (www.linklint.org) not wget. Thanks I didn't know that tool. However as I'm not really into Perl scripting I wonder if you know any PHP equivalent. And if I understand well how it works, it seems link checkers like Linklint are

Re: Getting list of files

2005-11-02 Thread Jean-Marc MOLINA
Shahram Bakhtiari wrote: I would like to share my experience of my failed attempt on using wget to get the list of files. I used the following command to get a list of all existing mp3 files, without really downloading them: wget --http-user=peshvar2000 --http-passwd=peshvar2000 -r -np