Re: -R and HTML files

Matthias Vill Wed, 22 Aug 2007 23:54:01 -0700

Micah Cowan schrieb:
> Josh Williams wrote:
>> On 8/22/07, Micah Cowan <[EMAIL PROTECTED]> wrote:
>>> What would be the appropriate behavior of -R then?
>> I think the default option should be to download the html files to
>> parse the links, but it should discard them afterwards if they do not
>> match the acceptance list.
> 
> Heh, that _is_ the current default. But I'm not convinced that's what
> the naïve user is going to expect the default to be. Especially since
> the manpage doesn't mention it, and the info page only mentions it if
> you dig into the details section.
> 
> OTOH, it has a history, so choosing to change it is not a small decision.
>


To me downloading of HTML-files which match rejection-patterns make no
sense.
Of course, there is this case, where you want "the whole site, but" lets
say you don't want any of the pictures because they are to big.

I don't know whether this is actually a good idea, but I would suggest,
that you combine mime-types and paths for accept/reject lists like you say
-R "image/jpeg:*" -A "html/text:*,*:*static*"

So you don't get any jpegs, accept all html/text and also everything
"static". if the first part is left out it may default to everything but
html/text to provide compatibility to prior versions.

Cheers

Matthias

Re: -R and HTML files

Reply via email to