"Tony Lewis" <[EMAIL PROTECTED]> writes:

> Philip Mateescu wrote:
>
>> A warning message would be nice when for not so obvious reasons wget
>> doesn't behave as one would expect.
>>
>> I don't know if there are other tags that could change wget's behavior
>> (like -r and meta name="robots" do), but if they happen it would be
>> useful to have a message.
>
> I agree that this is worth a notable mention in the wget output. At the very
> least, running with -d should provided more guidance on why the links it has
> appended to urlpos are not being followed. Buried in the middle of hundreds
> of lines of output is:
>
> no-follow in index.php
>
> On the other hand, if other rules prevent a URL from being followed, you
> might see something like:
>
> Deciding whether to enqueue "http://www.othersite.com/index.html";.
> This is not the same hostname as the parent's (www.othersite.com and
> www.thissite.com).
> Decided NOT to load it.

There's a practical reason for this discrepancy.  All these other
links are examined one by one and rejected one by one.  On the other
hand, when nofollow is specified, it causes Wget to not even
*consider* any of the links for download.

Another tweak that should be added (easily, I think): Wget should
ignore robots when downloading the page requisites.

Reply via email to