Hi,

I am mirroring a friendly site that excludes robots in general but
is supposed to allow my "FriendlyMirror" using wget.
For this purpose I asked the webadmin to set up his robots.txt as follows:

User-agent: FriendlyMirror
Disallow:

User-agent: *
Disallow: /

Starting Wget by

wget --user-agent FriendlyMirror -m http://Friendly.Site

Wget indeed identifies as user-agent "FriendlyMirror" to Friendly.Site
but considers itself to be user-agent "Wget" when implementing the rules
of robots.txt.

I think it would be nice if Wget could be told to interpret robots.txt
such that only my FriendlyMirror and not all other robots using wget
will continue automatic download.

Any Ideas ?

Cheers,

Christian

Reply via email to