Ignoring robots.txt [was Re: wget default behavior...]

2007-10-17 Thread Tony Godshall
 ... Perhaps it should be one of those things that one can do
 oneself if one must but is generally frowned upon (like making a
 version of wget that ignores robots.txt).

Damn.  I was only joking about ignoring robots.txt, but now I'm
thinking[1] there may be good reasons to do so...  maybe it should be
in mainline wget.

T

[1] 
http://web.archive.org/web/20041013225557/http://www.differentstrings.info/archives/002813.html


Re: Ignoring robots.txt [was Re: wget default behavior...]

2007-10-17 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Tony Godshall wrote:
 ... Perhaps it should be one of those things that one can do
 oneself if one must but is generally frowned upon (like making a
 version of wget that ignores robots.txt).
 
 Damn.  I was only joking about ignoring robots.txt, but now I'm
 thinking[1] there may be good reasons to do so...  maybe it should be
 in mainline wget.

Actually, it is. -e robots=off. :)

This also turns off obedience to the nofollow attribute sometimes
found in meta and a tags.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHFmaM7M8hyUobTrERCNYWAJ4zTyACcT2zTgjo4FnXG2R8F839PgCgjkbo
2IcWqVjV6Lgxvg7JLh+tjX4=
=cYGA
-END PGP SIGNATURE-


Re: Ignoring robots.txt [was Re: wget default behavior...]

2007-10-17 Thread Tony Godshall
 Tony Godshall wrote:
  ... Perhaps it should be one of those things that one can do
  oneself if one must but is generally frowned upon (like making a
  version of wget that ignores robots.txt).
 
  Damn.  I was only joking about ignoring robots.txt, but now I'm
  thinking[1] there may be good reasons to do so...  maybe it should be
  in mainline wget.

 Actually, it is. -e robots=off. :)

 This also turns off obedience to the nofollow attribute sometimes
 found in meta and a tags.

Ah, my ignorance is showing.

I stand corrected.