Hi all! My name is Sean Burke, and I'm writing a book for O'Reilly, which is to basically replace the Clinton Wong's now out-of-print /Web Client Programming with Perl/. In my book draft so far, I haven't discussed actual recursive spiders (I've only discussed getting a given page, and then every page that it links to which is also on the same host), since I think that most readers that think they want a recursive spider, really don't. But it has been suggested that I cover recursive spiders, just for sake of completeness.
Aside from basic concepts (don't hammer the server; always obey the robots.txt; don't span hosts unless you are really sure that you want to), are there any particular bits of wisdom that list members would want me to pass on to my readers? -- Sean M. Burke [EMAIL PROTECTED] http://www.spinn.net/~sburke/ -- This message was sent by the Internet robots and spiders discussion list ([EMAIL PROTECTED]). For list server commands, send "help" in the body of a message to "[EMAIL PROTECTED]".
