That's a curious remark about readers and their misplaced desire for recursive spiders. A recursive spider allows its user to drill down into a particular information domain and ultimately exhaust it if the spider is capable enough. This is of enormous benefit to the information researcher looking for a complete and accurate view of the information domain, as opposed to the relevancy scored aggregate data provided by most search engines. It may not be appropriate for all sites or all topics but can certainly provide an abundant yield given the proper parameters.
-----Original Message----- From: Sean M. Burke [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 07, 2002 3:51 AM To: [EMAIL PROTECTED] Subject: [Robots] Perl and LWP robots Hi all! My name is Sean Burke, and I'm writing a book for O'Reilly, which is to basically replace the Clinton Wong's now out-of-print /Web Client Programming with Perl/. In my book draft so far, I haven't discussed actual recursive spiders (I've only discussed getting a given page, and then every page that it links to which is also on the same host), since I think that most readers that think they want a recursive spider, really don't. But it has been suggested that I cover recursive spiders, just for sake of completeness. Aside from basic concepts (don't hammer the server; always obey the robots.txt; don't span hosts unless you are really sure that you want to), are there any particular bits of wisdom that list members would want me to pass on to my readers? -- Sean M. Burke [EMAIL PROTECTED] http://www.spinn.net/~sburke/ -- This message was sent by the Internet robots and spiders discussion list ([EMAIL PROTECTED]). For list server commands, send "help" in the body of a message to "[EMAIL PROTECTED]". -- This message was sent by the Internet robots and spiders discussion list ([EMAIL PROTECTED]). For list server commands, send "help" in the body of a message to "[EMAIL PROTECTED]".
