That's a curious remark about readers and their misplaced desire for
recursive spiders.
A recursive spider allows its user to drill down into a particular
information domain and
ultimately exhaust it if the spider is capable enough.  This is of
enormous benefit to the 
information researcher looking for a complete and accurate view of the
information domain, 
as opposed to the relevancy scored aggregate data provided by most
search engines.  It may
not be appropriate for all sites or all topics but can certainly provide
an abundant yield
given the proper parameters.

-----Original Message-----
From: Sean M. Burke [mailto:[EMAIL PROTECTED]]
Sent: Thursday, March 07, 2002 3:51 AM
To: [EMAIL PROTECTED]
Subject: [Robots] Perl and LWP robots



Hi all!
My name is Sean Burke, and I'm writing a book for O'Reilly, which is to 
basically replace the Clinton Wong's now out-of-print /Web Client 
Programming with Perl/.  In my book draft so far, I haven't discussed 
actual recursive spiders (I've only discussed getting a given page, and 
then every page that it links to which is also on the same host), since
I 
think that most readers that think they want a recursive spider, really
don't.
But it has been suggested that I cover recursive spiders, just for sake
of 
completeness.

Aside from basic concepts (don't hammer the server; always obey the 
robots.txt; don't span hosts unless you are really sure that you want
to), 
are there any particular bits of wisdom that list members would want me
to 
pass on to my readers?

--
Sean M. Burke    [EMAIL PROTECTED]    http://www.spinn.net/~sburke/


--
This message was sent by the Internet robots and spiders discussion list
([EMAIL PROTECTED]).  For list server commands, send "help" in the
body of a message to "[EMAIL PROTECTED]".

--
This message was sent by the Internet robots and spiders discussion list 
([EMAIL PROTECTED]).  For list server commands, send "help" in the body of a message 
to "[EMAIL PROTECTED]".

Reply via email to