Excellent.  I have a copy of Wong's book at home and like that topic
(i.e. I'm a potential customer :))  When will it be published?
I think lots of people do want to know about recursive spiders, and I
bet one of the most frequent obstacles are issues like: queueing, depth
vs. breadth first crawling, (memory) efficient storage of extracted and
crawled links, etc.
I think that if you covered those topics well lots of people would be
very greatful.

Thank you for asking, I hope this helps.
Otis

--- "Sean M. Burke" <[EMAIL PROTECTED]> wrote:
> 
> Hi all!
> My name is Sean Burke, and I'm writing a book for O'Reilly, which is
> to 
> basically replace the Clinton Wong's now out-of-print /Web Client 
> Programming with Perl/.  In my book draft so far, I haven't discussed
> 
> actual recursive spiders (I've only discussed getting a given page,
> and 
> then every page that it links to which is also on the same host),
> since I 
> think that most readers that think they want a recursive spider,
> really don't.
> But it has been suggested that I cover recursive spiders, just for
> sake of 
> completeness.
> 
> Aside from basic concepts (don't hammer the server; always obey the 
> robots.txt; don't span hosts unless you are really sure that you want
> to), 
> are there any particular bits of wisdom that list members would want
> me to 
> pass on to my readers?
> 
> --
> Sean M. Burke    [EMAIL PROTECTED]    http://www.spinn.net/~sburke/
> 
> 
> --
> This message was sent by the Internet robots and spiders discussion
> list ([EMAIL PROTECTED]).  For list server commands, send "help" in
> the body of a message to "[EMAIL PROTECTED]".


__________________________________________________
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/

--
This message was sent by the Internet robots and spiders discussion list 
([EMAIL PROTECTED]).  For list server commands, send "help" in the body of a message 
to "[EMAIL PROTECTED]".

Reply via email to