Excellent. I have a copy of Wong's book at home and like that topic (i.e. I'm a potential customer :)) When will it be published? I think lots of people do want to know about recursive spiders, and I bet one of the most frequent obstacles are issues like: queueing, depth vs. breadth first crawling, (memory) efficient storage of extracted and crawled links, etc. I think that if you covered those topics well lots of people would be very greatful.
Thank you for asking, I hope this helps. Otis --- "Sean M. Burke" <[EMAIL PROTECTED]> wrote: > > Hi all! > My name is Sean Burke, and I'm writing a book for O'Reilly, which is > to > basically replace the Clinton Wong's now out-of-print /Web Client > Programming with Perl/. In my book draft so far, I haven't discussed > > actual recursive spiders (I've only discussed getting a given page, > and > then every page that it links to which is also on the same host), > since I > think that most readers that think they want a recursive spider, > really don't. > But it has been suggested that I cover recursive spiders, just for > sake of > completeness. > > Aside from basic concepts (don't hammer the server; always obey the > robots.txt; don't span hosts unless you are really sure that you want > to), > are there any particular bits of wisdom that list members would want > me to > pass on to my readers? > > -- > Sean M. Burke [EMAIL PROTECTED] http://www.spinn.net/~sburke/ > > > -- > This message was sent by the Internet robots and spiders discussion > list ([EMAIL PROTECTED]). For list server commands, send "help" in > the body of a message to "[EMAIL PROTECTED]". __________________________________________________ Do You Yahoo!? Try FREE Yahoo! Mail - the world's greatest free email! http://mail.yahoo.com/ -- This message was sent by the Internet robots and spiders discussion list ([EMAIL PROTECTED]). For list server commands, send "help" in the body of a message to "[EMAIL PROTECTED]".
