[Robots] Re: Perl and LWP robots

2002-03-07 Thread Chris Skepper
Aside from basic concepts (don't hammer the server; always obey the robots.txt; don't span hosts unless you are really sure that you want to), are there any particular bits of wisdom that list members would want me to pass on to my readers? Look at

[Robots] Re: Perl and LWP robots

2002-03-07 Thread Otis Gospodnetic
Excellent. I have a copy of Wong's book at home and like that topic (i.e. I'm a potential customer :)) When will it be published? I think lots of people do want to know about recursive spiders, and I bet one of the most frequent obstacles are issues like: queueing, depth vs. breadth first

[Robots] Re: Perl and LWP robots

2002-03-07 Thread Michael Lange
Hi Sean, You might want to consider exploring the not yet approved updated robots.txt standard that covers allow rules and how to apply them to your spider. This may help raise the level of awareness on the robots.txt standard. You could also talk about how to use the robots.txt with your

[Robots] Re: Perl and LWP robots

2002-03-07 Thread Matthew Meadows
That's a curious remark about readers and their misplaced desire for recursive spiders. A recursive spider allows its user to drill down into a particular information domain and ultimately exhaust it if the spider is capable enough. This is of enormous benefit to the information researcher

[Robots] Re: Perl and LWP robots

2002-03-07 Thread Avi Rappoport
I've found that image maps, framesets, redirects, funky relative links, JavaScript links and dynamic URLs generated from backend systems are the main problems with robots. Also bad HTML on pages so the robot gets confused parsing it, such as unclosed li tags. I have written up a checklist

[Robots] Re: Perl and LWP robots

2002-03-07 Thread Klaus Johannes Rusch
In [EMAIL PROTECTED], Sean M. Burke [EMAIL PROTECTED] writes: Aside from basic concepts (don't hammer the server; always obey the robots.txt; don't span hosts unless you are really sure that you want to), are there any particular bits of wisdom that list members would want me to pass on to