[EMAIL PROTECTED] wrote:
I've created a robot, www.dead-links.com and i wonder if this list is alive.
It is alive, but very, very quiet.
Nick
___
Robots mailing list
[EMAIL PROTECTED]
http://www.mccmedia.com/mailman/listinfo/robots
On Nov 3, 2003, at 11:16 PM, Nick Arnett wrote:
[EMAIL PROTECTED] wrote:
I've created a robot, www.dead-links.com and i wonder if this list is
alive.
It is alive, but very, very quiet.
Yeah, this robots thing is just a fad, it'll never catch on. -Tim
On Tue, 4 Nov 2003, Alan Perkins wrote:
Here's a question to test whether the list is alive and active...
I have a feeling the bandwidth and other resources of web sites have
gone up so much that really robots do not pose a DoS threat any more. Hit
me as hard as you like as long as I am in
Alan Perkins writes:
What's the current accepted practice for hit rate?
In general, leave an interval several times longer than the time
taken for the last response. e.g. if a site responds in 20 ms,
you can hit it again the same second. If a site takes 4 seconds
to response, leave it at least
I thought I would post some of my experience with download rates
We have built a large scale crawler that has crawled over 2.4 billion urls and continues
to crawl at upwards of 500 pages/second. In tuning the download policy we
found that both the hit rate and number of pages downloaded per
Hello Robots list
Well maybe this list can finally put to rest a great deal of the 30 second wait
issue.
Can we all collectively research into an adaptive routine?
We all need a common code routine that all our spidering modules and connective
programs can use.
Especially when we wish to
On Tue, 4 Nov 2003 [EMAIL PROTECTED] wrote:
Hello Robots list
Well maybe this list can finally put to rest a great deal of the 30 second wait
issue.
Can we all collectively research into an adaptive routine?
Interesting topic...
With one hat on, I operate one of those little servers
--On Tuesday, November 4, 2003 10:05 AM + Alan Perkins [EMAIL PROTECTED] wrote:
What's the current accepted practice for hit rate?
Ultraseek uses one request at a time for a server with no
extra pause in between. Each file is parsed before sending
the next response, so there is a bit of