--On Saturday, March 13, 2004 2:22 AM -0800 [EMAIL PROTECTED] wrote:
> 
>> Does crawl-delay allow decimals?
> 
> You think people really want to be able to tell a crawler to fetch a
> page at most every 5.6 seconds, and not 5?

0.5s would be useful. Ultraseek has used a float for the delay for
the past six years.

>> Could this spec be a bit better quality?
> 
> It's not a spec, it's an implementation, ...
>
>> The words "positive integer" would improve things a lot.
> 
> That's just common sense to me. :)

Well, different peoples' common sense leads to incompatible
implementations. Which is why these things should be specified.
I think negative delays would be goofy, too, but we all know
that someone will try it.

> I am sure their people are on the list, they are just being quiet, and
> will probably remain silent now that their idea has been called dumb.

Nah, they would have e-mailed me directly by now. I used to work
with them at Inktomi.

I called it a dumb idea because it has obvious problems. These
could have been solved by trying to learn from the rest of the
robot community. Crawl-delay isn't useful in our crawler, and
there have been better rate-limit approaches proposed as
far back as 1996. Most sites have pages/day or bytes/day limit,
not instantaneous rate limits, so crawl-delay is controlling
the wrong thing.

Note that Google has implemented Allow lines with a limited
wildcard syntax, so Yahoo isn't alone in being incompatible.

wunder
--
Walter Underwood
Principal Architect
Verity Ultraseek

_______________________________________________
Robots mailing list
[EMAIL PROTECTED]
http://www.mccmedia.com/mailman/listinfo/robots

Reply via email to