--On Saturday, March 13, 2004 2:22 AM -0800 [EMAIL PROTECTED] wrote: > >> Does crawl-delay allow decimals? > > You think people really want to be able to tell a crawler to fetch a > page at most every 5.6 seconds, and not 5?
0.5s would be useful. Ultraseek has used a float for the delay for the past six years. >> Could this spec be a bit better quality? > > It's not a spec, it's an implementation, ... > >> The words "positive integer" would improve things a lot. > > That's just common sense to me. :) Well, different peoples' common sense leads to incompatible implementations. Which is why these things should be specified. I think negative delays would be goofy, too, but we all know that someone will try it. > I am sure their people are on the list, they are just being quiet, and > will probably remain silent now that their idea has been called dumb. Nah, they would have e-mailed me directly by now. I used to work with them at Inktomi. I called it a dumb idea because it has obvious problems. These could have been solved by trying to learn from the rest of the robot community. Crawl-delay isn't useful in our crawler, and there have been better rate-limit approaches proposed as far back as 1996. Most sites have pages/day or bytes/day limit, not instantaneous rate limits, so crawl-delay is controlling the wrong thing. Note that Google has implemented Allow lines with a limited wildcard syntax, so Yahoo isn't alone in being incompatible. wunder -- Walter Underwood Principal Architect Verity Ultraseek _______________________________________________ Robots mailing list [EMAIL PROTECTED] http://www.mccmedia.com/mailman/listinfo/robots