Angus McIntyre wrote: > Obviously, a robot would be free to scan any page to see if it contained > content in a format it's interested in, but a 'hint' of this kind might > allow the robot to prioritise processing of pages that the author claims > contain information in a specific format.
I can't speak about crawlers and parsers in general, but I designed and ran crawlers for the Internet Archive and Alexa Internet. My experience is that crawlers will parse all HTML, and that the effort to recognize an HTML-embedded tag that says "hey, this is a microformat page" probably won't make life much easier, as by that point the page is getting parsed anyway (in other words, there's nothing further to prioritize). A useful hint would be some way of presenting a list of URLs on a site that contain microformats data, like how robots.txt works, because it's easier to prioritize a list of URLs and feed them to the crawler and parser than it is to crawl and parse and then prioritize. --Pat _______________________________________________ microformats-discuss mailing list [email protected] http://microformats.org/mailman/listinfo/microformats-discuss
