I've only just recently figured out what xmdp's are, and what they're
capable of. I notice the structure of your hcard/hreview/hcalendar
thing is such that you can pick an attribute, and search for bits of
data which contain that attribute. Have you attempted to detect and
parse XMDP's not for validation, but to discover new microformats not
documented at microformats.org? Now don't get me wrong, I
understand the importance of strong standardization, but say someone
creates a niche format for their own site, and related sites in their
community with an xmdp, do you think your aggregator could use that
xmdp to create new searchable attributes in your search engine?
Sorry this is a bit of a tangent, but the idea of it kind of
fascinated me.
On Mar 24, 2006, at 5:40 PM, Scott Reynen wrote:
On Mar 24, 2006, at 4:20 PM, Ryan King wrote:
Hmm, this sounds to me like a theoretical argument. I'd like to
hear what experience people have had here. Has anyone here worked
on crawling to index microformats? If so, what challenges did you
face?
Yes. The two I know of are reevoo, which aggregates hreviews:
http://www.reevoo.com/
and my own effort, which aggregates hcards, hcalendars, and hreviews:
http://randomchaos.com/microformats/base/
My main challenges have been a lack of space to store the data
(which has nothing to do with microformats) and the the lack of a
parser that can read invalid X(HT)ML (which is only an issue
because I haven't installed Tidy on my server). If microformat
site maps existed, I would use them as starting points to know
where to look, but I wouldn't trust them as any sort of accurate
listing of what's on a domain just because I know I would likely
forget to update my own if I had one. So I'd still be reading the
same number of documents, just in a different order.
Peace,
Scott
_______________________________________________
microformats-discuss mailing list
[email protected]
http://microformats.org/mailman/listinfo/microformats-discuss
_______________________________________________
microformats-discuss mailing list
[email protected]
http://microformats.org/mailman/listinfo/microformats-discuss