[uf-discuss] hoard.it

Jim O'Donnell Thu, 03 Jul 2008 15:19:02 -0700

Hello,

This might be of interest to members of this group, as it deals withextracting data from semantic HTML. Prior to this year's MashedMuseum event at the University of Leicester, Dan Zambonini puttogether a prototype which aggregates data by spidering online museumcatalogues:

http://hoardit.pbwiki.com/

It's a pretty fantastic demo of how information can be extracted fromwell-structured HTML, even before you think of putting microformatsetc. on top.

In particular, it does a pretty good job of figuring out when anobject was made:

http://feeds.boxuk.com/museums/object_100yrs.php

The date parser is based on some code Dan & I knocked together atMashed Museum 2007, which looks at strings like 'late Victorian','early 20th Century', '4th January 1853' and so on, and converts themto machine-readable ISO dates.

Our original idea, which we never got round to actually implementing,was that this would be useful as a web service - you give it astring, it gives you a machine-parsable representation of thatstring. The recent discussion here about dates has made me wonder ifsuch a web service woud be useful for microformats parsers. What doothers think?


Cheers
Jim

Jim O'Donnell
[EMAIL PROTECTED]
http://eatyourgreens.org.uk
http://flickr.com/photos/eatyourgreens



_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

[uf-discuss] hoard.it

Reply via email to