On 9/18/07, ken <[EMAIL PROTECTED]> wrote: > > The problem with *not* storing the original markup is that, if there are > changes in the "standards" (which it seems there certainly will be), you > won't know which of your data need to be changed. >
If there are changes in standards/conventions, then you'd probably want to re-scan the originals in any case. Caching the originals may or may not be a good optimization strategy, but the basic idea is that the semantic content is what's important. If I were doing an experiment along these lines (as opposed to, say, writing a whole-Internet scale production system) I'd either: a) Have some fun and check out an RDF data store. Flexible, fun, and lots of relatively unproven (and therefore interesting) technology. b) Be super corporate about it and come up with a relational schema that matched the _semantics_ (not the markup) of each microformat, then publish the schema to the group for others to use. Kinda brittle in the face of standards changes, but it should be clear how to proceed and people would probably find the research useful. (a) is one fairly likely future, (b) is clearly the present, and it all sounds like an enjoyable way to spend some free time. If you're doing a real, live, highly scalable production system, then never mind the above, you've got a whole different set of problems :-) -cks -- Christopher St. John http://artofsystems.blogspot.com _______________________________________________ microformats-discuss mailing list [email protected] http://microformats.org/mailman/listinfo/microformats-discuss
