On 11/28/05, Simon Kittle <[EMAIL PROTECTED]> wrote: > > So it's not that he was after a way to identify these things (for which > there exists a URI scheme already) but a way to put them into context, like > grouping a bunch of tags under class='vcard' and marking one class='url' > puts it into context.
Right, that's what I'm after. I'm all for URIs. I'm all for the info protocol, and already use it internally. I *don't* want a new syntax for identifiers. What I'm looking or is a standard way to indicate which identifiers being *presented to humans* on a web page are relevant identifiers for the items on that page. I disagree that "an identifier isn't useful if you don't know what it is". There are many identifier systems (DOI, UUID, etc.) in widespread use that are generic in presentation. The benefit of those systems is that the resolvable meaning of the identifiers can be equivalent across systems. I know there's no need to explain that here, but it seems important to clarify (given the discussion) that in the context of wiring arbitrary systems together web2.0-style, when you are potentially moving objects around across the boundaries of individual webapps, it can be very useful to base connections on arbitrarily-human-meaningful yet unambiguously-resolvable with equivalent meaning across contexts. Since I'm new and obviously struggling to make my case I'll just go to our use cases (where it seems I should have started, sorry). It'll take several paragraphs, bear with me, can't help it. :\ In libraries we use the OpenURL (ANSI/NISO Z39.88) specs to pass content references by-reference (using identifiers) or by-value (using defined attribute-value sets per media type) across systems. It's been mentioned on-list recently so I won't go into great detail, but here's a summary: the immediate scenario for which this spec was designed was: you're a scholar doing research online. You read a relevant article in Journal A from Publisher1 and need to chase down its references. Of 25 references 22 of them are in different journals from different publishers, so you *want* to click right to those articles. But since university libraries subscribe to various content packages from various publishers and those subscriptions vary widely among institutions, you need a way to (a) pass the reference to your library, (b) have the library figure out which subscription/interface has the content, and (c) get to the article online in another system, or request through interlibrary loan, or save the reference in your folder, etc. In this scenario the ability to pass identifiers across systems is a huge win because otherwise you can only match on field/value pairs you get from the original source, which vary widely (author names, titles, vol/iss/page/yr/etc.). The OpenURL spec details how to pass that information in a URL, with definitions for how to specify ContextObjects, aka "the things [usually references] you want to do something with", in GET strings or POSTed XML. So the problem with OpenURL *implementations* is that every publisher who supports OpenURL-style linking publishes their OpenURL links using inconsistent HTML *human*-readable formats. You'd think OpenURL should provide great leverage for rewiring apps -- it certainly could -- but the upshot is that if you can't identfy which bits on a page comprise the OpenURL you can't write software that rewrites it usefully. To deal with this problem we've written an ad-hoc spec called "COinS", i.e. "ContextObjects in Spans". This spec says: "put your ContextObjects in HTML span elements with a class value of 'Z3988'", and nothing more. It's an anti-microformat, in a way, since ContextObjects themselves are almost never human-readable. In any case, COinS have been implemented in numerous systems: CiteULike, Citebase, some online journals, unalog, weblogs including wordpress and pyblosxom. The main first benefit of COinS is that if you're a small publisher or a weblogger, you can just put COinS in your pages and people at institutions who can resolve OpenURLs (most major universities and many large libraries and corporations) to actual articles. Like, talk about a research article in your blog, and your readers can link to the articles at their libraries. To support this we've generated ~900 institution-specific COinS-resolving bookmarklets and greasemonkey userscripts based on data from an OCLC international registry of OpenURL resolvers on a trial basis and it works nicely, for a demo. But to the point: lately some of us have experimented with doing more with COinS. Since OpenURL specifies where to put object identifiers, and many of our systems have OAI-PMH interfaces that let you get metadata for a given identifier through a simple GET, why not metadata autodiscovery? It's easy to wire up identifiers, as specified in COinS, to relevant OAI-PMH services, with their URLs in link tags, so you can script access to robust metadata for objects on a page right from within the browser. To restate, more simply: for any content we publish on any site, if you specify identifiers for items on a page and how to query for more information about those items using the identifiers, you can pull metadata for those items with a simple AJAX call. This could be huge leverage in wiring new systems together. To demonstrate all this, I wrote a greasemonkey script that looks for COinS-with-identifiers and link tags for OAI-PMH services. When it finds those both (a combination we started calling "COinS-PMH", because we suck at naming things), it pops up a left-side list of links to all the metadata records for the items on the page. This doesn't do much besides giving a visual indicator of what's possible. But, we're working on extending various personal-collection-system thingies we work on to support this stuff and do things more magically. Basically, imagine if a tool (like, say, flock) could suck down not just a page link and its content or an image with a link but specific hand-picked objects from pages, replete with complete, complex metadata records for those objects. And not just from a few big name sites with robust, distinct APIs, but from any site, using one simple API. With more greasemonkey userscripts we've tweaked Amazon, Flickr, Google Books, American Memory collections at LoC, arxiv.org, wordpress, and unalog to speak COinS-PMH and thus give up metadata for identified objects. Screenshots here: http://flickr.com/photos/dchud/tags/coinspmh/ The Amazon, Flickr, and Google Books tweaks use a faux-OAI-PMH proxy that responds to a few OAI-PMH queries with metadata retrieved using the Amazon and Flickr APIs. To sum: - there are *lots* of arbitrary things people can do with items on web pages if those items are managed objects in their own right with their own identifiers. - the identifiers need to be clearly demarcated in a way that distinguishes which identifier goes with which item on the page (think browse views or search result lists). - if they're not clearly marked there's no way to know which URIs embedded in the page go with which objects. - identifiers themselves can be in any format... URIs preferred. - we have a way to do the above, but it's unmicroformatic. OpenURL ContextObjects, and by association, COinS, are unfortunately overcomplex and intended for machine, not human consumption. - some of the benefits of using COinS could be accomplished more easily with a simpler convention for demarcating which items on a page have which symbolically meaningful content identifiers. Something as easy as "class='identifier'" might do it. For the two of you who haven't deleted this and moved on long ago, some links: http://ocoins.info/ - more details about COinS, with links to OpenURL specs and COinS implementations http://www.openarchives.org/OAI/openarchivesprotocol.html - OAI-PMH http://curtis.med.yale.edu/dchud/log/project/unalog/unalog-now-speaks-xfolk - some samples of how an identifier microformat could look, and work, with COinS http://microformats.org/wiki/cite-formats#OpenURL - an example of how OpenURL profiles for articles, books, etc. might translate to microformats Thank you for your consideration... really! _______________________________________________ microformats-discuss mailing list [email protected] http://microformats.org/mailman/listinfo/microformats-discuss
