Houghton,Andrew wrote:
Lets separate your argument into two pieces. Identification and
resolution. The DOI is the identifier and it inherently doesn't
tie itself to any resolution mechanism. So creating an info URI
for it is meaningless, it's just another alias for the DOI. I
can create an HTTP resolution mechanism for DOI's by doing:
http://resolve.example.org/?doi=10.1111/j.1475-4983.2007.00728.x
or
http://resolve.example.org/?uri=info:doi/10.1111/j.1475-4983.2007.00728.x
since the info URI contains the "natural" DOI identifier, wrapping it
in a URI scheme has no value when I could have used the DOI identifier
directly, as in the first HTTP resolution example.
I disagree that wrapping it in a URI scheme has no value. We have very
much software and schemas that are built to store URIs, even if they
don't know what the URI is or what can be done with it, we have
infrastructure in place for dealing with URIs.
So there is value in wrapping a 'natural' identifier in a URI, even if
that URI does not carry it's own resolution mechanism with it. I have
run into this in several places in my own work.
I share Mike's concerns about tying resolution to identification in one
mechanism. As a sort of general principle or 'pattern' or design,
trying to make one mechanism do two jobs at once is a 'bad smell'. It's
in fact (I hope this isn't too far afield) how I'd sum up much of the
failure of AACR2/MARC, involving our 'controlled headings' (see me
expanding on this in some blog posts at
http://bibwild.wordpress.com/2008/01/17/identifiers-and-display-labels-again/).
On the other hand, it is awfully _convenient_ to combine these two
functions in one mechanism. And convenience does matter too.
I can see both sides. So I think we just do what feels right, and when
we all disagree on what feels right, we pick one. I don't share the
opinion of those who think it's obvious that everything should be an
http uri, nor do I share the opinion of those who think it's obvious
that this is a disaster.
DOI is definitely one good example of where One Canonical Resolution
fails. The DOI _resolution_ system fails for me -- it does not reliably
or predictably deliver the right document for my users. But a DOI as an
identifier is still useful for me. Even if that DOI were expressed in a
URI as http://dx.doi.org/resolve/10.1111/j.1475-4983.2007.00728.x, I
STILL wouldn't actually use the HTTP server at dx.doi.org to resolve
it. I'd extract the actual DOI out of it, and use a different
resolution mechanism.
Another example to think about is what happens when the protocol for
resolution changes? Right now already we could find a resolution
service starting to make available and/or insist upon https protocol
resolution. But all those existing identifiers expressed as http URIs
should not change, they are meant to be persistent. So already it's
possible for an identifier originally intended to describe it's own
resolution to be slightly wrong. Is this confusing? In the future,
maybe we'll have something different than http entirely.
Jonathan