On 4 Apr 2008, at 02:41, Mark Diggory wrote:
But to take this to the point of describing an actual "file", if I
have a file (lets say a pdf) at /path/too/my.pdf and I'm using
content negotiation... I suppose I could have a unique rdf
representation for that pdf that describes it, then /path/to/my.pdf
would return that rdf to rdf browsers. But what if I'm asking the
browser to also render the pdf? then the Accept header needs to
adjust to negotiate only the pdf.
Remember that the Web doesn't really have a concept of “files”, it has
resources which can have zero or more representations. Hence if you
start out with a bunch of files, you first have to make a decision on
how to model the files as resources and representations.
If you use Apache to serve static files for example, then Apache will
do that modelling for you automatically, in the very simple way where
you end up with one resource per file, the path of the file directly
corresponds to the resource's URI, and the resource has exactly one
representation. That's a sensible modelling, but it's not the only
thing possible and it's not set in stone!
So here's how you could treat this.
/path/to/my.pdf doesn't have content negotiation, it's just the PDF.
/path/to/my.rdf has the RDF version of the PDF.
/path/to/my is a “generic document” which content-negotiates to PDF or
RDF (or perhaps also HTML).
When you pass around links to this bunch of resources, you would
usually pass around /path/to/my, because it's generic and provides
access to whatever format is most appropriate for the client. But if
you definitely want the client to see the PDF, tell him about /path/to/
my.pdf instead.
See also http://www.w3.org/2001/tag/doc/alternatives-discovery.html
which describes this approach.
(Note that if the PDF and RDF contain very different information, e.g.
the PDF is a 100-page-document but the RDF is just a few triples with
name, author and date, then this is not really appropriate, the two
should in that case be treated as different resources, and connected
via links and not content negotiation. Content negotiation is best-
suited for the case where all the different variants, e.g. HTML and
RDF, have more or less the same information content, and it's just a
question of selecting the variant that the client can most easily
process. Content negotiation is about different formats or languages
of the same information.)
Richard
Serving different document formats from the same URI (content
negotiation) has been a feature of the basic Web protocols for
many, many years.
And with that and the Semantic Web effort, seems that if OAI-ORE is
about being able to encode descriptions of complex composite digital
objects, they best account for content negotiation in their spec.
Because I feel this is a description of that resource, not a
description of a description of the resource. I'd like to be able
to say...
<rdf:RDF ... >
<rdf:Description rdf:about="http://dspace-test.mit.edu/handle/1721.1/36383
">
<dc:creator>Abelson, Harold</dc:creator>
<dc:creator>Zittrain, Jonathan</dc:creator>
<ore:describes rdf:resource="http://dspace-test.mit.edu/handle/1721.1/36383#aggregation
"/>
</rdf:Description>
<rdf:Description rdf:about="http://dspace-test.mit.edu/handle/1721.1/36383#aggregation
">
<ore:aggregates rdf:resource="...."/>
<ore:aggregates rdf:resource="...."/>
</rdf:Description>
</rdf:RDF>
This looks good to me.
But, doing this requires that the tool resolving be crossing 303
redirects or parsing HTML and extracting the location of the RDF
from there, otherwise they always resolve to the HTML rather than
the RDF whenever attempting to follow the URI. Can anyone
recommend what a best practice would be in this case?
Not sure I understand the problem. RDF-aware tools need to send a
proper Accept header anyways or they won't get any RDF out of many
Semantic Web sites. And practically all Web tools follow redirects
transparently unless you explicitly tell them not to.
I would actually propose this slightly different setup:
/handle/1721.1/36383 serves either HTML or RDF/XML, based on the
Accept header (content negotiation), directly without a redirect.
Yes that can be done.
/handle/1721.1/36383.html serves only HTML.
/handle/1721.1/36383.rdf serves only RDF/XML.
We currently have a special path for representations because the /
handle/ space is rather controlled in our application...but
/metadata/handle/1721.1/36383.html
/metadata/handle/1721.1/36383.rdf
/metadata/handle/1721.1/36383.xxx
should be workable for the moment. Eventually, hoping we do get to
reusing the same namespace to serve out the different
representations...
This is the approach described here: http://www.w3.org/TR/cooluris/#hashuri
.
Thanks, it looks to be a good resource.
-Mark
~~~~~~~~~~~~~
Mark R. Diggory - DSpace Developer and Systems Manager
MIT Libraries, Systems and Technology Services
Massachusetts Institute of Technology