On 27/01/12 13:44, Martynas Jusevicius wrote:
Hey list,
I am looking for an implementation doing what looks like a simple task
(but probably isn't): given a URI, try to extract RDF Model from it in
all possible ways.
It should use content negotiation: ask for RDF/XML as first priority,
Turtle/N-Triples as the second, and try GRDDL on HTML as the last
option.
I can see Jena's RDFReader, JenaReader, and GRDDLReader that all seem
to do a part of what is needed, but I wonder if there already is some
code that combines it all?
Martynas
http://graphity.org
Ah. This is something that's been talked about several times and I went
as far as looking for old notes on this for a JIRA moderately recently.
What we need (IMO) is a single reader that opens streams then decides
which parser to dispatch to.
FileManager+typed streams.
Add a locator to the filemanager to do conneg.
Streams are typed by any MIME info
then the decision on MIME type to believe is based on
1/ MIME type
2/ file extension
3/ user hint
probably in the order 3-1-2. Except for text/plain when 2 overrides 1
or we route it to Turtle regardless.
Given that, look in a registry and call the real parser.
I'm not completely sure it will work for RDFa and GRDDL - maybe if the
system is told to read one of those, the dispatching reader believes
that over any conneg and just does it.
What I think we should avoid unless really, really necessary is sniffing
the content.
org.openjena.riot.web.HttpOp for some code that does HTTP GETs and
dispatches to a handler. I don't think this is the way to go; it's not
nice to pick the results out of the operation.
org.openjena.riot.WebContent has lots of constants.
Andy