Hey Andy,

does that look like something in the right direction?

https://github.com/Graphity/graphity-analytics/blob/rdf_editor/src/main/java/org/graphity/util/DataManager.java
https://github.com/Graphity/graphity-analytics/blob/rdf_editor/src/main/java/org/graphity/util/LocatorLinkedData.java

I'm using it like this:

        DataManager dm = new DataManager();
        dm.setModelCaching(true);
        model = dm.loadModel(getURI());
        log.debug("Number of Model stmts read: {}", model.size());

I am able to read triples from a remote Turtle file, but haven't tested further.
The classes use portions of Jena code, as indicated.
I'm using older version of Jena, but classes like
org.openjena.riot.ContentType could simplify the code.

FileManager could be easier to extend but seems like an OK base otherwise.
I also tried implementing URIResolver which I think makes perfect
sense here. And in general it would be nice to see Jena implementing
more Java XML interfaces since Model is accessible as XML at any point
via RDF/XML.

Martynas
graphity.org

On Fri, Jan 27, 2012 at 9:29 PM, Martynas Jusevicius
<[email protected]> wrote:
> Ok, that clears some things up.
>
> So is there a good class to extend, like JenaReader?
> Or should I start from scratch and implement RDFReader?
>
> I think most mainstream Linked Data publishing methods should be
> supported, at least these:
> http://linkeddatabook.com/editions/1.0/#htoc65
>
> Maybe the implementation could be broken into several levels that
> extend each other:
> a) content negotiation only
> b) heuristics (like using file extension) not involving content-sniffing
> c) GRDDL
> d) HTML-sniffing to find <link>s etc
>
> Martynas
>
> On Fri, Jan 27, 2012 at 8:17 PM, Andy Seaborne <[email protected]> wrote:
>> On 27/01/12 13:44, Martynas Jusevicius wrote:
>>>
>>> Hey list,
>>>
>>> I am looking for an implementation doing what looks like a simple task
>>> (but probably isn't): given a URI, try to extract RDF Model from it in
>>> all possible ways.
>>> It should use content negotiation: ask for RDF/XML as first priority,
>>> Turtle/N-Triples as the second, and try GRDDL on HTML as the last
>>> option.
>>>
>>> I can see Jena's RDFReader, JenaReader, and GRDDLReader that all seem
>>> to do a part of what is needed, but I wonder if there already is some
>>> code that combines it all?
>>>
>>> Martynas
>>> http://graphity.org
>>
>>
>> Ah. This is something that's been talked about several times and I went as
>> far as looking for old notes on this for a JIRA moderately recently.
>>
>> What we need (IMO) is a single reader that opens streams then decides which
>> parser to dispatch to.
>>
>> FileManager+typed streams.
>>
>>  Add a locator to the filemanager to do conneg.
>>  Streams are typed by any MIME info
>>
>> then the decision on MIME type to believe is based on
>> 1/ MIME type
>> 2/ file extension
>> 3/ user hint
>>
>> probably in the order 3-1-2.  Except for text/plain when 2 overrides 1 or we
>> route it to Turtle regardless.
>>
>> Given that, look in a registry and call the real parser.
>>
>> I'm not completely sure it will work for RDFa and GRDDL - maybe if the
>> system is told to read one of those, the dispatching reader believes that
>> over any conneg and just does it.
>>
>> What I think we should avoid unless really, really necessary is sniffing the
>> content.
>>
>> org.openjena.riot.web.HttpOp for some code that does HTTP GETs and
>> dispatches to a handler.  I don't think this is the way to go; it's not nice
>> to pick the results out of the operation.
>>
>> org.openjena.riot.WebContent has lots of constants.
>>
>>        Andy

Reply via email to