On 27/01/12 20:29, Martynas Jusevicius wrote:
Ok, that clears some things up.

So is there a good class to extend, like JenaReader?
Or should I start from scratch and implement RDFReader?

I think most mainstream Linked Data publishing methods should be
supported, at least these:
http://linkeddatabook.com/editions/1.0/#htoc65

Maybe the implementation could be broken into several levels that
extend each other:
a) content negotiation only
b) heuristics (like using file extension) not involving content-sniffing
c) GRDDL
d) HTML-sniffing to find<link>s etc

Martynas

That's a good breakdown. The (a)+(b) is the area I've been wanting to sort out for some time if only to make adding parser types a bit more straight forward. RDFa, microX, JSON-LD, native-to-triple generators, ... and definitely not a fixed set.

As a contribution to this discussion for (a)+(b) I've gathered together various bits and pieces into a experimental design:

http://s.apache.org/wbZ

I don't have a sense of how to incorporate (c)+(d) and hope you have ideas here.


The idea is that reading/parsing is orthogonal to a model. In Jena2, there is the possibility of per-model choice of reader implementation. I'm not sure if any use is made of this feature. RDFReaderFImpl is the only implementation active in Jena. Are there any others?

There is a need to configure the parsing process per-read, that is mainly for RDF/XML as described at:

http://s.apache.org/BMB

which is all done with property settings.

We can separate reading from model. The FileManager already does this with readModel(model, ...).

We can have a factory-style design with a function (static method):

read(Model m, String uri, String hintLang, Context context)

or rather:

read(Sink<Triple> destination, String uri, String hintLang, Context context)

where:

Sink<Triple> destination
  where to send triples generated by the parser.  There is a standard
  wrapper for a graph that turns Sink.send into Graph.add.

String uri
  The place to read.

String hintLang
  A hint to the system of the syntax.

Context context
  A set of property-values to configure the parser.

The process is :
       open -> TypeStream
       process -- choose parser, call parser
       ts.close

"open" can use a FileManager to look in a list of places for the "uri" (actually, a general label - maybe in the filing system, a Java resources, zip file, on the web, servlet context, in a cache, ...). A nice feature of the file manager is you can turn off locations - e.g. don't put a file system component and the local file system isn't accessible which is good for servers.

        Andy




Reply via email to