Hi,

I maintain an RDF Haskell library, and I would like to look towards Jena
for inspiration on improving the API.

Currently,  there are two RDF graph implementations in the library. 1)
storing the triples just as a list of (subject,predicate,object) tuple of
node elements, and 2) storing as a map from subject to predicate lists and
then for each predicate a map from predicate to object list. The instance
names in the API for the RDF type class is not very intuitive to the RDF
domain expert. Here are two use case examples:

Right (rdf :: TriplesGraph) <- parseFile NTriplesParser "my_file.nt"
Right (rdf :: MGraph) <- parseFile NTriplesParser "my_file.nt"

One might ask: what is the internal structure of `TriplesGraph` and
`MGraph`, it certainly isn't clear from their names. A better design would
be for the user to choose the graph structure in memory that reflects how
the triples are indexed, perhaps in line with some application specific
needs about how the RDF graph should be searched. For example, indexed on
SP keys mapping to O, or SO mapping to P, or OP mapping to S, or S mapping
to O, and so on.

Where should I be looking in the Jena API, to find out what the API design
is for providing Java programmers the ability to A) index a graph whilst it
is being populated with triples whilst parsing a source, and B) how to
index an already populated RDF graph? Does the Jena API allow the
programmer to inspect the indexing that has been applied to an RDF graph in
memory? E.g. can I find out whether an RDF graph in-memory is indexed on SO
mapping to P? If so, is this reflected by the instantiated class holding
the data, e.g. (myGraph instanceof SOtoPGraph), or is it reflected by
method calls, e.g. bool indexedBySO(myGraph), or is it not possible to
inspect previous indexing routines on an in-memory RDF graph with Jena?

Thanks!

--
Rob Stewart

Reply via email to