Unrelated to the above but dealing with the SPARQL Update implementation
would be a change to add a method to either the GraphStore or DatasetGraph
interface that creates and adds a named graph.  This would be useful for the
SPARQL Update create method so that a native graph could be created.  The
current mechanism of creating a default in-memory Jena graph and adding that
to the GraphStore works, but seems a little ugly because of the extra work
to create an object that just gets iterated over and then thrown away if the
store creates its own graph object to replace it during the add call.
Another benefit would be for users of the graph store to have a standard way
of creating new graphs that are native to the GraphStore.

-Stephen

Hi Stephen,,

DatasetGraph has methods addGraph(Node, Graph), and getGraph(Node graphNode).

For some storage layers all graphs "exist" in the sense that you can get a graph from the store and an object is returned. You can call getGraph() and add triples to that and getGraph() creates a graph if necessary. TDB does not know the difference between an empty graph and a non-existent graph. It's graph objects are views on the datastorage.

The overridable UpdateEngineWorker.execInsert uses getGraph: it instantiates quads and separates the triples by which graph they go to. It then loops to insert them using getGraph to get the graph and do a bulk insert (from an iterator). These two steps could and should be done together to get better more streaming of inserts - if the store does not mind the possibility of lots of duplicates. The current implementation is more debuggable - a helpful feature at the time.

DatasetGraphTriplesQuads tries to bridge from the Quads/triples viewpoint to the graph-centric viewpoint by dispatching operations to the default or named graphs. DatasetGraphCaching refines that with a cache for creating graph under getGraph. If there is no graph in the cache, it calls

 abstract protected Graph _createNamedGraph(Node graphNode) ;

I'm not convinced the class hierarchy is the best design - it's as if there are two hierarchies - one for triple/quad-centric storage with implicit graphs and one for storgae using explicit graphs that store triples but hierarchies seemed to have duplication.


        Andy

Reply via email to