On 10/09/15 00:20, Zen 98052 wrote:
Hi,
I am new to Jena programming.
I am using Jena's SPARQL API to process the query and translate the
query into accessing our custom back-end storage. I am now familiar
with basic code like subclassing the GraphBase and QueryIter with my
own implementation. But I need to learn more on extending the query
optimization.
I found this doc
(http://jena.apache.org/documentation/query/arq-query-eval.html) is a
good start, but it's not comprehensive enough for beginner like me.
Even the Java API doc also doesn't have much info, for example I was
looking at OpExecutorFactory class at
https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/sparql/engine/main/OpExecutorFactory.html
<https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/sparql/engine/main/OpExecutorFactory.html>One
option is to read the Jena source code and try to understand what it
does, but it'd be nice if there are some documentation which I can
read first before I have to look at the source code.
Thanks, Z
Hi,
Without knowing more about your custom layer, it's hard to be very
definite. You are probably better off looking at the code because when
you come to integrate your work, you'll want to debug it and that means
seeing how the internals work.
For SPARQL processing, you don't (normally) need to implement Graph.
Graph is for the RDF API.
For SPARQL, it's based on DatasetGraph; there is a class hierarchy of
classes covering many cases so usually you only need to plug into one of
those. DatasetGraphTriplesQuads for example even if you have a
triples-only storage system.
SPARQL processing goes via OpExecutor. OpExecutor itself is a general
execution of SPARQL algebra. See OpExecutorTDB1 for an example of an
OpExecutor extended for a specific storage layer.
OpExecutorTDB1 is the TDB specialization of OpExecutor to provide access
to TDB-stored RDF; it only needs to provide the points where a SPARQL
query actually touches the data: OpQuadPattern.
Theer are few other there - OpBGP, OpGraph for the case of a TDB graph
in a general mixed dataset (i.e a mix of storage layers).
OpFilter is implemented by TDB to place filters within basic patterns
because at that point, it is working with internal identifiers for RDF
terms.
Hope that helps,
Andy