Hi Shijie,
The jena.sourceforge.net site is out of date - the website is now
jena.apache.org. I'm not sure how you go there as we have tried to put
redirections at every way in to lead to the new site, just leaving the
old material in place.
SPARQL execution is driven by a class called OpExecutor. This is a
general purpose execution and if you do nothing else, will call
Graph.find to execute. This is what Claude is describing and will work.
With deeper integration you get better performance, but it's more work.
Getting the graph route to work is well worth it before moving on
because both SPARQL and the API will all work with just Graph.find. You
can then write test cases of deeper integration code comparing against
the simple integration.
You may also wish to add your own algebra optimizations - a query engine
can rewrite the algebra before execution.
Example of a derived OpExecutor:
https://svn.apache.org/repos/asf/jena/trunk/jena-arq/src-examples/arq/examples/bgpmatching/
Example of a QueryEngine:
https://svn.apache.org/repos/asf/jena/trunk/jena-arq/src-examples/arq/examples/engine/
What storage layout of RDF are you using with HBase? One triple per row
or all triples same subject per row or something else?
You might want to hook in to include filters of triple. You do that by
extending OpExecutor, specific execute(OpFilter,...) and check to see if
the expression being filter is one you can handle directly (if not, just
pass on up to super.execute(OpFilter)
StageGenerators: you may come across these. They are a packaged up form
of hooking in at the level of basic graph pattern.
I've been rewriting query execution as part of a other project (Lizard -
a cluster query engine for SPARQL). As part of that, I've cleaned up
the OpExecutor stack and got rid of StageGenerators, to simplify the
number of ways to integrate into ARQ. When that's stable and tested,
I'll move it into ARQ (it's in my github account at the moment under
"quack").
Andy
"TripleMatch" is redundant - it comes from a time long ago before
complex query languages like RDQL and SPARQL.
Triples are TripleMatches and there is nowadays no use of TripleMatch
execept for Triple. TripleMatch can be removed.