On 15/10/15 11:39, Giorgio Stefanoni wrote:
Hello,

I recently read the paper ‘SPARQL Basic Graph Pattern Optimisation Using 
Selectivity Estimation’ by Markus Stocker et al. from WWW 2008. This paper 
proposes a new method based on graph statistics and heuristics for  estimating 
the result size of a simple SPARQL query (simple = conjunction of BGPs). As far 
as I know, this approach was implemented and was part of ARQ.

Markus used and implemented his work based on ARQ; it is not in the distribution.


I have a couple of questions regarding static query optimisation in JENA/ARQ:
Does JENA/ARQ query optimiser still follow the approach by Markus Stocker et 
al? If no, is there a place where I can read about query optimisation in 
JENA/ARQ?
Given a SPARQL query q, is it possible to obtain the estimated size of the result set 
of q? I tried following this example 
(https://jena.apache.org/documentation/query/explain.html 
<https://jena.apache.org/documentation/query/explain.html>), however, the 
result of explaining the query does not include the cardinality estimation.
Best regards,

Giorgio


Query optimization happens in two stages:

1/ Rewrite the algebra to better algebra (called the "high level") such
as filter placement.  See Algebra.optimize

2/ Reordering basic graph patterns is done as the query executes. You will see the effect as the query executes. ReorderWeighted, ReorderFixed.

https://jena.apache.org/documentation/tdb/optimizer.html

The "fixed" style is applied in-memory as well. In practice, the fixed algorithm does a reliable and fairly good job in the majority of cases.
Stats based optimization is only really needed when that fails.

Optimization does not try to guess the size of the result set. It tries to find a faster way to execute the query mainly by replacing, fairly conservatively, one way of executing with another.

        Andy

Reply via email to