Thank you for your exhaustive reply! Best regards,
Giorgio > On 17 Oct 2015, at 10:16, Andy Seaborne <[email protected]> wrote: > >> On 15/10/15 11:39, Giorgio Stefanoni wrote: >> Hello, >> >> I recently read the paper ‘SPARQL Basic Graph Pattern Optimisation Using >> Selectivity Estimation’ by Markus Stocker et al. from WWW 2008. This paper >> proposes a new method based on graph statistics and heuristics for >> estimating the result size of a simple SPARQL query (simple = conjunction of >> BGPs). As far as I know, this approach was implemented and was part of ARQ. > > Markus used and implemented his work based on ARQ; it is not in the > distribution. > >> >> I have a couple of questions regarding static query optimisation in JENA/ARQ: >> Does JENA/ARQ query optimiser still follow the approach by Markus Stocker et >> al? If no, is there a place where I can read about query optimisation in >> JENA/ARQ? >> Given a SPARQL query q, is it possible to obtain the estimated size of the >> result set of q? I tried following this example >> (https://jena.apache.org/documentation/query/explain.html >> <https://jena.apache.org/documentation/query/explain.html>), however, the >> result of explaining the query does not include the cardinality estimation. >> Best regards, >> >> Giorgio > > Query optimization happens in two stages: > > 1/ Rewrite the algebra to better algebra (called the "high level") such > as filter placement. See Algebra.optimize > > 2/ Reordering basic graph patterns is done as the query executes. You will > see the effect as the query executes. ReorderWeighted, ReorderFixed. > > https://jena.apache.org/documentation/tdb/optimizer.html > > The "fixed" style is applied in-memory as well. In practice, the fixed > algorithm does a reliable and fairly good job in the majority of cases. > Stats based optimization is only really needed when that fails. > > Optimization does not try to guess the size of the result set. It tries to > find a faster way to execute the query mainly by replacing, fairly > conservatively, one way of executing with another. > > Andy
