Re: Does TDB have command to see estimated query execution time and row count ?

Andy Seaborne Mon, 07 Sep 2015 08:11:29 -0700

On 07/09/15 15:37, Wei Zhang wrote:

Hi Andy,


I think there is criteria to choose better plan.
I will look at the source code. Hope it will help...

Thank you very much for your time.

Wei Zhang


Great - I look forward to hearing form you.
        
        Andy

-----Original Message-----
From: Andy Seaborne [mailto:[email protected]]
Sent: Monday, 7 September 2015 11:56 PM
To: [email protected]
Subject: Re: Does TDB have command to see estimated query execution time and 
row count ?

Hi there,

The optimizer does not try to estimate the execution time.  It is not a fully 
fledged, top-to-bottom cost-based optimizer.  It does the best it can based on 
heuristics.  As has been discovered in Jena and elsewhere, SPARQL can also be 
used in very simple fashion where the optimizer cost can be more than just 
doing the query.

QueryExecUtils will actually execute the algebra expression.  I mentioned the 
class because you'll probably want to execute algebra directly to get 
comparisons.  And of course, you can look at the source code!

        Andy

On 07/09/15 13:42, Wei Zhang wrote:

Hi Andy,

Thank you very much for your help!
I think per your suggestion, I can compare the performance with and without 
optimizer.
But how can I get optimizer's estimated query time? Which I plan to compare 
with the real execution time?

Do you mean when I execute algebra expressions directly using tools like 
QueryExecUtils, then the time I get can be considered as estimated time?
I am not sure if my understanding is correct...

Best Regards,
Wei Zhang

-----Original Message-----
From: Andy Seaborne [mailto:[email protected]]
Sent: Monday, 7 September 2015 9:26 PM
To: [email protected]
Subject: Re: Does TDB have command to see estimated query execution time and 
row count ?

On 07/09/15 05:30, Wei Zhang wrote:

Dear All,

   From the document 
(https://jena.apache.org/documentation/tdb/optimizer.html), it is said TDB 
optimizer has both static and dynamic optimizations.
How can I get the estimated query time and row count instead of the actual time 
and row count after static/dynamic optimization?
Another question is that I think "tdbquery -explain" gives the query plan after 
execution,  but it also cannot provide the information I want.

What I want is to find the TDB optimizer's performance.

Could anyone help?

Thank you very much for your time.

Best Regards,
Wei


Wei,

I think you have a model of how the optimizer works but it's at odds with what 
it actually does.  It is not strongly based around cost estimation although TDB 
does a little of that.

The high level optimizations, done at the start of query execution, are a set 
of rule based rewrites that look for patterns in the algebra and produce better 
algebra.  In particular, these are not based on the data.
    Rather the rules are ways to standard SPARQL algebra (exactly as produced by the 
transformation in the spec) into better (nearly always!) algebra.  That includes 
introducing a new operators (like "TopN") as well as rewriting using existing 
operators (like filter/equality into a pattern with that term and a BIND).

This is printed by "qparse --print=opt"

TDB adds reordering basic graph patterns, either by the rule based method 
described at that link or a fixed way (roughly - choose mist grounded triple 
pattern, but avoid rdf:type).

There are tools (QueryExecUtils) to execute algebra expressions directly so 
combined with the optimizer switched off, you can try out different 
possibilities.

        Andy

Re: Does TDB have command to see estimated query execution time and row count ?

Reply via email to