Hello Sebastian,

With
select * where { ?r a nfo:FileDataObject .
?r nfo:fileName ?f . ?f bif:contains 'breaking' . }

the optimizer decides for some reason that being nfo:FileDataObject is much 
less frequent property of ?r than containing "breaking" word in a filename.
That's weird decision because in the EXPLAIN the optimizer says that "breaking" 
is seldom by itself, so "breaking" in filename should be even more seldom than 
"breaking" in anything.

The workaround is to tell the optimizer that you know the order of operations 
better than it and to reorder triple patterns:

sparql define sql:select-option "ORDER" select * where {
?r
<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#fileName> ?f .
 ?f bif:contains \'breaking\' .
?r a
<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject> . }

Without define sql:select-option "ORDER", reordering does not help
because the optimizer pay no attention to original order of tables if
there are only few variants. (If the query is complicated and no define
sql:select-option "ORDER" is specified then writing patterns in some
specific order guarantees that the optimizer will consider this order of
execution among other possibilities even if there's not enough time to
search throughout all possible permutations.)

Best Regards,

Ivan Mikhailov
OpenLink Software
http://virtuoso.openlinksw.com


On Fri, 2010-04-30 at 15:10 +0200, Sebastian Trüg wrote:
> Hello Hugh,
> 
> Please find the output attached to prevent line breaks.
> 
> Cheers,
> Sebastian
> 
> On 04/30/2010 03:14 AM, Hugh Williams wrote:
> > Hi Sebastian,
> > 
> > Can you please provide the explain output detailing the execution plan for 
> > both queries for comparison, as detailed at:
> > 
> >     http://docs.openlinksw.com/virtuoso/perfdiag.html#perfdiagqueryplans
> >     http://docs.openlinksw.com/virtuoso/fn_explain.html
> > 
> > Note you must include the "sparql" keyword before the start of each query. 
> > Please also provide the output of running the status(); command to provide 
> > access statistics on the server.
> > 
> > Best Regards
> > Hugh Williams
> > Professional Services
> > OpenLink Software
> > Web: http://www.openlinksw.com
> > Support: http://support.openlinksw.com
> > Forums: http://boards.openlinksw.com/support
> > Twitter: http://twitter.com/OpenLink
> > 
> > On 29 Apr 2010, at 16:40, Sebastian Trüg wrote:
> > 
> >> Hello,
> >>
> >> using a V6 server with default indexes I would like to know why the
> >> following queries are different in performance. The store contains
> >> thousands of different graphs but adding a "graph ?g {}" around all
> >> patterns does not change the execution time.
> >>
> >> select * where { ?r a nfo:FileDataObject .
> >>                 ?r nfo:fileName ?f .
> >>                 ?f bif:contains 'breaking' . }
> >>
> >> takes a long time to finish - almost a minute.
> >>
> >>
> >> select * where { ?r nfo:fileName ?f .
> >>                 ?f bif:contains 'breaking' . }
> >>
> >> is finished in no time.
> >>
> >> I just would like to understand the reasons behind the difference. If
> >> possible also a solution. :)
> >>
> >> Cheers,
> >> Sebastian
> >>
> >> ------------------------------------------------------------------------------
> >> _______________________________________________
> >> Virtuoso-users mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
> > 
> > 
> ------------------------------------------------------------------------------
> _______________________________________________
> Virtuoso-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Reply via email to