Hi Ivan,
while this is very good to know it is hard to do it generically. After
all users and developers can create all sorts of queries.
Thanks for the explanation.
Can this be considered a bug in the query optimizer?
Cheers,
Sebastian
On 05/03/2010 10:20 AM, Ivan Mikhailov wrote:
> Hello Sebastian,
>
> With
> select * where { ?r a nfo:FileDataObject .
> ?r nfo:fileName ?f . ?f bif:contains 'breaking' . }
>
> the optimizer decides for some reason that being nfo:FileDataObject is much
> less frequent property of ?r than containing "breaking" word in a filename.
> That's weird decision because in the EXPLAIN the optimizer says that
> "breaking" is seldom by itself, so "breaking" in filename should be even more
> seldom than "breaking" in anything.
>
> The workaround is to tell the optimizer that you know the order of operations
> better than it and to reorder triple patterns:
>
> sparql define sql:select-option "ORDER" select * where {
> ?r
> <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#fileName> ?f .
> ?f bif:contains \'breaking\' .
> ?r a
> <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject> . }
>
> Without define sql:select-option "ORDER", reordering does not help
> because the optimizer pay no attention to original order of tables if
> there are only few variants. (If the query is complicated and no define
> sql:select-option "ORDER" is specified then writing patterns in some
> specific order guarantees that the optimizer will consider this order of
> execution among other possibilities even if there's not enough time to
> search throughout all possible permutations.)
>
> Best Regards,
>
> Ivan Mikhailov
> OpenLink Software
> http://virtuoso.openlinksw.com
>
>
> On Fri, 2010-04-30 at 15:10 +0200, Sebastian Trüg wrote:
>> Hello Hugh,
>>
>> Please find the output attached to prevent line breaks.
>>
>> Cheers,
>> Sebastian
>>
>> On 04/30/2010 03:14 AM, Hugh Williams wrote:
>>> Hi Sebastian,
>>>
>>> Can you please provide the explain output detailing the execution plan for
>>> both queries for comparison, as detailed at:
>>>
>>> http://docs.openlinksw.com/virtuoso/perfdiag.html#perfdiagqueryplans
>>> http://docs.openlinksw.com/virtuoso/fn_explain.html
>>>
>>> Note you must include the "sparql" keyword before the start of each query.
>>> Please also provide the output of running the status(); command to provide
>>> access statistics on the server.
>>>
>>> Best Regards
>>> Hugh Williams
>>> Professional Services
>>> OpenLink Software
>>> Web: http://www.openlinksw.com
>>> Support: http://support.openlinksw.com
>>> Forums: http://boards.openlinksw.com/support
>>> Twitter: http://twitter.com/OpenLink
>>>
>>> On 29 Apr 2010, at 16:40, Sebastian Trüg wrote:
>>>
>>>> Hello,
>>>>
>>>> using a V6 server with default indexes I would like to know why the
>>>> following queries are different in performance. The store contains
>>>> thousands of different graphs but adding a "graph ?g {}" around all
>>>> patterns does not change the execution time.
>>>>
>>>> select * where { ?r a nfo:FileDataObject .
>>>> ?r nfo:fileName ?f .
>>>> ?f bif:contains 'breaking' . }
>>>>
>>>> takes a long time to finish - almost a minute.
>>>>
>>>>
>>>> select * where { ?r nfo:fileName ?f .
>>>> ?f bif:contains 'breaking' . }
>>>>
>>>> is finished in no time.
>>>>
>>>> I just would like to understand the reasons behind the difference. If
>>>> possible also a solution. :)
>>>>
>>>> Cheers,
>>>> Sebastian
>>>>
>>>> ------------------------------------------------------------------------------
>>>> _______________________________________________
>>>> Virtuoso-users mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>>>
>>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Virtuoso-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>
>