thanks, Andy. Please see the inline comments.

Best wishes,

June

2015-02-01 22:08 GMT+08:00 Andy Seaborne <[email protected]>:

> On 31/01/15 02:02, 朱曼 wrote:
>
>> select (sum(?subTotal) as ?sum) where {
>>   {select ((2*count(?inst)*?countec/?countc -
>> ?countec/?countc*?countec/?countc) as ?subTotal)
>>   where {
>>   ?inst<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>  <
>> http://dbpedia.org/class/yago/PhysicalEntity100001930>.
>>   {select (count(?inst1) as ?countc) ?c ?countec
>>   where{
>>   ?inst1<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>  ?c.
>>   {select (count(?inst) as ?countec) ?c
>>   WHERE{
>>   ?inst<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>  <
>> http://dbpedia.org/class/yago/PhysicalEntity100001930>;
>>   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>  ?c.
>>   filter (str(?c)!="http://dbpedia.org/class/yago/PhysicalEntity100001930
>> ")
>>   } GROUP BY ?c } }
>>   GROUP BY ?c ?countec }} group by ?countc ?countec }}
>>
>
> Formatted:
>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX  dbyago: <http://dbpedia.org/class/yago/>
>
> SELECT  (sum(?subTotal) AS ?sum)
> WHERE
> { { SELECT  ...
>     WHERE
>       { ?inst rdf:type dbyago:PhysicalEntity100001930
>         { SELECT  (count(?inst1) AS ?countc) ?c ?countec
>           WHERE
>             { ?inst1 rdf:type ?c
>               { SELECT  (count(?inst) AS ?countec) ?c
>                 WHERE
>                   { ?inst rdf:type dbyago:PhysicalEntity100001930 .
>                     ?inst rdf:type ?c
>                     FILTER ( ?c != dbyago:PhysicalEntity100001930 )
>                   }
>                 GROUP BY ?c
>               }
>             }
>           GROUP BY ?c ?countec
>         }
>       }
>     GROUP BY ?countc ?countec
>   }
> }
>
> ------------------------
>
> Looking at:
>
> 1:: You have a cross product (graph patterns without connections between
> them).
>
>   { { SELECT  ...
>       WHERE
>         { ?inst rdf:type dbyago:PhysicalEntity100001930
>           { SELECT  (count(?inst1) AS ?countc) ?c ?countec
>
> then ?inst is used in the triple pattern and is not connected to
> SELECT/count.
>
> You'll get A x B result where A is number of ?inst rdf:type
> dbyago:PhysicalEntity100001930 and B number from the SELECT (number of
> groups).
>
> if
> ?inst rdf:type dbyago:PhysicalEntity100001930
>
> is large, that's a lot of work
>
>
> That maybe more an indication of mistaken query structure because ?inst
> isn't used anywhere else - it's a different variable to the inner ?inst.


?inst is only used in count(?inst), which is renamed to ?countec then.

>


> 2::
>    WHERE
>    { ?inst rdf:type dbyago:PhysicalEntity100001930 .
>      ?inst rdf:type ?c
>      FILTER ( ?c != dbyago:PhysicalEntity100001930 )
>    }
>
> looks likely very large = expensive especially if
>
> number(?inst rdf:type dbyago:PhysicalEntity100001930) != 1
>
>
> 3:: No optimizer is perfect : they are tuned to expected usage and which
> engine it is will affect ways to improve the query.  What is your setup?
> How much data is there?

I am using virtuoso 6.1.8, and dataset is DBpedia 3.9.In the dataset,
 number(?inst rdf:type dbyago:PhysicalEntity100001930) is approx. 2
million, which is indeed very large.

>
>
>         Andy
>

Reply via email to