thanks, Andy. Please see the inline comments. Best wishes,
June 2015-02-01 22:08 GMT+08:00 Andy Seaborne <[email protected]>: > On 31/01/15 02:02, 朱曼 wrote: > >> select (sum(?subTotal) as ?sum) where { >> {select ((2*count(?inst)*?countec/?countc - >> ?countec/?countc*?countec/?countc) as ?subTotal) >> where { >> ?inst<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> < >> http://dbpedia.org/class/yago/PhysicalEntity100001930>. >> {select (count(?inst1) as ?countc) ?c ?countec >> where{ >> ?inst1<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?c. >> {select (count(?inst) as ?countec) ?c >> WHERE{ >> ?inst<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> < >> http://dbpedia.org/class/yago/PhysicalEntity100001930>; >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?c. >> filter (str(?c)!="http://dbpedia.org/class/yago/PhysicalEntity100001930 >> ") >> } GROUP BY ?c } } >> GROUP BY ?c ?countec }} group by ?countc ?countec }} >> > > Formatted: > > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX dbyago: <http://dbpedia.org/class/yago/> > > SELECT (sum(?subTotal) AS ?sum) > WHERE > { { SELECT ... > WHERE > { ?inst rdf:type dbyago:PhysicalEntity100001930 > { SELECT (count(?inst1) AS ?countc) ?c ?countec > WHERE > { ?inst1 rdf:type ?c > { SELECT (count(?inst) AS ?countec) ?c > WHERE > { ?inst rdf:type dbyago:PhysicalEntity100001930 . > ?inst rdf:type ?c > FILTER ( ?c != dbyago:PhysicalEntity100001930 ) > } > GROUP BY ?c > } > } > GROUP BY ?c ?countec > } > } > GROUP BY ?countc ?countec > } > } > > ------------------------ > > Looking at: > > 1:: You have a cross product (graph patterns without connections between > them). > > { { SELECT ... > WHERE > { ?inst rdf:type dbyago:PhysicalEntity100001930 > { SELECT (count(?inst1) AS ?countc) ?c ?countec > > then ?inst is used in the triple pattern and is not connected to > SELECT/count. > > You'll get A x B result where A is number of ?inst rdf:type > dbyago:PhysicalEntity100001930 and B number from the SELECT (number of > groups). > > if > ?inst rdf:type dbyago:PhysicalEntity100001930 > > is large, that's a lot of work > > > That maybe more an indication of mistaken query structure because ?inst > isn't used anywhere else - it's a different variable to the inner ?inst. ?inst is only used in count(?inst), which is renamed to ?countec then. > > 2:: > WHERE > { ?inst rdf:type dbyago:PhysicalEntity100001930 . > ?inst rdf:type ?c > FILTER ( ?c != dbyago:PhysicalEntity100001930 ) > } > > looks likely very large = expensive especially if > > number(?inst rdf:type dbyago:PhysicalEntity100001930) != 1 > > > 3:: No optimizer is perfect : they are tuned to expected usage and which > engine it is will affect ways to improve the query. What is your setup? > How much data is there? I am using virtuoso 6.1.8, and dataset is DBpedia 3.9.In the dataset, number(?inst rdf:type dbyago:PhysicalEntity100001930) is approx. 2 million, which is indeed very large. > > > Andy >
