On 01/08/12 16:50, Olivier Rossel wrote:
4/ You can use a subselect to restrict the remote query part:


SERVICE <...> {
    SELECT * {
    ...
    } LIMIT 300
}

I tried this query:
SELECT DISTINCT ?comment WHERE {
SERVICE
<http://api.talis.com/stores/bbc-backstage/services/sparql>
{ ?thCenturyClassicalComposers0
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/class/yago/20thCenturyClassicalComposers>
  }
SERVICE <http://dbpedia.org/sparql> {SELECT
?thCenturyClassicalComposers0 ?comment WHERE {
?thCenturyClassicalComposers0
<http://www.w3.org/2000/01/rdf-schema#comment> ?comment   } }
}

It returns results in a very correct time.

Then I remove ?thCenturyClassicalComposers0 from the sub-SELECT:


SELECT DISTINCT ?comment WHERE {
SERVICE
<http://api.talis.com/stores/bbc-backstage/services/sparql>
{ ?thCenturyClassicalComposers0
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/class/yago/20thCenturyClassicalComposers>
  }
SERVICE <http://dbpedia.org/sparql> {SELECT ?comment WHERE {
?thCenturyClassicalComposers0
<http://www.w3.org/2000/01/rdf-schema#comment> ?comment   } }
}

This query now takes MUCH MUCH longer. And eventually fizzles in a 509
HttpException.

Any idea why the query plan goes so wrong when
?thCenturyClassicalComposers0 is absent of the sub-SELECT.
?


Because in the second query you are joining the intermediate results of

SERVICE 1:
?thCenturyClassicalComposers0

with

SERVICE 2:
?comment

i.e. an unconstrained join which happens to be done inefficiently.

The inner SERVICE/2 ?thCenturyClassicalComposers0 is not the same as one in SERVICE/1 if you remove it from the sub-select.

Try looking at it with

http://www.sparql.org/query-validator.html

and set "SPARQL algebra (general optimizations)" and you will see the
?/thCenturyClassicalComposers0 (note the ?/) which is a renamed-because-its-hidden variable).

Any chance of readable queries?  A few prefixed perhaps?

        Andy

Reply via email to