On 01/08/12 16:50, Olivier Rossel wrote:
4/ You can use a subselect to restrict the remote query part:
SERVICE <...> {
SELECT * {
...
} LIMIT 300
}
I tried this query:
SELECT DISTINCT ?comment WHERE {
SERVICE
<http://api.talis.com/stores/bbc-backstage/services/sparql>
{ ?thCenturyClassicalComposers0
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/class/yago/20thCenturyClassicalComposers>
}
SERVICE <http://dbpedia.org/sparql> {SELECT
?thCenturyClassicalComposers0 ?comment WHERE {
?thCenturyClassicalComposers0
<http://www.w3.org/2000/01/rdf-schema#comment> ?comment } }
}
It returns results in a very correct time.
Then I remove ?thCenturyClassicalComposers0 from the sub-SELECT:
SELECT DISTINCT ?comment WHERE {
SERVICE
<http://api.talis.com/stores/bbc-backstage/services/sparql>
{ ?thCenturyClassicalComposers0
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/class/yago/20thCenturyClassicalComposers>
}
SERVICE <http://dbpedia.org/sparql> {SELECT ?comment WHERE {
?thCenturyClassicalComposers0
<http://www.w3.org/2000/01/rdf-schema#comment> ?comment } }
}
This query now takes MUCH MUCH longer. And eventually fizzles in a 509
HttpException.
Any idea why the query plan goes so wrong when
?thCenturyClassicalComposers0 is absent of the sub-SELECT.
?
Because in the second query you are joining the intermediate results of
SERVICE 1:
?thCenturyClassicalComposers0
with
SERVICE 2:
?comment
i.e. an unconstrained join which happens to be done inefficiently.
The inner SERVICE/2 ?thCenturyClassicalComposers0 is not the same as one
in SERVICE/1 if you remove it from the sub-select.
Try looking at it with
http://www.sparql.org/query-validator.html
and set "SPARQL algebra (general optimizations)" and you will see the
?/thCenturyClassicalComposers0 (note the ?/) which is a
renamed-because-its-hidden variable).
Any chance of readable queries? A few prefixed perhaps?
Andy