SELECT vs CONSTRUCT wil make little difference - a CONSTRUCT is a SELECT
with DISTINCT followed by making the RDF.
It's easier to work with SELECT * for analysis of the query pattern.
Having the optional first is a possible cause - your query asks for all of:
OPTIONAL {?xyz oboe-core:hasMeasurement ?object. }
but none of ?xyz or ?object are used in the following part of the query.
That is an unbounded cross product of the data from the OPTIONAL with
that of the rest of the pattern. If you use the SELECT * form, you
should see a huge number of results. I suspect that your large query
has a similar effect, as well as the OPTIONAL being in a less than ideal
place. a pattern involving ?x ?p ?z is going to be slow unless the
optimizer can ground one of the terms, which can sometimes whenre there is a
FILTER(?p = <uri1> || ?p = <uri2>)
but it can't do that if the FILTER is outside the OPTIONAL and the
pattern inside.
{ ?x ?p ?z . FILTER(?p = <uri1> || ?p = <uri2>) }
is converted to what is effectively
{ ?x <uri1> ?z .
BIND(<uri1> AS ?p }
UNION
{ ?x <uri2> ?z .
BIND(<uri2> AS ?p }
Usually you want OPTIONALs at the end of the query.
CONSTRUCT {?tempObservation0 oboe-core:ofEntity ?temp0 }
neither ?tempObservation0 nor ?temp0 appear in the pattern and will not
be bound: the result is going to the empty model, caluculated very slowly.
Andy
This query is still not complete - no namespaces. One of th first
things I'm likely to do is feed it into various tools such as
http://www.sparql.org/query-validator.html
which needs a complete query or the command line arq.qparse (to see the
optimized algebra --print=opt). Complete, minimal examples are appreciated.
On 10/07/12 23:16, Jewell, Paul wrote:
Sure thing.
1.)
I reduced the queries down so that they should (hopefully) be a bit more
readable.
The query with the OPTIONAL, which is sluggish, is as follows:
CONSTRUCT {?tempObservation0 oboe-core:ofEntity ?temp0 }
WHERE{
OPTIONAL {?xyz oboe-core:hasMeasurement ?object. }
?tempObservation oboe-core:ofEntity ?temp .
?temp rdf:type testing:Fir .
?tempObservation oboe-core:hasMeasurement ?tempMeasurement .
?tempMeasurement oboe-core:ofCharacteristic ?tempPlaceholder0 .
?tempPlaceholder0 rdf:type testing:Height .
?tempMeasurement oboe-core:hasValue ?tempPlaceholder1 .
?tempPlaceholder1 oboe-core:hasCode ?hCode .
?tempMeasurement oboe-core:usesStandard ?tempStandard .
?tempStandard rdf:type ?tempStandardType .
}
This query takes a very long time, but by simply removing the
OPTIONAL clause, it completes in less than a second. Also,
inside a SELECT query, the same WHERE statement, with the optional,
is nearly instant as well.
2.) The database is just over 100 MB, containing about 250,000 triples.
3.) Also, the FILTERS were a hacked-together workaround to avoid the
use of OPTIONALS, but the query above presents the same problem.
Paul,
A few questions:
1/ Do you have some readable versions of those queries?
2/ What size is the data?
3/ Why is it written with los of FILTERs? Grounded patterns>
The presence of the OPTIONAL stops FILTER optimization - whether that's
because the optimizer is too dumb to know it can optimize though the
leftjoin (OPTIONAL) or it's the structure of the query, I can't tell.
FILTER (?predicate3 = oboe-core:hasCode || ?predicate3 = rdf:type)
?object2 ?predicate3 ?object3 .
OPTIONAL{ ?object3 rdf:type ?object4.
?object3 ?predicate3 ?object4 .}
Restricting on ?predicate3 outside and then
There look like places where property paths might help.
oboe-core:hasCode and rdf:type being equivalent seems odd modelling.
Andy