Andy,
I have simplified the scenario a little bit. So the following query
PREFIX p1:<t1>
PREFIX p2:<t2>
SELECT DISTINCT ?R1 ?optionalValue
WHERE
{
?R1 p1:pr1 <https://host:9443/jts/process/project-areas/p>
FILTER ( ?R1 = <https://host:9443/rm/resources/_r1>)
OPTIONAL
{ ?R1 p2:pr2 ?optionalValue }
}
leads to
(prefix ((p1: <file:///C:/Temp/t1>)
(p2: <file:///C:/Temp/t2>))
(distinct
(project (?R1 ?optionalValue)
(filter (= ?R1 <https://host:9443/rm/resources/_r1>)
(conditional
(bgp (triple ?R1 p1:pr1 <
https://host:9443/jts/process/project-areas/p>))
(bgp (triple ?R1 p2:pr2 ?optionalValue)))))))
whereas the following query
PREFIX p1:<t1>
PREFIX p2:<t2>
SELECT DISTINCT ?R1 ?optionalValue
WHERE
{
{
?R1 p1:pr1 <https://host:9443/jts/process/project-areas/p>
FILTER ( ?R1 = <https://host:9443/rm/resources/_r1>)
}
OPTIONAL
{ ?R1 p2:pr2 ?optionalValue }
}
leads to
(prefix ((p1: <file:///C:/Temp/t1>)
(p2: <file:///C:/Temp/t2>))
(distinct
(project (?R1 ?optionalValue)
(conditional
(assign ((?R1 <https://host:9443/rm/resources/_r1>))
(bgp (triple <https:/host:9443/rm/resources/_r1> p1:pr1 <
https://perfjts.ibm.com:9443/jts/process/project-areas/p>)))
(bgp (triple ?R1 p2:pr2 ?optionalValue))))))
So, looking at this in more detail, the former plan surprises me a bit
because it proposes to evaluate p1:pr1 as optional, even though I didn't
express that in the query. To answer the rest of the question, the intend
of the query is to bind ?R1 to a known fixed resource uri and also
retrieve any optional predicates with that resource if present (in the
example p2:pr2). The reason to use the filter expression is that in
general, there may be a number of resources, so the following kind of
query
PREFIX p1:<t1>
PREFIX p2:<t2>
SELECT DISTINCT ?R1 ?optionalValue
WHERE
{
?R1 p1:pr1 <https://perfjts.ibm.com:9443/jts/process/project-areas/p>
OPTIONAL { ?R1 p2:pr2 ?optionalValue }
FILTER ( ?R1 = <https://perfrrs.ibm.com:9443/rm/resources/_r1> || ?R1 = <
https://perfrrs.ibm.com:9443/rm/resources/_r2>)
}
would be a typical variation
Simon
From:
Andy Seaborne <[email protected]>
To:
[email protected]
Date:
07/18/2012 10:47 AM
Subject:
Re: optimization opportunity lost in 2.7.x which existed in 2.6.x?
Simon,
The work for this optimization is done in TransformFilterEquality.
Is there a simpler (= shorter) query that exhibits this behaviour? Does
it depend on the number of OPTIONALS?
Aside fro the report, is that structure intended with a top-level BGP at
the end of the query, a nested one after the FILTER and a BGp at the
start - or was a single BGP meant?
Andy
On 16/07/12 23:27, Simon Helsen wrote:
> Hi everyone,
>
> some of our clients were reporting a rather severe performance breakdown
> on top of 2.7.x. After further investigation, it turns out that they had
> queries whose optimized plan was suddenly very weak for the given
> repository layout and shape. Strangely enough, putting some curlies
around
> the right triple patterns was sufficient to push the optimizer to do the
> correct optimization. Below is the original query and its plan and the
> adjusted query and its plan.
>
> I have 2 questions:
>
> 1) it seems this behavior changed against 2.6.x. Is this a known issue,
> e.g. a change which was required to avert a bug?
> 2) it is not clear to me why the optimizer needs the curlies in order to
> do the right thing. I.e. why it cannot achieve the same in the original
> query
>
> thanks
>
> Simon
>
> Original query:
>
> PREFIX t1:<t1>
> PREFIX t2:<t2>
> PREFIX t3:<t3>
> PREFIX t4:<t4>
> PREFIX t5:<t5>
> SELECT DISTINCT ?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4
?R1_v5
> ?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10
> WHERE
> { ?R1 t1:p0 <https://host/jts/process/project-areas/p1>
> FILTER ( ?R1 = <https://host/rm/resources/r1> )
> { ?R1 t1:p1 ?R1_v6 }
> OPTIONAL
> { ?R1 t2:p2 ?R1_v9 }
> OPTIONAL
> { ?R1 t2:p3 ?R1_v7 }
> OPTIONAL
> { ?R1 t2:p4 ?R1_v10 }
> OPTIONAL
> { ?R1 t2:p5 ?R1_v8 }
> OPTIONAL
> { ?R1 t3:p6 ?R1_v1 .
> ?R1_v1 t2:p5 ?R1_uv2
> }
> OPTIONAL
> { ?R1 t3:p7 ?R1_v2 }
> OPTIONAL
> { ?R1 t3:p8 ?R1_v5 }
> OPTIONAL
> { ?R1 t3:p9 ?R1_v4 }
> OPTIONAL
> { ?R1 t4:p10 ?R1_v3 .
> ?R1_v3 t5:p11 ?R1_uv1
> }
> ?R1 t5:p12 t3:Artifact .
> ?R1 t1:p0 ?R1_resourceContext
> }
>
> Original plan:
>
> (prefix ((t4: <file:///C:/Temp/t4>)
> (t5: <file:///C:/Temp/t5>)
> (t1: <file:///C:/Temp/t1>)
> (t2: <file:///C:/Temp/t2>)
> (t3: <file:///C:/Temp/t3>))
> (distinct
> (project (?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4
?R1_v5
> ?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10)
> (filter (= ?R1 <https://host/rm/resources/r1>)
> (sequence
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (bgp
> (triple ?R1 t1:p0 <
> https://host/jts/process/project-areas/p1>)
> (triple ?R1 t1:p1 ?R1_v6)
> )
> (bgp (triple ?R1 t2:p2 ?R1_v9)))
> (bgp (triple ?R1 t2:p3 ?R1_v7)))
> (bgp (triple ?R1 t2:p4 ?R1_v10)))
> (bgp (triple ?R1 t2:p5 ?R1_v8)))
> (bgp
> (triple ?R1 t3:p6 ?R1_v1)
> (triple ?R1_v1 t2:p5 ?R1_uv2)
> ))
> (bgp (triple ?R1 t3:p7 ?R1_v2)))
> (bgp (triple ?R1 t3:p8 ?R1_v5)))
> (bgp (triple ?R1 t3:p9 ?R1_v4)))
> (bgp
> (triple ?R1 t4:p10 ?R1_v3)
> (triple ?R1_v3 t5:p11 ?R1_uv1)
> ))
> (bgp
> (triple ?R1 t5:p12 t3:Artifact)
> (triple ?R1 t1:p0 ?R1_resourceContext)
> ))))))
>
> Adjusted query (the first 2 constraints and its filter have been
> surrounded by curlies):
>
> PREFIX t1:<t1>
> PREFIX t2:<t2>
> PREFIX t3:<t3>
> PREFIX t4:<t4>
> PREFIX t5:<t5>
> SELECT DISTINCT ?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4
?R1_v5
> ?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10
> WHERE
> {
> {
> ?R1 t1:p0 <https://host/jts/process/project-areas/p1>
> FILTER ( ?R1 = <https://host/rm/resources/r1> )
> { ?R1 t1:p1 ?R1_v6 }
> }
> OPTIONAL
> { ?R1 t2:p2 ?R1_v9 }
> OPTIONAL
> { ?R1 t2:p3 ?R1_v7 }
> OPTIONAL
> { ?R1 t2:p4 ?R1_v10 }
> OPTIONAL
> { ?R1 t2:p5 ?R1_v8 }
> OPTIONAL
> { ?R1 t3:p6 ?R1_v1 .
> ?R1_v1 t2:p5 ?R1_uv2
> }
> OPTIONAL
> { ?R1 t3:p7 ?R1_v2 }
> OPTIONAL
> { ?R1 t3:p8 ?R1_v5 }
> OPTIONAL
> { ?R1 t3:p9 ?R1_v4 }
> OPTIONAL
> { ?R1 t4:p10 ?R1_v3 .
> ?R1_v3 t5:p11 ?R1_uv1
> }
> ?R1 t5:p12 t3:Artifact .
> ?R1 t1:p0 ?R1_resourceContext
> }
>
> Adjusted plan:
>
> (prefix ((t4: <file:///C:/Temp/t4>)
> (t5: <file:///C:/Temp/t5>)
> (t1: <file:///C:/Temp/t1>)
> (t2: <file:///C:/Temp/t2>)
> (t3: <file:///C:/Temp/t3>))
> (distinct
> (project (?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4
?R1_v5
> ?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10)
> (sequence
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (conditional
> (assign ((?R1 <https://host/rm/resources/r1
>))
> (bgp
> (triple <https://host/rm/resources/r1>
t1:p0
> <https://host/jts/process/project-areas/p1>)
> (triple <https://host/rm/resources/r1>
t1:p1
> ?R1_v6)
> ))
> (bgp (triple ?R1 t2:p2 ?R1_v9)))
> (bgp (triple ?R1 t2:p3 ?R1_v7)))
> (bgp (triple ?R1 t2:p4 ?R1_v10)))
> (bgp (triple ?R1 t2:p5 ?R1_v8)))
> (bgp
> (triple ?R1 t3:p6 ?R1_v1)
> (triple ?R1_v1 t2:p5 ?R1_uv2)
> ))
> (bgp (triple ?R1 t3:p7 ?R1_v2)))
> (bgp (triple ?R1 t3:p8 ?R1_v5)))
> (bgp (triple ?R1 t3:p9 ?R1_v4)))
> (bgp
> (triple ?R1 t4:p10 ?R1_v3)
> (triple ?R1_v3 t5:p11 ?R1_uv1)
> ))
> (bgp
> (triple ?R1 t5:p12 t3:Artifact)
> (triple ?R1 t1:p0 ?R1_resourceContext)
> )))))
>
> Note the assign ((?R1 <https://host/rm/resources/r1>), which makes the
> query scalable on a large repository
>