Hi everyone,
some of our clients were reporting a rather severe performance breakdown
on top of 2.7.x. After further investigation, it turns out that they had
queries whose optimized plan was suddenly very weak for the given
repository layout and shape. Strangely enough, putting some curlies around
the right triple patterns was sufficient to push the optimizer to do the
correct optimization. Below is the original query and its plan and the
adjusted query and its plan.
I have 2 questions:
1) it seems this behavior changed against 2.6.x. Is this a known issue,
e.g. a change which was required to avert a bug?
2) it is not clear to me why the optimizer needs the curlies in order to
do the right thing. I.e. why it cannot achieve the same in the original
query
thanks
Simon
Original query:
PREFIX t1:<t1>
PREFIX t2:<t2>
PREFIX t3:<t3>
PREFIX t4:<t4>
PREFIX t5:<t5>
SELECT DISTINCT ?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4 ?R1_v5
?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10
WHERE
{ ?R1 t1:p0 <https://host/jts/process/project-areas/p1>
FILTER ( ?R1 = <https://host/rm/resources/r1> )
{ ?R1 t1:p1 ?R1_v6 }
OPTIONAL
{ ?R1 t2:p2 ?R1_v9 }
OPTIONAL
{ ?R1 t2:p3 ?R1_v7 }
OPTIONAL
{ ?R1 t2:p4 ?R1_v10 }
OPTIONAL
{ ?R1 t2:p5 ?R1_v8 }
OPTIONAL
{ ?R1 t3:p6 ?R1_v1 .
?R1_v1 t2:p5 ?R1_uv2
}
OPTIONAL
{ ?R1 t3:p7 ?R1_v2 }
OPTIONAL
{ ?R1 t3:p8 ?R1_v5 }
OPTIONAL
{ ?R1 t3:p9 ?R1_v4 }
OPTIONAL
{ ?R1 t4:p10 ?R1_v3 .
?R1_v3 t5:p11 ?R1_uv1
}
?R1 t5:p12 t3:Artifact .
?R1 t1:p0 ?R1_resourceContext
}
Original plan:
(prefix ((t4: <file:///C:/Temp/t4>)
(t5: <file:///C:/Temp/t5>)
(t1: <file:///C:/Temp/t1>)
(t2: <file:///C:/Temp/t2>)
(t3: <file:///C:/Temp/t3>))
(distinct
(project (?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4 ?R1_v5
?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10)
(filter (= ?R1 <https://host/rm/resources/r1>)
(sequence
(conditional
(conditional
(conditional
(conditional
(conditional
(conditional
(conditional
(conditional
(conditional
(bgp
(triple ?R1 t1:p0 <
https://host/jts/process/project-areas/p1>)
(triple ?R1 t1:p1 ?R1_v6)
)
(bgp (triple ?R1 t2:p2 ?R1_v9)))
(bgp (triple ?R1 t2:p3 ?R1_v7)))
(bgp (triple ?R1 t2:p4 ?R1_v10)))
(bgp (triple ?R1 t2:p5 ?R1_v8)))
(bgp
(triple ?R1 t3:p6 ?R1_v1)
(triple ?R1_v1 t2:p5 ?R1_uv2)
))
(bgp (triple ?R1 t3:p7 ?R1_v2)))
(bgp (triple ?R1 t3:p8 ?R1_v5)))
(bgp (triple ?R1 t3:p9 ?R1_v4)))
(bgp
(triple ?R1 t4:p10 ?R1_v3)
(triple ?R1_v3 t5:p11 ?R1_uv1)
))
(bgp
(triple ?R1 t5:p12 t3:Artifact)
(triple ?R1 t1:p0 ?R1_resourceContext)
))))))
Adjusted query (the first 2 constraints and its filter have been
surrounded by curlies):
PREFIX t1:<t1>
PREFIX t2:<t2>
PREFIX t3:<t3>
PREFIX t4:<t4>
PREFIX t5:<t5>
SELECT DISTINCT ?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4 ?R1_v5
?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10
WHERE
{
{
?R1 t1:p0 <https://host/jts/process/project-areas/p1>
FILTER ( ?R1 = <https://host/rm/resources/r1> )
{ ?R1 t1:p1 ?R1_v6 }
}
OPTIONAL
{ ?R1 t2:p2 ?R1_v9 }
OPTIONAL
{ ?R1 t2:p3 ?R1_v7 }
OPTIONAL
{ ?R1 t2:p4 ?R1_v10 }
OPTIONAL
{ ?R1 t2:p5 ?R1_v8 }
OPTIONAL
{ ?R1 t3:p6 ?R1_v1 .
?R1_v1 t2:p5 ?R1_uv2
}
OPTIONAL
{ ?R1 t3:p7 ?R1_v2 }
OPTIONAL
{ ?R1 t3:p8 ?R1_v5 }
OPTIONAL
{ ?R1 t3:p9 ?R1_v4 }
OPTIONAL
{ ?R1 t4:p10 ?R1_v3 .
?R1_v3 t5:p11 ?R1_uv1
}
?R1 t5:p12 t3:Artifact .
?R1 t1:p0 ?R1_resourceContext
}
Adjusted plan:
(prefix ((t4: <file:///C:/Temp/t4>)
(t5: <file:///C:/Temp/t5>)
(t1: <file:///C:/Temp/t1>)
(t2: <file:///C:/Temp/t2>)
(t3: <file:///C:/Temp/t3>))
(distinct
(project (?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4 ?R1_v5
?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10)
(sequence
(conditional
(conditional
(conditional
(conditional
(conditional
(conditional
(conditional
(conditional
(conditional
(assign ((?R1 <https://host/rm/resources/r1>))
(bgp
(triple <https://host/rm/resources/r1> t1:p0
<https://host/jts/process/project-areas/p1>)
(triple <https://host/rm/resources/r1> t1:p1
?R1_v6)
))
(bgp (triple ?R1 t2:p2 ?R1_v9)))
(bgp (triple ?R1 t2:p3 ?R1_v7)))
(bgp (triple ?R1 t2:p4 ?R1_v10)))
(bgp (triple ?R1 t2:p5 ?R1_v8)))
(bgp
(triple ?R1 t3:p6 ?R1_v1)
(triple ?R1_v1 t2:p5 ?R1_uv2)
))
(bgp (triple ?R1 t3:p7 ?R1_v2)))
(bgp (triple ?R1 t3:p8 ?R1_v5)))
(bgp (triple ?R1 t3:p9 ?R1_v4)))
(bgp
(triple ?R1 t4:p10 ?R1_v3)
(triple ?R1_v3 t5:p11 ?R1_uv1)
))
(bgp
(triple ?R1 t5:p12 t3:Artifact)
(triple ?R1 t1:p0 ?R1_resourceContext)
)))))
Note the assign ((?R1 <https://host/rm/resources/r1>), which makes the
query scalable on a large repository