[
https://issues.apache.org/jira/browse/JENA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rob Vesse updated JENA-780:
---------------------------
Description:
This RFE is a follow on from JENA-779, the query with a sub-optimal plan there
uses a {{BIND}} to create a value which is then only used once in a subsequent
filter.
Actually that query uses it twice but I think the general approach I am trying
to describe in this RFE bears consideration. In this case it seems like it
would be possible to substitute the extend expression for the bound variable in
the filter expression.
Simplified variant of original query such that the bound value is only used
once:
{noformat}
SELECT DISTINCT ?uri
{
{ ?uri ?p ?o }
UNION
{
?sub ?p ?uri
FILTER(isIRI(?uri))
}
BIND(str(?uri) as ?s)
FILTER(STRSTARTS(?s, "http://"))
}
{noformat}
Rewritten query:
{noformat}
SELECT DISTINCT ?domainName
{
{ ?uri ?p ?o }
UNION
{
?sub ?p ?uri
FILTER(isIRI(?uri))
}
FILTER(STRSTARTS(str(?uri), "http://"))
}
{noformat}
Which avoids an extend expression whose value is only used once and will
ultimately be thrown away.
>From a {{Transform}} standpoint this is likely awkward to implement in a pure
>transform since it requires knowledge about the query structure above the
>{{FILTER}} i.e. whether the bound variable is used elsewhere and so would need
>to use before and after visitors to track that additional state but I think
>this is a feasible optimisation.
was:
This RFE is a follow on from JENA-779, the query with a sub-optimal plan there
uses a {{BIND}} to create a value which is then only used once in a subsequent
filter. In this case it seems like it would be possible to substitute the
extend expression for the bound variable in the filter expression.
Original query:
{noformat}
SELECT DISTINCT ?domainName
{
{ ?uri ?p ?o }
UNION
{
?sub ?p ?uri
FILTER(isIRI(?uri))
}
BIND(str(?uri) as ?s)
FILTER(STRSTARTS(?s, "http://"))
BIND(IRI(CONCAT("http://", STRBEFORE(SUBSTR(?s,8), "/"))) AS ?domainName)
}
{noformat}
Rewritten query:
{noformat}
SELECT DISTINCT ?domainName
{
{ ?uri ?p ?o }
UNION
{
?sub ?p ?uri
FILTER(isIRI(?uri))
}
FILTER(STRSTARTS(str(?uri), "http://"))
BIND(IRI(CONCAT("http://", STRBEFORE(SUBSTR(?s,8), "/"))) AS ?domainName)
}
{noformat}
Which avoids one extend expression whose value will ultimately be thrown away.
>From a {{Transform}} standpoint this is likely awkward to implement in a pure
>transform since it requires knowledge about the query structure above the
>{{FILTER}} and so would need to use before and after visitors to track that
>additional state but I think this is a feasible optimisation.
> Extend expression whose value is used only in a filter can be substituted
> directly into the filter
> --------------------------------------------------------------------------------------------------
>
> Key: JENA-780
> URL: https://issues.apache.org/jira/browse/JENA-780
> Project: Apache Jena
> Issue Type: Improvement
> Components: ARQ, Optimizer
> Affects Versions: Jena 2.12.0
> Reporter: Rob Vesse
> Priority: Minor
>
> This RFE is a follow on from JENA-779, the query with a sub-optimal plan
> there uses a {{BIND}} to create a value which is then only used once in a
> subsequent filter.
> Actually that query uses it twice but I think the general approach I am
> trying to describe in this RFE bears consideration. In this case it seems
> like it would be possible to substitute the extend expression for the bound
> variable in the filter expression.
> Simplified variant of original query such that the bound value is only used
> once:
> {noformat}
> SELECT DISTINCT ?uri
> {
> { ?uri ?p ?o }
> UNION
> {
> ?sub ?p ?uri
> FILTER(isIRI(?uri))
> }
> BIND(str(?uri) as ?s)
> FILTER(STRSTARTS(?s, "http://"))
> }
> {noformat}
> Rewritten query:
> {noformat}
> SELECT DISTINCT ?domainName
> {
> { ?uri ?p ?o }
> UNION
> {
> ?sub ?p ?uri
> FILTER(isIRI(?uri))
> }
> FILTER(STRSTARTS(str(?uri), "http://"))
> }
> {noformat}
> Which avoids an extend expression whose value is only used once and will
> ultimately be thrown away.
> From a {{Transform}} standpoint this is likely awkward to implement in a pure
> transform since it requires knowledge about the query structure above the
> {{FILTER}} i.e. whether the bound variable is used elsewhere and so would
> need to use before and after visitors to track that additional state but I
> think this is a feasible optimisation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)