[
https://issues.apache.org/jira/browse/JENA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129838#comment-14129838
]
Rob Vesse commented on JENA-780:
--------------------------------
The performance impact for ARQ is likely negligible because of the streaming
nature of the engine.
However for other block based engines (like Urika) where the expression is
calculated in parallel and stored in bulk with each row before proceeding this
can potentially incur a large memory cost for values that are ultimately thrown
away. This is probably not something that would ever be on by default in the
ARQ standard optimiser but would potentially be a useful transform for users to
use in their own optimisers.
> Single use extend expressions could be substituted directly for their later
> usage
> ---------------------------------------------------------------------------------
>
> Key: JENA-780
> URL: https://issues.apache.org/jira/browse/JENA-780
> Project: Apache Jena
> Issue Type: Improvement
> Components: ARQ, Optimizer
> Affects Versions: Jena 2.12.0
> Reporter: Rob Vesse
> Priority: Minor
>
> This RFE is a follow on from JENA-779, the query with a sub-optimal plan
> there uses a {{BIND}} to create a value which is then only used once in a
> subsequent filter.
> Actually that query uses it twice but I think the general approach I am
> trying to describe in this RFE bears consideration. In this case it seems
> like it would be possible to substitute the extend expression for the bound
> variable in the filter expression.
> Simplified variant of original query such that the bound value is only used
> once:
> {noformat}
> SELECT DISTINCT ?uri
> {
> { ?uri ?p ?o }
> UNION
> {
> ?sub ?p ?uri
> FILTER(isIRI(?uri))
> }
> BIND(str(?uri) as ?s)
> FILTER(STRSTARTS(?s, "http://"))
> }
> {noformat}
> Rewritten query:
> {noformat}
> SELECT DISTINCT ?domainName
> {
> { ?uri ?p ?o }
> UNION
> {
> ?sub ?p ?uri
> FILTER(isIRI(?uri))
> }
> FILTER(STRSTARTS(str(?uri), "http://"))
> }
> {noformat}
> Which avoids an extend expression whose value is only used once and will
> ultimately be thrown away.
> From a {{Transform}} standpoint this is likely awkward to implement in a pure
> transform since it requires knowledge about the query structure above the
> {{FILTER}} i.e. whether the bound variable is used elsewhere and so would
> need to use before and after visitors to track that additional state but I
> think this is a feasible optimisation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)