[
https://issues.apache.org/jira/browse/JENA-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andy Seaborne updated JENA-653:
-------------------------------
Attachment: Q.rq
data.ttl
Example
* {{--engine=ref}} -- correct answers (1 rows).
* {{--set arq:optFilterPlacement=false}} -- correct answers
* Default settings -- wrong answers (0 rows).
* Swap the {{UNION}} and joined' BGP -- right answers.
> Filter Placement into union pushes down whole filter but this fails in a
> sequence.
> ----------------------------------------------------------------------------------
>
> Key: JENA-653
> URL: https://issues.apache.org/jira/browse/JENA-653
> Project: Apache Jena
> Issue Type: Bug
> Components: ARQ
> Affects Versions: Jena 2.11.1
> Reporter: Andy Seaborne
> Assignee: Andy Seaborne
> Attachments: Q.rq, data.ttl
>
>
> The filter placement method of pushing into both arms only works when the
> filter is directly over the union. If the filter is further out, and for
> filter expressions that do not involve variables in the union arms, it should
> be left outside as it may be applied elsewhere later.
> This is shown in sequence where the union is before some pattern that does
> bind the variable.
> Example - the key feature is that the {{union}} is first and is joined to a
> BGP with the union on the LHS and BGP on the RHS. If the join order is
> reversed, then a reasonable and corect optimization is performed.
> {noformat}
> PREFIX ex: <http://ex.org/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT *
> WHERE
> { { ?item rdf:type ex:type_a }
> UNION
> { ?item rdf:type ex:type_b }
> ?item ex:label ?label
> FILTER ( str(?label) = "a" )
> }
> {noformat}
> Algebra, after join strategy, before filter placement. The joion style is a
> {{sequence}}:
> {noformat}
> (prefix ((ex: <http://ex.org/>)
> (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
> (filter (= (str ?label) "a")
> (sequence
> (union
> (bgp (triple ?item rdf:type ex:type_a))
> (bgp (triple ?item rdf:type ex:type_b)))
> (bgp (triple ?item ex:label ?label)))))
> {noformat}
> which is optimzed (wrongly) as:
> {noformat}
> (prefix ((ex: <http://ex.org/>)
> (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
> (sequence
> (union
> (filter (= (str ?label) "a")
> (bgp (triple ?item rdf:type ex:type_a)))
> (filter (= (str ?label) "a")
> (bgp (triple ?item rdf:type ex:type_b))))
> (bgp (triple ?item ex:label ?label))))
> {noformat}
> The {{(filter (= (str ?label) "a")}} is applied on the {{union}}, not the
> later {{(bgp (triple ?item ex:label ?label))))}}.
> The problem is in the relationship of {{sequence}} and {{union}}. The
> {{union}} can't be treated isolation with the current design. Either the
> {{union}} needs a better placement calculated, or failing that (less
> preferrable), a flag to change the way filters are pushed own in union
> depending on nesting context.
--
This message was sent by Atlassian JIRA
(v6.2#6252)