[ 
https://issues.apache.org/jira/browse/JENA-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne updated JENA-653:
-------------------------------

    Attachment: Q.rq
                data.ttl

Example 

* {{--engine=ref}} -- correct answers (1 rows).
* {{--set arq:optFilterPlacement=false}} -- correct answers
* Default settings -- wrong answers (0 rows).
* Swap the {{UNION}} and joined' BGP -- right answers.


> Filter Placement into union pushes down whole filter but this fails in a 
> sequence.
> ----------------------------------------------------------------------------------
>
>                 Key: JENA-653
>                 URL: https://issues.apache.org/jira/browse/JENA-653
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ
>    Affects Versions: Jena 2.11.1
>            Reporter: Andy Seaborne
>            Assignee: Andy Seaborne
>         Attachments: Q.rq, data.ttl
>
>
> The filter placement method of pushing into both arms only works when the 
> filter is directly over the union. If the filter is further out, and for 
> filter expressions that do not involve variables in the union arms, it should 
> be left outside as it may be applied elsewhere later.
> This is shown in sequence where the union is before some pattern that does 
> bind the variable.
> Example - the key feature is that the {{union}} is first and is joined to a 
> BGP with the union on the LHS and BGP on the RHS. If the join order is 
> reversed, then a reasonable and corect optimization is performed.
> {noformat}
> PREFIX  ex:   <http://ex.org/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT  *
> WHERE
>   {   { ?item rdf:type ex:type_a }
>     UNION
>       { ?item rdf:type ex:type_b }
>     ?item ex:label ?label
>     FILTER ( str(?label) = "a" )
>   }
> {noformat}
> Algebra, after join strategy, before filter placement. The joion style is a 
> {{sequence}}:
> {noformat}
> (prefix ((ex: <http://ex.org/>)
>          (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
>   (filter (= (str ?label) "a")
>     (sequence
>       (union
>         (bgp (triple ?item rdf:type ex:type_a))
>         (bgp (triple ?item rdf:type ex:type_b)))
>       (bgp (triple ?item ex:label ?label)))))
> {noformat}
> which is optimzed (wrongly) as:
> {noformat}
> (prefix ((ex: <http://ex.org/>)
>          (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
>   (sequence
>     (union
>       (filter (= (str ?label) "a")
>         (bgp (triple ?item rdf:type ex:type_a)))
>       (filter (= (str ?label) "a")
>         (bgp (triple ?item rdf:type ex:type_b))))
>     (bgp (triple ?item ex:label ?label))))
> {noformat}
> The  {{(filter (= (str ?label) "a")}} is applied on the {{union}}, not the 
> later {{(bgp (triple ?item ex:label ?label))))}}.
> The problem is in the relationship of {{sequence}} and {{union}}. The 
> {{union}} can't be treated isolation with the current design.  Either the 
> {{union}} needs a better placement calculated, or failing that (less 
> preferrable), a flag to change the way filters are pushed own in union 
> depending on nesting context.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to