[ 
https://issues.apache.org/jira/browse/JENA-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne closed JENA-779.
------------------------------

> Filter placement should be able to break up extend
> --------------------------------------------------
>
>                 Key: JENA-779
>                 URL: https://issues.apache.org/jira/browse/JENA-779
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ, Optimizer
>    Affects Versions: Jena 2.12.0
>            Reporter: Rob Vesse
>            Assignee: Andy Seaborne
>             Fix For: Jena 2.12.1
>
>         Attachments: JENA-779-filter-extend-extend, 
> JENA-779-filter-extend_distinct.patch, JENA-779-single-extend.patch, 
> JENA-779.patch
>
>
> The following query demonstrates a query plan seen internally which is 
> considered sub-optimal.
> Consider the following query:
> {noformat}
> SELECT DISTINCT ?domainName
> {
>   { ?uri ?p ?o }
>   UNION
>   {
>     ?sub ?p ?uri
>     FILTER(isIRI(?uri))
>   }
>   BIND(str(?uri) as ?s)
>   FILTER(STRSTARTS(?s, "http://";))
>   BIND(IRI(CONCAT("http://";, STRBEFORE(SUBSTR(?s,8), "/"))) AS ?domainName)
> }
> {noformat}
> Which ARQ optimises as follows:
> {noformat}
> (distinct
>   (project (?domainName)
>     (filter (strstarts ?s "http://";)
>       (extend ((?s (str ?uri)) (?domainName (iri (concat "http://"; (strbefore 
> (substr ?s 8) "/")))))
>         (union
>           (bgp (triple ?uri ?p ?o))
>           (filter (isIRI ?uri)
>             (bgp (triple ?sub ?p ?uri))))))))
> {noformat}
> Which makes the query engine do a lot of work because it computes the both 
> the {{BIND}} expressions for lots of possible solutions that will then be 
> rejected when for many of them it would only be necessary to compute the 
> first simple {{BIND}} function.
> It would be better if the query was planned as follows:
> {noformat}
> (distinct
>   (project (?domainName)
>     (extend (?domainName (iri (concat "http://"; (strbefore (substr ?s 8) 
> "/"))))
>       (filter (strstarts ?s "http://";)
>         (extend (?s (str ?uri))
>           (union
>             (bgp (triple ?uri ?p ?o))
>             (filter (isIRI ?uri)
>               (bgp (triple ?sub ?p ?uri)))))))))
> {noformat}
> Essentially when we try to push a filter through an {{extend}} if we 
> determine that we cannot push it through the extend we should see if we can 
> split the {{extend}} instead thus resulting in a partial pushing.
> Note that a user can re-write the original query to yield this plan if they 
> make the second {{BIND}} a project expression like so:
> {noformat}
> SELECT DISTINCT (IRI(CONCAT("http://";, STRBEFORE(SUBSTR(?s,8), "/"))) AS 
> ?domainName)
> {
>   { ?uri ?p ?o }
>   UNION
>   {
>     ?sub ?p ?uri
>     FILTER(isIRI(?uri))
>   }
>   BIND(str(?uri) as ?s)
>   FILTER(STRSTARTS(?s, "http://";))
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to