[ 
https://issues.apache.org/jira/browse/JENA-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346892#comment-14346892
 ] 

Andy Seaborne commented on JENA-885:
------------------------------------

Partial analysis.  The queries may look similar but they exhibit different 
execution possibilities so picking one for discussion:
{noformat}
PREFIX ex: <http://example.com/>
SELECT  ?s ?valueA
WHERE
  { 
    OPTIONAL
      { ?s ex:propA ?a
        OPTIONAL
          { ?a ex:label ?labelA}
        BIND(if(bound(?labelA), ?labelA, ?a) AS ?valueA)
      }
  } LIMIT 1000
{noformat}
optimized algebra
{noformat}
(slice _ 1000
  (project (?s ?valueA)
    (leftjoin
      (bgp (triple ?s <http://example.com/propA> ?a))
      (extend ((?valueA (if (bound ?labelA) ?labelA ?a)))
        (conditional
          (bgp (triple ?s <http://example.com/propA> ?a))
          (bgp (triple ?a <http://example.com/label> ?labelA)))))))
{noformat}

# The query is being executed bottom up at the top level (i.e. {{leftjoin}} - 
it does not need to be in this case although in other of the queries it might 
be necessary.
# The timeout is missed - looks like a separate issue to efficient execution.
# The cost comes from the fact the {{limit}} is not moved inwards so the 
{{conditional}} is 1048576 rows of evaluation where it need only be 1000. This 
is copied with Java's unhelpful slow growth of {{ArrayList}}.
# Excessive execution is made worse by JENA-801 (this is an effect - not a 
cause) 
# The top level {{leftjoin}} can be made a lot faster in this specific case. 
Unclear about the general case within the current framework though there is a 
separate eval engine "quack" which is worth trying out for this. Non-issue if 
the limit is placed better in alegrba or execution.

I was using a reduced timeout - on my machine with an SSD it executed in 18s 
worse cold case.

Missing timeout : the internal handling of bottom up execution is not all done 
with {{QueryIterator}}s so the timeout mechanism isn't checked during some 
internal operations.  Fix : use QueryIterators.

This algebra with an additional placed {{slice}} is fast:
{noformat}
(slice _ 1000
  (project (?s ?valueA)
    (leftjoin
      (table unit)
      (slice _ 1010
        (extend ((?valueA (if (bound ?labelA) ?labelA ?a)))
          (conditional
            (bgp (triple ?s <http://example.com/propA> ?a))
            (bgp (triple ?a <http://example.com/label> ?labelA))))))))
{noformat}

> Poor performance and timeout failure with BIND in nested OPTIONALs
> ------------------------------------------------------------------
>
>                 Key: JENA-885
>                 URL: https://issues.apache.org/jira/browse/JENA-885
>             Project: Apache Jena
>          Issue Type: Bug
>    Affects Versions: Jena 2.11.2
>            Reporter: Mark Buquor
>         Attachments: ExecuteTestQueries.java, GenerateTestDataset.java
>
>
> There appears to be a performance issue with BIND when used inside nested 
> OPTIONALs. Affected queries fail to time out.
> The following patterns appear to be affected:
> {noformat}
> OPTIONAL { ... OPTIONAL { ... BIND ( ... ) } }
> OPTIONAL { ... OPTIONAL { ... } BIND ( ... ) }
> {noformat}
> The following patterns appear to be unaffected:
> {noformat}
> OPTIONAL { ... OPTIONAL { ... } } BIND ( ... )
> OPTIONAL { ...  BIND ( ... ) }
> OPTIONAL { ... } BIND ( ... )
> {noformat}
> So far, users have been able to work around the performance issue by 
> rewriting their queries. However, the timeout failure is still a significant 
> reliability issue, since affected queries consume resources and can run 
> indefinitely. I've attached a testcase that exhibits the performance and 
> timeout problems. Reproduced with a recent 2.13.0-SNAPSHOT build.
> {noformat}
> Execution Timeout (ms): 30000
> Query: PREFIX ex: <http://example.com/> SELECT ?s ?valueA { OPTIONAL { ?s 
> ex:propA ?a . OPTIONAL { ?a ex:label ?labelA . } } BIND ( IF ( BOUND 
> (?labelA), ?labelA, ?a) as ?valueA) }
> Execution time (ms): 586
> Execution time (ms): 143
> Query: PREFIX ex: <http://example.com/> SELECT ?s ?valueA { OPTIONAL { ?s 
> ex:propA ?a . OPTIONAL { ?a ex:label ?labelA . } BIND ( IF ( BOUND (?labelA), 
> ?labelA, ?a) as ?valueA) } }
> Execution time (ms): 110922
> Execution time (ms): 41004
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to