[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Simon Helsen (JIRA) Thu, 18 Oct 2012 10:52:14 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479184#comment-13479184
 ]


Simon Helsen commented on JENA-289:
-----------------------------------

We execute all our queries with defaultUnionGraph=true. The query in trouble is 
quite ugly and has the following form, but I had to obfuscate it:

PREFIX pr1: <http://host/context1>  
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX pr2: <http://host/context2> 
PREFIX pr3: <http://host/context3> 
PREFIX pr4: <http://host/context4> 
SELECT DISTINCT ?R1 ?R1_resourceContext ?R1_v1 ?R1_v2 ?R1_v3 ?R1_v4 ?R1_v5 
?R1_v6 ?R1_v7 ?R1_v8 ?R1_v9 ?R1_v10 ?R1_v11
WHERE
{ { ?R1 pr4:pred1 <https://host/context5/_SmApctudEeGqa4VgFDq7yw#Text>  }
OPTIONAL
{ ?R1 pr2:pred2 ?R1_v7 }
OPTIONAL
{ ?R1 pr1:pred3 ?R1_v10 }
OPTIONAL
{ ?R1 pr1:pred4 ?R1_v6 }
OPTIONAL
{ ?R1 pr1:pred5 ?R1_v11 }
OPTIONAL
{ ?R1 pr1:pred6 ?R1_v8 }
OPTIONAL
{ ?R1 pr3:pred7 ?R1_v12 }
FILTER ( ! bound(?R1_v12) )
OPTIONAL
{ ?R1 pr3:pred8 ?R1_v1 .
?R1_v1 pr1:pred6 ?R1_uv2
}
OPTIONAL
{ ?R1 pr3:pred9 ?R1_v2 }
OPTIONAL
{ { SELECT ?R1 (count(?b) AS ?R1_v9)
WHERE
{ ?R1 pr4:pred1 <https://host/context5/_SmApctudEeGqa4VgFDq7yw#Text> .
?b pr3:pred10 ?R1
}
GROUP BY ?R1
OFFSET 35640
LIMIT 20
}
}
OPTIONAL
{ ?R1 pr3:pred11 ?R1_v5 }
OPTIONAL
{ ?R1 pr3:pred12 ?R1_v4 }
OPTIONAL
{ ?R1 pr4:pred1 ?R1_v3 .
?R1_v3 rdf:value ?R1_uv1
}
?R1 rdf:type pr3:pred13 .
?R1 pr2:pred14 ?R1_resourceContext
}
OFFSET 35640
LIMIT 20

Note: this comes from our client and they acknowledge it is bad. But as we 
noted elsewhere in the issue, one bad query from one client can bring the 
server on its knees and one of the main use-cases for abort is to prevent a 
rogue query from continuing. Note that we also run into OMEs with a very large 
heap. This is either caused by the same reason you mention in an earlier 
comment (the union default graph needs to suppress duplicates) or, because this 
is executed in a transaction, all subsequent write activity is kept in memory 
eventually exploding it
                
> Respect query timeouts in TDB implementation
> --------------------------------------------
>
>                 Key: JENA-289
>                 URL: https://issues.apache.org/jira/browse/JENA-289
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: TDB
>    Affects Versions: TDB 0.9.1
>            Reporter: Mark Buquor
>         Attachments: TestTimeout.java, TestTimeout.java
>
>
> In general use, we sometimes see queries throw QueryCancelledException 
> several seconds/minutes after the expected timeout. This is acceptable to a 
> degree, but it appears that there are cases where a rogue query could execute 
> unmitigated. The attached testcase is an example of a query that will execute 
> and consume CPU/heap until an OutOfMemoryError is thrown.
> Example: The following query with 10s timeouts executed for ~7 minutes before 
> throwing an OOME.
> Aug 3, 2012 10:18:19 AM Executing query [limit=1,000 timeout1=10s 
> timeout2=10s]: SELECT * WHERE { ?a ?b ?c . ?c ?d ?e }
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>     at java.util.HashSet.<init>(HashSet.java:86)
>     at org.openjena.atlas.iterator.FilterUnique.<init>(FilterUnique.java:26)
>     at org.openjena.atlas.iterator.Iter.distinct(Iter.java:438)
>     at 
> com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:116)
>     at 
> com.hp.hpl.jena.tdb.solver.StageMatchTuple.makeNextStage(StageMatchTuple.java:44)
>     at 
> org.openjena.atlas.iterator.RepeatApplyIterator.hasNext(RepeatApplyIterator.java:49)
>     at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterPlainWrapper.hasNextBinding(QueryIterPlainWrapper.java:54)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
>     at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:108)
>     at 
> com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72)
>     at 
> com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:41)
>     at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
>     at 
> com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:482)
>     at 
> com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:460)
>     at TestTimout.main(TestTimout.java:84)
> Aug 3, 2012 10:25:41 AM Finished

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-289) Respect query timeouts in TDB implementation

Reply via email to