Hi Holger, I believe you are correct that Query objects with aggregators cannot be reused by different threads. They *can* be reused by the same thread or by different threads that synchronize the compile step, but even then there is a problem with the Query object hanging onto references to a new aggregator for each query execution.
The thing causing this appears to be in AlgebraGenerator.java line 562, where the aggregators added to a Query object are referenced directly by the compiled query plan. Instead, we should make a copy of the aggregators so that the original Query object remains immutable. I've created a JIRA issue and submitted a patch, JENA-120: https://issues.apache.org/jira/browse/JENA-120 As a work-around until the patch is applied, I think you can synchronize around the QueryExecutionFactory.create() method. Or, you can decide not to cache Group By queries (test for this with Query.hasGroupBy()). I don't know if there are other issues that may prevent reusing Query objects, maybe Andy can chime in here. -Stephen P.S. Your strategy of caching Query objects does avoid having to reparse the query string, which can be quite beneficial. Along these same lines, a better enhancement to ARQ would be a mechanism to cache the query plans after the optimizer step. Query optimization itself can get quite expensive (n! for left-deep trees, and even worse for bushy trees). > -----Original Message----- > From: Holger Knublauch [mailto:[email protected]] > Sent: Tuesday, September 20, 2011 1:14 AM > To: [email protected] > Subject: Aggregators and concurrent use of Query object > > Hi Andy, > > we have (unreliably) run into exceptions like the one below, and my > suspicion is that the ARQ Query class is not meant to be re-used by > multiple threads. Although each step in the Query is converted into a > corresponding Algebra objects for execution, the Aggregators seem to be > shared between multiple objects. Is this correct and do I need to > create a new Query each time I want a QueryExecution? This would slow > down things quite a lot, as we currently cache all Queries that were > created from string representation. If this is the case, are there any > ways to tell which particular queries are not thread-safe, e.g. all > queries involving aggregations? > > If I am totally off the mark, do you know what else could cause the > exception below, only sometimes in multi-threading conditions? > > Thank you, > Holger > > > com.hp.hpl.jena.sparql.ARQInternalErrorException: Null for accumulator > at > com.hp.hpl.jena.sparql.expr.aggregate.AggregatorBase.getValue(Aggregato > rBase.java:61) > at > com.hp.hpl.jena.sparql.engine.iterator.QueryIterGroup.calc(QueryIterGro > up.java:121) > at > com.hp.hpl.jena.sparql.engine.iterator.QueryIterGroup.<init>(QueryIterG > roup.java:32) > at > com.hp.hpl.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:4 > 13) > at > com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDis > patch.java:255) > at > com.hp.hpl.jena.sparql.algebra.op.OpGroup.visit(OpGroup.java:37) > at > com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDisp > atch.java:33) > at > com.hp.hpl.jena.sparql.engine.main.OpExecutor.executeOp(OpExecutor.java > :107) > at > com.hp.hpl.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:4 > 41) > at > com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDis > patch.java:241) > at > com.hp.hpl.jena.sparql.algebra.op.OpExtend.visit(OpExtend.java:107) > at > com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDisp > atch.java:33) > at > com.hp.hpl.jena.sparql.engine.main.OpExecutor.executeOp(OpExecutor.java > :107) > at > com.hp.hpl.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:3 > 93) > at > com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDis > patch.java:213) > at > com.hp.hpl.jena.sparql.algebra.op.OpProject.visit(OpProject.java:34) > at > com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDisp > atch.java:33) > at > com.hp.hpl.jena.sparql.engine.main.OpExecutor.executeOp(OpExecutor.java > :107) > at > com.hp.hpl.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:8 > 0) > at com.hp.hpl.jena.sparql.engine.main.QC.execute(QC.java:40) > at > com.hp.hpl.jena.sparql.engine.main.QueryEngineMain.eval(QueryEngineMain > .java:52) > at > com.hp.hpl.jena.sparql.engine.QueryEngineBase.evaluate(QueryEngineBase. > java:138) > at > com.hp.hpl.jena.sparql.engine.QueryEngineBase.createPlan(QueryEngineBas > e.java:109) > at > com.hp.hpl.jena.sparql.engine.QueryEngineBase.getPlan(QueryEngineBase.j > ava:97) > at > com.hp.hpl.jena.sparql.engine.main.QueryEngineMain$1.create(QueryEngine > Main.java:91) > at > com.hp.hpl.jena.sparql.engine.QueryExecutionBase.getPlan(QueryExecution > Base.java:266) > at > com.hp.hpl.jena.sparql.engine.QueryExecutionBase.startQueryIterator(Que > ryExecutionBase.java:243) > at > com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execResultSet(QueryExe > cutionBase.java:248) > at > com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execSelect(QueryExecut > ionBase.java:94) > at > org.topbraid.spin.arq.SPINARQFunction.executeBody(SPINARQFunction.java: > 121)
