[
https://issues.apache.org/jira/browse/JENA-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090066#comment-17090066
]
Andy Seaborne commented on JENA-1861:
-------------------------------------
There are two different exceptions:
# {{Query.setResultVars}}, when concurrent updates within {{Query}} interfere.
# Concurrent calls of {{OptimizerStd.transformExprConstantFolding}} causing
concurrent calls of {{ExprDigest}}.
(1) can be fixed by making the actual work of {{setResultVars}} synchronized.
This works (= the test case, together with the fix for (2) below, runs without
problems for many minutes).
A better solution is to do the {{setResultVars}} as parsing finishes (and
equivalent build steps in jena-querybuilder). This also works except it breaks
some tests in {{TestSyntaxTransform}}. The internal details of the query state
are different even if the transformed query is equivalent to the expected
query, just not perfectly {{.equals}}. This is either untidiness in
{{QueryTransformOps}} (leaves different internal state in the transformed query
even if this state no longer matters) or excessive detail in the query compare.
(2) is fixed by using {{CacheFactory.createOneSlotCache()}}, not two member
variables {{lastSeen}} and {{lastCalc}}, which aren't manged safely.
> Query not thread safe
> ---------------------
>
> Key: JENA-1861
> URL: https://issues.apache.org/jira/browse/JENA-1861
> Project: Apache Jena
> Issue Type: Question
> Components: ARQ
> Affects Versions: Jena 3.14.0
> Reporter: Claus Stadler
> Assignee: Andy Seaborne
> Priority: Major
>
> Executing the same query object on different RDFConnections is not thread
> safe:
> I ran into very misleading "NPE in NodeFactory.createLiteral" exceptions when
> computing SHA256 sums in parallel on different connections backed by
> different datasets/models using the SAME query object.
> I identified the cause as due to a race condition due to the digestCache used
> in
> [ExprDigest|https://github.com/apache/jena/blob/d95b7d295cebaeb2ea41029f4ee7781be94e5e85/jena-arq/src/main/java/org/apache/jena/sparql/expr/ExprDigest.java#L33]
> My first question is: Are Query objects - or rather expressions - supposed to
> carry execution state or is this rather a bug?
> I know that some parts of the Query object, such as result vars, are only
> initialized on request which makes use of the same Query object in different
> threads fragile to begin with.
> So my other question is: Given a Query object, is Jena supposed to allow for
> 'fully initializing' it, such that its execution using Jena's provided
> facilities (models, datasets, etc) is guaranteed to not modify its state?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)