[ 
https://issues.apache.org/jira/browse/JENA-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090066#comment-17090066
 ] 

Andy Seaborne commented on JENA-1861:
-------------------------------------

There are two different exceptions:

# {{Query.setResultVars}}, when concurrent updates within {{Query}} interfere.
# Concurrent calls of {{OptimizerStd.transformExprConstantFolding}} causing 
concurrent calls of {{ExprDigest}}.

(1) can be fixed by making the actual work of {{setResultVars}} synchronized. 
This works (= the test case, together with the fix for (2) below, runs without 
problems for many minutes).

A better solution is to do the {{setResultVars}} as parsing finishes (and 
equivalent build steps in jena-querybuilder). This also works except it breaks 
some tests in {{TestSyntaxTransform}}. The internal details of the query state 
are different even if the transformed query is equivalent to the expected 
query, just not perfectly {{.equals}}. This is either untidiness in 
{{QueryTransformOps}} (leaves different internal state in the transformed query 
even if this state no longer matters) or excessive detail in the query compare.

(2) is fixed by using {{CacheFactory.createOneSlotCache()}}, not two member 
variables {{lastSeen}} and {{lastCalc}}, which aren't manged safely.

> Query not thread safe
> ---------------------
>
>                 Key: JENA-1861
>                 URL: https://issues.apache.org/jira/browse/JENA-1861
>             Project: Apache Jena
>          Issue Type: Question
>          Components: ARQ
>    Affects Versions: Jena 3.14.0
>            Reporter: Claus Stadler
>            Assignee: Andy Seaborne
>            Priority: Major
>
> Executing the same query object on different RDFConnections is not thread 
> safe:
> I ran into very misleading "NPE in NodeFactory.createLiteral" exceptions when 
> computing SHA256 sums in parallel on different connections backed by 
> different datasets/models using the SAME query object.
> I identified the cause as due to a race condition due to the digestCache used 
> in 
> [ExprDigest|https://github.com/apache/jena/blob/d95b7d295cebaeb2ea41029f4ee7781be94e5e85/jena-arq/src/main/java/org/apache/jena/sparql/expr/ExprDigest.java#L33]
> My first question is: Are Query objects - or rather expressions - supposed to 
> carry execution state or is this rather a bug?
> I know that some parts of the Query object, such as result vars, are only 
> initialized on request which makes use of the same Query object in different 
> threads fragile to begin with.
> So my other question is: Given a Query object, is Jena supposed to allow for 
> 'fully initializing' it, such that its execution using Jena's provided 
> facilities (models, datasets, etc) is guaranteed to not modify its state?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to