Dave Reynolds created JENA-2319:
-----------------------------------

             Summary: Concurrency errors in text search when using explicit 
Analyzers
                 Key: JENA-2319
                 URL: https://issues.apache.org/jira/browse/JENA-2319
             Project: Apache Jena
          Issue Type: Bug
            Reporter: Dave Reynolds


Seeing errors when multiple jena text queries are in flight at the same time.  
Precise traces vary but all examples seen so far occur in the Lucene analyzer 
phase of query parsing. Have only been able to reproduce this reliably when 
using the ConfigurableAnalyzer but that code itself looks clean suggesting that 
in general Lucene Analyzers are not thread safe. 

Reproduced on Jena versions from 3.16.0 through 4.4.0.

Will submit a PR with a test case and brute force fix (synchronize the query 
parse step) though more subtle fixes may be possible.

Example partial stack traces:

{{Caused by: java.lang.IllegalStateException: TokenStream contract violation: 
reset()/close() call missing, reset() called multiple times, or subclass does 
not call super.reset(). Please see Javadocs of TokenStream class for more 
information about the correct consuming workflow.}}
{{    at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:109) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.CharacterUtils.readFully(CharacterUtils.java:184) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.CharacterUtils.fill(CharacterUtils.java:160) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.CharacterUtils.fill(CharacterUtils.java:178) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.util.CharTokenizer.incrementToken(CharTokenizer.java:174)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter.incrementToken(ASCIIFoldingFilter.java:102)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:41)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.CachingTokenFilter.fillCache(CachingTokenFilter.java:91)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.CachingTokenFilter.incrementToken(CachingTokenFilter.java:70)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:312) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:260) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParserBase.newFieldQuery(QueryParserBase.java:473)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParserBase.getFieldQuery(QueryParserBase.java:465)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(QueryParserBase.java:828)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:469) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:355) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:244) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:215)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:109)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.jena.query.text.TextIndexLucene.parseQuery(TextIndexLucene.java:441) 
~[fuseki-server.jar:4.4.0]}}

{{...}}

{{and}}

{{Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 16 out of bounds 
for length 16}}
{{    at 
org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter.incrementToken(ASCIIFoldingFilter.java:109)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:41)
 ~[fuseki-server.jar:4.4.0]}}
{{    at org.apache.lucene.analysis.Analyzer.normalize(Analyzer.java:247) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParserBase.getRegexpQuery(QueryParserBase.java:756)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(QueryParserBase.java:824)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:469) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:355) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:244) 
~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:215)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:109)
 ~[fuseki-server.jar:4.4.0]}}
{{    at 
org.apache.jena.query.text.TextIndexLucene.parseQuery(TextIndexLucene.java:441) 
~[fuseki-server.jar:4.4.0]}}

{{...}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to