Stephen Allen created JENA-848:
----------------------------------
Summary: jena-text Lucene concurrency issues
Key: JENA-848
URL: https://issues.apache.org/jira/browse/JENA-848
Project: Apache Jena
Issue Type: Bug
Components: Text
Reporter: Stephen Allen
Assignee: Stephen Allen
When using jena-text with an in-process Lucene index, there are concurrency
issues when multiple requests are accessing the Dataset in using transactions.
It appears the problem is that a new Lucene IndexWriter is created at every
transaction start with no concurrency control. Instead the solution should be
to create a single IndexWriter when the DatasetGraphText is created and use
that for all requests. This works because the Lucene IndexWriter is thread
safe and designed for concurrent access.
This should also increase performance by not continually opening and closing
the IndexWriter. Also we can use Near Real-Time (NRT) IndexReaders that don't
have to wait until changes are pushed to disk.
If concurrent access is not controlled then you can end up with IndexWriter
objects being closed while they are still in use by other threads:
{code}
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:645)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2974)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2954)
at
org.apache.jena.query.text.TextIndexLucene.finishIndexing(TextIndexLucene.java:122)
at
org.apache.jena.query.text.TextDocProducerTriples.finish(TextDocProducerTriples.java:46)
at
org.apache.jena.query.text.DatasetGraphText.commit(DatasetGraphText.java:122)
at
org.apache.jena.query.text.TestLuceneWithMultipleThreads$2.run(TestLuceneWithMultipleThreads.java:156)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)