Stephen Allen created JENA-848:
----------------------------------

             Summary: jena-text Lucene concurrency issues
                 Key: JENA-848
                 URL: https://issues.apache.org/jira/browse/JENA-848
             Project: Apache Jena
          Issue Type: Bug
          Components: Text
            Reporter: Stephen Allen
            Assignee: Stephen Allen


When using jena-text with an in-process Lucene index, there are concurrency 
issues when multiple requests are accessing the Dataset in using transactions.

It appears the problem is that a new Lucene IndexWriter is created at every 
transaction start with no concurrency control.  Instead the solution should be 
to create a single IndexWriter when the DatasetGraphText is created and use 
that for all requests.  This works because the Lucene IndexWriter is thread 
safe and designed for concurrent access.

This should also increase performance by not continually opening and closing 
the IndexWriter.  Also we can use Near Real-Time (NRT) IndexReaders that don't 
have to wait until changes are pushed to disk.

If concurrent access is not controlled then you can end up with IndexWriter 
objects being closed while they are still in use by other threads:
{code}
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
        at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:645)
        at 
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2974)
        at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2954)
        at 
org.apache.jena.query.text.TextIndexLucene.finishIndexing(TextIndexLucene.java:122)
        at 
org.apache.jena.query.text.TextDocProducerTriples.finish(TextDocProducerTriples.java:46)
        at 
org.apache.jena.query.text.DatasetGraphText.commit(DatasetGraphText.java:122)
        at 
org.apache.jena.query.text.TestLuceneWithMultipleThreads$2.run(TestLuceneWithMultipleThreads.java:156)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:744)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to