Mark,

>I am not a Lucene expert but I would like to understand the threading issues
>also, and I'm wondering if the following is true when using Lucene in a
>multithreaded application.
>
>I understand there are three modes for using IndexReader and IndexWriter:
>
>A- IndexReader for reading only, not deleting
>B- IndexReader for deleting (and reading)
>C- IndexWriter (for adding and optimizing)
>
>Any number of readers may be used concurrently in mode A.  But for B and C
>the reader or writer may not be kept open for long periods.  Write
>operations create a lock, and closing the reader or writer is the only way
>to release the lock.  In theory a single writer could be kept open, but its
>lock will prevent deletions (which are performed with a separate reader).
>
>Therefore for B and C each set of changes should be made inside a
>synchronized block where the reader or writer is opened and closed.  This
>prevents multiple writers (or readers used for deleting) from being open at
>once.  The synchronization should be done on an object that identifies a
>particular index, e.g., on a global object if there is only one index.  For
>example:

This is exactly how I understand things. I have written a LuceneDb
class that works mostly in this way, except that it will keep an index
reader and an index searcher open once they are opened. There is also a
method to close these in case they are open.
Opening an index reader is not really fast, so it pays to keep it open.

Lucene itself should provide some or most of the locking needed for this.
However I managed to get inconsistencies (ie. exceptions about wrong
doc order during optimize()) probably due to 'unusual' semantics of
the underlying file system. To solve this I provided my own locking
in the LuceneDb class.

I wrote methods for the A case, the B/C case and the C case
visiting an index reader and/or an index writer as appropriate.
Cases B and C need to be under the same lock when you are
deleting old docs and inserting new ones.
The B/C and C cases always end with both the reader/searcher
and the writer closed.
For searching an index searcher is also needed so there is
another method to visit both a reader and a searcher, case D.

I use Doug Lea's util.concurrent read/write lock, readers for
cases A and D, writers for cases B/C and C.
To open the reader and the searcher there is a separate lock to prevent
races between readers/searchers.
Finally I put a semaphore around the readers to limit the nr of
concurrent readers to a maximum.

The whole thing works fine now, despite the underlying file system.
It keeps a dozen lucene db's (half a gig) open for searching with very
acceptable speed. And it's written in Jython, so I need only
half the number of lines...

<snip rest>

Regards,
Ype

-- 

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to