Also, the commit lock is there to allow the merge process to remove unused segments. Without it, a reader might get half way through reading the segments, only to find some missing, and then have to restart reading again. In a highly interactive environment this would be too inefficient.

On Aug 18, 2006, at 3:52 PM, Michael McCandless wrote:


i don't think these changes are going to work. With multiple writers and or readers doing deletes, without serializing the writes you will have inconsistencies - and the del files will need to be unioned.
That is:
station A opens the index
station B opens the index
station A deletes some documents creating segment.del1
station B deletes some documents creating segment.del2
when station C opens the index (or when the segment is merged) del1 and del2 need to be merged. The locking enforces that writers are serialized - you cannot remove this restriction unless you merge the writes when reading.

Sorry, I should be very clear: I am not proposing we remove the write
lock.  The write lock must definitely remain (for the reasons /
examples you list above).  Only one writer can be open at a time
against the index.

The commit lock, which is used to ensure that when an IndexReader
opens the index, no writer is changing it at that moment (and v/v), is
I think the more problematic of the two.

The reason is, the write lock is really a safety net: it's up to you
to use Lucene in such a way that you never try to create two writers
at the same time.  You can use IndexModifier.  Or you can do your own
switching between IndexReader/IndexWriter.  Or you can use the patch
in LUCENE-565 so that IndexWriter is able to delete documents.  But in
all these cases, the write lock is really just a safety net: it
catches you if you accidentally violate this constraint and then you
go and fix your code accordingly.  You would typically catch this in
development / testing because it's a coding / design error.

The commit lock is more troublesome because it really serves an active
purpose in typical Lucene apps when there's otherwise no app level
logic to synchronize opening an IndexReader vs when a writer is
committing.  The writers can commit whenever they want to (well
IndexWriter at least).  And an IndexReader initialization is often
unpredictable (whenever you restart you App server instance, etc.).
So the timing of these events does require active serialization as
things stands now.

Because of this, an index stored on a remote store (eg, NFS, Samba),
where our current locking implementation is known [silently] not to
work, will eventually cause an errant FileNotFound or an Access Denied
exception.  And this is insidious because it may work fine during
initial development and testing only to strike after some time in
production.  This is why I'd like to change commits to not require
locking at all (by never re-using the same file name), while keeping
the write locking.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to