Re: Lock-less commits

Michael McCandless Fri, 25 Aug 2006 04:17:50 -0700

If i'm understanding this suggestion correctly, the main change in
observable behavior will be that actions performed by a "reader" will
never block or invalidate actions performed by a "writer" -- writers on
the other hand can still block eachother.


Yes this is true: here readers do not block writers (nor readers), a writer
blocks readers, and a writer blocks other writers.

This seems like it might be the opposite of what most people would want:
that opening "reader" threads for doing searches need to be fast, and if

writer thread has to wait a half second that's okay.


Right... this is an important point that I missed - in the numbered-files
approach a reader never has to wait, while in this suggestion readers may
need to wait for a writer that commits just now.


Yes ideally a reader should never have to wait.

In my local changes (using numbered files) for lock-less commits, I'veimplemented Yonik's suggestsion of opening segments in reverse order,and this has definitely reduced the number of "retries" that thesearchers hit on opening the index. Even in highly interactivesearching (open searcher, do one search, close searcher, repeat) theretry rate is low.

And if necessary we could further reduce retries by adding some small[settable] pause into IndexWriter and/or not removing old segment filesuntil some time has passed (at the expense of increased temporary diskusage). I'm currently not planning on doing either of these unless inbenchmarking I see performance regressions.

Still it is interesting to notice that the way Lucene works today, readers
initialization also block one another, so they initialize serially - each
reader needs to obtain a commit lock, initialize, and release the lock. In
this suggestion all readers initialize in parallel, and perhaps
re-initialize if a writer happens to commit just now.

I think this is one of the big improvements of switching to thelock-less approach: readers will never wait on other readers, as they donow.

Also, the way that writers do their work - most work is done out of the
"commit-window" - so the commit-window is both short and "relatively rare".

Agreed. This is nice because it already reduces the chance of retry (innumbered files approach) or pause (in current Lucene or this proposal).

I also don't believe this would "solve" the NFS issues with regards to

the

commit lock -- as i recall, the problem stems from NFS not being able to
garuntee transactional order of file operations (ie: i open the commit
lock file, i modify and close segments, i close/delete the commit file --
a remote NFS client might still see the orriginal segments file after the
commit file is deleted.  Your version file might suffer the same fate
(with reader clients seeing V1==V2 because the whole file is a second
stale)


I thought that the (cooperative) lock-file related problems with NFS stem
from deleteFile() that may return failure code due to timeout although it
actually succeeded, possibly causing the lock-releasing party to retry
deleting, but now erroneously deleting a lock file just obtained by another
process.

The RFC for NFS version 2 (http://tools.ietf.org/html/rfc1094) says: "All
of the procedures in the NFS protocol are assumed to be synchronous.  When
a procedure returns to the client, the client can assume that the operation
has completed and any data associated with the request is now on stable
storage."

So if writer did actions { a1 , a2 } in this order and they completed, it
seems that a reader "seeing" the result of action a2 must also "feel" the
result of action a1. (This would prevent errors with the proposed version
number.) But I am no expert in NFS and may be wrong here.

Operations are indeed synchronous to the server, though NFS V3 does addsome support for asynchronous writes, eg see http://nfs.sourceforge.net.

The big problem is the client's caching. I've seen cases in my owntesting where the NFS cache on one machine remains stale for quite sometime (seconds) before "seeing" changes to a file on a server. I thinkinstead relying on a newly created file with the numbered approach (ienever before used file name) will avoid the risk that a client-sidecache is presenting stale (or delayed) contents of a file.


Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lock-less commits

Reply via email to