Chris Hostetter <[EMAIL PROTECTED]> wrote on 24/08/2006 23:46:39: > > If i'm understanding this suggestion correctly, the main change in > observable behavior will be that actions performed by a "reader" will > never block or invalidate actions performed by a "writer" -- writers on > the other hand can still block eachother. >
Yes this is true: here readers do not block writers (nor readers), a writer blocks readers, and a writer blocks other writers. > This seems like it might be the opposite of what most people would want: > that opening "reader" threads for doing searches need to be fast, and if a > writer thread has to wait a half second that's okay. Right... this is an important point that I missed - in the numbered-files approach a reader never has to wait, while in this suggestion readers may need to wait for a writer that commits just now. Still it is interesting to notice that the way Lucene works today, readers initialization also block one another, so they initialize serially - each reader needs to obtain a commit lock, initialize, and release the lock. In this suggestion all readers initialize in parallel, and perhaps re-initialize if a writer happens to commit just now. Also, the way that writers do their work - most work is done out of the "commit-window" - so the commit-window is both short and "relatively rare". > > I also don't believe this would "solve" the NFS issues with regards to the > commit lock -- as i recall, the problem stems from NFS not being able to > garuntee transactional order of file operations (ie: i open the commit > lock file, i modify and close segments, i close/delete the commit file -- > a remote NFS client might still see the orriginal segments file after the > commit file is deleted. Your version file might suffer the same fate > (with reader clients seeing V1==V2 because the whole file is a second > stale) I thought that the (cooperative) lock-file related problems with NFS stem from deleteFile() that may return failure code due to timeout although it actually succeeded, possibly causing the lock-releasing party to retry deleting, but now erroneously deleting a lock file just obtained by another process. The RFC for NFS version 2 (http://tools.ietf.org/html/rfc1094) says: "All of the procedures in the NFS protocol are assumed to be synchronous. When a procedure returns to the client, the client can assume that the operation has completed and any data associated with the request is now on stable storage." So if writer did actions { a1 , a2 } in this order and they completed, it seems that a reader "seeing" the result of action a2 must also "feel" the result of action a1. (This would prevent errors with the proposed version number.) But I am no expert in NFS and may be wrong here. > > > : Date: Thu, 24 Aug 2006 23:22:56 -0700 > : From: Doron Cohen <[EMAIL PROTECTED]> > : Reply-To: java-dev@lucene.apache.org > : To: java-dev@lucene.apache.org > : Subject: Re: Lock-less commits > : > : I would like to discuss an additional approach, that requires small changes > : to current Lucene implementation. Here, the index version (currently in > : segments file) is maintained in a separate file, and is used to synchronize > : between readers and writers, without requiring readers to create/obtain any > : lock files, and without requiring readers to write anything to disk. > : > : - Index version would be maintained in a separate, dedicated Version file - > : (say .vsn) - one per index. > : - Version file contains two occurrences of the version number - V1 and V2. > : - In steady state, V1 == V2, but During update V1 == V2+1. > : - Every commit would: > : - obtain a write lock (as today), to guarantee single writer at a time. > : - increments V1 in that file, using RandomAccessFile API (RAF). > : notice: now V1 != V2. > : - do the commit work (merge, delete, whatever). > : - increments V2 in that file, using RAF. > : notice: now, again, V1 == V2. > : - release the write lock (as today) > : - Every reader would read the version data in opposite order: > : (1) read V2 from the version file, using RAF. > : (2) read V1 from the version file, using RAF. > : (3) if not V1==V2 wait some time, and try again (from step 1), until > : V1==V2, or timeout and fail. > : (4) initialize reader data (read segment infos, open files). > : (5) read again V2 then V1 using RAF. > : (6) if not V2==V1 or they changed from steps 1 and 2, try again (from > : step 1), or timeout and fail. > : > : > : A few points to notice: > : - Using RAF protects from errors due to IO buffering. > : - Only tiny amount of version data is being read/written using RAF, so > : performance should not degrade. > : - Readers are not writing any data, so they are faster (A reader that does > : deleteDoc is a writer in this regard). > : - The opposite read/write order of RAF operations, i.e. writing V1 and then > : V2 by writer but reading V2 and then V1 by reader, protects from race > : conditions between readers and writers that otherwise might have caused > : reading corrupt data and concluding wrongly that the data is consistent > : while in fact it is not. > : - By using RAF and by the order of operations above, this scheme would work > : also for NFS (excluding the write lock mechanism which remains an issue in > : NFS). > : - For backward compatibility with current index structure the code can fall > : back to obtain a commit lock file if the .vsn file does not exist, and then > : create that .vsn file if it still does not exist. > : > : Regards, > : Doron > : > : > : > : --------------------------------------------------------------------- > : To unsubscribe, e-mail: [EMAIL PROTECTED] > : For additional commands, e-mail: [EMAIL PROTECTED] > : > > > > -Hoss > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]