I'll pull together a patch to allow FSDirectory/FSIndexOutput to
optionally do flush() & sync() before every close().

Lucene is already "atomic" (the "A" in ACID): it's very careful to do
the IO operations in order such that the index is never in an
inconsistent state ASSUMING the IO system completes or
fails-to-complete the operations *in order*.

The problem is, on a hard shutdown (kill -9 or JVM/machine crashes),
apparently future operations may have completed while some past
operations have not.  For example, the new segments_N file was
successfully written while say the _X.fdx file of the just-flushed
segment was not successfully written, even though Lucene had written &
closed _X.fdx before segments_N.

It's this out-of-order completion of the IO operations on a hard
shutdown that leads to an inconsistent index.

Using autoCommit=false just means your commits are less frequent, and,
the "atomic" transaction is all changes done during the lifetime of
that one writer, versus all changes done since the last flush() when
autoCommit=true.  However, autoCommit=false cannot fully eliminate the
chance of corruption due to out-of-order completion of IO operations.

It sounds like inserting a flush() then sync() call before every
close() *might* in fact force the IO system to attain in-order
completion of the operations Lucene sends it, at the cost of some
performance loss.  I say *might* because there are so many layers to
an IO system that it's not clear that the fsync() that the JVM is in
fact calling (and relying on) will really always do the right thing.

The performance hit could in practice be low, especially if you are
using a large RAM buffer in your writer.

Mike

"Mark Miller" <[EMAIL PROTECTED]> wrote:
> I think it would be great if it where an option. Perhaps even a sandbox 
> implementation that could wrap or replace a few classes. Maybe that 
> complicates things too much, but that way you could just not use the 
> transaction system if you where on NFS (if that ends up being a problem) 
> or you didn't want to pay a performance cost.
> 
> Maybe a Derby guy would pitch in.
> 
> Chris Hostetter wrote:
> > : This is simple not true. See FileDescriptor.sync().
> > : 
> > : There are several options, but normally it is used so that when close
> > : completes, all data must be on disk. This is a much slower way to write 
> > data.
> > : It is very common in database systems when committing the log file.
> >
> > Ok.  I'll certainly take your word for it ... i've been trusting the docs 
> > for [File]OutputStream.flush()...
> >
> >   
> >>> If the intended destination of this stream is an abstraction provided 
> >>> by the underlying operating system, for example a file, then flushing 
> >>> the stream guarantees only that bytes previously written to the stream 
> >>> are passed to the operating system for writing; it does not guarantee 
> >>> that they are actually written to a physical device such as a disk drive.
> >>>       
> >
> > I haven't looked at the internals of FileOutputStream or FileDescriptor on 
> > any particular platforms to see how exactly they work, but if dealing with 
> > the FD directly and using FD.sync() the magic bullet then I'd love to see 
> > a patch that uses it in FSDirectory.
> >
> > I assume the SyncFailedException it throws is rare?  If it is always 
> > thrown when using things like NFS that may be a show stopper for using 
> > sync() in Lucene ... many people have jumped through a lot of hoops this 
> > past year to get Lucene working on NFS; I'd hate to see all that work go 
> > out the window in an effort to make Lucene ACID.  (I suspect there are 
> > more users interested in using Lucene on NFS then on using it as a 
> > transactional data store)
> >
> >   
> >>> Throws:
> >>>    SyncFailedException - Thrown when the buffers cannot be flushed, or 
> >>> because the system cannot guarantee that all the buffers have been 
> >>> synchronized with physical media.
> >>>       
> >
> > -Hoss
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >   
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to