Michael Busch wrote:

On 2/26/09 1:50 PM, Michael McCandless wrote:

Michael Busch wrote:

On 2/24/09 4:05 AM, Michael McCandless wrote:

I believe we still need this, for remote filesystems (like NFS) that have inconsistent client-side caching.

The fsync() ensures the local IO system has moved the bytes/file metadata to stable storage, but I'd expect remote caches would still potentially be stale.


We could have an expert API to turn using the .gen file on/off? And then default it to off in 3.0.

We could do that, though I'd default to leaving it on?

I said off, because I think most people don't use NFS (and we don't even recommend using it for performance reasons). But defaulting to on would be ok too, to be on the safe side.

OK. I'm not certain other filesystems aren't affected, so being defensive seems best...

I think the .gen file and the CompoundFileWriter are the only places left where we overwrite (parts of) files? To change the latter we could move the cfs header to the segments file.


TermInfosWriter also seeks & writes a header; see here (enabling Lucene to write directly to HDFS):

   http://issues.apache.org/jira/browse/LUCENE-532

What situation are you seeing that requires absolute "write once"?

Actually I personally don't need Lucene to be "write once". The reason why I started this thread about the segments.gen file was that in our project we sometimes need to rollback to a previous commit-point (using Lucene 2.4.0) that we keep around with the SnapshotDeletionPolicy. To get rid of the newest commit-point we simply delete the most recent segments file. But then we also have to delete the segments.gen file, otherwise Lucene will read the generation from it and try to find the segments file we deleted. Then Lucene will recreate the segments.gen file. This just made me think that this is not very clean (deleting and recreating the segments.gen) especially because we use a local FS and don't even need the .gen file.

Ahh... OK. In trunk/2.9, you can explicitly open an IndexWriter on a prior commit point, which will take care of the gen file for you.

I mentioned the "write once" issue because I think with disabling the .gen file and all your recent changes (especially lockless commits) we've almost accomplished "write once", so we might as well change the last two places that don't comply (CFSWriter and TermInfosWriter), to resolve LUCENE-532. Not sure how many user there are out there who really need it though.

OK. Writing an index to HDFS is the only "real" case I know about so far... would be curious if there are others.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to