Michael Busch wrote:
On 2/26/09 1:50 PM, Michael McCandless wrote:
Michael Busch wrote:
On 2/24/09 4:05 AM, Michael McCandless wrote:
I believe we still need this, for remote filesystems (like NFS)
that have inconsistent client-side caching.
The fsync() ensures the local IO system has moved the bytes/file
metadata to stable storage, but I'd expect remote caches would
still potentially be stale.
We could have an expert API to turn using the .gen file on/off?
And then default it to off in 3.0.
We could do that, though I'd default to leaving it on?
I said off, because I think most people don't use NFS (and we don't
even recommend using it for performance reasons). But defaulting to
on would be ok too, to be on the safe side.
OK. I'm not certain other filesystems aren't affected, so being
defensive seems best...
I think the .gen file and the CompoundFileWriter are the only
places left where we overwrite (parts of) files? To change the
latter we could move the cfs header to the segments file.
TermInfosWriter also seeks & writes a header; see here (enabling
Lucene to write directly to HDFS):
http://issues.apache.org/jira/browse/LUCENE-532
What situation are you seeing that requires absolute "write once"?
Actually I personally don't need Lucene to be "write once". The
reason why I started this thread about the segments.gen file was
that in our project we sometimes need to rollback to a previous
commit-point (using Lucene 2.4.0) that we keep around with the
SnapshotDeletionPolicy. To get rid of the newest commit-point we
simply delete the most recent segments file. But then we also have
to delete the segments.gen file, otherwise Lucene will read the
generation from it and try to find the segments file we deleted.
Then Lucene will recreate the segments.gen file. This just made me
think that this is not very clean (deleting and recreating the
segments.gen) especially because we use a local FS and don't even
need the .gen file.
Ahh... OK. In trunk/2.9, you can explicitly open an IndexWriter on a
prior commit point, which will take care of the gen file for you.
I mentioned the "write once" issue because I think with disabling
the .gen file and all your recent changes (especially lockless
commits) we've almost accomplished "write once", so we might as well
change the last two places that don't comply (CFSWriter and
TermInfosWriter), to resolve LUCENE-532. Not sure how many user
there are out there who really need it though.
OK. Writing an index to HDFS is the only "real" case I know about so
far... would be curious if there are others.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org