I am not happy with complicating the readers like this, conceptually
adding back commit locks (for deletion), this time with a keep-a-life
thread, and again making readers not read-only.

To my understanding the only remaining issue with NFS is: a reader
might get an IO exception in case writer removed an old file that
the reader is using.

It is not a possible corruption that we try to solve, right?

For that I think it is not worth to add that stuff again.

A writer's "two steps" policy - delete only files that
"would have not been in use unless a reader did not refresh for X minutes"
is "fair enough" I think.

By "two steps" I mean, start measuring time not from when segment to be
deleted was created, but rather from when its "next generation" was
created.

Michael McCandless <[EMAIL PROTECTED]> wrote on 18/01/2007
14:24:16:

> Marvin Humphrey wrote:
> >
> > On Jan 17, 2007, at 1:16 PM, Michael McCandless wrote:
> >
> >> This is the solution I have in mind for LUCENE-710: change the
> >> IndexFileDeleter so that instead of always immediately deleting the
> >> last commit when a new commit happens, allow some time before doing
> >> so.  This way readers have a chance to refresh.  The actual time would
> >> be settable by the developer.  So if you set it to 6 hours, then, a
> >> commit would remain usable for at least 6 hours after it had been
> >> obsoleted by a new commit.  This means if you can ensure your readers
> >> refresh within 6 hours of a new commit happening, then the writer will
> >> never delete an "in-use" commit.
> >
> > I've been mulling this over.  If you set the interval to 6 hours, and
> > there's a lot of churn (e.g. if you optimize frequently), you'll end up

> > with a lot of wasted disk space.  On the flip side, the user has to set

> > up some sort of trigger for refreshing the IndexReaders anyway.  It's
> > still not user-friendly by default, and we'd be polluting the API with
a
> > hateful workaround.
>
> Well, 6 hours would be a long time for such a high turnover site.
> They would presumably set the time to something like 10 minutes
> instead.
>
> I think we should decouple the deletion policy from commits.  This way
> developers could subclass and make their own deletion policy that
> suits their application.  The IndexFileDeleter base class would do all
> the legwork to keep ref counts to all specific index files based on
> all segments_N commits that are still "live".  Then the deletion
> policy just decides which commits should be deleted, when.  (This is
> roughly what's outlined in LUCENE-710).
>
> The current policy is to delete all prior commits after a new commit
> and that would remain the default.
>
> Chuck's idea (reference counting via filesystem) would be another
> policy.  My proposal (delete by time after being obsoleted) would be
> another policy, etc.
>
> > The real problem is NFS.  For background, see
> > <http://nfs.sourceforge.net/#section_d>, item D2, which deals with NFS
> > and "delete on last close".
> >
> > Now I wonder.  Version 4 of the NFS protocol introduces state, so it's
> > possible to implement file locking.  Can we lock a segments file, then
> > have IndexFileDeleter detect which segments are locked that way?  And
if
> > that's the case, can we detect whether the locking mechanism is failing

> > and throw an exception if someone tries to use an earlier version of
NFS?
>
> Locking and NFS makes me very nervous :)
>
> > I'd be cool with making it impossible to put an index on an NFS volume
> > prior to version 4.  That puts the blame where it belongs.
>
> Well, most times users have no control over which NFS server and/or
> client version is in use, so I think taking this approach of "pinning
> the blame" can only hurt our users.  I would rather find a solution
> that's more portable, if we can (like the ref counting idea Chuck
> brought up).
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to