The basic idea is to change all commits (from SegmentReader or
IndexWriter) so that we never write to an existing file that a reader
could be reading from. Instead, always write to a new file name using
sequentially numbered files. For example, for "segments", on every
commit, write to a the sequence: segments.1, segments.2, segments.3,
etc. Likewise for the *.del and *.fN (norms) files that
SegmentReaders write to.
Interesting idea...
How do you get around races between opening and deleting?
I assume for the writer, you would
1) write new segments
2) write new 'segments.3'
3) delete unused segments (those referenced by 'segments.2')
But what happens when a reader comes along at point 1.5, say, opens
the latest 'segments.2' file, and then tries to open some of the
segments files at 3.5?
I guess the reader could retry... checking for a new segments file.
This could happen more than once (hopefully it wouldn't lead to
starvation... that would be unlikely).
Yes, exactly.
And specifically, the reader only retries if, on hitting a FileNotFound
exception, it then checks & sees that a newer segments file is
available. This way if there is a "true" FileNotFound exception due to
some sort of index corruption or something, we will [correctly] throw it.
It could in theory lead to starvation but this should be rare in
practice unless you have an IndexWriter that's constantly committing.
Also note that this should be no worse than what we have today, where
you would also likely hit starvation and get a "Lock obtain timed out"
thrown (eg see http://issues.apache.org/jira/browse/LUCENE-307).
In my stress test (shared index with writer accessing it over NFS and 3
reader threads doing "open indexsearcher; search" over and over, via
Samba share) the IndexSearchers do retry but so far never more than
once. Of course this will depend heavily on details of the use case ...
We can also get rid of the "deletable" file (and associated errors
renaming deletable.new -> deletable) because we can compute what's
deletable according to "what's not referenced by current segments
file."
If the segments file is written last, how does an asynchronous deleter
tell what will be part of a future index? I guess it's doable if all
file types have sequence numbers...
Well, in my current implementation I don't have a truly asynchronous
deleter. If I did have that then you're right I'd need to not delete
the "new and in progress" files. We could consider something like that
in the future ...
Instead, I still do all deletes [synchronously] in the same places as
the current code, with the write lock held. For example, during a
commit, we delete old segments immediately after writing the new
segments file, and then again after creating a compound file (if index
is using compound files). Likewise when a SegmentReader commits new
deletes/norms.
Also one neat possibility this could lead to in the future is to
explicitly keep "virtual snapshots" at points in time, but within a
single index (vs eg the hard-link snapshots that Solr does).
For example if you want to index a bunch of docs, but not make them
visible yet for searching, with the current code, you have to make sure
never to restart an IndexSearcher. But if your app server goes down
(say), then all IndexSearchers will come back up and make your indexed
docs visible.
But with this new approach (plus some additional code that I'm not
planning on doing for starters), it would be possible for an
IndexSearcher to explicitly say "I'd like to re-open the snapshot of the
index as of 3 days ago", for example. This would require more smarts in
the reclaiming of old files ... but at least this could be a first step
towards that.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]