Team, I've been struggling to find a clean solution for LUCENE-710, when I thought of a simple addition to Lucene ("explicit commits") that would I think resolve LUCENE-710 and would fix a few other outstanding issues when readers are using a "live" index (being updated by a writer).
The basic idea is to add an explicit "commit" operation to Lucene. This is the same nice feature Solr has, but just a different implementation (in Lucene core, in a single index, instead). The commit makes a "point in time" snapshot (term borrowed from Solr!) available for searching. The implementation is surprisingly simple (see below) and completely backwards compatible. I'd like to get some feedback on the idea/implementation. Details...: right now, Lucene writes a new segments_N file at various times: when a writer (or reader that's writing deletes/norms) needs to flush its pending changes to disk; when a writer merges segments; when a writer is closed; multiple times during optimize/addIndexes; etc. These times are not controllable / predictable to the developer using Lucene. A new reader always opens the last segments_N written, and, when a reader uses isCurrent() to check whether it should re-open (the suggested way), that method always returns false (meaning you should re-open) if there are any new segments_N files. So it's somewhat uncontrollable to the developer what state the index is in when you [re-]open a reader. People work around this today by adding logic above Lucene so that the writer separately communicates to readers when is a good time to refresh. But with "explicit commits", readers could instead look directly at the index and pick the right segments_N to refresh to. I'm proposing that we separate the writing of a new segments_N file into those writes that are done automatically by Lucene (I'll call these "checkpoints") from meaningful (to the application) commits that are done explicitly by the developer at known times (I'll call this "committing a snapshot"). I would add a new boolean mode to IndexWriter called "autoCommit", and a new public method "commit()" to IndexWriter and IndexReader (we'd have to rename the current protected commit() in IndexReader) When autoCommit is true, this means every write of a segments_N file will be "commit a snapshot", meaning readers will then use it for searching. This will be the default and this is exactly how Lucene behaves today. So this change is completely backwards compatible. When autoCommit is false, then when Lucene chooses to save a segments_N file it's just a "checkpoint": a reader would not open or re-open to the checkpoint. This means the developer must then call IndexWriter.commit() or IndexReader.commit() in order to "commit a snapshot" at the right time, thereby telling readers that this segments_N file is a valid one to switch to for searching. The implementation is very simple (I have an initial coarse prototype working with all but the last bullet): * If a segments_N file is just a checkpoint, it's named "segmentsx_N" (note the added 'x'); if it's a snapshot, it's named "segments_N". No other changes to the index format. * A reader by default opens the latest snapshot but can optionally open a specific N (segments_N) snapshot. * A writer by default starts from the most recent "checkpoint" but may also take a specific checkpoint or snapshot point N (segments_N) to start from (to allow rollback). * Change IndexReader.isCurrent() to see if there are any newer snapshots but disregard newer checkpoints. * When a writer is in autoCommit=false mode, it always writes to the next segmentsx_N; else it writes to segments_N. * The commit() method would just write to the next segments_N file and return the N it had written (in case application needs to re-use it later). * IndexFileDeleter would need to have a slightly smarter policy when autoCommit=false, ie, "don't delete anything referenced by either the past N snapshots or if the snapshot was obsoleted less than X minutes ago". I think there are some compelling things this could solve: * The "delete then add" problem (really a special but very common case of general transactions): Right now when you want to update a bunch of documents in a Lucene index, it's best to open a reader, do a "batch delete", close the reader, open a writer, do a "batch add", close the writer. This is the suggested way. The open risk here is that a reader could refresh at any time during these operations, and find that a bunch of documents have been deleted but not yet added again. Whereas, with autoCommit false you could do this entire operation (batch delete then batch add), and then call the final commit() in the end, and readers would know not to re-open the index until that final commit() succeeded. * The "using too much disk space during optimize" problem: This came up on the user's list recently: if you aggressively refresh readers while optimize() is running, you can tie up much more disk space than you'd expect, because your readers are holding open all the [possibly very large] intermediate segments. Whereas, if autoCommit is false, then developer calls optimize() and then calls commit(), the readers would know not to re-open until optimize was complete. * More general transactions: It has come up a fair number of times how to make Lucene transactional, either by itself ("do the following complex series of index operations but if there is any failure, rollback to the start, and don't expose result to searcher until all operations are done") or as part of a larger transaction eg involving a relational database. EG, if you want to add a big set of documents to Lucene, but not make them searchable until they are all added, or until a specific time (eg Monday @ 9 AM), you can't do that easily today but it would be simple with explicit commits. I believe this change would make transactions work correctly with Lucene. * LUCENE-710 ("implement point in time searching without relying on filesystem semantics"), also known as "getting Lucene to work correctly over NFS". I think this issue is nearly solved when autoCommit=false, as long as we can adopt a shared policy on "when readers refresh" to match the new deletion policy (described above). Basically, as long as the deleter and readers are playing by the same "refresh rules" and the writer gives the readers enough time to switch/warm, then the deleter should never delete something in use by a reader. There are also some neat future things made possible: * The "support deleteDocuments in IndexWriter" (LUCENE-565) feature could have a more efficient implementation (just like Solr) when autoCommit is false, because deletes don't need to be flushed until commit() is called. Whereas, now, they must be aggressively flushed on each checkpoint. * More generally, because "checkpoints" do not need to be usable by a reader/searcher, other neat optimizations might be possible. EG maybe the merge policy could be improved if it knows that certain segments are "just checkpoints" and are not involved in searching. * I could simplify the approach for my recent addIndexes changes (LUCENE-702) to use this, instead of it's current approach (wish I had thought of this sooner: ugh!.). * A single index could hold many snapshots, and, we could enable a reader to explicitly open against an older snapshot. EG maybe you take weekly and a monthly snapshot because you sometimes want to go back and "run a search on last week's catalog". Feedback? Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]