"Simon Wistow" <[EMAIL PROTECTED]> wrote: > I'm looking at doing a system which is looks something like this - I > have an IndexSearcher open with a on-disk index but all writes go to a > RAM based IndexWriter. Periodically I do > > 1. Close IndexSearcher > 2. Open new IndexWriter in same location > 3. Use addIndexes with old RAM based IndexWriter > 4. Close new IndexWriter > 4a. Inform something that outstanding writes have been committed > 5. Close and then open afresh old RAM based IndexWriter > 6. Open new IndexSearcher > > A few questions which may showcase my ignorance. > > 1. Is this transactional? Is there at any point where the Index can be > left in an undefined state? One of the reasons for doing this is that if > a crash happens I can know exactly what documents have been 'committed' > and which are still in memory and need to be reindexed. Also, since > addIndexes is transactional there's should be no chance of the index > getting corrupted which was a problem we'd had before.
This should be transactional. The one small caveat is on hitting an exception during addIndexes you don't actually know whether all indices were in fact added or not. It's possible they were all successfully added but that an exception was then hit during the followon optimize(). Though, if you use addIndexesNoOptimize I believe hitting an exception will always mean no indices were added. > 2. This seems an awful lot like what IndexWriter does anyway - am I just > reimplementing the wheel? I think you could simply open IndexWriter with "autoCommit=false", instead of using a RAM-based IndexWriter? Then on closing this IndexWriter, all changes are committed. In the meantime, readers will always see the index at its starting point, nomatter what you are doing with the writer. You can also call abort() to close the IndexWriter & revert to the starting state of the index. But, one difference with this approach is you'd have to use the writer to do deletes on these index, and these deletes would not be visible until the writer commits (is closed). So if that's a problem you should stick with your current approach. > 3. Because the IndexSearcher (or, more appropriately, its underlying > IndexReader) can have deletes I need to close() it before merging and > reopen it. I also need to reopen it to get all the new documents in the > new merged index. Currently I'm doing something roughly like this > > > > private void reopenIndex() { > index_closed = true; > index.close(); > index = new IndexSearcher(path); > index_close = false; > } > > > and > > public Results search(String query) { > // make sure the index is open > while (index_closed) { > sleep(1); > } > // now do the search > ... > } > > But I'm convinced there must be a better way than this. A quick > first attempt of this > > > private void reopenIndex() { > IndexSearcher old_index = index; > index = new IndexSearcher(path); > old_index.close(); > } > > throws StaleIndex exceptions. Is that a StaleReaderException? The problem here is that your newly opened reader is stale because when you close your old index, it flushes its deletes and advances the index. If you can do transactional deletes with IndexWriter (above) then this problem should go away. If you cannot, then I think you'd have to re-open your new IndexReader once the old IndexReader has been closed (and flushed its deletes) before trying to do any deletes with the new IndexReader. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]