|
Hi George, actually I don't know how it works in Java, I'm not a Java developer and I couldn't easily get to develop in Java. For what I see this might even not have to do with the FAQ entry I sent you (although it looks like it does), I just realized that if I try to optimize an index when there's an IndexSearcher opened on it then it's "less optimized" than it would be if there were no searchers open on it. This means that on the index directory is kept another file of the same size of the main index file, thus - I guess - a duplicate. I just tested that if I don't keep an open searcher on the index during optimization, instead, the optimization process works just as expected. Therefore it looks like that in Lucene.Net that issue is not solved. I don't have test code, but a complete application you can find here (http://code.google.com/p/cs2project/). I guess you don't have a lot of time to test it, but I'm pretty sure that I'm right on this fact, since just avoiding to open an IndexSearcher implies that the index is optimized correctly. Here are the exact steps of my application to reproduce the issue: 1 - Open and IndexSearcher (which, in turn, opens an IndexReader) to use for searches 2 - Open an IndexReader to delete old documents from the index 3 - Close the IndexReader opened in the previous step 4 - Open and IndexWriter to add new documents 5 - Call Optimize() and then close the IndexWriter opened above (notice that the IndexSearcher opened at step 1 is still open here) 6 - Close the IndexSearcher opened at step 1 7 - Create a new IndexSearcher to be able to search through the newly added documents, and to exclude from searches the deleted ones. If in the process you remove steps 1, 6 and 7 (ie, you never open a searcher), the optimization triggered at step 5 works as expected, otherwise the issue I reported occurs and the index main file is duplicated. I can send you more details if you care. Simone George Aroush wrote: Hi Simone, Lucene.Net 2.1 is suppose to work just like it's Java version. If you are seeing a difference in this behavior, then something is obviously wrong. My question to you is this; do you have a C# test code to show this problem with Lucene.Net? Can you port it to Java and verify? If you can't do all this verification, at least, give us the C# test code and then I might be able to take it from here. This will also have the additional benefit of verifying that your code is not the issue. Regards, -- George_____ From: Simone Busoli [mailto:[EMAIL PROTECTED]] Sent: Wednesday, September 12, 2007 8:48 PM To: [email protected] Subject: Re: Lucene.Net 2.1 status Hi George, thanks for the update. I wanted to ask you something about 2.1. In Java Lucene FAQ one of the entries says: Why do I have a deletable file (and old segment files remain) after running optimize? This is normal behavior on Windows whenever you also have readers (IndexReaders <http://wiki.apache.org/lucene-java/IndexReaders> or IndexSearchers <http://wiki.apache.org/lucene-java/IndexSearchers> ) open against the index you are optimizing. Lucene tries to remove old segments files once they have been merged (optimized). However, because Windows does not allow removing files that are open for reading, Lucene catches an IOException deleting these files and and then records these pending deletable files into the "deletable" file. On the next segments merge, which happens with explicit optimize() or close() calls and also whenever the IndexWriter <http://wiki.apache.org/lucene-java/IndexWriter> flushes its internal RAMDirectory to disk (every IndexWriter <http://wiki.apache.org/lucene-java/IndexWriter> .DEFAULT_MAX_BUFFERED_DOCS (default 10) addDocuments), Lucene will try again to delete these files (and additional ones) and any that still fail will be rewritten to the deletable file. Note that as of 2.1 the deletable file is no longer used. Instead, Lucene computes which files are no longer referenced by the index and removes them whenever a writer is created. I'm working on Lucene.Net trunk but I still get the deletable-files-not-deleted behavior under Windows. Is this supposed to be working instead? Simone George Aroush wrote: Hi folks, Lucene.Net 2.1 is stabilizing very well. Thanks to DIGY who flushed out the last remaining NUnit failed tests, we are now down to only one test that is failing: Lucene.Net.Index.TestNorms._TestNorms(). Since Monday, I have been using this version in production with success. I like to get feedback from others if you are using it and how it's working for you. If results are good, and pending the elimination of Lucene.Net.Index.TestNorms._TestNorms() I think we are ready to vote on this release and close it. As for the next step, I'm going to take a look at Lucene Java 2.2 and see how big of a job to port it will be. I will post on it in few days. Regards, -- George |
- Lucene.Net 2.1 status George Aroush
- Re: Lucene.Net 2.1 status Simone Busoli
- RE: Lucene.Net 2.1 status George Aroush
- Re: Lucene.Net 2.1 status Simone Busoli
- Re: Lucene.Net 2.1 status Simone Busoli
- RE: Lucene.Net 2.1 status DIGY
