I think the workaround is to subclass IW and insert your own flush... Mike
http://blog.mikemccandless.com On Sun, Mar 27, 2011 at 9:41 AM, Grant Ingersoll <[email protected]> wrote: > Is there a workaround? > > > On Mar 27, 2011, at 9:30 AM, Michael McCandless wrote: > >> Indeed I think this is a real bug -- addIndexes(IR[]) should call >> flush(false, true), just like addIndexes(Dir[]) does. >> >> Mike >> >> http://blog.mikemccandless.com >> >> On Sun, Mar 27, 2011 at 9:07 AM, Shai Erera <[email protected]> wrote: >>> Hi >>> >>> One of our users stumbled upon what seems to be a bug in trunk (didn't >>> verify yet against 3x but I have a feeling it exists there as well). The >>> scenario is: you want to add an index into an existing index. Beforehand, >>> you want to delete all new docs from the existing index. These are the >>> operations that are performed: >>> 1) deleteDocuments(Term) for all the new documents >>> 2) addIndexes(IndexReader) >>> 3) commit >>> >>> Strangely, it looks like the deleteDocs happens *after* addIndexes. Even >>> more strangely, if addIndexes(Directory) is called, the deletes are applied >>> *before* addIndexes. This user needs to use addIndexes(IndexReader) in order >>> to rewrite payloads using PayloadProcessorProvider. He reported this error >>> using a "3x" checkout which is before the RC branch (as he intends to use >>> 3.1). I wrote a short unit test that demonstrates this bug on trunk: >>> >>> {code} >>> private static IndexWriter createIndex(Directory dir) throws Exception { >>> IndexWriterConfig conf = new IndexWriterConfig(Version.LUCENE_40, >>> new MockAnalyzer()); >>> IndexWriter writer = new IndexWriter(dir, conf); >>> Document doc = new Document(); >>> doc.add(new Field("id", "myid", Store.NO, >>> Index.NOT_ANALYZED_NO_NORMS)); >>> writer.addDocument(doc); >>> writer.commit(); >>> return writer; >>> } >>> >>> public static void main(String[] args) throws Exception { >>> // Create the first index >>> Directory dir = new RAMDirectory(); >>> IndexWriter writer = createIndex(dir); >>> >>> // Create the second index >>> Directory dir1 = new RAMDirectory(); >>> createIndex(dir1); >>> >>> // Now delete the document >>> writer.deleteDocuments(new Term("id", "myid")); >>> writer.addIndexes(IndexReader.open(dir1)); >>> // writer.addIndexes(dir1); >>> writer.commit(); >>> System.out.println("numDocs=" + writer.numDocs()); >>> writer.close(); >>> } >>> {code} >>> >>> The test as it is prints "numDocs=0", while if you switch the addIndexes >>> calls, it prints 1 (which should be the correct answer). >>> >>> Before I open an issue for this, I wanted to verify that it's indeed a bug >>> and I haven't missed anything in the expected behavior of these two >>> addIndexes. If indeed it's a bug, I think it should be a blocker for 3.1? >>> I'll also make a worthy junit test out of it. >>> >>> BTW, the user, as an intermediary solution, extends IndexWriter and calls >>> flush() before the delete and addIndexes calls. It would be preferable if >>> this solution can be avoided. >>> >>> Shai >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
