I think the workaround is to subclass IW and insert your own flush...

Mike

http://blog.mikemccandless.com

On Sun, Mar 27, 2011 at 9:41 AM, Grant Ingersoll <[email protected]> wrote:
> Is there a workaround?
>
>
> On Mar 27, 2011, at 9:30 AM, Michael McCandless wrote:
>
>> Indeed I think this is a real bug -- addIndexes(IR[]) should call
>> flush(false, true), just like addIndexes(Dir[]) does.
>>
>> Mike
>>
>> http://blog.mikemccandless.com
>>
>> On Sun, Mar 27, 2011 at 9:07 AM, Shai Erera <[email protected]> wrote:
>>> Hi
>>>
>>> One of our users stumbled upon what seems to be a bug in trunk (didn't
>>> verify yet against 3x but I have a feeling it exists there as well). The
>>> scenario is: you want to add an index into an existing index. Beforehand,
>>> you want to delete all new docs from the existing index. These are the
>>> operations that are performed:
>>> 1) deleteDocuments(Term) for all the new documents
>>> 2) addIndexes(IndexReader)
>>> 3) commit
>>>
>>> Strangely, it looks like the deleteDocs happens *after* addIndexes. Even
>>> more strangely, if addIndexes(Directory) is called, the deletes are applied
>>> *before* addIndexes. This user needs to use addIndexes(IndexReader) in order
>>> to rewrite payloads using PayloadProcessorProvider. He reported this error
>>> using a "3x" checkout which is before the RC branch (as he intends to use
>>> 3.1). I wrote a short unit test that demonstrates this bug on trunk:
>>>
>>> {code}
>>>     private static IndexWriter createIndex(Directory dir) throws Exception {
>>>         IndexWriterConfig conf = new IndexWriterConfig(Version.LUCENE_40,
>>> new MockAnalyzer());
>>>         IndexWriter writer = new IndexWriter(dir, conf);
>>>         Document doc = new Document();
>>>         doc.add(new Field("id", "myid", Store.NO,
>>> Index.NOT_ANALYZED_NO_NORMS));
>>>         writer.addDocument(doc);
>>>         writer.commit();
>>>         return writer;
>>>     }
>>>
>>>     public static void main(String[] args) throws Exception {
>>>         // Create the first index
>>>         Directory dir = new RAMDirectory();
>>>         IndexWriter writer = createIndex(dir);
>>>
>>>         // Create the second index
>>>         Directory dir1 = new RAMDirectory();
>>>         createIndex(dir1);
>>>
>>>         // Now delete the document
>>>         writer.deleteDocuments(new Term("id", "myid"));
>>>         writer.addIndexes(IndexReader.open(dir1));
>>> //        writer.addIndexes(dir1);
>>>         writer.commit();
>>>         System.out.println("numDocs=" + writer.numDocs());
>>>         writer.close();
>>>     }
>>> {code}
>>>
>>> The test as it is prints "numDocs=0", while if you switch the addIndexes
>>> calls, it prints 1 (which should be the correct answer).
>>>
>>> Before I open an issue for this, I wanted to verify that it's indeed a bug
>>> and I haven't missed anything in the expected behavior of these two
>>> addIndexes. If indeed it's a bug, I think it should be a blocker for 3.1?
>>> I'll also make a worthy junit test out of it.
>>>
>>> BTW, the user, as an intermediary solution, extends IndexWriter and calls
>>> flush() before the delete and addIndexes calls. It would be preferable if
>>> this solution can be avoided.
>>>
>>> Shai
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to