IndexWriter can in fact delete documents, by Term or by Query. It also has updateDocument, which under-the-hood simply calls deleteDocuments then addDocument.

Mike

Vinubalaji Gopal wrote:

Hi all,

I am a new lucene user and got started with in a really quick time!
Its been really nice and I love it :) -  I am still trying to do a few
things the right way and digging through documentation/archives.
Here is my question:

What is the best/fastest way to batch index a large set of documents.
I open an IndexWriter at the start  and index the documents one by
one. But here is my problem:
Some of the documents I have may be updated after it has been indexed
(even before the batch process completes) and thus I need to reindex
them. The only way to do that is using (I may be wrong - but this is
what I understood from reading the FAQs, documentation)
IndexReader.deleteDocument(term) and indexing it again. But everytime
I have to delete a document - i need to open a new reader since the
writer would have already written more documents and I can't reuse the
reader for the whole session :( - which makes the operation a bit
slow.

I thought the better way would be to set autocommit as false . In
which case even if I reopen the reader everytime to delete any
document (using IndexReader.deleteDocument(term)) - I wouldn't find it
since the commit has not happened. Is there any other way of deleting
the documents before the commit has taken place?

This wouldn't have been a problem if deleteDocument was in IndexWriter
:( - but I believe there would have been strong reason to have
deleteDocument in indexReader and would be interested to hear about
it.(or read some pointers)


--
Vinu

In a world without fences who needs Gates?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to