It seems to me that updating a document is rather tedious and slow in lucene, 
especially for updating large number of documents.  Before opening an 
IndexWriter to add documents, one has to open an IndexReader/IndexSearcher to 
search for the document of a particular id.  Upon finding its docnum, delete 
it.  One then close the reader and open the writer to add the updated content.  
Move on to the next item, repeat the process. For large number of updating, the 
performance is bad.  

One way to speed it up is don't do this process for every single updates.  If 
one has a large number of updates, open the reader/searcher first.  But instead 
of searching/deleting for one document and then close it, search/delete all of 
them and then close the reader/searcher.  Then one can open a writer and just 
do a batch add.

However, I wonder whether we could change the writer to do update 
automatically. Update will be consists of marking an old document invalid, but 
not removing them and adding the new document.  We can clean things up later 
on.  Using this method, there is no need of switching between reader and writer 
during update.  


john



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to