Dear developers, is there any architectural reason while an IndexWriter could not delete a document?
I understand that the IndexReader (besides its strange naming for this feature) is the right class to use to delete a document, but this raises a huge problem for me. We add almost 50.000 documents a day, while deleting a similar amount of old documents over the same period. We index new documents in batch every 5 minutes while deleting the old ones and optimize the index twice a day, in order to keep good performance for the queries and the number of index files under control. In this situation, I try to keep the same IndexWriter open as much as possible, in order to avoid any unnecessary fragmentation of the index. Before indexing any document, I can check to see if the document has already been inserted, but I am not able to delete it without closing the IndexWriter, opening an IndexReader, deleting the document, closing the IndexReader an opening again the IndexWritere. This arrangement seems reasonable if updated documents are scarce, but doesn't seem feasible to work with a high rate of updated documents. I would prefer to avoid deleting all updated documents from the index before opening the IndexWriter because the updating and indexing procedure would get much more complex, and because I will introduce a significant time gap where a previously available document is no more available on the index. Do you confirm my idea that keeping and IndexWriter open as much as possible while indexing batch of documents is a "good thing"? Is there any option to ever see a deleteDocument method in the IndexWriter class, or should I start planning how to handle the update of documents in another way? Thank you very much for your attention. Regards, Giulio Cesare Solaroli --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]