On 9/14/06, Neville Burnell <[EMAIL PROTECTED]> wrote: > Hi David, > > > Deleted documents don't get deleted until commit is called > > Ok, but FYI, my experiments show that #commit doesn't affect #doc_count, > even across ruby sessions.
Sorry, I guess I wan't very clear on that point. The deletes don't get commited until commit is called which is why I don't have a num_docs method in IndexWriter to because there is no way to reliably tell until commit is called. IndexWriter#doc_count is like IndexReader#max_doc. It tells you how many documents there are in the index, deleted or not. > On a different note, I'd like to request a variation of #add_document > which returns the doc_id of the document added, as opposed to self. > > I'm trying to track down an issue with a large test index [600MB, 500k > docs] in which I need to update a document. The old document is deleted > then added again, but doesn't show up in my searches. > > A #doc_count on the writer before and after #add_document shows that the > index is 1 document larger, but I still cant #search for the updated > doc. > > What do you think about having #add_document "yield" the doc_id if > block_given? > > Neville How about just using the doc_count method. Call it after you add the document and subtract one and you'll have the document ID of the last document added. Don't call it before you add the document as a merge might happen when you add the document, possibly changing all document IDs when deletes are completely removed. Cheers, Dave _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

