I *guess* it's due to the fact that the update is implemented as remove and 
reinsert the document. Deletes in Lucene are lazy: the deleted document is just 
flagged as deleted in a bitmap and then removed from the index only when 
segments are merged.  Did you check IndexSearcher.collectionStatistic 
documentation? it should mention something about that.. 

Cheers,
diego


From: java-user@lucene.apache.org At: 02/28/21 11:09:52To:  
java-user@lucene.apache.org
Subject: Incorrect CollectionStatistics if IndexWriter.close is not called

Hi,

I don't understand if I'm doing something wrong or if it is the
expected behaviour.

My problem is when a document is updated the collectionStatistics
returns counts as if a new document is added in the index, even after
a call to IndexWriter.commit and to
SearcherManager.maybeRefreshBlocking.
If I call the IndexWriter.close, the counts are correct again, but the
documentation of IndexWriter.close says to try to reuse the
IndexWriter so I'm a bit confused.

Ex:
If I add two documents to an empty index

IndexSearcher.collectionStatistics("TEXT")) returns
"field="TEXT",maxDoc=2,docCount=2,sumTotalTermFreq=5,sumDocFreq=5" ->
OK

then I update one of the document and call commit()

IndexSearcher.collectionStatistics("TEXT")) returns
"field="TEXT",maxDoc=3,docCount=3,sumTotalTermFreq=9,sumDocFreq=9" ->
NOK

If I call close() now

IndexSearcher.collectionStatistics("TEXT")) returns
"field="TEXT",maxDoc=2,docCount=2,sumTotalTermFreq=6,sumDocFreq=6" ->
OK

Note that the counts are correct if the index contains only one document.


I attached a test case.

Am I doing something wrong somewhere?


Julien


----------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


Reply via email to