[ 
https://issues.apache.org/jira/browse/LUCENE-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651758#comment-14651758
 ] 

Ahmet Arslan commented on LUCENE-6711:
--------------------------------------

bq. We should fix TFIDFSimilarity and BM25Similarity too.

For TFIDF and BM25, do we simply replace {code}collectionStats.maxDoc(){code} 
with {code}collectionStats.docCount() == -1 ? collectionStats.maxDoc() : 
collectionStats.docCount(){code} ?

> Instead of docCount(), maxDoc() is used for numberOfDocuments in 
> SimilarityBase
> -------------------------------------------------------------------------------
>
>                 Key: LUCENE-6711
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6711
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 5.2.1
>            Reporter: Ahmet Arslan
>            Priority: Minor
>             Fix For: 5.3
>
>         Attachments: LUCENE-6711.patch, LUCENE-6711.patch
>
>
> {{SimilarityBase.java}} has the following line :
> {code}
>  long numberOfDocuments = collectionStats.maxDoc();
> {code}
> It seems like {{collectionStats.docCount()}}, which returns the total number 
> of documents that have at least one term for this field, is more appropriate 
> statistics here. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to