[
https://issues.apache.org/jira/browse/LUCENE-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651758#comment-14651758
]
Ahmet Arslan commented on LUCENE-6711:
--------------------------------------
bq. We should fix TFIDFSimilarity and BM25Similarity too.
For TFIDF and BM25, do we simply replace {code}collectionStats.maxDoc(){code}
with {code}collectionStats.docCount() == -1 ? collectionStats.maxDoc() :
collectionStats.docCount(){code} ?
> Instead of docCount(), maxDoc() is used for numberOfDocuments in
> SimilarityBase
> -------------------------------------------------------------------------------
>
> Key: LUCENE-6711
> URL: https://issues.apache.org/jira/browse/LUCENE-6711
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/search
> Affects Versions: 5.2.1
> Reporter: Ahmet Arslan
> Priority: Minor
> Fix For: 5.3
>
> Attachments: LUCENE-6711.patch, LUCENE-6711.patch
>
>
> {{SimilarityBase.java}} has the following line :
> {code}
> long numberOfDocuments = collectionStats.maxDoc();
> {code}
> It seems like {{collectionStats.docCount()}}, which returns the total number
> of documents that have at least one term for this field, is more appropriate
> statistics here.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]