[
https://issues.apache.org/jira/browse/LUCENE-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070654#comment-13070654
]
David Mark Nemeskey commented on LUCENE-3220:
---------------------------------------------
I think I realized what I wanted with numberOfFieldTokens. I was afraid that
sumTotalTermFreq is affected by norms / index time boost / etc, and I wanted to
make numberOfFieldTokens to unaffected by those (I don't know now how); only I
forgot to do so.
But if sumTotalTermFreq is really just the number of tokens in the field, I
will delete one of them. Not sure which, because for me numberOfFieldTokens
seems a more descriptive name than sumTotalTermFreq, but the latter is used
everywhere in Lucene. May I ask your opinion on this question?
> Implement various ranking models as Similarities
> ------------------------------------------------
>
> Key: LUCENE-3220
> URL: https://issues.apache.org/jira/browse/LUCENE-3220
> Project: Lucene - Java
> Issue Type: Sub-task
> Components: core/search
> Affects Versions: flexscoring branch
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Labels: gsoc
> Attachments: LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch,
> LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch,
> LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch,
> LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch,
> LUCENE-3220.patch
>
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> With [LUCENE-3174|https://issues.apache.org/jira/browse/LUCENE-3174] done, we
> can finally work on implementing the standard ranking models. Currently DFR,
> BM25 and LM are on the menu.
> Done:
> * {{EasyStats}}: contains all statistics that might be relevant for a
> ranking algorithm
> * {{EasySimilarity}}: the ancestor of all the other similarities. Hides the
> DocScorers and as much implementation detail as possible
> * _BM25_: the current "mock" implementation might be OK
> * _LM_
> * _DFR_
> * The so-called _Information-Based Models_
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]