[
https://issues.apache.org/jira/browse/LUCENE-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Mark Nemeskey updated LUCENE-3220:
----------------------------------------
Attachment: LUCENE-3220.patch
* Fixed #1
* Added a totalBoost to EasySimilarity, and a getter method -- noone uses it
yet
* Added basic implementations for the Jelinek-Mercer and the Dirichlet LM
methods.
As for the last one: the implementation is very basic now, I want to factor a
few things out (e.g. p(w|C) to LMStats, possibly in a pluggable way so ppl can
implement it however they want). It also doesn't seem right to have the same LM
method implemented twice (both as MockLMSimilarity and here), so I'll take a
look to see if I can merge those two. Finally, I am wondering whether I should
implement the absolute discounting method, which, according to the paper, seems
inferior to the Jelinek-Mercer and Dirichlet methods. Right now I am more on
the "no" side.
> Implement various ranking models as Similarities
> ------------------------------------------------
>
> Key: LUCENE-3220
> URL: https://issues.apache.org/jira/browse/LUCENE-3220
> Project: Lucene - Java
> Issue Type: Sub-task
> Components: core/search
> Affects Versions: flexscoring branch
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Labels: gsoc
> Attachments: LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch,
> LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch,
> LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch,
> LUCENE-3220.patch
>
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> With [LUCENE-3174|https://issues.apache.org/jira/browse/LUCENE-3174] done, we
> can finally work on implementing the standard ranking models. Currently DFR,
> BM25 and LM are on the menu.
> TODO:
> * {{EasyStats}}: contains all statistics that might be relevant for a
> ranking algorithm
> * {{EasySimilarity}}: the ancestor of all the other similarities. Hides the
> DocScorers and as much implementation detail as possible
> * _BM25_: the current "mock" implementation might be OK
> * _LM_
> * _DFR_
> Done:
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]