[
https://issues.apache.org/jira/browse/LUCENE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090685#comment-13090685
]
Robert Muir commented on LUCENE-2959:
-------------------------------------
I rearranged the BM25 in the branch a little bit, its now as fast as lucene's
ranking formula:
{noformat}
Task QPS tfidf StdDev tfidf QPS bm25 StdDev bm25 Pct
diff
SpanNear 4.29 0.52 4.14 0.49 -24% -
22%
Phrase 3.97 0.25 3.89 0.25 -13% -
11%
Term 82.18 4.78 81.00 2.56 -9% -
7%
TermBGroup1M1P 83.30 2.41 82.12 2.20 -6% -
4%
SloppyPhrase 8.03 0.31 7.93 0.43 -10% -
8%
AndHighHigh 19.38 0.59 19.16 0.71 -7% -
5%
PKLookup 175.49 4.33 173.67 4.20 -5% -
3%
AndHighMed 40.99 1.12 40.71 1.07 -5% -
4%
TermGroup1M 25.69 0.39 25.69 0.44 -3% -
3%
Fuzzy2 42.62 1.83 42.65 1.80 -8% -
8%
Fuzzy1 91.74 3.48 91.86 3.44 -7% -
7%
Respell 73.96 3.30 74.18 3.29 -8% -
9%
Wildcard 56.33 0.97 56.60 1.08 -3% -
4%
Prefix3 33.36 0.83 33.59 0.97 -4% -
6%
TermBGroup1M 55.58 1.03 56.17 0.88 -2% -
4%
IntNRQ 13.38 0.74 13.58 0.94 -10% -
14%
OrHighMed 11.71 1.18 11.94 0.97 -14% -
22%
OrHighHigh 8.91 0.74 9.13 0.63 -11% -
19%
{noformat}
> [GSoC] Implementing State of the Art Ranking for Lucene
> -------------------------------------------------------
>
> Key: LUCENE-2959
> URL: https://issues.apache.org/jira/browse/LUCENE-2959
> Project: Lucene - Java
> Issue Type: New Feature
> Components: core/query/scoring, general/javadocs, modules/examples
> Reporter: David Mark Nemeskey
> Assignee: Robert Muir
> Labels: gsoc2011, lucene-gsoc-11, mentor
> Fix For: flexscoring branch
>
> Attachments: LUCENE-2959_mockdfr.patch, implementation_plan.pdf,
> proposal.pdf
>
>
> Lucene employs the Vector Space Model (VSM) to rank documents, which compares
> unfavorably to state of the art algorithms, such as BM25. Moreover, the
> architecture is
> tailored specically to VSM, which makes the addition of new ranking functions
> a non-
> trivial task.
> This project aims to bring state of the art ranking methods to Lucene and to
> implement a
> query architecture with pluggable ranking functions.
> The wiki page for the project can be found at
> http://wiki.apache.org/lucene-java/SummerOfCode2011ProjectRanking.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]