On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey
<[email protected]> wrote:
> Hi all,
>
> I have already sent this mail to Simon Willnauer, and he suggested me to post
> it here for discussion.
>
> I am David Nemeskey, a PhD student at the Eotvos Lorand University, Budapest,
> Hungary. I am doing an IR-related research, and we have considered using
> Lucene as our search engine. We were quite satisfied with the speed and ease 
> of
> use. However, we would like to experiment with different ranking algorithms,
> and this is where problems arise. Lucene only supports the VSM, and
> unfortunately the ranking architecture seems to be tailored specifically to 
> its
> needs.
>
> I would be very much interested in revamping the ranking component as a GSoC
> project. The following modifications should be doable in the allocated time
> frame:
> - a new ranking class hierarchy, which is generic enough to allow easy
> implementation of new weighting schemes (at least bag-of-words ones),
> - addition of state-of-the-art ranking methods, such as Okapi BM25, proximity
> and DFR models,
> - configuration for ranking selection, with the old method as default.
>
> I believe all users of Lucene would profit from such a project. It would
> provide the scientific community with an even more useful research aid, while
> regular users could benefit from superior ranking results.
>
> Please let me know your opinion about this proposal.
>

Hi David, honestly this sounds fantastic.

It would be great to have someone to work with us on this issue!

To date, progress is pretty slow-going (minor improvements, cleanups,
additional stats here and there)... but we really need all the help we
can get, especially from people who have a really good understanding
of the various models.

In case you are interested, here are some references to discussions
about adding more flexibility (with some prototypes etc):
http://www.lucidimagination.com/search/document/72787e0e54f798e4/baby_steps_towards_making_lucene_s_scoring_more_flexible
https://issues.apache.org/jira/browse/LUCENE-2392

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to