Hey Joaquin,
Your work here looks very interesting. The Lucene community has shown a
strong interest in this area before (see LUCENE-965).
I see you went with an lgpl license though. This might be a bit of a
barrier in getting feedback from a community based on apache license
software. Obviously, there still might be interest,learning, and an
exchange of ideas, but none of your code can be distributed with Lucene,
and so what you have done loses some of its appeal in that sense. Is
there any chance you would be willing to relax the license, possibly
gaining more feedback, contributors, and possible inclusion in Lucene?
Certainly not necessary to receive feedback, but I think it would help
-- I'd certainly be looking closer anyway.
- Mark
Joaquin Perez Iglesias wrote:
Hi all,
finally I got some time to finish the BM25/BM25F implementation for
Lucene you can find more details at
http://nlp.uned.es/~jperezi/Lucene-BM25/, it has been tested but I
cannot assure that is bugs free.
It would be great to receive some feedback about it.
There are some details about the implementation that I consider will
be of interest,as how to calculate the average_length or idf at
document level.
Please if you find any bug or mistake in the supplied implementation
let me know and I will try to solve it, same for questions.
Hope that some of you will find useful.
Thanks in advance.
[EMAIL PROTECTED] escribió:
Hi Otis,
as my colleague said, we have a first implementation of BM25 over
Lucene, this development is part of a (almost finished) thesis
project that compares different IR models, over an standard
collection. At the same time we are trying to extend this first
implementation in order to support BM25F for multifield queries,
unfortunately at this time we are too busy to prepare a final version
of this code, so we will have to finish this code over the summer
(hopefully we will have more time :-))), and make it public at this
time.
We will inform to this list when we will finish the preparation of a
final version.
Thanks to everybody for the interest!!!
Bye
Joaquin
-----------------------------------------------------------
Joaquín Pérez Iglesias
Dpto. Lenguajes y Sistemas Informáticos
E.T.S.I. Informática (UNED)
Ciudad Universitaria
C/ Juan del Rosal nº 16
28040 Madrid - Spain
Phone. +34 91 398 87 25
Fax +34 91 398 65 35
Office 2.07
Email: [EMAIL PROTECTED]
----------------------------------------------------------- Otis
Gospodnetic <[EMAIL PROTECTED]> escribe :
Hi Jose,
I was wondering if you ever got to this. I would love to see and
try BM25 for
Lucene!
I'm looking at http://code.google.com/soc/2008/asf/about.html
and it looks like this didn't make it into GSoC, but this would
still be great
to have.
Thanks,
Otis
--
Sematext -- http://sematext.com/ --
Lucene - Solr - Nutch
----- Original Message ----
From: José Ramón Pérez Agüera <[EMAIL PROTECTED]>
To: java-dev@lucene.apache.org;
Joaquin Perez-Iglesias <[EMAIL PROTECTED]>
Sent: Saturday, March 15, 2008 4:54:08 AM
Subject: Re: Summer of Code idea for lucene
we have almost implemented BM25 using lucene structure, but we need
help to finish query parser and other details. If you o somebody want
We can send you the code and you can help us to implement the query
parser and prepare the code to sandbox.
If there are people interested I can made a web page for the project
and put our implementatio to download
Somebody is interested?
jose
--
José Ramón Pérez Agüera
Dept. de Ingeniería del Software e Inteligencia Artificial
Despacho 411 tlf. 913947599
Facultad de Informática
Universidad Complutense de Madrid
On Sat, Mar 15, 2008 at 5:32 AM, Ian Holsman wrote:
If no one objects (I don't think it's too late)
would you mind a GSOC project to implement BM25
relevancy/scoring?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
________________________________________________
Servicio WebMail de CiberUNED http://www.uned.es
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]