Hey Joaquin,

Your work here looks very interesting. The Lucene community has shown a strong interest in this area before (see LUCENE-965).

I see you went with an lgpl license though. This might be a bit of a barrier in getting feedback from a community based on apache license software. Obviously, there still might be interest,learning, and an exchange of ideas, but none of your code can be distributed with Lucene, and so what you have done loses some of its appeal in that sense. Is there any chance you would be willing to relax the license, possibly gaining more feedback, contributors, and possible inclusion in Lucene? Certainly not necessary to receive feedback, but I think it would help -- I'd certainly be looking closer anyway.

- Mark

Joaquin Perez Iglesias wrote:
Hi all,

finally I got some time to finish the BM25/BM25F implementation for Lucene you can find more details at http://nlp.uned.es/~jperezi/Lucene-BM25/, it has been tested but I cannot assure that is bugs free.
It would be great to receive some feedback about it.

There are some details about the implementation that I consider will be of interest,as how to calculate the average_length or idf at document level. Please if you find any bug or mistake in the supplied implementation let me know and I will try to solve it, same for questions.

Hope that some of you will find useful.

Thanks in advance.



[EMAIL PROTECTED] escribió:
Hi Otis,

as my colleague said, we have a first implementation of BM25 over Lucene, this development is part of a (almost finished) thesis project that compares different IR models, over an standard collection. At the same time we are trying to extend this first implementation in order to support BM25F for multifield queries, unfortunately at this time we are too busy to prepare a final version of this code, so we will have to finish this code over the summer (hopefully we will have more time :-))), and make it public at this time.

We will inform to this list when we will finish the preparation of a final version.

Thanks to everybody for the interest!!!

Bye
Joaquin

-----------------------------------------------------------
Joaquín Pérez Iglesias
Dpto. Lenguajes y Sistemas Informáticos
E.T.S.I. Informática (UNED)
Ciudad Universitaria
C/ Juan del Rosal nº 16
28040 Madrid - Spain
Phone. +34 91 398 87 25
Fax    +34 91 398 65 35
Office  2.07
Email: [EMAIL PROTECTED]
----------------------------------------------------------- Otis Gospodnetic <[EMAIL PROTECTED]> escribe :

Hi Jose,

I was wondering if you ever got to this. I would love to see and try BM25 for
Lucene!


I'm looking at http://code.google.com/soc/2008/asf/about.html
and it looks like this didn't make it into GSoC, but this would still be great
to have.

Thanks,
Otis
--
Sematext -- http://sematext.com/ --
Lucene - Solr - Nutch


----- Original Message ----
From: José Ramón Pérez Agüera <[EMAIL PROTECTED]>
To: java-dev@lucene.apache.org;
Joaquin Perez-Iglesias <[EMAIL PROTECTED]>
Sent: Saturday, March 15, 2008 4:54:08 AM
Subject: Re: Summer of Code idea for lucene

we have almost implemented BM25 using lucene structure, but we need
help to finish query parser and other details. If you o somebody want
We can send you the code and you can help us to implement the query
parser and prepare the code to sandbox.

If there are people interested I can made a web page for the project
and put our implementatio to download

Somebody is interested?

jose

--
José Ramón Pérez Agüera

Dept. de Ingeniería del Software e Inteligencia Artificial
Despacho 411 tlf. 913947599
Facultad de Informática
Universidad Complutense de Madrid

On Sat, Mar 15, 2008 at 5:32 AM, Ian Holsman wrote:
If no one objects (I don't think it's too late)

 would you mind a GSOC project to implement BM25
relevancy/scoring?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

________________________________________________
Servicio WebMail de CiberUNED http://www.uned.es



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to