Re: ApacheCon next week

Grant Ingersoll Mon, 12 Dec 2005 17:30:18 -0800

We use boosts that are calculated based on the frequencies and thestandard alpha, beta, gamma multipliers from Rochio. Non-relevant termsdecrement the frequency. If a term is <= 0, we remove the term (someonehas posted a contribution for dealing with negative weights, we justhaven't adopted it yet). I am sure there are more things you could do,we just haven't investigated too much. We also give different weightsto things we think are more important based on our NLP analysis.


Ian Soboroff wrote:

Grant Ingersoll <[EMAIL PROTECTED]> writes:

You stole my thunder!  :-)  Was going to post the URL after doing the
actual talk, but that's all right.  I will post a few changes I have
made on the plane tonight or tomorrow to the website below.

Let me know if you have any questions...


I have one.  I've been thinking about the problem with doing relevance
feedback in Lucene, and I appreciate seeing your code on getting the
top terms from a single document.

However, the real problem for RF and pseudo-RF techniques is forming
the query.  You can obviously add terms to a query, but how are you
handling the weighting?  With boosts, or something more sophisticated?

Ian


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

--

-------------------------------------------------------------------Grant IngersollSr. Software EngineerCenter for Natural Language ProcessingSyracuse UniversitySchool of Information Studies337 Hinds HallSyracuse, NY 13244http://www.cnlp.orgVoice: 315-443-5484Fax: 315-443-6886


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: ApacheCon next week

Reply via email to