Re: Jensen–Shannon divergence

2015-12-14 Thread will martin
Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> -Original Message- >> From: Jack Krupansky [mailto:jack.krupan...@gmail.com] >> Sent: Monday, December 14, 2015 11:21 PM >> To: java-user@lucene.apache.o

Re: Jensen–Shannon divergence

2015-12-14 Thread Jack Krupansky
Is there any particular reason that you find Lucene's builtin TF/IDF and BM25 similarity models insufficient for your needs? In any case, examination of their source code should get you started if you with to do your own:

RE: Jensen–Shannon divergence

2015-12-14 Thread Uwe Schindler
Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Jack Krupansky [mailto:jack.krupan...@gmail.com] > Sent: Monday, December 14, 2015 11:21 PM > To: java-user@lucene.apache.org > Subject: Re: Jensen–

Re: Jensen–Shannon divergence

2015-12-13 Thread will martin
expand your due diligence beyond wikipedia: i.e. http://ciir.cs.umass.edu/pubfiles/ir-464.pdf > On Dec 13, 2015, at 8:30 AM, Shay Hummel wrote: > > LMDiricletbut its feasibilit

Re: Jensen–Shannon divergence

2015-12-13 Thread will martin
Sorry it was early. If you go looking on the web, you can find, as I did reputable work on implementing DiricletLanguage Models. However, at this hour you might get answers here. Extrapolating others work into a lucene implantation is only slightly different from getting answers here. imo

Jensen–Shannon divergence

2015-12-13 Thread Shay Hummel
Hi I need help to implement similarity between query model and document model. I would like to use the JS-Divergence for ranking documents. The documents and the query will be represented according to the language models approach -

Re: Jensen–Shannon divergence

2015-12-13 Thread Shay Hummel
Hi I am sorry but I didn't understand your answer. Can you please elaborate? Shay On Sun, Dec 13, 2015 at 3:41 PM will martin wrote: > expand your due diligence beyond wikipedia: > i.e. > > http://ciir.cs.umass.edu/pubfiles/ir-464.pdf > > > > > On Dec 13, 2015, at 8:30

Re: Jensen–Shannon divergence

2015-12-13 Thread Ahmet Arslan
Hi Shay, I suggest you to extend o.a.l.search.similarities.SimilarityBase. All you need to implement a score() method. After all fancy names (language models, etc), a similarity is a function of seven salient statistics. It is actually six: avgFieldLength can derived from other two