someone can check my facts here, but the log-likelihood ratio follows
a chi-square distribution. You can figure an actual probability from
that in the usual way, from its CDF. You would need to tweak the code
you see in the project to compute an actual LLR by normalizing the
input.

You could use 1-p then as a similarity metric.

This also isn't how the test statistic is turned into a similarity
metric in the project now. But 1-p sounds nicer. Maybe the historical
reason was speed, or, ignorance.

On Thu, Jun 20, 2013 at 8:53 AM, Dan Filimon
<dangeorge.fili...@gmail.com> wrote:
> When computing item-item similarity using the log-likelihood similarity
> [1], can I simply apply a sigmoid do the resulting values to get the
> probability that two items are similar?
>
> Is there any other processing I need to do?
>
> Thanks!
>
> [1] http://tdunning.blogspot.ro/2008/03/surprise-and-coincidence.html

Reply via email to