Hello,

Can anyone help me understand the scoring function in the
LMJelinekMercerSimilarity class?

The scoring function in LMJelinekMercerSimilarity is shown below:
--------------------------------------------------------
float score = stats.getTotalBoost() *
(float)Math.log(1 + ((1 - lambda) * freq / docLen) / (lambda *
((LMStats)stats).getCollectionProbability()));
--------------------------------------------------------

Can anyone help explain the equation? I can understand the scoring effect
when calculating the stat in the document, i.e.: (1 - lambda) * freq /
docLen).

I hope getCollectionProbability() returns col_freq(t) / col_size. Am I
right?

Also the boosting part is not clear to me (stats.getTotalBoost()).

I want to reproduce the result of the scoring using LM-JM. Hence I want the
details.

Thanks.
Dwaipayan Roy..

Reply via email to