Hello, Can anyone help me understand the scoring function in the LMJelinekMercerSimilarity class?
The scoring function in LMJelinekMercerSimilarity is shown below: -------------------------------------------------------- float score = stats.getTotalBoost() * (float)Math.log(1 + ((1 - lambda) * freq / docLen) / (lambda * ((LMStats)stats).getCollectionProbability())); -------------------------------------------------------- Can anyone help explain the equation? I can understand the scoring effect when calculating the stat in the document, i.e.: (1 - lambda) * freq / docLen). I hope getCollectionProbability() returns col_freq(t) / col_size. Am I right? Also the boosting part is not clear to me (stats.getTotalBoost()). I want to reproduce the result of the scoring using LM-JM. Hence I want the details. Thanks. Dwaipayan Roy..