Waiting for an explanation for my query. Thank you very much. On Tue, Dec 20, 2016 at 10:51 PM, Dwaipayan Roy <dwaipayan....@gmail.com> wrote:
> Hello, > > Can anyone help me understand the scoring function in the > LMJelinekMercerSimilarity class? > > The scoring function in LMJelinekMercerSimilarity is shown below: > -------------------------------------------------------- > float score = stats.getTotalBoost() * > (float)Math.log(1 + ((1 - lambda) * freq / docLen) / (lambda * > ((LMStats)stats).getCollectionProbability())); > -------------------------------------------------------- > > Can anyone help explain the equation? I can understand the scoring effect > when calculating the stat in the document, i.e.: (1 - lambda) * freq / > docLen). > > I hope getCollectionProbability() returns col_freq(t) / col_size. Am I > right? > > Also the boosting part is not clear to me (stats.getTotalBoost()). > > I want to reproduce the result of the scoring using LM-JM. Hence I want > the details. > > Thanks. > Dwaipayan Roy.. > -- Dwaipayan Roy.