[ https://issues.apache.org/jira/browse/LUCENE-8011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266230#comment-16266230 ]
Mayya Sharipova commented on LUCENE-8011: ----------------------------------------- Hello! What other specific similarity classes we would like to tackle here? Are for example {{AfterEffect}}, {{AfterEffectB}}, {{Normalization}} be good candidates? > Improve similarity explanations > ------------------------------- > > Key: LUCENE-8011 > URL: https://issues.apache.org/jira/browse/LUCENE-8011 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Robert Muir > Labels: newdev > > LUCENE-7997 improves BM25 and Classic explains to better explain: > {noformat} > product of: > 2.2 = scaling factor, k1 + 1 > 9.388654 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from: > 1.0 = n, number of documents containing term > 17927.0 = N, total number of documents with field > 0.9987758 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) > from: > 979.0 = freq, occurrences of term within document > 1.2 = k1, term saturation parameter > 0.75 = b, length normalization parameter > 1.0 = dl, length of field > 1.0 = avgdl, average length of field > {noformat} > Previously it was pretty cryptic and used confusing terminology like > docCount/docFreq without explanation: > {noformat} > product of: > 0.016547536 = idf, computed as log(1 + (docCount - docFreq + 0.5) / > (docFreq + 0.5)) from: > 449.0 = docFreq > 456.0 = docCount > 2.1920826 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b > * fieldLength / avgFieldLength)) from: > 113659.0 = freq=113658 > 1.2 = parameter k1 > 0.75 = parameter b > 2300.5593 = avgFieldLength > 1048600.0 = fieldLength > {noformat} > We should fix other similarities too in the same way, they should be more > practical. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org