Hi all,
We are trying to emulate in Solr 8.0 the behaviour of Solr 3.6 and we are
facing a problem that we cannot solve
When we have duplicated tokens:
- Solr 8.0 scores only once the token but it applies a huge boost
- Solr 3.6 scores individually each token and the final score is lower
We are using ClassicSimilarity algorythm but we cannot prevent that boosting
Example: table 60 cm 50 cm
Solr 8.0
/11.096966 = sum of:
4.3195267 = sum of:
4.3195267 = weight(name:table in 138556) [ClassicSimilarity], result of:
4.3195267 = score(freq=1.0), product of:
8.639053 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
62381 = docFreq, number of documents containing term
129615816 = docCount, total number of documents with field
1.0 = tf(freq=1.0), with freq of:
1.0 = freq, occurrences of term within document
0.5 = fieldNorm
2.7624812 = weight(name:60 in 138556) [ClassicSimilarity], result of:
2.7624812 = score(freq=1.0), product of:
5.5249624 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
1404402 = docFreq, number of documents containing term
129615816 = docCount, total number of documents with field
1.0 = tf(freq=1.0), with freq of:
1.0 = freq, occurrences of term within document
0.5 = fieldNorm
4.0149584 = weight(name:cm in 138556) [ClassicSimilarity], result of:
4.0149584 = score(freq=1.0), product of:
* 2.0 = boost*
4.0149584 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
6357381 = docFreq, number of documents containing term
129615816 = docCount, total number of documents with field
1.0 = tf(freq=1.0), with freq of:
1.0 = freq, occurrences of term within document
0.5 = fieldNorm
/
Solr 3.6
/3.098446 = (MATCH) product of:
3.8730574 = (MATCH) sum of:
2.120801 = (MATCH) sum of:
2.120801 = (MATCH) weight(name:table in 101441), product of:
0.4913325 = queryWeight(name:table), product of:
8.632854 = idf(docFreq=135231, maxDocs=279245306)
0.05691426 = queryNorm
4.316427 = (MATCH) fieldWeight(name:table in 101441), product of:
1.0 = tf(termFreq(name:table)=1)
8.632854 = idf(docFreq=135231, maxDocs=279245306)
0.5 = fieldNorm(field=name, doc=101441)
0.8427305 = (MATCH) weight(name:60 in 101441), product of:
0.30972046 = queryWeight(name:60), product of:
5.4418783 = idf(docFreq=3287778, maxDocs=279245306)
0.05691426 = queryNorm
2.7209392 = (MATCH) fieldWeight(name:60 in 101441), product of:
1.0 = tf(termFreq(name:60)=1)
5.4418783 = idf(docFreq=3287778, maxDocs=279245306)
0.5 = fieldNorm(field=name, doc=101441)
0.45476305 = (MATCH) weight(name:cm in 101441), product of:
0.22751924 = queryWeight(name:cm), product of:
3.9975789 = idf(docFreq=13936507, maxDocs=279245306)
0.05691426 = queryNorm
1.9987894 = (MATCH) fieldWeight(name:cm in 101441), product of:
1.0 = tf(termFreq(name:cm)=1)
3.9975789 = idf(docFreq=13936507, maxDocs=279245306)
0.5 = fieldNorm(field=name, doc=101441)
0.45476305 = (MATCH) weight(name:cm in 101441), product of:
0.22751924 = queryWeight(name:cm), product of:
3.9975789 = idf(docFreq=13936507, maxDocs=279245306)
0.05691426 = queryNorm
1.9987894 = (MATCH) fieldWeight(name:cm in 101441), product of:
1.0 = tf(termFreq(name:cm)=1)
3.9975789 = idf(docFreq=13936507, maxDocs=279245306)
0.5 = fieldNorm(field=name, doc=101441)
0.8 = coord(4/5)
/
Is it possible to configure this?
Thanks in advance!
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html