Many thanks.

On Fri, Aug 19, 2016 at 4:22 PM, Anshum Gupta <ans...@anshumgupta.net>
wrote:

> The default similarity changed from TF-IDF to BM25 in 6.0.
>
> On Fri, Aug 19, 2016 at 3:00 PM John Bickerstaff <j...@johnbickerstaff.com
> >
> wrote:
>
> > Bump!
> >
> > TL;DR Question: Are scores (and debug output) *expected* to be different
> > between 5.4 and 6.1?
> >
> > On Thu, Aug 18, 2016 at 2:44 PM, John Bickerstaff <
> > j...@johnbickerstaff.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > TL:DR -
> > > Is it expected that the /select endpoint would produce different
> > > scores/result order between versions 5.4 and 6.1?
> > >
> > >
> > > (I'm aware that it's certainly possible I've done something different
> to
> > > these environments, although at this point I can't see any difference
> in
> > > configs etc... and I used a very simple search against /select to test
> > this)
> > >
> > > ====== Detail ==========
> > >
> > > I'm currently seeing different scoring and different result order when
> I
> > > compare Solr results in the Admin console for a 5.4 and 6.1
> environment.
> > >
> > > I'm using the /select endpoint to try to avoid any difference in
> > > configuration.  To the best of my knowledge (and reading) I haven't
> ever
> > > modified the xml for that endpoint.
> > >
> > > As I was looking into it, I saw that the debug output looks quite
> > > different in 6.1...
> > >
> > > Any advice, including "You must have broken it yourself, that's
> > > impossible" is much appreciated.
> > >
> > >
> > >
> > > Here's debug from the "old" 5.4 SolrCloud environment.  The id's are a
> > > pain to read, but not only am I getting different scores, I'm getting
> > > different docs (or docs in a clearly different order)
> > >
> > > "debug": { "rawquerystring": "chiari", "querystring": "chiari", "
> > > parsedquery": "text:chiari", "parsedquery_toString": "text:chiari", "
> > > explain": { "d9644f86-5fe2-4a9f-8517-545e2cde0b64": "\n4.3581347 =
> > > weight(text:chiari in 26783) [ClassicSimilarity], result of:\n
> 4.3581347
> > =
> > > fieldWeight in 26783, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0
> > > = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> > > fieldNorm(doc=26783)\n", "1347f707-6fdd-4864-b9dd-6d3e7cc32bf5":
> > "\n4.3581347
> > > = weight(text:chiari in 26792) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 26792, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=26792)\n", "d01c32ad-e29d-4b65-9930-f8a6844a2613":
> > "\n4.3581347
> > > = weight(text:chiari in 27028) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27028, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27028)\n", "0c5a4be7-1162-4b1a-ab83-4b48a690fc3a":
> > "\n4.3581347
> > > = weight(text:chiari in 27029) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27029, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27029)\n", "e1cb441d-9d60-482d-956b-3fbc964a17c1":
> > "\n4.3581347
> > > = weight(text:chiari in 27042) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27042, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27042)\n", "f87951f1-e163-4f17-a628-904b9df0c609":
> > "\n4.3581347
> > > = weight(text:chiari in 27043) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27043, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27043)\n", "caaa7ca1-34cb-44a8-8dd9-12c909db8c2d":
> > "\n4.3581347
> > > = weight(text:chiari in 27044) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27044, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27044)\n", "ada7a87e-725a-4533-b72e-3817af4c7179":
> > "\n4.3581347
> > > = weight(text:chiari in 27055) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27055, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27055)\n", "ac6d47fd-9a59-47d6-8cfb-11b34c7ded54":
> > "\n4.3581347
> > > = weight(text:chiari in 27056) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27056, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27056)\n", "4aaa7697-b26a-4bea-ba4e-70d18ea649f0":
> > "\n4.3581347
> > > = weight(text:chiari in 62240) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 62240, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=62240)\n" }, "QParser": "LuceneQParser", "timing": {
> > "time":
> > > 2, "prepare": { "time": 0, "query": { "time": 0 },
> > >
> > > ... and here's the same from the Solr Cloud 6.0 environment
> > >
> > > "debug":{ "rawquerystring":"chiari", "querystring":"chiari",
> "parsedquery
> > > ":"text:chiari", "parsedquery_toString":"text:chiari", "explain":{ "
> > > 85249c23-ef68-4276-9ef7-48c290033993":"\n9.735645 =
> weight(text:chiari in
> > > 106960) [], result of:\n 9.735645 = score(doc=106960,freq=50.0 =
> > > termFreq=50.0\n), product of:\n 4.798444 = idf(docFreq=281,
> > > docCount=34151)\n 2.0289173 = tfNorm, computed from:\n 50.0 =
> > > termFreq=50.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 941.3421 =
> > > avgFieldLength\n 4096.0 = fieldLength\n", "495b660d-8e8f-4b75-a523-
> > > 106440468818":"\n9.655164 = weight(text:chiari in 106215) [], result
> > > of:\n 9.655164 = score(doc=106215,freq=58.0 = termFreq=58.0\n), product
> > > of:\n 4.798444 = idf(docFreq=281, docCount=34151)\n 2.0121448 = tfNorm,
> > > computed from:\n 58.0 = termFreq=58.0\n 1.2 = parameter k1\n 0.75 =
> > > parameter b\n 941.3421 = avgFieldLength\n 5349.8774 = fieldLength\n", "
> > > 841df60a-b83e-4e74-9ad5-463971d5220a":"\n9.613188 =
> weight(text:chiari in
> > > 106214) [], result of:\n 9.613188 = score(doc=106214,freq=74.0 =
> > > termFreq=74.0\n), product of:\n 4.798444 = idf(docFreq=281,
> > > docCount=34151)\n 2.003397 = tfNorm, computed from:\n 74.0 =
> > > termFreq=74.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 941.3421 =
> > > avgFieldLength\n 7281.778 = fieldLength\n", "0a8ab59f-95e3-4fca-adea-
> > > 5a62d97b4369":"\n9.594478 = weight(text:chiari in 106440) [], result
> > > of:\n 9.594478 = score(doc=106440,freq=54.0 = termFreq=54.0\n), product
> > > of:\n 4.798444 = idf(docFreq=281, docCount=34151)\n 1.9994978 = tfNorm,
> > > computed from:\n 54.0 = termFreq=54.0\n 1.2 = parameter k1\n 0.75 =
> > > parameter b\n 941.3421 = avgFieldLength\n 5349.8774 = fieldLength\n", "
> > > 15595a34-88c4-42e0-a6b2-9ee8eafdd9e8":"\n9.502294 =
> weight(text:chiari in
> > > 106958) [], result of:\n 9.502294 = score(doc=106958,freq=38.0 =
> > > termFreq=38.0\n), product of:\n 4.798444 = idf(docFreq=281,
> > > docCount=34151)\n 1.9802866 = tfNorm, computed from:\n 38.0 =
> > > termFreq=38.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 941.3421 =
> > > avgFieldLength\n 4096.0 = fieldLength\n", "0acd1f88-395c-434d-9cba-
> > > 919e7073080c":"\n9.449741 = weight(text:chiari in 106439) [], result
> > > of:\n 9.449741 = score(doc=106439,freq=62.0 = termFreq=62.0\n), product
> > > of:\n 4.798444 = idf(docFreq=281, docCount=34151)\n 1.9693346 = tfNorm,
> > > computed from:\n 62.0 = termFreq=62.0\n 1.2 = parameter k1\n 0.75 =
> > > parameter b\n 941.3421 = avgFieldLength\n 7281.778 = fieldLength\n", "
> > > 66516297-cf1d-4ee8-847b-a5193420491a":"\n9.284438 =
> weight(text:chiari in
> > > 108786) [], result of:\n 9.284438 = score(doc=108786,freq=53.0 =
> > > termFreq=53.0\n), product of:\n 4.798444 = idf(docFreq=281,
> > > docCount=34151)\n 1.9348853 = tfNorm, computed from:\n 53.0 =
> > > termFreq=53.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 941.3421 =
> > > avgFieldLength\n 7281.778 = fieldLength\n", "0c5a4be7-1162-4b1a-ab83-
> > > 4b48a690fc3a":"\n9.164393 = weight(text:chiari in 6100) [], result
> of:\n
> > > 9.164393 = score(doc=6100,freq=2.0 = termFreq=2.0\n), product of:\n
> > > 4.798444 = idf(docFreq=281, docCount=34151)\n 1.9098678 = tfNorm,
> > computed
> > > from:\n 2.0 = termFreq=2.0\n 1.2 = parameter k1\n 0.75 = parameter b\n
> > > 941.3421 = avgFieldLength\n 4.0 = fieldLength\n", "
> > > e1cb441d-9d60-482d-956b-3fbc964a17c1":"\n9.164393 =
> weight(text:chiari in
> > > 6113) [], result of:\n 9.164393 = score(doc=6113,freq=2.0 =
> > > termFreq=2.0\n), product of:\n 4.798444 = idf(docFreq=281,
> > > docCount=34151)\n 1.9098678 = tfNorm, computed from:\n 2.0 =
> > termFreq=2.0\n
> > > 1.2 = parameter k1\n 0.75 = parameter b\n 941.3421 = avgFieldLength\n
> > 4.0 =
> > > fieldLength\n", "f87951f1-e163-4f17-a628-904b9df0c609":"\n9.164393 =
> > > weight(text:chiari in 6114) [], result of:\n 9.164393 =
> > > score(doc=6114,freq=2.0 = termFreq=2.0\n), product of:\n 4.798444 =
> > > idf(docFreq=281, docCount=34151)\n 1.9098678 = tfNorm, computed from:\n
> > 2.0
> > > = termFreq=2.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 941.3421 =
> > > avgFieldLength\n 4.0 = fieldLength\n"}, "QParser":"LuceneQParser",
> > "timing
> > > ":{ "time":1.0,
> > >
> >
>

Reply via email to