So I don't really know what I'm talking about, and I'm not really sure if it's related or not, but your particular query:

"The Beatles as musicians : Revolver through the Anthology"

With the lone "word" that's a ':', reminds me of a dismax stopwords-type problem I ran into. Now, I ran into it on 1.4. I don't know why it would be different on 1.4 and 3.x. And I see you aren't even using a multi-field dismax in your sample query, so it couldn't possibly be what I ran into... I don't think. But I'll write this anyway in case it gives someone some ideas.

The problem I ran into is caused by different analysis in two fields both used in a dismax, one that ends up keeping ":" as a token, and one that doesn't. Which ends up having the same effect as the famous 'dismax stopwords problem'.

Maybe somehow your schema changed such to produce this problem in 3.x but not in 1.4? Although again I realize the fact that you are only using a single field in your demo dismax query kind of suggests it's not this problem. Wonder if you try the query without the ":", if the problem goes away, that might be a hint. Or, maybe someone more skilled at understanding what's in those Solr debug statements than I am (it's kind of all greek to me) will be able to take this hint and rule out or confirm that it may have something to do with your problem.

Here I write up the issue I ran into (which may or may not have anything to do with what you ran into)

http://bibwild.wordpress.com/2011/06/15/more-dismax-gotchas-varying-field-analysis-and-mm/


Also, you don't say what your 'mm' is in your dismax queries, that could be relevant if it's got anything to do with anything similar to the issue I'm talking about.

Hmm, I wonder if Solr 3.x changes the way dismax calculates number of tokens for 'mm' in such a way that the 'varying field analysis dismax gotcha' can manifest with only one field, if the way dismax counts tokens for 'mm' differs from number of tokens the single field's analysis produces?

Jonathan

On 2/22/2012 2:55 PM, Naomi Dushay wrote:
I am working on upgrading Solr from 1.4 to 3.5, and I have hit a problem.   I 
have a test checking for a search result in Solr, and the test passes in Solr 
1.4, but fails in Solr 3.5.   Dismax is the desired QueryParser -- I just 
included output from lucene QueryParser to prove the document exists and is 
found

I am completely stumped.


Here are the debugQuery details:

***Solr 3.5***

lucene QueryParser:

URL:   q=all_search:"The Beatles as musicians : Revolver through the Anthology"
final query:  all_search:"the beatl as musician revolv through the antholog"

6.0562754 = (MATCH) weight(all_search:"the beatl as musician revolv through the 
antholog" in 1064395), product of:
   1.0 = queryWeight(all_search:"the beatl as musician revolv through the 
antholog"), product of:
     48.450203 = idf(all_search: the=3531140 beatl=398 as=645923 musician=11805 
revolv=872 through=81366 the=3531140 antholog=11611)
     0.02063975 = queryNorm
   6.0562754 = fieldWeight(all_search:"the beatl as musician revolv through the 
antholog" in 1064395), product of:
     1.0 = tf(phraseFreq=1.0)
     48.450203 = idf(all_search: the=3531140 beatl=398 as=645923 musician=11805 
revolv=872 through=81366 the=3531140 antholog=11611)
     0.125 = fieldNorm(field=all_search, doc=1064395)

dismax QueryParser:
URL:  qf=all_search&pf=all_search&q="The Beatles as musicians : Revolver through the 
Anthology"
final query:   +(all_search:"the beatl as musician revolv through the antholog"~1)~0.01 
(all_search:"the beatl as musician revolv through the antholog"~3)~0.01

(no matches)


***Solr 1.4***

lucene QueryParser:

URL:  q=all_search:"The Beatles as musicians : Revolver through the Anthology"
final query:  all_search:"the beatl as musician revolv through the antholog"

5.2676983 = fieldWeight(all_search:"the beatl as musician revolv through the 
antholog" in 3469163), product of:
   1.0 = tf(phraseFreq=1.0)
   48.16181 = idf(all_search: the=3542123 beatl=391 as=749890 musician=11955 
revolv=820 through=88238 the=3542123 antholog=11205)
   0.109375 = fieldNorm(field=all_search, doc=3469163)

dismax QueryParser:
URL:  qf=all_search&pf=all_search&q="The Beatles as musicians : Revolver through the 
Anthology"
final query:  +(all_search:"the beatl as musician revolv through the antholog"~1)~0.01 
(all_search:"the beatl as musician revolv through the antholog"~3)~0.01

score:

7.449651 = (MATCH) sum of:
   3.7248254 = weight(all_search:"the beatl as musician revolv through the 
antholog"~1 in 3469163), product of:
     0.7071068 = queryWeight(all_search:"the beatl as musician revolv through the 
antholog"~1), product of:
       48.16181 = idf(all_search: the=3542123 beatl=391 as=749890 
musician=11955 revolv=820 through=88238 the=3542123 antholog=11205)
       0.014681898 = queryNorm
     5.2676983 = fieldWeight(all_search:"the beatl as musician revolv through the 
antholog" in 3469163), product of:
       1.0 = tf(phraseFreq=1.0)
       48.16181 = idf(all_search: the=3542123 beatl=391 as=749890 
musician=11955 revolv=820 through=88238 the=3542123 antholog=11205)
       0.109375 = fieldNorm(field=all_search, doc=3469163)
   3.7248254 = weight(all_search:"the beatl as musician revolv through the 
antholog"~3 in 3469163), product of:
     0.7071068 = queryWeight(all_search:"the beatl as musician revolv through the 
antholog"~3), product of:
       48.16181 = idf(all_search: the=3542123 beatl=391 as=749890 
musician=11955 revolv=820 through=88238 the=3542123 antholog=11205)
       0.014681898 = queryNorm
     5.2676983 = fieldWeight(all_search:"the beatl as musician revolv through the 
antholog" in 3469163), product of:
       1.0 = tf(phraseFreq=1.0)
       48.16181 = idf(all_search: the=3542123 beatl=391 as=749890 
musician=11955 revolv=820 through=88238 the=3542123 antholog=11205)
       0.109375 = fieldNorm(field=all_search, doc=3469163)



Reply via email to