The original query is fine, and has the boost as expected: ((+language:eng +( CutoffQueryWrapper((+value_0:bunker~0.8332333 +value_0:hill)^0.6666667) CutoffQueryWrapper((+othervalue_0:bunker~0.8332333 +value_0:hill)^0.5714286) CutoffQueryWrapper((+value_0:bunker~0.8332333 +othervalue_0:hill)^0.5714286) CutoffQueryWrapper((+value_1:bunker~0.8332333 +value_0:hill)^0.6666667) CutoffQueryWrapper((+othervalue_1:bunker~0.8332333 +value_0:hill)^0.5714286) CutoffQueryWrapper((+value_1:bunker~0.8332333 +othervalue_0:hill)^0.5714286) ... CutoffQueryWrapper((+othervalue_7:bunker~0.8332333 +value_7:hillmonument~0.8332333)^0.85714287) CutoffQueryWrapper((+value_7:bunker~0.8332333 +othervalue_7:hillmonument~0.8332333)^0.85714287)))^3.0) ( CutoffQueryWrapper((+value_0:bunker~0.8332333 +value_0:hill)^0.6666667) CutoffQueryWrapper((+othervalue_0:bunker~0.8332333 +value_0:hill)^0.5714286) CutoffQueryWrapper((+value_0:bunker~0.8332333 +othervalue_0:hill)^0.5714286) CutoffQueryWrapper((+value_1:bunker~0.8332333 +value_0:hill)^0.6666667) ... ))
The rewritten query is odd. Here's a sample: ((+language:eng +( CutoffQueryWrapper((+() +value_0:hill)^0.6666667) CutoffQueryWrapper((+() +value_0:hill)^0.5714286) CutoffQueryWrapper((+() +othervalue_0:hill)^0.5714286) CutoffQueryWrapper((+() +value_0:hill)^0.6666667) CutoffQueryWrapper((+() +value_0:hill)^0.5714286) CutoffQueryWrapper((+() +othervalue_0:hill)^0.5714286) CutoffQueryWrapper((+(value_2:bunker value_2:burker^5.997396E-4) +value_0:hill)^0.6666667) CutoffQueryWrapper((+() +value_0:hill)^0.5714286) ... CutoffQueryWrapper((+() +(()^0.5555556))^0.85714287)))^3.0) ( CutoffQueryWrapper((+() +value_0:hill)^0.6666667) CutoffQueryWrapper((+() +value_0:hill)^0.5714286) CutoffQueryWrapper((+() +othervalue_0:hill)^0.5714286) CutoffQueryWrapper((+() +value_0:hill)^0.6666667) CutoffQueryWrapper((+() +value_0:hill)^0.5714286) CutoffQueryWrapper((+() +othervalue_0:hill)^0.5714286) CutoffQueryWrapper((+(value_2:bunker value_2:burker^5.997396E-4) +value_0:hill)^0.6666667) CutoffQueryWrapper((+() +value_0:hill)^0.5714286) ... CutoffQueryWrapper((+() +(()^0.5555556))^0.85714287) CutoffQueryWrapper(+() +(()^0.6666667)) CutoffQueryWrapper((+() +(()^0.6666667))^0.85714287) CutoffQueryWrapper((+() +(()^0.5555556))^0.85714287) ) As you can see, there are a lot of repeats, a lot of blank matches, but the original boost *is* still there. I really can't interpret this any further - the many blank and repeated matches seem wrong to me, but the scorer explanation seems even more wrong. Any ideas? Karl -----Original Message----- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of ext Yonik Seeley Sent: Thursday, January 20, 2011 3:34 PM To: dev@lucene.apache.org Subject: Re: Odd Boolean scoring behavior? On Thu, Jan 20, 2011 at 3:06 PM, <karl.wri...@nokia.com> wrote: > I tried commenting out the final OR term, and that excluded all records that > were out-of-language as expected. It's just the boost that doesn't seem to > work. I see a lot of unexpected zeros - queryNorm has factors if idf and the boost in it - the fact that it's 0 suggests that you used a 0 boost. Why don't you do a toString() on your query and see if it's what you expect. -Yonik http://www.lucidimagination.com > Exploring the explain is challenging because of its size, but there are NO > boosts recorded of the size I am using (10.0). Here's the basic structure of > the first result. > > 0.0 = (MATCH) sum of: > 0.0 = (MATCH) sum of: > 0.0 = (MATCH) weight(language:eng in 52867945), product of: > 0.0 = queryWeight(language:eng), product of: > 1.0 = idf(docFreq=23889670, maxDocs=59327671) > 0.0 = queryNorm > 1.0 = (MATCH) fieldWeight(language:eng in 52867945), product of: > 1.0 = tf(termFreq(language:eng)=0) > 1.0 = idf(docFreq=23889670, maxDocs=59327671) > 1.0 = fieldNorm(field=language, doc=52867945) > 0.0 = (MATCH) product of: > 0.0 = (MATCH) sum of: > 0.0 = (MATCH) CutoffQueryWrapper((+(othervalue_5:banker^5.997396E-4 > othervalue_5:bucker^5.997396E-4 othervalue_5:bunder^5.997396E-4 > othervalue_5:bunker othervalue_5:bunner^5.997396E-4 > othervalue_5:burker^5.997396E-4) +value_5:hill)^0.5714286), product of: > 1.0 = boost > 0.0 = queryNorm > 0.0 = (MATCH) CutoffQueryWrapper((+(value_5:banker^5.997396E-4 > value_5:baunker^5.997396E-4 value_5:benker^5.997396E-4 > value_5:beunker^5.997396E-4 value_5:binker^5.997396E-4 > value_5:bonker^5.997396E-4 value_5:brunker^5.997396E-4 > value_5:bucker^5.997396E-4 value_5:bueker^5.997396E-4 > value_5:bunder^5.997396E-4 value_5:bunger^5.997396E-4 > value_5:bunkek^5.997396E-4 value_5:bunken^5.997396E-4 value_5:bunker > value_5:bunkers^5.997396E-4 value_5:bunkeru^5.997396E-4 > value_5:bunner^5.997396E-4 value_5:bunter^5.997396E-4 > value_5:bunzer^5.997396E-4 value_5:burker^5.997396E-4 > value_5:busker^5.997396E-4) +othervalue_5:hill)^0.5714286), product of: > 1.0 = boost > 0.0 = queryNorm > > ... > > 0.0069078947 = coord(21/3040) > 0.0 = (MATCH) product of: > 0.0 = (MATCH) sum of: > 0.0 = (MATCH) CutoffQueryWrapper((+(othervalue_5:banker^5.997396E-4 > othervalue_5:bucker^5.997396E-4 othervalue_5:bunder^5.997396E-4 > othervalue_5:bunker othervalue_5:bunner^5.997396E-4 > othervalue_5:burker^5.997396E-4) +value_5:hill)^0.5714286), product of: > 1.0 = boost > 0.0 = queryNorm > 0.0 = (MATCH) CutoffQueryWrapper((+(value_5:banker^5.997396E-4 > value_5:baunker^5.997396E-4 value_5:benker^5.997396E-4 > value_5:beunker^5.997396E-4 value_5:binker^5.997396E-4 > value_5:bonker^5.997396E-4 value_5:brunker^5.997396E-4 > value_5:bucker^5.997396E-4 value_5:bueker^5.997396E-4 > value_5:bunder^5.997396E-4 value_5:bunger^5.997396E-4 > value_5:bunkek^5.997396E-4 value_5:bunken^5.997396E-4 value_5:bunker > value_5:bunkers^5.997396E-4 value_5:bunkeru^5.997396E-4 > value_5:bunner^5.997396E-4 value_5:bunter^5.997396E-4 > value_5:bunzer^5.997396E-4 value_5:burker^5.997396E-4 > value_5:busker^5.997396E-4) +othervalue_5:hill)^0.5714286), product of: > 1.0 = boost > 0.0 = queryNorm > > ... > > 0.0069078947 = coord(21/3040) > > It looks like the PRODUCT_OF and SUM_OF, which represents the Boolean logic, > does not actually apply boost? > > Karl > > > > -----Original Message----- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of ext Yonik > Seeley > Sent: Thursday, January 20, 2011 2:36 PM > To: dev@lucene.apache.org > Subject: Re: Odd Boolean scoring behavior? > > On Thu, Jan 20, 2011 at 2:17 PM, <karl.wri...@nokia.com> wrote: >> The problem is that the LANGUAGE_BOOST boost doesn't seem to be having any >> effect. I can change it all over the place, and nothing much changes. > > Then perhaps your language term doesn't actually match anything in the > index? (i.e. how is it analyzed?) > Next step would be to get score explanations (just add debugQuery=true > if you're using Solr, or see IndexSearcher.explain() if not). > > -Yonik > http://www.lucidimagination.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org