incomplete proximity boost for fielded searches

2014-08-28 Thread Burgmans, Tom
Consider query:
http://10.208.152.231:8080/solr/wkustaldocsphc_A/search?q=title:(Michigan 
Corporate Income Tax)debugQuery=truepf=titleps=255defType=edismax

The intention is to perform a search in field title and to apply a proximity 
boost within a window of 255 words. If I look at the debug information, I see:

str name=parsedquery
BoostedQuery(boost(+((title:michigan title:corporate title:income title:tax)~4) 
(title:corporate income tax~255)~1.0))
/str

Note that the first search term (michigan) is missing in the proximity boost 
clause. I can't believe this is intended behavior. 

Why is edismax splitting  (title:Michigan) and (Corporate Income Tax) while 
determining what to use for proximity boost?

Thanks, Tom


Re: incomplete proximity boost for fielded searches

2014-08-28 Thread Erick Erickson
feels like a JIRA to me.

This _does_ seem weird.

if I omit the field qualification, i.e. my query is:
q=Michigan
http://10.208.152.231:8080/solr/wkustaldocsphc_A/search?q=title:(Michigan
Corporate
Income TaxdebugQuery=truepf=titleps=255defType=edismax
it works fine.

I can get the results I think you expect by omitting the field qualifier
and defining my default search field as:
q=Michigan
http://10.208.152.231:8080/solr/wkustaldocsphc_A/search?q=title:(Michigan
Corporate
Income TaxdebugQuery=truepf=titleps=255defType=edismaxdf=title

But the fact that you get the results feels like a bug. Or at least
something that I don't understand.

Feels like a bug to me, do others agree?

Can you raise a JIRA? on this?

Best,
Erick


On Thu, Aug 28, 2014 at 7:41 AM, Burgmans, Tom 
tom.burgm...@wolterskluwer.com wrote:

 Consider query:
 http://10.208.152.231:8080/solr/wkustaldocsphc_A/search?q=title:(Michigan
 Corporate Income Tax)debugQuery=truepf=titleps=255defType=edismax

 The intention is to perform a search in field title and to apply a
 proximity boost within a window of 255 words. If I look at the debug
 information, I see:

 str name=parsedquery
 BoostedQuery(boost(+((title:michigan title:corporate title:income
 title:tax)~4) (title:corporate income tax~255)~1.0))
 /str

 Note that the first search term (michigan) is missing in the proximity
 boost clause. I can't believe this is intended behavior.

 Why is edismax splitting  (title:Michigan) and (Corporate Income Tax)
 while determining what to use for proximity boost?

 Thanks, Tom