I agree with Allesandro the behavior you're describing is _not_ correct at all given your description. So either
1> There's something "interesting" about your configuration that doesn't seem important that you haven't told us, although what it could be is a mystery to me too ;) 2> it's matching on something else. Note that the phrase has been stemmed, so something in there besides management might stem to manag and/or something other than changes might stem to chang and the two of _them_ happen to be next to each other. "are managers changing?" for instance. Or even something less likely. Perhaps turn on highlighting and see if it pops out? 3> you've uncovered a bug. Although I suspect others would have reported it and the unit tests would have barfed all over the place. One other thing you can do. Go to the admin/analysis page and turn on the "verbose" check box. Put management is undergoing many changes in both the query and index boxes. The result (it's kind of hard to read I'll admit) will include the position of each token after all the analysis is done. Phrase queries (without slop) should only be matching adjacent positions. So the question is whether the position info "looks correct".... Best, Erick On Tue, Jun 16, 2015 at 4:40 AM, Alessandro Benedetti <benedetti.ale...@gmail.com> wrote: > According to your debug you are using a default Lucene Query Parser. > This surprise me as i would expect with that query a match with distance 0 > between the 2 terms . > > Are you sure nothing else is that field that matches the phrase query ? > > From the documentation > > "Lucene supports finding words are a within a specific distance away. To do > a proximity search use the tilde, "~", symbol at the end of a Phrase. For > example to search for a "apache" and "jakarta" within 10 words of each > other in a document use the search: > > "jakarta apache"~10 " > > > Cheers > > > 2015-06-16 11:33 GMT+01:00 Alistair Young <alistair.yo...@uhi.ac.uk>: > >> it¹s a useful behaviour. I¹d just like to understand where it¹s deciding >> the document is relevant. debug output is: >> >> <lst name="debug"> >> <str name="rawquerystring">dc.description:"manage change"</str> >> <str name="querystring">dc.description:"manage change"</str> >> <str name="parsedquery">PhraseQuery(dc.description:"manag chang")</str> >> <str name="parsedquery_toString">dc.description:"manag chang"</str> >> <lst name="explain"> >> <str name="tst:test"> >> 1.2008798 = (MATCH) weight(dc.description:"manag chang" in 221) >> [DefaultSimilarity], result of: >> 1.2008798 = fieldWeight in 221, product of: >> 1.0 = tf(freq=1.0), with freq of: >> 1.0 = phraseFreq=1.0 >> 9.6070385 = idf(), sum of: >> 4.0365543 = idf(docFreq=101, maxDocs=2125) >> 5.5704846 = idf(docFreq=21, maxDocs=2125) >> 0.125 = fieldNorm(doc=221) >> </str> >> </lst> >> <str name="QParser">LuceneQParser</str> >> <lst name="timing"> >> <double name="time">41.0</double> >> <lst name="prepare"> >> <double name="time">3.0</double> >> <lst name="query"> >> <double name="time">0.0</double> >> </lst> >> <lst name="facet"> >> <double name="time">0.0</double> >> </lst> >> <lst name="mlt"> >> <double name="time">0.0</double> >> </lst> >> <lst name="highlight"> >> <double name="time">0.0</double> >> </lst> >> <lst name="stats"> >> <double name="time">0.0</double> >> </lst> >> <lst name="debug"> >> <double name="time">0.0</double> >> </lst> >> </lst> >> <lst name="process"> >> <double name="time">35.0</double> >> <lst name="query"> >> <double name="time">0.0</double> >> </lst> >> <lst name="facet"> >> <double name="time">0.0</double> >> </lst> >> <lst name="mlt"> >> <double name="time">0.0</double> >> </lst> >> <lst name="highlight"> >> <double name="time">0.0</double> >> </lst> >> <lst name="stats"> >> <double name="time">0.0</double> >> </lst> >> <lst name="debug"> >> <double name="time">35.0</double> >> </lst> >> </lst> >> </lst> >> </lst> >> >> >> thanks, >> >> Alistair >> >> -- >> mov eax,1 >> mov ebx,0 >> int 80h >> >> >> >> >> On 16/06/2015 11:26, "Alessandro Benedetti" <benedetti.ale...@gmail.com> >> wrote: >> >> >Can you show us how the query is parsed ? >> >You didn't tell us nothing about the query parser you are using. >> >Enable the debugQuery=true will show you how the query is parsed and this >> >will be quite useful for us. >> > >> > >> >Cheers >> > >> >2015-06-16 11:22 GMT+01:00 Alistair Young <alistair.yo...@uhi.ac.uk>: >> > >> >> Hiya, >> >> >> >> I've been looking for documentation that would point to where I could >> >> modify or explain why 'near neighbours' are returned from a phrase >> >>search. >> >> If I search for: >> >> >> >> "manage change" >> >> >> >> I get back a document that contains "this will help in your management >> >>of >> >> <lots more words...> changes". It's relevant but I'd like to understand >> >>why >> >> solr is returning it. Is it a combination of fuzzy/slop? The distance >> >> between the two variations of the two words in the document is quite >> >>large. >> >> >> >> thanks, >> >> >> >> Alistair >> >> >> >> -- >> >> mov eax,1 >> >> mov ebx,0 >> >> int 80h >> >> >> > >> > >> > >> >-- >> >-------------------------- >> > >> >Benedetti Alessandro >> >Visiting card : http://about.me/alessandro_benedetti >> > >> >"Tyger, tyger burning bright >> >In the forests of the night, >> >What immortal hand or eye >> >Could frame thy fearful symmetry?" >> > >> >William Blake - Songs of Experience -1794 England >> >> > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England