I agree with Allesandro the behavior you're describing
is _not_ correct at all given your description. So either

1> There's something "interesting" about your configuration
      that doesn't seem important that you haven't told us,
      although what it could be is a mystery to me  too ;)

2> it's matching on something else. Note that the
     phrase has been stemmed, so something in there
     besides management might stem to manag and/or
    something other than changes might stem to chang
    and the two of _them_ happen to be next to each
    other. "are managers changing?" for instance. Or
    even something less likely. Perhaps turn on
    highlighting and see if it pops out?


3> you've uncovered a bug. Although I suspect others
    would have reported it and the unit tests would have
    barfed all over the place.

One other thing you can do. Go to the admin/analysis
page and turn on the "verbose" check box. Put
management is undergoing many changes
in both the query and index boxes. The result (it's
kind of hard to read I'll admit) will include the position
of each token after all the analysis is done. Phrase
queries (without slop) should only be matching adjacent
positions. So the question is whether the position info
"looks correct"....

Best,
Erick

On Tue, Jun 16, 2015 at 4:40 AM, Alessandro Benedetti
<benedetti.ale...@gmail.com> wrote:
> According to your debug you are using a default Lucene Query Parser.
> This surprise me as i would expect with that query a match with distance 0
> between the 2 terms .
>
> Are you sure nothing else is that field that matches the phrase query ?
>
> From the documentation
>
> "Lucene supports finding words are a within a specific distance away. To do
> a proximity search use the tilde, "~", symbol at the end of a Phrase. For
> example to search for a "apache" and "jakarta" within 10 words of each
> other in a document use the search:
>
> "jakarta apache"~10 "
>
>
> Cheers
>
>
> 2015-06-16 11:33 GMT+01:00 Alistair Young <alistair.yo...@uhi.ac.uk>:
>
>> it¹s a useful behaviour. I¹d just like to understand where it¹s deciding
>> the document is relevant. debug output is:
>>
>> <lst name="debug">
>>   <str name="rawquerystring">dc.description:"manage change"</str>
>>   <str name="querystring">dc.description:"manage change"</str>
>>   <str name="parsedquery">PhraseQuery(dc.description:"manag chang")</str>
>>   <str name="parsedquery_toString">dc.description:"manag chang"</str>
>>   <lst name="explain">
>>     <str name="tst:test">
>> 1.2008798 = (MATCH) weight(dc.description:"manag chang" in 221)
>> [DefaultSimilarity], result of:
>>   1.2008798 = fieldWeight in 221, product of:
>>     1.0 = tf(freq=1.0), with freq of:
>>       1.0 = phraseFreq=1.0
>>     9.6070385 = idf(), sum of:
>>       4.0365543 = idf(docFreq=101, maxDocs=2125)
>>       5.5704846 = idf(docFreq=21, maxDocs=2125)
>>     0.125 = fieldNorm(doc=221)
>> </str>
>>   </lst>
>>   <str name="QParser">LuceneQParser</str>
>>   <lst name="timing">
>>     <double name="time">41.0</double>
>>     <lst name="prepare">
>>       <double name="time">3.0</double>
>>       <lst name="query">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="facet">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="mlt">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="highlight">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="stats">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="debug">
>>         <double name="time">0.0</double>
>>       </lst>
>>     </lst>
>>     <lst name="process">
>>       <double name="time">35.0</double>
>>       <lst name="query">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="facet">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="mlt">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="highlight">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="stats">
>>         <double name="time">0.0</double>
>>       </lst>
>>       <lst name="debug">
>>         <double name="time">35.0</double>
>>       </lst>
>>     </lst>
>>   </lst>
>> </lst>
>>
>>
>> thanks,
>>
>> Alistair
>>
>> --
>> mov eax,1
>> mov ebx,0
>> int 80h
>>
>>
>>
>>
>> On 16/06/2015 11:26, "Alessandro Benedetti" <benedetti.ale...@gmail.com>
>> wrote:
>>
>> >Can you show us how the query is parsed ?
>> >You didn't tell us nothing about the query parser you are using.
>> >Enable the debugQuery=true will show you how the query is parsed and this
>> >will be quite useful for us.
>> >
>> >
>> >Cheers
>> >
>> >2015-06-16 11:22 GMT+01:00 Alistair Young <alistair.yo...@uhi.ac.uk>:
>> >
>> >> Hiya,
>> >>
>> >> I've been looking for documentation that would point to where I could
>> >> modify or explain why 'near neighbours' are returned from a phrase
>> >>search.
>> >> If I search for:
>> >>
>> >> "manage change"
>> >>
>> >> I get back a document that contains "this will help in your management
>> >>of
>> >> <lots more words...> changes". It's relevant but I'd like to understand
>> >>why
>> >> solr is returning it. Is it a combination of fuzzy/slop? The distance
>> >> between the two variations of the two words in the document is quite
>> >>large.
>> >>
>> >> thanks,
>> >>
>> >> Alistair
>> >>
>> >> --
>> >> mov eax,1
>> >> mov ebx,0
>> >> int 80h
>> >>
>> >
>> >
>> >
>> >--
>> >--------------------------
>> >
>> >Benedetti Alessandro
>> >Visiting card : http://about.me/alessandro_benedetti
>> >
>> >"Tyger, tyger burning bright
>> >In the forests of the night,
>> >What immortal hand or eye
>> >Could frame thy fearful symmetry?"
>> >
>> >William Blake - Songs of Experience -1794 England
>>
>>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England

Reply via email to