Sorry about the previous message, I had some formatting issues. Below is the
actual message!

oleg_gnatovskiy wrote:
> 
> Hello everyone.
> 
> I've run into a weird problem with Solr's ranking engine. In a nutshell,
> the problem involves certain results getting EXTREMELY high rank scores.
> Here is an example:
> 
> locRvwText:"Pizza Pizza"^10 OR locName:"Pizza Pizza"^30
> 
> The way I understand it is that the locName part of the query should be
> boosted 3x more then the locRvwText.
> However, when running this query the first result is:
> 
> <float name="score">10.8226</float>
> <str name="locName">Johnnie's New York Pizzeria</str>
> <arr name="locRvwText">
> <str>
> Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> Pizza... Pizza... Pizza... Pizza... Pizza...
> </str>
> </arr>
> <lst name="explain">
> 
>       <str name="id=157789,internal_docid=3792465">
> 
> 10.8226 = (MATCH) product of:
>   21.6452 = (MATCH) sum of:
>     21.6452 = weight(locRvwText:"pizza pizza"^10.0 in 3792465), product
> of:
>       0.3354544 = queryWeight(locRvwText:"pizza pizza"^10.0), product of:
>         10.0 = boost
>         14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
>         0.0023249863 = queryNorm
>       64.52502 = fieldWeight(locRvwText:"pizza pizza" in 3792465), product
> of:
>         4.472136 = tf(phraseFreq=20.0)
>         14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
>         1.0 = fieldNorm(field=locRvwText, doc=3792465)
>   0.5 = coord(1/2)
> </str>
> </lst>
> 
> 
> How come the phrase frequency for rvwText comes back as 20? The field
> rvwText is defined in the following way:
> 
> <field name="locRvwText" type="text" index="false" stored="true"
> required="false" multiValued="true"  omitNorms="true"/>
> 
> And my text fields are defined in the following way:
> 
> <fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       <!-- in this example, we will only use synonyms at query time -->
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>        
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>     </fieldtype>
> 
> Forgive me if I am wrong, but shouldn't the
> RemoveDuplicatesTokenFilterFactory have the string "Pizza... Pizza...
> Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> Pizza... Pizza... Pizza..." Count as simplu one Pizza?<br>
> I'd appreciate any help I can get! 
> 
> Thanks!
> 

-- 
View this message in context: 
http://www.nabble.com/Question-regarding-Solr-ranking-tp15719752p15719834.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to