Hi , Mikhail Thanks , I looked at the explain and this is what I see for the two different documents in questions, they have identical scores even though the document 2 has a shorter productName field, I do not see any lenghtNorm related information in the explain.
Also I am not exactly clear on what needs to be looked in the API ? *Search Query* : q=iphone+4s+16gb&qf= productName&mm=1&pf= productName&ps=1&pf2= productName&pf3= productName&stopwords=true&lowercaseOperators=true *productName Details about Apple iPhone 4s 16GB Smartphone AT&T Factory Unlocked * - *100%* 10.649221 sum of the following: - *10.58%* 1.1270299 sum of the following: - *2.1%* 0.22383358 productName:iphon - *3.47%* 0.36922288 productName:"4 s" - *5.01%* 0.53397346 productName:"16 gb" - *30.81%* 3.2814684 productName:"iphon 4 s 16 gb"~1 - *27.79%* 2.959255 sum of the following: - *10.97%* 1.1680154 productName:"iphon 4 s"~1 - *16.82%* 1.7912396 productName:"4 s 16 gb"~1 - *30.81%* 3.2814684 productName:"iphon 4 s 16 gb"~1 *productName Apple iPhone 4S 16GB for Net10, No Contract, White* - *100%* 10.649221 sum of the following: - *10.58%* 1.1270299 sum of the following: - *2.1%* 0.22383358 productName:iphon - *3.47%* 0.36922288 productName:"4 s" - *5.01%* 0.53397346 productName:"16 gb" - *30.81%* 3.2814684 productName:"iphon 4 s 16 gb"~1 - *27.79%* 2.959255 sum of the following: - *10.97%* 1.1680154 productName:"iphon 4 s"~1 - *16.82%* 1.7912396 productName:"4 s 16 gb"~1 - *30.81%* 3.2814684 productName:"iphon 4 s 16 gb"~1 On Mon, Dec 8, 2014 at 10:25 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > It's worth to look into <explain> to check particular scoring values. But > for most suspect is the reducing precision when float norms are stored in > byte vals. See javadoc for DefaultSimilarity.encodeNormValue(float) > > > On Mon, Dec 8, 2014 at 5:49 PM, S.L <simpleliving...@gmail.com> wrote: > > > I have two documents doc1 and doc2 and each one of those has a field > called > > phoneName. > > > > doc1:phoneName:"Details about Apple iPhone 4s - 16GB - White (Verizon) > > Smartphone Factory Unlocked" > > > > doc2:phoneName:"Apple iPhone 4S 16GB for Net10, No Contract, White" > > > > Here if I search for > > > > > q=iphone+4s+16gb&qf=phoneName&mm=1&pf=phoneName&ps=1&pf2=phoneName&pf3=phoneName&stopwords=true&lowercaseOperators=true > > > > Doc1 and Doc2 both have the same identical score , but since the field > > phoneName in the doc2 has shorter length I would expect it to have a > higher > > score , but both have an identical score of 9.961212. > > > > The phoneName filed is defined as follows.As we can see no where am I > > specifying omitNorms=True, still the behavior seems to be that the length > > norm is not functioning at all. Can some one let me know whats the issue > > here ? > > > > <field name="phoneName" type="text_en_splitting" indexed="true" > > stored="true" required="true" /> > > <fieldType name="text_en_splitting" class="solr.TextField" > > positionIncrementGap="100" autoGeneratePhraseQueries="true"> > > <analyzer type="index"> > > <tokenizer class="solr.WhitespaceTokenizerFactory" /> > > <!-- in this example, we will only use synonyms at query > > time <filter > > class="solr.SynonymFilterFactory" > > synonyms="index_synonyms.txt" ignoreCase="true" > > expand="false"/> --> > > <!-- Case insensitive stop word removal. add > > enablePositionIncrements=true > > in both the index and query analyzers to leave a > 'gap' > > for more accurate > > phrase queries. --> > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > words="lang/stopwords_en.txt" > > enablePositionIncrements="true" /> > > <filter class="solr.WordDelimiterFilterFactory" > > generateWordParts="1" generateNumberParts="1" > > catenateWords="1" > > catenateNumbers="1" catenateAll="0" > > splitOnCaseChange="1" /> > > <filter class="solr.LowerCaseFilterFactory" /> > > <filter class="solr.KeywordMarkerFilterFactory" > > protected="protwords.txt" /> > > <filter class="solr.PorterStemFilterFactory" /> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.WhitespaceTokenizerFactory" /> > > <filter class="solr.SynonymFilterFactory" > > synonyms="synonyms.txt" > > ignoreCase="true" expand="true" /> > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > words="lang/stopwords_en.txt" > > enablePositionIncrements="true" /> > > <filter class="solr.WordDelimiterFilterFactory" > > generateWordParts="1" generateNumberParts="1" > > catenateWords="0" > > catenateNumbers="0" catenateAll="0" > > splitOnCaseChange="1" /> > > <filter class="solr.LowerCaseFilterFactory" /> > > <filter class="solr.KeywordMarkerFilterFactory" > > protected="protwords.txt" /> > > <filter class="solr.PorterStemFilterFactory" /> > > </analyzer> > > </fieldType> > > > > > > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > <http://www.griddynamics.com> > <mkhlud...@griddynamics.com> >