Re: how scoring computed in wildcard and prefix query

Dan Tuffery Thu, 10 Apr 2014 22:26:22 -0700

The scoring is computed using the Lucene scoring:

https://lucene.apache.org/core/3_6_2/api/all/org/apache/lucene/search/Similarity.html


the *idf* is the inverse document frequency, which gives a higher score to 
the rarer terms in the index. The term '*Hap*penings' appears four times in 
your index, the term 'Happier' in appears twice your index therefore it has 
a higher score for idf.

Dan

On Friday, April 11, 2014 4:45:58 AM UTC+1, cyrilforce wrote:
>
> Hi,
>
> I have a question on how the scoring being computed on the following query 
> : 
>
> {
>   "from" : 0,
>   "size" : 60,
>   "explain" : true,
>   "track_scores" : true,
>   "query" : {
>         "bool" : {
>                 "should" : [ 
>                   { "prefix": { "DISPLAY_NAME" : { "value" : "*hap*", 
> "rewrite" : "top_terms_10", "boost" : "3.0" }}},
>                   { "prefix": {"PERFORMER" :{ "value" : "*hap*" }}}
>                   
>                 ]
>         }
>    }
>  }
>
> and it produces result :
>
> 1) 
>  "DISPLAY_NAME": "*Happier?*",
> , "_explanation": {
>                     "value": *2.7100196*,
>                     "description": "product of:",
>                     "details": [
>                         {
>                             "value": 5.420039,
>                             "description": "sum of:",
>                             "details": [
>                                 {
>                                     "value": 5.420039,
>                                     "description": "sum of:",
>                                     "details": [
>                                         {
>                                             "value": 5.420039,
>                                             "description": 
> "weight(DISPLAY_NAME:happier^3.0 in 32661) [PerFieldSimilarity], result 
> of:",
>                                             "details": [
>                                                 {
>                                                     "value": 5.420039,
>                                                     "description": 
> "score(doc=32661,freq=1.0 = termFreq=1.0\n), product of:",
>                                                     "details": [
>                                                         {
>                                                             "value": 
> 0.34746242,
>                                                             "description": 
> "queryWeight, product of:",
>                                                             "details": [
>                                                                 {
>                                                                     
> "value": 3,
>                                                                     
> "description": "boost"
>                                                                 },
>                                                                 {
>                                                                  *   
> "value": 15.598923,*
> *                                                                    
> "description": "idf(docFreq=2, maxDocs=6566786)"*
>                                                                 },
>                                                                 {
>                                                                    * 
> "value": 0.0074249236,*
> *                                                                    
> "description": "queryNorm"*
>                                                                 }
>                                                             ]
>                                                         },
>                                                         {
>                                                             "value": 
> 15.598923,
>                                                             "description": 
> "fieldWeight in 32661, product of:",
>                                                             "details": [
>                                                                 {
>                                                                     
> "value": 1,
>                                                                     
> "description": "tf(freq=1.0), with freq of:",
>                                                                     
> "details": [
>                                                                         {
>                                                                           
>   "value": 1,
>                                                                           
>   "description": "termFreq=1.0"
>                                                                         }
>                                                                     ]
>                                                                 },
>                                                                 {
>                                                                  *   
> "value": 15.598923,*
> *                                                                    
> "description": "idf(docFreq=2, maxDocs=6566786)"*
>                                                                 },
>                                                                 {
>                                                                     
> "value": 1,
>                                                                     
> "description": "fieldNorm(doc=32661)"
>                                                                 }
>                                                             ]
>                                                         }
>                                                     ]
>                                                 }
>                                             ]
>                                         }
>                                     ]
>                                 }
>                             ]
>                         },
>                         {
>                             "value": 0.5,
>                             "description": "coord(1/2)"
>                         }
>
>
> 2)
> "DISPLAY_NAME": *"Hap*penings",
> ,
>                 "_explanation": {
>                     "value": *2.5354335*,
>                     "description": "product of:",
>                     "details": [
>                         {
>                             "value": 5.070867,
>                             "description": "sum of:",
>                             "details": [
>                                 {
>                                     "value": 5.070867,
>                                     "description": "sum of:",
>                                     "details": [
>                                         {
>                                             "value": 5.070867,
>                                             "description": 
> "weight(DISPLAY_NAME:happenings^3.0 in 23093) [PerFieldSimilarity], result 
> of:",
>                                             "details": [
>                                                 {
>                                                     "value": 5.070867,
>                                                     "description": 
> "score(doc=23093,freq=1.0 = termFreq=1.0\n), product of:",
>                                                     "details": [
>                                                         {
>                                                             "value": 
> 0.33608392,
>                                                             "description": 
> "*queryWeight*, product of:",
>                                                             "details": [
>                                                                 {
>                                                                     
> "value": 3,
>                                                                     
> "description": "boost"
>                                                                 },
>                                                                 {
>                                                                     *"value": 
> 15.088098,*
> *                                                                    
> "description": "idf(docFreq=4, maxDocs=6566786)"*
>                                                                 },
>                                                                 {
>                                                                 *    
> "value": 0.0074249236,*
> *                                                                    
> "description": "queryNorm"*
>                                                                 }
>                                                             ]
>                                                         },
>                                                         {
>                                                             "value": 
> 15.088098,
>                                                             "description": 
> "*fieldWeight *in 23093, product of:",
>                                                             "details": [
>                                                                 {
>                                                                     
> "value": 1,
>                                                                     
> "description": "tf(freq=1.0), with freq of:",
>                                                                     
> "details": [
>                                                                         {
>                                                                           
>   "value": 1,
>                                                                           
>   "description": "termFreq=1.0"
>                                                                         }
>                                                                     ]
>                                                                 },
>                                                                 {
>                                                                 *    
> "value": 15.088098,*
> *                                                                    
> "description": "idf(docFreq=4, maxDocs=6566786)"*
>                                                                 },
>                                                                 {
>                                                                     
> "value": 1,
>                                                                     
> "description": "fieldNorm(doc=23093)"
>                                                                 }
>                                                             ]
>                                                         }
>                                                     ]
>                                                 }
>                                             ]
>                                         }
>                                     ]
>                                 }
>                             ]
>                         },
>                         {
>                             "value": 0.5,
>                             "description": "coord(1/2)"
>                         }
>                     ]
>                 }
>             }
>
>
> As both of the display name in the documents matched "*Hap*" it should 
> have same scoring however it yields different scoring as shown above. 
> Further inspection on the explaining i found
> out that the different is in the queryWeight->idf and fieldWeight->idf 
> fields :
>
> 1) * "value": 15.598923,*
> *     "description": "idf(docFreq=2, maxDocs=6566786)"*
>
> *2)*    *"value": 15.088098,*
> *         "description": "idf(docFreq=4, maxDocs=6566786)"*
>
>
> I would like to know why the value is different and how this is being 
> computed and what is docFreq ? Also i would like to know what is 
> queryWeight as when i use wildcard and prefix query it only will computed 
> the score with queryWeight otherwise only fieldWeight.
>
> I am using *&search_type=dfs_query_then_fetch&preference=_primary* in the 
> query.
>
> And here is the gist for full result :
> https://gist.github.com/cheehoo/10439849
>
>
> Thanks.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eb221209-ddc9-4257-9a5f-a1f11a39f088%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: how scoring computed in wildcard and prefix query

Reply via email to