Hello!

I am using the multi_match's cross_field 
<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#type-cross-fields>
 query. 
It works very well and is exactly what I need. However, in some rare 
circumstances the order of the results doesn't match my expectations. It 
turns out that the scoring of the first results is much higher than the 
score of the rest of the documents. I had a closer look at the explain 
statements and was surprised by the way the scores were calculated:

for the first doc:
{
   "value": 8.252264,
   "description": "fieldWeight in 998806, product of:",
   "details": [
      {
         "value": 1,
         "description": "tf(freq=1.0), with freq of:",
         "details": [
            {
               "value": 1,
               "description": "termFreq=1.0"
            }
         ]
      },
      {
         "value": 8.252264,
         "description": "idf(docFreq=13182, maxDocs=18605118)"
      },
      {
         "value": 1,
         "description": "fieldNorm(doc=998806)"
      }
   ]
}

and for the doc that is supposed to be first:
{
   "value": 3.8485851,
   "description": "score(doc=700068,freq=1.0 = termFreq=1.0\n), product 
of:",
   "details": [
      {
         "value": 0.46578622,
         "description": "queryWeight, product of:",
         "details": [
            {
               "value": 18,
               "description": "boost"
            },
            {
               "value": 8.262557,
               "description": "idf(docFreq=13047, maxDocs=18605118)"
            },
            {
               "value": 0.0031318406,
               "description": "queryNorm"
            }
         ]
      },
      {
         "value": 8.262557,
         "description": "fieldWeight in 700068, product of:",
         "details": [
            {
               "value": 1,
               "description": "tf(freq=1.0), with freq of:",
               "details": [
                  {
                     "value": 1,
                     "description": "termFreq=1.0"
                  }
               ]
            },
            {
               "value": 8.262557,
               "description": "idf(docFreq=13047, maxDocs=18605118)"
            },
            {
               "value": 1,
               "description": "fieldNorm(doc=700068)"
            }
         ]
      }
   ]
}

You can see that the queryWeight factor is missing in the calculation of 
the first doc, which leads to a much higher total score. I am no expert in 
it, but this seems to be a bug in my eyes. Or did I misunderstand something?

You can find the query and the result list here: 
https://gist.github.com/christophlingg/0014ba3d6d334a27cccd

Thanks for your help!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3526746f-d8a9-4344-978a-9240bbd38a13%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to