Re: Relevancy sorting of result returned

chee hoo lum Wed, 02 Apr 2014 11:02:21 -0700

Hi Ivan,

Nope i didn't disable the norm. Here's the mapping :


{
    "media": {
        "properties": {
            "AUDIO": {
                "type": "string"
            },
            "BILLINGTYPE_ID": {
                "type": "long"
            },
            "CATMEDIA_CDATE": {
                "type": "date",
                "format": "dateOptionalTime"
            },
            "CATMEDIA_NAME": {
                "type": "string"
            },
            "CATMEDIA_RANK": {
                "type": "long"
            },
            "CAT_ID": {
                "type": "long"
            },
            "CAT_NAME": {
                "type": "string",
                "analyzer": "string_lowercase",
                "include_in_all": true
            },
            "CAT_PARENT": {
                "type": "long"
            },
            "CHANNEL_ID": {
                "type": "long"
            },
            "CKEY": {
                "type": "long"
            },
            "DISPLAY_NAME": {
                "type": "string"
            },
            "FTID": {
                "type": "string"
            },
            "GENRE": {
                "type": "string"
            },
            "ITEMCODE": {
                "type": "string"
            },
            "KEYWORDS": {
                "type": "string"
            },
            "LANG_ID": {
                "type": "long"
            },
            "LONG_DESCRIPTION": {
                "type": "string"
            },
            "MAPPINGS": {
                "type": "string",
                "analyzer": "string_lowercase",
                "include_in_all": true
            },
            "MEDIA_ID": {
                "type": "long"
            },
            "MEDIA_PKEY": {
                "type": "string"
            },
            "PERFORMER": {
                "type": "string"
            },
            "PLAYER": {
                "type": "string"
            },
            "POSITION": {
                "type": "long"
            },
            "PRICE": {
                "type": "double"
            },
            "PRIORITY": {
                "type": "long"
            },
            "SHORTCODE": {
                "type": "string"
            },
            "SHORT_DESCRIPTION": {
                "type": "string"
            },
            "TYPE_ID": {
                "type": "long"
            },
            "VIEW_ID": {
                "type": "long"
            }
        }
    }
}


My client is nagging about the result relevancy returned. You know business
user always compare with google search result and stuff. lol. For now i am
scratching my head to sort this problem out. My use case is search through
by the display_name and performer and display as the closest possible in
the top of the list.

eg :

1)Happy
2)Happy
3)Be Happy

Would be deeply appreciated if you could shed me some light. Thanks






On Thu, Apr 3, 2014 at 1:51 AM, Ivan Brusic <[email protected]> wrote:

> All the documents have the same score since they have the same field
> weight, idf (always the same when you only have one search term) and term
> frequency (each document has the term once).
>
> It appears that you disabled norms on the DISPLAY_NAME field since the
> field norm is 1. Is this correct? Can you provide the mapping? If you
> disable norms, you will no longer get length normalization, which would
> provide the ordering you desire since the field norms will penalize the
> longer field, but it not might be ideal for every search. Relevancy
> ultimately depends on you and your use cases. Another option is to enable
> term vectors [1] (or index the number of terms yourself) and see if the
> resulting field has the same number of tokens returned.  Very kludgy.
>
> [1]
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-termvectors.html
>
> Cheers,
>
> Ivan
>
> On Wed, Apr 2, 2014 at 4:02 AM, chee hoo lum <[email protected]> wrote:
>
>> Hi Binh,
>>
>> The same problem again. I have the following queries :
>>
>> 1)
>>
>> {
>>   "from" : 0,
>>   "size" : 100,
>>   "explain" : true,
>>   "query" : {
>>     "filtered" : {
>>       "query" : {
>>          "multi_match": {
>>       "query": "happy",
>>       "fields": [ "DISPLAY_NAME^6", "PERFORMER" ]
>>     }
>>       },
>>       "filter" : {
>>         "query" : {
>>           "bool" : {
>>           "must" : {
>>             "term" : {
>>               "CHANNEL_ID" : "1"
>>             }
>>           }
>>         }
>>         }
>>       }
>>     }
>>   }
>> }
>>
>> However the result display in reverse order for #2 and #3. I have added
>> the boost in the DISPLAY_NAME but still yield the same behaviour :
>>
>> 1)
>> * "_score": 10.960511,*
>>                 "_source": {
>>                     "DISPLAY_NAME": "Happy",
>>                     "PRICE": 5,
>>                     "CHANNEL_ID": 1,
>>                     "CAT_PARENT": 981,
>>                     "MEDIA_ID": 390933,
>>                     "GENRE": "Happy",
>>                     "MEDIA_PKEY": "838644",
>>                     "COMPOSER": null,
>>                     "PLAYER": null,
>>                     "CATMEDIA_NAME": "*Happy*",
>>                     "FTID": null,
>>                     "VIEW_ID": 43,
>>                     "POSITION": 51399,
>>                     "ITEMCODE": null,
>>                     "CAT_ID": 982,
>>                     "PRIORITY": 80,
>>                     "CKEY": 757447,
>>                     "CATMEDIA_RANK": 3,
>>                     "BILLINGTYPE_ID": 1,
>>                     "CAT_NAME": "POP",
>>                     "KEYWORDS": null,
>>                     "LONG_DESCRIPTION": null,
>>                     "SHORT_DESCRIPTION": null,
>>                     "TYPE_ID": 74,
>>                     "ARTIST_GENDER": null,
>>                    * "PERFORMER": "Mario Pacchioli",*
>>                     "MAPPINGS": "1_43_982_POP_981_51399_5",
>>                     "SHORTCODE": null,
>>                     "CATMEDIA_CDATE": "2014-01-12T15:12:27.000Z",
>>                     "LANG_ID": 1
>>                 },
>>                 "_explanation": {
>>                     "value": 10.960511,
>>                     "description": "max of:",
>>                     "details": [
>>                         {
>>                             "value": 10.960511,
>>                             "description": "weight(DISPLAY_NAME:happy^6.0
>> in 23025) [PerFieldSimilarity], result of:",
>>                             "details": [
>>                                 {
>>                                     "value": 10.960511,
>>                                     "description": "fieldWeight in 23025,
>> product of:",
>>                                     "details": [
>>                                         {
>>                                             "value": 1,
>>                                             "description": "tf(freq=1.0),
>> with freq of:",
>>                                             "details": [
>>                                                 {
>>                                                     "value": 1,
>>                                                     "description":
>> "termFreq=1.0"
>>                                                 }
>>                                             ]
>>                                         },
>>                                         {
>>                                             "value": 10.960511,
>>                                             "description":
>> "idf(docFreq=58, maxDocs=1249243)"
>>                                         },
>>                                         {
>>                                             "value": 1,
>>                                             "description":
>> "fieldNorm(doc=23025)"
>>                                         }
>>                                     ]
>>                                 }
>>                             ]
>>                         }
>>                      ]
>>                 }
>>             }
>>
>>
>> 2)
>> "_id": "10194",
>>               *  "_score": 10.699952,*
>>                 "_source": {
>>                     "DISPLAY_NAME": "Be *Happy*",
>>                     "PRICE": 1.5,
>>                     "CHANNEL_ID": 1,
>>                     "CAT_PARENT": 557,
>>                     "MEDIA_ID": 10194,
>>                     "GENRE": "Be Happy",
>>                     "MEDIA_PKEY": "534570",
>>                     "COMPOSER": null,
>>                     "PLAYER": null,
>>                     "CATMEDIA_NAME": "Be Happy",
>>                     "FTID": null,
>>                     "VIEW_ID": 241,
>>                     "POSITION": 6733,
>>                     "ITEMCODE": "33271",
>>                     "CAT_ID": 558,
>>                     "PRIORITY": 100,
>>                     "CKEY": 528380,
>>                     "CATMEDIA_RANK": 3,
>>                     "BILLINGTYPE_ID": 1,
>>                     "CAT_NAME": "POP",
>>                     "KEYWORDS": null,
>>                     "LONG_DESCRIPTION": null,
>>                     "SHORT_DESCRIPTION": null,
>>                     "TYPE_ID": 76,
>>                     "ARTIST_GENDER": null,
>>                    * "PERFORMER": "Mary J. Blige",*
>>                     "MAPPINGS": "1_241_558_POP_557_6733_1.5",
>>                     "SHORTCODE": "0012139471",
>>                     "CATMEDIA_CDATE": "2014-01-26T20:04:46.000Z",
>>                     "LANG_ID": 1
>>                 },
>>                 "_explanation": {
>>                     "value": 10.699952,
>>                     "description": "max of:",
>>                     "details": [
>>                         {
>>                             "value": 10.699952,
>>                             "description": "weight(DISPLAY_NAME:happy^6.0
>> in 9092) [PerFieldSimilarity], result of:",
>>                             "details": [
>>                                 {
>>                                     "value": 10.699952,
>>                                     "description": "fieldWeight in 9092,
>> product of:",
>>                                     "details": [
>>                                         {
>>                                             "value": 1,
>>                                             "description": "tf(freq=1.0),
>> with freq of:",
>>                                             "details": [
>>                                                 {
>>                                                     "value": 1,
>>                                                     "description":
>> "termFreq=1.0"
>>                                                 }
>>                                             ]
>>                                         },
>>                                         {
>>                                             "value": 10.699952,
>>                                             "description":
>> "idf(docFreq=80, maxDocs=1321663)"
>>                                         },
>>                                         {
>>                                             "value": 1,
>>                                             "description":
>> "fieldNorm(doc=9092)"
>>                                         }
>>                                     ]
>>                                 }
>>                             ]
>>                         }
>>                      ]
>>                 }
>>             },
>>
>>
>>
>> 3)
>> * "_score": 10.699952,*
>>                 "_source": {
>>                     "DISPLAY_NAME": "*Happy*",
>>                     "PRICE": 1.5,
>>                     "CHANNEL_ID": 1,
>>                     "CAT_PARENT": 557,
>>                     "MEDIA_ID": 8615,
>>                     "GENRE": "Happy",
>>                     "MEDIA_PKEY": "533022",
>>                     "COMPOSER": null,
>>                     "PLAYER": null,
>>                     "CATMEDIA_NAME": "Happy",
>>                     "FTID": null,
>>                     "VIEW_ID": 241,
>>                     "POSITION": 5685,
>>                     "ITEMCODE": "11927",
>>                     "CAT_ID": 558,
>>                     "PRIORITY": 100,
>>                     "CKEY": 526838,
>>                     "CATMEDIA_RANK": 3,
>>                     "BILLINGTYPE_ID": 1,
>>                     "CAT_NAME": "POP",
>>                     "KEYWORDS": null,
>>                     "LONG_DESCRIPTION": null,
>>                     "SHORT_DESCRIPTION": null,
>>                     "TYPE_ID": 76,
>>                     "ARTIST_GENDER": null,
>>                    * "PERFORMER": "Ashanti",*
>>                     "MAPPINGS": "1_241_558_POP_557_5685_1.5",
>>                     "SHORTCODE": "0012139036",
>>                     "CATMEDIA_CDATE": "2014-01-26T20:03:44.000Z",
>>                     "LANG_ID": 1
>>                 },
>>                 "_explanation": {
>>                     "value": 10.699952,
>>                     "description": "max of:",
>>                     "details": [
>>                         {
>>                             "value": 10.699952,
>>                             "description": "weight(DISPLAY_NAME:happy^6.0
>> in 11167) [PerFieldSimilarity], result of:",
>>                             "details": [
>>                                 {
>>                                     "value": 10.699952,
>>                                     "description": "fieldWeight in 11167,
>> product of:",
>>                                     "details": [
>>                                         {
>>                                             "value": 1,
>>                                             "description": "tf(freq=1.0),
>> with freq of:",
>>                                             "details": [
>>                                                 {
>>                                                     "value": 1,
>>                                                     "description":
>> "termFreq=1.0"
>>                                                 }
>>                                             ]
>>                                         },
>>                                         {
>>                                             "value": 10.699952,
>>                                             "description":
>> "idf(docFreq=80, maxDocs=1321663)"
>>                                         },
>>                                         {
>>                                             "value": 1,
>>                                             "description":
>> "fieldNorm(doc=11167)"
>>                                         }
>>                                     ]
>>                                 }
>>                             ]
>>                         }
>>                      ]
>>                 }
>>             },
>>
>>
>> May i know how could the #2 and #3 yield the same scoring values even it
>> have different text value for both. Also how i could reverse the #2 and #3
>> as what i want the result returned is based on relevancy thus i assume that
>> it should
>> return in this order.
>>
>> 1)Happy
>> 2)Happy
>> 3)Be Happy
>>
>>
>> Thanks.
>>
>>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/RXuuSlkDSyA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD9aw%3Dh21OW_bJG4qbQ2TenQXa%2Bof8tgasVJqE16Bbysg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD9aw%3Dh21OW_bJG4qbQ2TenQXa%2Bof8tgasVJqE16Bbysg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Regards,

Chee Hoo

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGS0%2Bg__Ghng%3Dzgb_FTrEzUvyB2cejzLrqnB1iTgvuegEDK4-g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Relevancy sorting of result returned

Reply via email to