Hello Everyone,

I would like to implement a popularity-based boost in my elasticsearch 
engine. I calculate custom popularity boost factors for documents 
periodically, but I store these float numbers in a child document, because 
I want to avoid the full reindex of the main article documents.

The mapping of the child document is the following:

{

  "document_boost": {
    "_parent": {
      "type": "document"
    },
    "popular_boost_total": {
      "type": "float"
    },
    "popular_boost_recent": {
      "type": "float"
    },
    "last_updated": {
      "type": "date"
    }
  }
}

I would like to create query that:

   - executes the main query provided by the end users
   - attach the child document (1-1 relation to the parent)
   - boost the score of the main query by multiplying with the custom boost 
   factors that are read from the child document (popular_boost_total, 
   popular_boost_recent)
   
I have been struggling with this for a while, and could not find the real 
nice solution. The best solution that I could find is the following 
(simplified):

GET index/document/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "basketball"
          }
        }
      ],
      "should": [
        {
          "has_child": {
            "type": "document_boost",
            "query": {
              "function_score": {
                "script_score": {
                  "script": 
"doc['document_boost.popular_boost_total'].value"
                }
              }
            }
          }
        }
      ]
    }
  }
}

However, this is not a real boost, because the second bool part is an 
additional score, not a multiplication on the primary query score! In this 
case, the amount of boost cannot be expressed as a clean percentage, but a 
noisy additional score and the real boosting factor is depends on the 
absolute score value of the particular query. So, I think it is wrong.
I would be able to solve it, if the custom boost factors would not be in 
chid documents, but in the parent document fields:

GET index/document/_search
{
  "query": {
    "function_score": {
      "query": {
        "match": {
          "title": "basketball"
        }
      },
      "script_score": {
        "script": "doc['popular_boost_recent'].value"
      }
    }
  }
}

Well, it i obvious, it the above case we do not need the has_child query.
I also tried without the bool query:

GET index/document/_search
{
  "query": {
    "function_score": {
      "query": {
        "match": {
          "title": "basketball"
        }
      },
      "functions": [
        {
          "filter" : {
            "has_child": {
              "type": "document_boost",
              "query": {"match_all": {}}
            }
          },
          "script_score": {
            "script": "doc['document_boost.popular_boost_recent'].value"
          }
        }
      ]
    }
  }
}

In the above case, the script reads the value from the parent document, not 
from the child! Well, anyway, it seems a bug, since I explicitly define the 
full qualified name.

I think - considering the possibilities of the query API syntax - the last 
query above would be the solution for the real multiplication boosting, but 
it simpli does not work.
Another solution can be if I would be able to define the score mode for the 
bool query, i.e. to tell elastic search not to add, but multiply the scores 
of the parts.

Are there others who are facing with the same issue? I think it is a common 
request nowadays to have some kind of popularity and other kind of custom 
boosts.
Can somebody give me a hint? I hope I just misunderstood something...

Thanks!

Regards,
Csaba





-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/af4a19e4-1b1c-4702-a016-c88a6c76d04b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to