In my Elasticsearch index I have documents that have multiple tokens at the 
same position.

I want to get a document back when I match at least one token at every 
position.
The order of the tokens is not important. How can I accomplish that?
I use Elasticsearch 0.90.5.

*Example:*

I index a document like this.
    
    {
        "field":"red car"
    }


I use a synonym token filter that adds synonyms at the same positions as 
the original token.
So now in the field, there are 2 positions:


   - Position 1: "red"
   - Position 2: "car", "automobile"
   

*My solution for now:*

To be able to ensure that all positions match, I index the maximum position 
as well.

    {
        "field":"red car",
        "max_position": 2
    }


I have a custom similarity that extends from DefaultSimilarity and returns 
1 tf(), idf() and lengthNorm(). The resulting score is the number of 
matching terms in the field.

Query:

    {
        "custom_score": {
            "query": {
                 "match": {
                     "field": "a car is an automobile"
                 }
            },
            "_script": "_score*100/doc[\"max_position\"]+_score"
        },
        "min_score":"100"
    }
Enter code here...



*Problem with my solution:*
The above search should not match the document, because there is no token 
"red" in the query string. But it matches, because Elasticsearch counts the 
matches for car and automobile as two matches and that gives a score of 2 
which leads to a script score of 102, which satisfies the "min_score".

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/77d52c69-8862-4e10-8036-470bf4ca8189%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to