I am also very keen on answer!! If you find a solution, let me know!
Sebastian
On Thursday, 16 January 2014 15:12:23 UTC+1, Dany Gielow wrote:
>
> In my Elasticsearch index I have documents that have multiple tokens at
> the same position.
>
> I want to get a document back when I match at least one token at every
> position.
> The order of the tokens is not important. How can I accomplish that?
> I use Elasticsearch 0.90.5.
>
> *Example:*
>
> I index a document like this.
>
> {
> "field":"red car"
> }
>
>
> I use a synonym token filter that adds synonyms at the same positions as
> the original token.
> So now in the field, there are 2 positions:
>
>
> - Position 1: "red"
> - Position 2: "car", "automobile"
>
>
> *My solution for now:*
>
> To be able to ensure that all positions match, I index the maximum
> position as well.
>
> {
> "field":"red car",
> "max_position": 2
> }
>
>
> I have a custom similarity that extends from DefaultSimilarity and returns
> 1 tf(), idf() and lengthNorm(). The resulting score is the number of
> matching terms in the field.
>
> Query:
>
> {
> "custom_score": {
> "query": {
> "match": {
> "field": "a car is an automobile"
> }
> },
> "_script": "_score*100/doc[\"max_position\"]+_score"
> },
> "min_score":"100"
> }
> Enter code here...
>
>
>
> *Problem with my solution:*
> The above search should not match the document, because there is no token
> "red" in the query string. But it matches, because Elasticsearch counts the
> matches for car and automobile as two matches and that gives a score of 2
> which leads to a script score of 102, which satisfies the "min_score".
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ca3aaaaa-dffc-4714-8940-0278cf70a7cf%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.