Problem: 

Recently I wanted to do a proximity search on elastic search index. I 
wanted to search all docs where ‘measles’ and ‘vaccin*’ were with 25 
characters to each other. Plus I wanted both of them to be in order. 

The elastic search proximity search wasn’t an option because of two 
reasons. 

1. Proximity search doesn’t support wildcards. e.g (“measles vaccine”)~25 
is supported but (“measles vacci*”)~25 or (“measle* vacci*”) is not 
supported. 

2. Proximity search doesn’t check the respect the order of words in phrase 
e.g (“measles vaccine”)~25 and (“vaccine measles”)~25 will give same 
results. 

Solution: 
Few examples to resolve this issue using span_near 

1. (“measles vacci*”)~25 
{ 
  "query": { 
    "span_near": { 
      "clauses": [ 
        { 
          "span_or": { 
            "clauses": [ 
              { 
                "span_term": { 
                  "text": "measles" 
                } 
              } 
            ] 
          } 
        }, 
        { 
          "span_or": { 
            "clauses": [ 
              { 
                "span_multi": { 
                  "match": { 
                    "prefix": { 
                      "text": { 
                        "value": "vacci" 
                      } 
                    } 
                  } 
                } 
              } 
            ] 
          } 
        } 
      ], 
      "slop": 25, 
      "in_order": "true”, 
      "collect_payloads": "true" 
    } 
  } 
} 

// in_order can be used to toggle between ordered or unordered. 

2. “measle* vacci*” 
{ 
  "query": { 
    "span_near": { 
      "clauses": [ 
        { 
          "span_or": { 
            "clauses": [ 
              { 
                "span_multi": { 
                  "match": { 
                    "prefix": { 
                      "text": { 
                        "value": "measle" 
                      } 
                    } 
                  } 
                } 
              } 
            ] 
          } 
        }, 
        { 
          "span_or": { 
            "clauses": [ 
              { 
                "span_multi": { 
                  "match": { 
                    "prefix": { 
                      "text": { 
                        "value": "vacci" 
                      } 
                    } 
                  } 
                } 
              } 
            ] 
          } 
        } 
      ], 
      "slop": 0, 
      "in_order": "true", 
      "collect_payloads": "true" 
    } 
  } 
} 

3. Grouping. Now lets assume you want to find all docs where (canada OR 
toronto OR “North york”) NEAR (measles OR vaccin*). And they should be near 
to each other by 30 characters. 
{ 
  "query": { 
    "span_near": { 
      "clauses": [ 
        { 
          "span_or": { 
            "clauses": [ 
              { 
                "span_near": { 
                  "clauses": [ 
                    { 
                      "span_term": { 
                        "text": "North" 
                      } 
                    }, 
                    { 
                      "span_term": { 
                        "text": "york" 
                      } 
                    } 
                  ], 
                  "slop": 0, 
                  "in_order": "true", 
                  "collect_payloads": "true" 
                } 
              }, 
              { 
                "span_term": { 
                  "text": "toronto" 
                } 
              }, 
              { 
                "span_term": { 
                  "text": "canada" 
                } 
              } 
            ] 
          } 
        }, 
        { 
          "span_or": { 
            "clauses": [ 
              { 
                "span_term": { 
                  "text": "measles" 
                } 
              }, 
              { 
                "span_multi": { 
                  "match": { 
                    "prefix": { 
                      "text": { 
                        "value": "vaccin" 
                      } 
                    } 
                  } 
                } 
              } 
            ] 
          } 
        } 
      ], 
      "slop": 30, 
      "in_order": "false", 
      "collect_payloads": "true" 
    } 
  } 
} 



If any one knows better solution than this one please comment. Any 
suggestions how I can build a parser to take query from user e.g (quick AND 
near(foxes OR rats, toronto OR ontario, 30)) and convert that to elastic 
search span_near using above workaround. Boolean operators and parenthesis 
precedence is what I am finding hard to handle. Any open source PHP library 
which can help me change user written queries with parenthesis and boolean 
operator  to ES filters. 

@shay banon, @ steven @uri: any plans to have such operator support in the 
query_string 

References: 

http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-span-near-query.html

http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-span-multi-term-query.html

http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-span-or-query.html

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/248757df-b957-4c67-8df7-6ba4efc93623%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to