You analysis of what is going on sounds correct. However, Elasticsearch's results are also correct. When it analyzes the search string, your query becomes a match query on "foo" AND "bar", which matches any document containing both of those terms. Most queries against analyzed fields do not respect the original ordering of the terms.
One thing you could try is looking into the match_phrase query ( http://www.elastic.co/guide/en/elasticsearch/guide/master/phrase-matching.html) which is aware of the ordering of the terms. Using the base match_phrase query for "foo bar" will not match either "foo xyz bar" or "bar xyz foo". If you still need to match things like "foo xyz bar" you may be able to do that using the slop parameter, depending on what exactly the use case is. James On Tue, Apr 14, 2015 at 2:03 PM, Dave Reed <[email protected]> wrote: > I have the following search: > > { > "query": { > "filtered": { > "query": { > "query_string": { > "default_operator": "AND", > "query": "details:foo\\-bar" > } > }, > "filter": { > "term": { > "deleted": false > } > } > } > } > } > > > > The details field is analyzed using pattern tokenizer, as so: > > settings: { > index.analysis.analyzer.letterordigit.pattern: "[^\\p{L}\\p{N}]+", > index.analysis.analyzer.letterordigit.type: "pattern" > } > > > This breaks the field into tokens separated by any non-letter or > non-numeric character. > > But the user is searching for "foo-bar" which contains a non alphanumeric > character. I assume, but correct me if I'm wrong, that ES will apply the > same analyzer to that string. So it is broken into two tokens: ["foo", > "bar"], and then the default_operator kicks in and essentially turns the > query into "details:foo AND detail:bar". > > My problem is that it will match documents containing "foo xyz bar" and > "bar xyz foo" -- in the latter case, the tokens are in the reverse order > from the user's search. I'm fine with it matching the former, but it's a > stretch to convince the user that the latter is intended. > > The search string is provided by the user, so I can't really build a > complex query with different query types, hence the basic querystring > search. > > Any advice or corrections to my assumptions is appreciated! > > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/4a204214-f209-48dd-a13a-96463609ad7d%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/4a204214-f209-48dd-a13a-96463609ad7d%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAABsnTZWNp65WzwYsZVZz%3DiHon7WW90EO8SUKbnB4aHuKcd-og%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
