Thanks, though unless I am misunderstanding it, the docs imply otherwise: For example, from: http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
The query string is parsed into a series of *terms* and *operators*. A term > can be a single word — quick or brown — or a phrase, surrounded by double > quotes — "quick brown" — which searches for all the words in the phrase, > in the same order. So what gives? :) On Tuesday, April 14, 2015 at 1:15:24 PM UTC-7, James Macdonald wrote: > > You analysis of what is going on sounds correct. However, Elasticsearch's > results are also correct. When it analyzes the search string, your query > becomes a match query on "foo" AND "bar", which matches any document > containing both of those terms. Most queries against analyzed fields do not > respect the original ordering of the terms. > > One thing you could try is looking into the match_phrase query ( > http://www.elastic.co/guide/en/elasticsearch/guide/master/phrase-matching.html) > > which is aware of the ordering of the terms. Using the base match_phrase > query for "foo bar" will not match either "foo xyz bar" or "bar xyz foo". > If you still need to match things like "foo xyz bar" you may be able to do > that using the slop parameter, depending on what exactly the use case is. > > James > > On Tue, Apr 14, 2015 at 2:03 PM, Dave Reed <[email protected] > <javascript:>> wrote: > >> I have the following search: >> >> { >> "query": { >> "filtered": { >> "query": { >> "query_string": { >> "default_operator": "AND", >> "query": "details:foo\\-bar" >> } >> }, >> "filter": { >> "term": { >> "deleted": false >> } >> } >> } >> } >> } >> >> >> >> The details field is analyzed using pattern tokenizer, as so: >> >> settings: { >> index.analysis.analyzer.letterordigit.pattern: "[^\\p{L}\\p{N}]+", >> index.analysis.analyzer.letterordigit.type: "pattern" >> } >> >> >> This breaks the field into tokens separated by any non-letter or >> non-numeric character. >> >> But the user is searching for "foo-bar" which contains a non alphanumeric >> character. I assume, but correct me if I'm wrong, that ES will apply the >> same analyzer to that string. So it is broken into two tokens: ["foo", >> "bar"], and then the default_operator kicks in and essentially turns the >> query into "details:foo AND detail:bar". >> >> My problem is that it will match documents containing "foo xyz bar" and >> "bar xyz foo" -- in the latter case, the tokens are in the reverse order >> from the user's search. I'm fine with it matching the former, but it's a >> stretch to convince the user that the latter is intended. >> >> The search string is provided by the user, so I can't really build a >> complex query with different query types, hence the basic querystring >> search. >> >> Any advice or corrections to my assumptions is appreciated! >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/4a204214-f209-48dd-a13a-96463609ad7d%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/4a204214-f209-48dd-a13a-96463609ad7d%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7a355b94-358f-4c5a-ac16-31ac7a0c0abe%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
