Re: Querystring search: Tokens are out of order

Dave Reed Tue, 14 Apr 2015 13:35:30 -0700

Thanks, though unless I am misunderstanding it, the docs imply otherwise:

For example, from:
http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html


The query string is parsed into a series of *terms* and *operators*. A term 
> can be a single word — quick or brown — or a phrase, surrounded by double 
> quotes — "quick brown" — which searches for all the words in the phrase, 
> in the same order.


So what gives? :)

On Tuesday, April 14, 2015 at 1:15:24 PM UTC-7, James Macdonald wrote:
>
> You analysis of what is going on sounds correct. However, Elasticsearch's 
> results are also correct. When it analyzes the search string, your query 
> becomes a match query on "foo" AND "bar", which matches any document 
> containing both of those terms. Most queries against analyzed fields do not 
> respect the original ordering of the terms. 
>
> One thing you could try is looking into the match_phrase query (
> http://www.elastic.co/guide/en/elasticsearch/guide/master/phrase-matching.html)
>  
> which is aware of the ordering of the terms. Using the base match_phrase 
> query for "foo bar" will not match either "foo xyz bar" or "bar xyz foo". 
> If you still need to match things like  "foo xyz bar" you may be able to do 
> that using the slop parameter, depending on what exactly the use case is. 
>
> James
>
> On Tue, Apr 14, 2015 at 2:03 PM, Dave Reed <[email protected] 
> <javascript:>> wrote:
>
>> I have the following search:
>>
>> {
>>   "query": {
>>     "filtered": {
>>       "query": {
>>         "query_string": {
>>           "default_operator": "AND",
>>           "query": "details:foo\\-bar"
>>         }
>>       },
>>       "filter": {
>>         "term": {
>>           "deleted": false
>>         }
>>       }
>>     }
>>   }
>> }
>>
>>
>>
>> The details field is analyzed using pattern tokenizer, as so:
>>
>> settings: {
>>   index.analysis.analyzer.letterordigit.pattern: "[^\\p{L}\\p{N}]+",
>>   index.analysis.analyzer.letterordigit.type: "pattern"
>> }
>>
>>
>> This breaks the field into tokens separated by any non-letter or 
>> non-numeric character. 
>>
>> But the user is searching for "foo-bar" which contains a non alphanumeric 
>> character. I assume, but correct me if I'm wrong, that ES will apply the 
>> same analyzer to that string. So it is broken into two tokens: ["foo", 
>> "bar"], and then the default_operator kicks in and essentially turns the 
>> query into "details:foo AND detail:bar".
>>
>> My problem is that it will match documents containing "foo xyz bar" and 
>> "bar xyz foo" -- in the latter case, the tokens are in the reverse order 
>> from the user's search. I'm fine with it matching the former, but it's a 
>> stretch to convince the user that the latter is intended.
>>
>> The search string is provided by the user, so I can't really build a 
>> complex query with different query types, hence the basic querystring 
>> search. 
>>
>> Any advice or corrections to my assumptions is appreciated!
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/4a204214-f209-48dd-a13a-96463609ad7d%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/4a204214-f209-48dd-a13a-96463609ad7d%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7a355b94-358f-4c5a-ac16-31ac7a0c0abe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Querystring search: Tokens are out of order

Reply via email to