The reason the that these percolator queries don't match, has nothing to do
with the percolator itself, but with text analysis. The default analyzer
for string fields just also breaks up the id by dash and the terms filter
and query require exact matches, which result in no matches. On the other
hand the match query is smart enough to check if a field is analyzed and
use the analyzer that is configured in the mapping, in that case the with
the match query the percolator query does match.

In your case if you want to make the terms filter/query match your document
being percolated you should do the following:
1)
curl -XPUT 'localhost:9200/my-index' -d '
{
  "mappings": {
    "my-type" : {
      "properties": {
        "id" : {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}'

2)
curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
  "query": {
    "terms": {
      "id": [
        "1aa808dc-48f0-4de3-8978-a0293d54b852",
        "6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"
      ]
    }
  },
  "type" : "my-type"
}'

By specifying that the type is 'my-type', the percolator will tell query
parsing to use not analyze the id field values and just take the values as
is.

3)
curl -XGET 'localhost:9200/my-index/my-type/_percolate' -d '{
  "doc": {
    "id": "1aa808dc-48f0-4de3-8978-a0293d54b852"
  }
}'




On 16 May 2014 01:09, JGL <[email protected]> wrote:

>
> Hi Martijin,
>
> Thanks for the reply. The analyzer breaking up the UUID explains a lot why
> the UUIDs are not matched as a whole.
>
> I am still wondering if we can register other types of queries other than
> match query into percolator.  We would like to put a list of values into a
> query for the "id" field, which is meant as a device ID, so that when we
> percolate a document with a device ID, all percolator queries whose ID list
> contains the device ID can be considered as a match.
>
> But according to our experimentation, queries like the following are not
> working with percolator, which seems only happy with match queries:
>
>
> {
>       "_index" : "my_idx",
>       "_type" : ".percolator",
>       "_id" : "inQuery",
>       "_score" : 1.0, "_source" : 
> {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
> },
>
>
> {
>       "_index" : "my_idx",
>       "_type" : ".percolator",
>       "_id" : "inFilterQ",
>       "_score" : 1.0, "_source" : 
> {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
>     },
>
>
> I could not find any resources clearly state that percolator can only work
> with match queries. Is it actually the case?
>
> Thanks,
> Jason
>
>
> On Friday, May 9, 2014 10:04:51 PM UTC+12, Martijn v Groningen wrote:
>
>> I think the issue here is that the 'id' field is analyzed and your UUIDS
>> are broken up into separate tokens. The standard analyzer is responsible
>> for breaking up by '-'. If you use the analyze api you can see what happens
>> with your uuids:
>> curl -XGET 'localhost:9200/_analyze?text=1aa808dc-48f0-4de3-8978-a0293d54b852
>> 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 1234fd1a-cd04-4e3c-8f38-
>> aaa87142380d&tokenizer=standard'
>>
>> The 'id' field in ES is not used as the id field. In ES the _id field is
>> used to store the unique identifier and that field is not analyzed.
>> Assuming that the 'id' field has the same value as the id of a document
>> then you can use the `ids` query instead in your percolator queries:
>> http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/query-dsl-ids-query.html#query-dsl-ids-query
>>
>> Martijn
>>
>>
>> On 9 May 2014 09:20, JGL <[email protected]> wrote:
>>
>>> Can anybody help plz?
>>>
>>>
>>> On Wednesday, May 7, 2014 6:29:35 PM UTC+12, JGL wrote:
>>>>
>>>> Can anybody help plz?
>>>>
>>>> On Tuesday, May 6, 2014 11:53:32 AM UTC+12, JGL wrote:
>>>>>
>>>>>
>>>>> Can anybody help plz?
>>>>>
>>>>> On Monday, May 5, 2014 10:24:09 AM UTC+12, JGL wrote:
>>>>>>
>>>>>>
>>>>>> Hi Martjin,
>>>>>>
>>>>>> The percolator query in the 1st post above is what we registered to
>>>>>> the percolator and kinda working, which consolidate all IDs in one query
>>>>>> string for a match query, which seems not quite a elegant solution to us.
>>>>>>
>>>>>> {
>>>>>>       "_index" : "my_idx",
>>>>>>       "_type" : ".percolator",
>>>>>>       "_id" : "my_query_id",
>>>>>>       "_score" : 1.0,
>>>>>>       "_source" : {
>>>>>>                 "query":{
>>>>>>                        "match":{
>>>>>>                               "id":{
>>>>>>                                   "query":"id1 id2 id3",
>>>>>>
>>>>>>
>>>>>>
>>>>>>                                   "type":"boolean"
>>>>>>                                    }
>>>>>>                                }
>>>>>>                         }
>>>>>>                   }
>>>>>> }
>>>>>>
>>>>>>
>>>>>> Another issue is that the above solution is not quite accurate when
>>>>>> the IDs are UUIDs. For example, if the query we register is as the 
>>>>>> following
>>>>>>
>>>>>> {
>>>>>>       "_index" : "my_idx",
>>>>>>       "_type" : ".percolator",
>>>>>>       "_id" : "my_query_id",
>>>>>>       "_score" : 1.0,
>>>>>>       "_source" : {
>>>>>>                 "query":{
>>>>>>                        "match":{
>>>>>>                               "id":{
>>>>>>                                   
>>>>>> "query":"1aa808dc-48f0-4de3-8978-*a0293d54b852* 
>>>>>> 6b256fd1-cd04-4e3c-8f38-aaa87ac2220d 
>>>>>> 1234fd1a-cd04-4e3c-8f38-aaa87142380d",
>>>>>>
>>>>>>
>>>>>>
>>>>>>                                   "type":"boolean"
>>>>>>                                    }
>>>>>>                                }
>>>>>>                         }
>>>>>>                   }
>>>>>> }
>>>>>>
>>>>>>
>>>>>> , the percolator return the above query as a match if the document we
>>>>>> try to percolate is "{"doc" : {"id":"1aa808dc-48f0-4de3-8978-
>>>>>> *00293d54b852*"}}", though we are expecting a no match response here
>>>>>> as the id in the document does not have a matched ID in the query String.
>>>>>>
>>>>>> Such false positive response, according to the experimentations we
>>>>>> had, happens when the doc UUID is almost the same to one of the IDs in 
>>>>>> the
>>>>>> query except the the last part of ID. Wondering if there is an 
>>>>>> explanation
>>>>>> for such behavior of elasticsearch?
>>>>>>
>>>>>> Our another question is if there is any way we could put the UUID
>>>>>> list as a list into a query that is working with the percolator, like 
>>>>>> what
>>>>>> we can do for inQuery or inFilter. We tried register an inQuery or a 
>>>>>> query
>>>>>> wrapping an inFilter. Non of them can work with the percolator, seems the
>>>>>> percolator only works with the MatchQuery, in which we cannot put the 
>>>>>> UUID
>>>>>> list as a list.
>>>>>>
>>>>>> For example the following two queries we tried are not working with
>>>>>> percolator:
>>>>>>
>>>>>> {
>>>>>>       "_index" : "my_idx",
>>>>>>       "_type" : ".percolator",
>>>>>>       "_id" : "inQuery",
>>>>>>       "_score" : 1.0, "_source" : 
>>>>>> {"query":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d54b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}
>>>>>> },
>>>>>>
>>>>>>
>>>>>> {
>>>>>>       "_index" : "my_idx",
>>>>>>       "_type" : ".percolator",
>>>>>>       "_id" : "inFilterQ",
>>>>>>       "_score" : 1.0, "_source" : 
>>>>>> {"query":{"filtered":{"query":{"match_all":{}},"filter":{"terms":{"id":["1aa808dc-48f0-4de3-8978-a0293d50b852","6b256fd1-cd04-4e3c-8f38-aaa87ac2220d"]}}}}}
>>>>>>     },
>>>>>>
>>>>>> Thanks for your help!
>>>>>>
>>>>>> Jason
>>>>>>
>>>>>>
>>>>>> On Friday, May 2, 2014 7:34:47 PM UTC+12, Martijn v Groningen wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Can you share the stored percolator queries and the percolate
>>>>>>> request that you were initially trying with, but didn't work?\
>>>>>>>
>>>>>>> Martijn
>>>>>>>
>>>>>>>
>>>>>>> On 2 May 2014 11:14, JGL <[email protected]> wrote:
>>>>>>>
>>>>>>>> Can anybody help plz?
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "elasticsearch" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/4ee60836-192
>>>>>>>> 2-43e0-8d9b-64ef9bb0b00a%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/4ee60836-1922-43e0-8d9b-64ef9bb0b00a%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Met vriendelijke groet,
>>>>>>>
>>>>>>> Martijn van Groningen
>>>>>>>
>>>>>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>>
>


-- 
Met vriendelijke groet,

Martijn van Groningen

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Tzc0b0ddpNAg31Y2sPw5tj9Xr8aXts4GRvORnOzhaxyQA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to