So you should be able to use aggregation to get the first email from each 
thread.
Kind of :

{
 "aggs": {
   "threads": {
     "terms": {
       "field": "thread_id"
     }, 
     "aggs": {
       "first_email": {
         "min": {
           "field": "email_id"
         }
       }
     }
   }
 }
}

06 август 2014, сряда, 17:02:21 UTC+3, Mark Fletcher написа:
>
> Each thread has a unique integer id (so, every message in a given thread 
> has a particular thread id). And each email has a unique integer id as 
> well. 
>
> On Wednesday, August 6, 2014 6:59:36 AM UTC-7, Tihomir Lichev wrote:
>>
>> So how you can distinguish the first email from any thread ?
>> Do you have some additional parameter ?
>>
>> 06 август 2014, сряда, 16:56:10 UTC+3, Mark Fletcher написа:
>>>
>>> Thanks for your response. If I do as you suggested, a subject match will 
>>> return all the messages in that thread (because they all match). I want the 
>>> search results to only contain one result if there's a thread match. 
>>>
>>> I suppose I could just grab all the results and then 'collapse' the 
>>> thread matches, but I was hoping to be able to do something better.
>>>
>>> Thanks,
>>> Mark
>>>
>>> On Tuesday, August 5, 2014 10:12:32 PM UTC-7, Tihomir Lichev wrote:
>>>>
>>>> Isn't better to create single document for each mail with fields 
>>>> "subject" and "body" (and whatever else you need from the mail) ?
>>>> This way you can search by any or all of the fields, also you can 
>>>> define boosting for each field. For instance when your search matches the 
>>>> subject the mail will be scored higher in the result than if it matches 
>>>> the 
>>>> body, and you will get single set of results.
>>>>
>>>> 06 август 2014, сряда, 02:12:52 UTC+3, Mark Fletcher написа:
>>>>>
>>>>> Hi,
>>>>>
>>>>> We're using ES to index email, specifically mailing list messages. 
>>>>> We'd like search to work similar to Gmail in that we'd like to match on 
>>>>> either the subject or body of the email, and if it matches on the 
>>>>> subject, 
>>>>> we only want to display one result for that match (say the first message 
>>>>> in 
>>>>> that thread). In our naive implementation, we have an ES index for 
>>>>> subjects 
>>>>> and another for message bodies. But that gets us two sets of results, not 
>>>>> combined. Is there a better way to structure the data, or a query that 
>>>>> we're missing so that we get one set of combined results?
>>>>>
>>>>> Thanks,
>>>>> Mark
>>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a8ee880a-a399-4f27-b698-c8ee8c445b68%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to