So you should be able to use aggregation to get the first email from each
thread.
Kind of :
{
"aggs": {
"threads": {
"terms": {
"field": "thread_id"
},
"aggs": {
"first_email": {
"min": {
"field": "email_id"
}
}
}
}
}
}
06 август 2014, сряда, 17:02:21 UTC+3, Mark Fletcher написа:
>
> Each thread has a unique integer id (so, every message in a given thread
> has a particular thread id). And each email has a unique integer id as
> well.
>
> On Wednesday, August 6, 2014 6:59:36 AM UTC-7, Tihomir Lichev wrote:
>>
>> So how you can distinguish the first email from any thread ?
>> Do you have some additional parameter ?
>>
>> 06 август 2014, сряда, 16:56:10 UTC+3, Mark Fletcher написа:
>>>
>>> Thanks for your response. If I do as you suggested, a subject match will
>>> return all the messages in that thread (because they all match). I want the
>>> search results to only contain one result if there's a thread match.
>>>
>>> I suppose I could just grab all the results and then 'collapse' the
>>> thread matches, but I was hoping to be able to do something better.
>>>
>>> Thanks,
>>> Mark
>>>
>>> On Tuesday, August 5, 2014 10:12:32 PM UTC-7, Tihomir Lichev wrote:
>>>>
>>>> Isn't better to create single document for each mail with fields
>>>> "subject" and "body" (and whatever else you need from the mail) ?
>>>> This way you can search by any or all of the fields, also you can
>>>> define boosting for each field. For instance when your search matches the
>>>> subject the mail will be scored higher in the result than if it matches
>>>> the
>>>> body, and you will get single set of results.
>>>>
>>>> 06 август 2014, сряда, 02:12:52 UTC+3, Mark Fletcher написа:
>>>>>
>>>>> Hi,
>>>>>
>>>>> We're using ES to index email, specifically mailing list messages.
>>>>> We'd like search to work similar to Gmail in that we'd like to match on
>>>>> either the subject or body of the email, and if it matches on the
>>>>> subject,
>>>>> we only want to display one result for that match (say the first message
>>>>> in
>>>>> that thread). In our naive implementation, we have an ES index for
>>>>> subjects
>>>>> and another for message bodies. But that gets us two sets of results, not
>>>>> combined. Is there a better way to structure the data, or a query that
>>>>> we're missing so that we get one set of combined results?
>>>>>
>>>>> Thanks,
>>>>> Mark
>>>>>
>>>>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a8ee880a-a399-4f27-b698-c8ee8c445b68%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.