Thanks again for your response. I don't have much experience with aggregations, but wouldn't that just give me a set of thread id's ordered by how many messages are in each thread? In my results, it's possible to have a match on a message body be ranked higher than a match on a subject. Using this aggregation, wouldn't this just end up showing all subject matches first?
Thanks, Mark On Wed, Aug 6, 2014 at 7:08 AM, Tihomir Lichev <[email protected]> wrote: > So you should be able to use aggregation to get the first email from each > thread. > Kind of : > > { > "aggs": { > "threads": { > "terms": { > "field": "thread_id" > }, > "aggs": { > "first_email": { > "min": { > "field": "email_id" > } > } > } > } > } > } > > 06 август 2014, сряда, 17:02:21 UTC+3, Mark Fletcher написа: > >> Each thread has a unique integer id (so, every message in a given thread >> has a particular thread id). And each email has a unique integer id as >> well. >> >> On Wednesday, August 6, 2014 6:59:36 AM UTC-7, Tihomir Lichev wrote: >>> >>> So how you can distinguish the first email from any thread ? >>> Do you have some additional parameter ? >>> >>> 06 август 2014, сряда, 16:56:10 UTC+3, Mark Fletcher написа: >>>> >>>> Thanks for your response. If I do as you suggested, a subject match >>>> will return all the messages in that thread (because they all match). I >>>> want the search results to only contain one result if there's a thread >>>> match. >>>> >>>> I suppose I could just grab all the results and then 'collapse' the >>>> thread matches, but I was hoping to be able to do something better. >>>> >>>> Thanks, >>>> Mark >>>> >>>> On Tuesday, August 5, 2014 10:12:32 PM UTC-7, Tihomir Lichev wrote: >>>>> >>>>> Isn't better to create single document for each mail with fields >>>>> "subject" and "body" (and whatever else you need from the mail) ? >>>>> This way you can search by any or all of the fields, also you can >>>>> define boosting for each field. For instance when your search matches the >>>>> subject the mail will be scored higher in the result than if it matches >>>>> the >>>>> body, and you will get single set of results. >>>>> >>>>> 06 август 2014, сряда, 02:12:52 UTC+3, Mark Fletcher написа: >>>>>> >>>>>> Hi, >>>>>> >>>>>> We're using ES to index email, specifically mailing list messages. >>>>>> We'd like search to work similar to Gmail in that we'd like to match on >>>>>> either the subject or body of the email, and if it matches on the >>>>>> subject, >>>>>> we only want to display one result for that match (say the first message >>>>>> in >>>>>> that thread). In our naive implementation, we have an ES index for >>>>>> subjects >>>>>> and another for message bodies. But that gets us two sets of results, not >>>>>> combined. Is there a better way to structure the data, or a query that >>>>>> we're missing so that we get one set of combined results? >>>>>> >>>>>> Thanks, >>>>>> Mark >>>>>> >>>>> -- > You received this message because you are subscribed to a topic in the > Google Groups "elasticsearch" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/elasticsearch/eQ9XVrbulk8/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/a8ee880a-a399-4f27-b698-c8ee8c445b68%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/a8ee880a-a399-4f27-b698-c8ee8c445b68%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADOuSDHt5Qd8Q%3DXO3S7Ppj4vhBgzZ%2BV9c%3DrjOrXO7k%2BF_m70rA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
