Is this request only about getting aggregations? If so you would probably get better response times by putting the filter in the query part (under a filtered query) and only having the date histogram in the aggregation. The reason is that aggregations are computed on matches, and in case the query is not specified, that means all documents of your index.
On Fri, Jun 13, 2014 at 9:41 AM, Thomas <[email protected]> wrote: > Below is an example aggregation i perform, is there any optimizations I > can perform? Maybe disabling some features i do not need etc. > > curl -XPOST " > http://localhost:9200/logs-idx.20140613/event/_search?search_type=count" - > d' > { > "aggs": { > "f1": { > "filter": { > "or": [ > { > "and": [ > { > "has_parent": { > "type": "request", > "filter": { > "and": { > "filters": [ > { > "term": { > "country": "US" > } > }, > { > "term": { > "city": "NY" > } > }, > { > "term": { > "code": 12 > } > } > ] > } > } > } > }, > { > "range": { > "event_time": { > "gte": "2014-06-13T10:00:00", > "lt": "2014-06-13T11:00:00" > } > } > } > ] > }, > { > "and": [ > { > "has_parent": { > "type": "request", > "filter": { > "and": { > "filters": [ > { > "term": { > "country": "US" > } > }, > { > "term": { > "city": "NY" > } > }, > { > "term": { > "code": 12 > } > }, > { > "range": { > "request_time": { > "gte": "2014-06-13T10:00:00", > "lt": "2014-06-13T11:00:00" > } > } > } > ] > } > } > } > }, > { > "range": { > "event_time": { > "lt": "2014-06-13T10:00:00" > } > } > } > ] > } > ] > }, > "aggs": { > "per_interval": { > "date_histogram": { > "field": "event_time", > "interval": "minute" > }, > "aggs": { > "metrics": { > "terms": { > "field": "event", > "size": 10 > } > } > } > } > } > } > } > }' > > > On Friday, 13 June 2014 10:09:46 UTC+3, Thomas wrote: >> >> Hi, >> >> I'm facing a performance issue with some aggregations I perform, and I >> need your help if possible: >> >> I have to documents, the *request* and the *event*. The request is the >> parent of the event. Below is a (sample) mapping >> >> "event" : { >> "dynamic" : "strict", >> "_parent" : { >> "type" : "request" >> }, >> "properties" : { >> "event_time" : { >> "format" : "dateOptionalTime", >> "type" : "date" >> }, >> "count" : { >> "type" : "integer" >> }, >> "event" : { >> "index" : "not_analyzed", >> "type" : "string" >> } >> } >> } >> >> "request" : { >> "dynamic" : "strict", >> "_id" : { >> "path" : "uniqueId" >> }, >> "properties" : { >> "uniqueId" : { >> "index" : "not_analyzed", >> "type" : "string" >> }, >> "user" : { >> "index" : "not_analyzed", >> "type" : "string" >> }, >> "code" : { >> "type" : "integer" >> }, >> "country" : { >> "index" : "not_analyzed", >> "type" : "string" >> }, >> "city" : { >> "index" : "not_analyzed", >> "type" : "string" >> } >> .... >> } >> } >> >> My cluster is becoming really big (almost 2 TB of data with billions of >> documents) and i maintain one index per day, whereas I occasionally delete >> old indices. My daily index is about 20GB big. The version of elasticsearch >> that I use is 1.1.1. >> >> My problems start when I want to get some aggregations of events with >> some criteria which is applied in the parent request document. For example >> count be the events of type *click for country = US and code=12. What I >> was initially doing was to generate a scriptFilter for the request document >> (in Groovy) and I was adding multiple aggregations in one search request. >> This ended up being very slow so I removed the scripting logic and I >> supported my logic with java code.* >> >> What seems to be initially solved in my local machine, when I got back to >> the cluster, nothing has changed. Again my app performs really really poor. >> I get more than 10 seconds to perform a search with ~10 sub-aggregations. >> >> What seems strange is that I notice that the cluster is pretty ok with >> regards load average, CPU etc. >> >> Any hints on where to look for solving this out? to be able to identify >> the bottleneck >> >> *Ask for any additional information to provide*, I didn't want to make >> this post too long to read >> Thank you >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/a4cf00b0-9786-4327-80f9-34941eaf3ca8%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/a4cf00b0-9786-4327-80f9-34941eaf3ca8%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- Adrien Grand -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6S31C%2Bx6AfRQPZ7F%2BZC0zkxchbh6xDeP%3DhiSJZnPDVEg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
