You need to look into using an index template that uses optimal mapping for 
your data. For logstash, it really helps to use doc_values on all fields 
you aggregate on and turning off norms as well on those fields. Doc_values 
means elasticsearch uses memory mapped files instead of heap memory for the 
field values. WIth huge aggregations this means the system will get slower 
but less likely to run out of memory if you get a lot of requests. Without 
doc_values, you will want to configure field data circuitbreakers properly 
to ensure you don't run out of memory. This typically means that searches 
that would have run out of memory abort with an error instead, which is 
preferable to your cluster crashing but not great from an end user 
perspective.

Jilles

On Wednesday, February 25, 2015 at 9:09:43 AM UTC+1, Seungjin Lee wrote:
>
> We are running a PAAS built with elasticsearch and we want to provide 
> multi-column count aggregation feature through ES aggregation
>
> Let's take below as an example
>
> POST /INDEX_PATTREN-*/_search
> {
>     "query":{"match":{"project":"dummyProject"}},
>     "size":0,
>    "aggs": {
>       "col1": {
>          "terms": {
>             "field": "host",
>             "size":5
>          },
>          "aggs": {
>             "col2": {
>                "terms": {
>                   "field": "source",
>                   "size":5
>                },
>                "aggs":{
>                    "col3":{
>                        "terms":{
>                            "field":"version",
>                            "size":5
>                        }
>                    }
>                }
>             }
>          }
>       }
>    }
> }
>
>
> We use daily index, stores 30 days amount of data, approximately 500GB per 
> day index.
>
> So the example aggreagation will investigate huge data.
>
> But we found out that it's blazingly fast, we use 20 data nodes together 
> with several search/master nodes, and it responds within 10 minutes.
>
>
>
>
> OK, but what if there's many request at the same time, what can happen?
>
> Will those requests just make other requests to slow down(in this case, 
> increase # of machines will be a solution?) or possibly cause OOM or 
> whatever critical error on ES daemon? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/745a95f9-d963-472c-9ece-f326521707b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to