Re: aggregations

Colin Goodheart-Smithe Wed, 17 Sep 2014 01:07:12 -0700

Field data does indeed load all the values for a field into memory 
irrespective of the query and filter.   This is how aggregations achieve 
fast lookups on the values of a field for a particular document. The field 
cache is loaded the first time it is needed and then stored in a cache.


Heap size is almost certainly your problem here. There are 2 options I can 
see for you:

1) Increase your heap size to allow enough space to load the field cache 
into memory
2) Try setting the field data format to 'doc_values' (described here 
<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/fielddata-formats.html>).
 
Note that doc_values uses less memory but will consume more disk and may be 
slightly slower so may or may not suit your needs.

Regards,

Colin

On Wednesday, 17 September 2014 07:47:27 UTC+1, navdeep agarwal wrote:
>
> Sorry for delayed response,
> i am using 1.3 version ,i was able to change limit,field data circut 
> breaker,i changed it to 80 ,this is nice setting to know .
> but it doesn't work ,may be heap size is my problem ,but i have very 
> limited heap space .
>
> Thanks you.
>
> On Friday, September 5, 2014 2:19:25 PM UTC+5:30, Thomas wrote:
>>
>> What version of es have you been using, afaik in later versions you can 
>> control the percentage of heap space to utilize with update settings api, 
>> try to increase it a bit and see what happens, default is 60%, increase it 
>> for example to 70%:
>>
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html#fielddata-circuit-breaker
>>
>> T.
>>
>> On Wednesday, 3 September 2014 19:58:02 UTC+3, navdeep agarwal wrote:
>>>
>>> hi ,
>>>
>>> i am bit new Elastic search ,while testing on elasticsearch's 
>>> aggregation feature ,i am always hitting data too large,i understand that 
>>> aggregations are very memory intensive , so is there any way query in ES 
>>> where one query's output can be  ingested to aggregation so that number of 
>>> input to aggregation is limited . i have used filter and querying before 
>>> aggregations .
>>>
>>> i have around 60 GB index on 5 shards .
>>>
>>> queries i tried:
>>>
>>> GET **********/_search
>>> {
>>>   "query": {"term": {
>>>     "file_sha2": {
>>>       "value": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
>>>     }
>>>   }}, 
>>>   
>>>   "aggs": {
>>>           "top_filename": {
>>>             "max": {
>>>               "field": "portalid"
>>>             }
>>>           }
>>>           
>>>   }
>>> }
>>>
>>> -------------------------------------------------------
>>>
>>> GET ************/_search
>>> {
>>>   
>>>     "aggs": {
>>>       "top filename": {
>>>         "filter": {"term": {
>>>           "file_sha2": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
>>>         }},
>>>         "aggs": {
>>>           "top_filename": {
>>>             "max": {
>>>               "field": "portalid"
>>>             }
>>>           }
>>>         }
>>>       }
>>>     }
>>>     
>>>     
>>>   
>>> }
>>>
>>>
>>> thanks in advance .
>>>  
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f20ced1d-1caf-4000-88e7-07fd23735ea7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: aggregations

Reply via email to