Re: jdk fails with out of memory error / es critical index counts

Nate Fox Mon, 05 May 2014 12:44:28 -0700

You might turn off the bootstrap.mlockall flag just for now - it'll make ES
swap a ton, but your error message looks like an OS level issue. Make sure
you have lots of swap available and grab some coffee.


What I'd also try if turning off bootstrap.mlockall doesnt work:
- Tarball the entire data directory and save the tarball somewhere (unless
you dont care about the data)
- Set 31Gb for your ES HEAP. There's plenty of docs out there that say not
to go over 32Gb of ram cause it'll cause Java to go into 64bit mode.
- Copy the entire data dir to node2
- Go into the data dir on node1 and delete half of the indexes
- Go into the data dir on node2 and delete the *other* half of the indexes
- Fire up both nodes, make sure they both have the same cluster name

I have no idea if this'll work, I'm by no means an ES expert. :)






On Mon, May 5, 2014 at 12:32 PM, Nish <[email protected]> wrote:

> Currently I have 279 indexes on a single node and elasticsearch starts for
> few minutes and dies ; I only have 60G RAM on disk and as far as I know 60%
> is the max that one should allocate to elasticsearch ; I tried allocating
> 38G and it lasted for few more minutes and it died.
>
> *(I think there's some state files that tell ES/Lucene which indexes are
> on disk)* => Where is this ? How do I fix it so that it doesn't move all
> indexes to all nodes ? I want to split the ~280 indexes into two nodes of
> 140each. So far I am not able to achieve this as the master keeps moving
> nodes to itself !
>
> On Monday, May 5, 2014 3:25:05 PM UTC-4, Nate Fox wrote:
>>
>> How many indexes do you have? It almost looks like the system itself cant
>> allocate the ram needed?
>> You might try jacking up the nofile to something like 999999 as well? I'd
>> definitely go with 31g heapsize.
>>
>> As for moving indexes, you might be able to copy the entire data store,
>> then remove some (I think there's some state files that tell ES/Lucene
>> which indexes are on disk), so it might recover if its missing some and
>> sees the others on another node?
>>
>> As for your other questions, it depends on usage as to how many nodes -
>> especially search activity while indexing. We have 230 indexes (1740
>> shards) on 8 data nodes (5.7Tb / 6.1B docs). So it can definitely handle a
>> lot more than what you're throwing at it. We dont search often nor do we
>> load a ton of data at once.
>>
>>
>> On Sunday, May 4, 2014 7:13:09 AM UTC-7, Nish wrote:
>>>
>>> elasticsearch is set as a single node instance on a  60G RAM and
>>> 32*2.6GHz machine. I am actively indexing historic data with logstash. It
>>> worked well with ~300 million documents (search and indexing were doing ok)
>>> , but all of a sudden es fails to starts and keep itself up. It starts for
>>> few minutes and I can query but fails with out of memory error. I monitor
>>> the memory and atleast 12G of memory is available when it fails. I had set
>>> the es_heap_size to 31G and then reduced it to 28, 24 and 18 and the same
>>> error every time (see dump below)
>>>
>>> *My security limits are as under  (this is a test/POC server thus "root"
>>> user) *
>>>
>>> root   soft    nofile          65536
>>> root   hard    nofile          65536
>>> root   -       memlock         unlimited
>>>
>>> *ES settings *
>>> config]# grep -v "^#" elasticsearch.yml | grep -v "^$"
>>>  bootstrap.mlockall: true
>>>
>>> *echo $ES_HEAP_SIZE*
>>> 18432m
>>>
>>> ---DUMP----
>>>
>>> # bin/elasticsearch
>>> [2014-05-04 13:30:12,653][INFO ][node                     ] [Sabretooth]
>>> version[1.1.1], pid[19309], build[f1585f0/2014-04-16T14:27:12Z]
>>> [2014-05-04 13:30:12,653][INFO ][node                     ] [Sabretooth]
>>> initializing ...
>>> [2014-05-04 13:30:12,669][INFO ][plugins                  ] [Sabretooth]
>>> loaded [], sites []
>>> [2014-05-04 13:30:15,390][INFO ][node                     ] [Sabretooth]
>>> initialized
>>> [2014-05-04 13:30:15,390][INFO ][node                     ] [Sabretooth]
>>> starting ...
>>> [2014-05-04 13:30:15,531][INFO ][transport                ] [Sabretooth]
>>> bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
>>> 10.109.136.59:9300]}
>>> [2014-05-04 13:30:18,553][INFO ][cluster.service          ] [Sabretooth]
>>> new_master [Sabretooth][eocFkTYMQnSTUar94A2vHw][ip-10-109-136-59][inet[/
>>> 10.109.136.59:9300]], reason: zen-disco-join (elected_as_master)
>>> [2014-05-04 13:30:18,579][INFO ][discovery                ] [Sabretooth]
>>> elasticsearch/eocFkTYMQnSTUar94A2vHw
>>> [2014-05-04 13:30:18,790][INFO ][http                     ] [Sabretooth]
>>> bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
>>> 10.109.136.59:9200]}
>>> [2014-05-04 13:30:19,976][INFO ][gateway                  ] [Sabretooth]
>>> recovered [278] indices into cluster_state
>>> [2014-05-04 13:30:19,984][INFO ][node                     ] [Sabretooth]
>>> started
>>> OpenJDK 64-Bit Server VM warning: Attempt to protect stack guard pages
>>> failed.
>>> OpenJDK 64-Bit Server VM warning: Attempt to deallocate stack guard
>>> pages failed.
>>> OpenJDK 64-Bit Server VM warning: INFO: 
>>> os::commit_memory(0x00000007f7c70000,
>>> 196608, 0) failed; error='Cannot allocate memory' (errno=12)
>>> #
>>> # There is insufficient memory for the Java Runtime Environment to
>>> continue.
>>> # Native memory allocation (malloc) failed to allocate 196608 bytes for
>>> committing reserved memory.
>>> # An error report file with more information is saved as:
>>> # /tmp/jvm-19309/hs_error.log
>>>
>>>
>>>
>>> ----
>>> *user untergeek on #logstash told me that I have reached a max number of
>>> indices on a single node. Here are my questions: *
>>>
>>>    1. Can I move half of my indexes to a new node ? If yes, how to do
>>>    that without compromising indexes
>>>    2. Logstash makes 1 index per day and I want to have 2 years of data
>>>    indexable ; Can I combine multiple indexes into one ? Like one month per
>>>    month : this will mean I will not have more than 24 indexes.
>>>    3. How many nodes are ideal for 24 moths of data ~1.5G a day
>>>
>>>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/cEimyMnhSv0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/564e2951-ed54-4f34-97a9-4de88f187a7a%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/564e2951-ed54-4f34-97a9-4de88f187a7a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHU4sP-EU0Z033x_F_PvQqKgSe5aKWD%3DOkq4kyoQezM-_99Amw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: jdk fails with out of memory error / es critical index counts

Reply via email to