*.."- Fire up both nodes, make sure they both have the same cluster name"*<= This is exactly what I wrote in my second message is where Elasticsearch is messing up. When I move the index to a new node and delete that index from master and then start master node and other data node, it (master) throws a message: "auto importing dangled indices" This means master is now copying the "deleted" index that exists only on other node to itself !
Basically this is what happens: 1. Node1 Master: rock,paper,scissors 2. I move rock from Node 1 to Node 2 (I verify by starting ONLY node1 and I can see that I am missing data that was originally in "rock" index, as expected, all good) 3. SO node1 now has paper,scissors 4. I start Node2 with ONLY "rock" index (verify independently, it works) 5. Then I start node 1 (master) and node 2(data) 6. Node1 sees says "hey I don't have rock, but node2 has it, let me copy it to myself" On Monday, May 5, 2014 3:44:17 PM UTC-4, Nate Fox wrote: > > You might turn off the bootstrap.mlockall flag just for now - it'll make > ES swap a ton, but your error message looks like an OS level issue. Make > sure you have lots of swap available and grab some coffee. > > What I'd also try if turning off bootstrap.mlockall doesnt work: > - Tarball the entire data directory and save the tarball somewhere (unless > you dont care about the data) > - Set 31Gb for your ES HEAP. There's plenty of docs out there that say not > to go over 32Gb of ram cause it'll cause Java to go into 64bit mode. > - Copy the entire data dir to node2 > - Go into the data dir on node1 and delete half of the indexes > - Go into the data dir on node2 and delete the *other* half of the indexes > - Fire up both nodes, make sure they both have the same cluster name > > I have no idea if this'll work, I'm by no means an ES expert. :) > > > > > > > On Mon, May 5, 2014 at 12:32 PM, Nish <[email protected] <javascript:>>wrote: > >> Currently I have 279 indexes on a single node and elasticsearch starts >> for few minutes and dies ; I only have 60G RAM on disk and as far as I know >> 60% is the max that one should allocate to elasticsearch ; I tried >> allocating 38G and it lasted for few more minutes and it died. >> >> *(I think there's some state files that tell ES/Lucene which indexes are >> on disk)* => Where is this ? How do I fix it so that it doesn't move all >> indexes to all nodes ? I want to split the ~280 indexes into two nodes of >> 140each. So far I am not able to achieve this as the master keeps moving >> nodes to itself ! >> >> On Monday, May 5, 2014 3:25:05 PM UTC-4, Nate Fox wrote: >>> >>> How many indexes do you have? It almost looks like the system itself >>> cant allocate the ram needed? >>> You might try jacking up the nofile to something like 999999 as well? >>> I'd definitely go with 31g heapsize. >>> >>> As for moving indexes, you might be able to copy the entire data store, >>> then remove some (I think there's some state files that tell ES/Lucene >>> which indexes are on disk), so it might recover if its missing some and >>> sees the others on another node? >>> >>> As for your other questions, it depends on usage as to how many nodes - >>> especially search activity while indexing. We have 230 indexes (1740 >>> shards) on 8 data nodes (5.7Tb / 6.1B docs). So it can definitely handle a >>> lot more than what you're throwing at it. We dont search often nor do we >>> load a ton of data at once. >>> >>> >>> On Sunday, May 4, 2014 7:13:09 AM UTC-7, Nish wrote: >>>> >>>> elasticsearch is set as a single node instance on a 60G RAM and >>>> 32*2.6GHz machine. I am actively indexing historic data with logstash. It >>>> worked well with ~300 million documents (search and indexing were doing >>>> ok) >>>> , but all of a sudden es fails to starts and keep itself up. It starts for >>>> few minutes and I can query but fails with out of memory error. I monitor >>>> the memory and atleast 12G of memory is available when it fails. I had set >>>> the es_heap_size to 31G and then reduced it to 28, 24 and 18 and the same >>>> error every time (see dump below) >>>> >>>> *My security limits are as under (this is a test/POC server thus >>>> "root" user) * >>>> >>>> root soft nofile 65536 >>>> root hard nofile 65536 >>>> root - memlock unlimited >>>> >>>> *ES settings * >>>> config]# grep -v "^#" elasticsearch.yml | grep -v "^$" >>>> bootstrap.mlockall: true >>>> >>>> *echo $ES_HEAP_SIZE* >>>> 18432m >>>> >>>> ---DUMP---- >>>> >>>> # bin/elasticsearch >>>> [2014-05-04 13:30:12,653][INFO ][node ] >>>> [Sabretooth] version[1.1.1], pid[19309], build[f1585f0/2014-04-16T14: >>>> 27:12Z] >>>> [2014-05-04 13:30:12,653][INFO ][node ] >>>> [Sabretooth] initializing ... >>>> [2014-05-04 13:30:12,669][INFO ][plugins ] >>>> [Sabretooth] loaded [], sites [] >>>> [2014-05-04 13:30:15,390][INFO ][node ] >>>> [Sabretooth] initialized >>>> [2014-05-04 13:30:15,390][INFO ][node ] >>>> [Sabretooth] starting ... >>>> [2014-05-04 13:30:15,531][INFO ][transport ] >>>> [Sabretooth] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address >>>> {inet[/10.109.136.59:9300]} >>>> [2014-05-04 13:30:18,553][INFO ][cluster.service ] >>>> [Sabretooth] new_master [Sabretooth][eocFkTYMQnSTUar94A2vHw][ip-10- >>>> 109-136-59][inet[/10.109.136.59:9300]], reason: zen-disco-join >>>> (elected_as_master) >>>> [2014-05-04 13:30:18,579][INFO ][discovery ] >>>> [Sabretooth] elasticsearch/eocFkTYMQnSTUar94A2vHw >>>> [2014-05-04 13:30:18,790][INFO ][http ] >>>> [Sabretooth] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address >>>> {inet[/10.109.136.59:9200]} >>>> [2014-05-04 13:30:19,976][INFO ][gateway ] >>>> [Sabretooth] recovered [278] indices into cluster_state >>>> [2014-05-04 13:30:19,984][INFO ][node ] >>>> [Sabretooth] started >>>> OpenJDK 64-Bit Server VM warning: Attempt to protect stack guard pages >>>> failed. >>>> OpenJDK 64-Bit Server VM warning: Attempt to deallocate stack guard >>>> pages failed. >>>> OpenJDK 64-Bit Server VM warning: INFO: >>>> os::commit_memory(0x00000007f7c70000, >>>> 196608, 0) failed; error='Cannot allocate memory' (errno=12) >>>> # >>>> # There is insufficient memory for the Java Runtime Environment to >>>> continue. >>>> # Native memory allocation (malloc) failed to allocate 196608 bytes for >>>> committing reserved memory. >>>> # An error report file with more information is saved as: >>>> # /tmp/jvm-19309/hs_error.log >>>> >>>> >>>> >>>> ---- >>>> *user untergeek on #logstash told me that I have reached a max number >>>> of indices on a single node. Here are my questions: * >>>> >>>> 1. Can I move half of my indexes to a new node ? If yes, how to do >>>> that without compromising indexes >>>> 2. Logstash makes 1 index per day and I want to have 2 years of >>>> data indexable ; Can I combine multiple indexes into one ? Like one >>>> month >>>> per month : this will mean I will not have more than 24 indexes. >>>> 3. How many nodes are ideal for 24 moths of data ~1.5G a day >>>> >>>> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "elasticsearch" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/elasticsearch/cEimyMnhSv0/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/564e2951-ed54-4f34-97a9-4de88f187a7a%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/564e2951-ed54-4f34-97a9-4de88f187a7a%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5de77e8a-46dd-43c9-b4ad-557d117072ff%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
