FYI settings: *Master*: [root@ip-10-169-36-251 logstash-2013.12.05]# grep -vE "^$|^#" /xx/elasticsearch-1.1.1/config/elasticsearch.yml cluster.name: elasticsearchtest node.name: "node1" node.master: true node.data: true index.number_of_replicas: 0 discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["10.169.36.251", "10.186.152.19"] *Non Master* [root@ip-10-186-152-19 logstash-2013.12.05]# grep -vE "^$|^#" /elasticsearch/es/elasticsearch-1.1.1/config/elasticsearch.yml cluster.name: elasticsearchtest node.name: "node2" node.master: false node.data: true index.number_of_replicas: 0 discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["10.169.36.251","10.186.152.19"]
On Mon, May 5, 2014 at 11:01 PM, Nishchay Shah <[email protected]>wrote: > Probably not. > > I deleted all data from slave and restarted both servers and I see this: > > *Master: * > [root@ip-10-169-36-251 logstash-2013.12.22]# du -h --max-depth=1 > 16M ./0 > 16M ./1 > 8.0K ./_state > 15M ./4 > 15M ./3 > 15M ./2 > 75M . > > *Data: * > > [root@ip-10-186-152-19 logstash-2013.12.22]# du -h --max-depth=1 > 16M ./0 > 16M ./1 > 15M ./4 > 15M ./3 > 15M ./2 > 75M . > > > On Mon, May 5, 2014 at 10:53 PM, Mark Walkom <[email protected]>wrote: > >> Don't copy indexes on the OS level! >> >> Is your new cluster balancing the shards? >> >> Regards, >> Mark Walkom >> >> Infrastructure Engineer >> Campaign Monitor >> email: [email protected] >> web: www.campaignmonitor.com >> >> >> On 6 May 2014 12:46, Nishchay Shah <[email protected]> wrote: >> >>> Hey Mark, >>> Thanks for the response. I have currently created two new medium test >>> instances (1 master 1 data only) because I didn't want to mess with the >>> main dataset. In my test setup, I have about 600MB of data ; 7 indexes >>> >>> After looking around a lot I saw that the directory organization is >>> /elasticsearch/es/elasticsearch-1.1.1/data/elasticsearchtest/nodes/*<node >>> number>*/ and the master node has only 1 directory >>> >>> (master) >>> # ls /elasticsearch/es/elasticsearch-1.1.1/data/elasticsearchtest/nodes >>> 0 >>> >>> So on node2 I created a "1" directory and moved 1 index from master to >>> data ; So master now has six indexes in 0 and data has one in 1. >>> When I started elasticsearch after that I got to a point where the >>> master is not NOT copying the data back to itself.. but now node2 is >>> copying master's data and making a "0" directory ; Also, I am unable to >>> query the node2's data ! >>> >>> >>> >>> >>> On Mon, May 5, 2014 at 9:34 PM, Mark Walkom >>> <[email protected]>wrote: >>> >>>> Moving data on the OS level without making ES aware can cause >>>> difficulties as you are seeing. >>>> >>>> A few suggestions on how to resolve this and improve things in >>>> general; >>>> >>>> 1. Set your heap size to 31GB. >>>> 2. Use Oracle's java, not OpenJDK. >>>> 3. Set bootstrap.mlockall to true, you don't want to swap, ever. >>>> >>>> Given the large number of indexes you have on node1, and to get to a >>>> point where you can move some of these to a new node and stop the root >>>> problem, it's going to be worth closing some of the older indexes. So try >>>> these steps; >>>> >>>> 1. Stop node2. >>>> 2. Delete any data from the second node, to prevent things being >>>> auto imported again. >>>> 3. Start node1, or restart it if it's running. >>>> 4. Close all your indexes older than a month - >>>> >>>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-open-close.html. >>>> You can use wildcards in index names to make the update easier. What >>>> this >>>> will do is tell ES to not load the index metadata into memory, which >>>> will >>>> help with your OOM issue. >>>> 5. Start node2 and let it join the cluster. >>>> 6. Make sure the cluster is in a green state. If you're not >>>> already, use something like ElasticHQ, kopf or Marvel to monitor things. >>>> 7. Let the cluster rebalance the current open indexes. >>>> 8. Once that is ok and things are stable, reopen your closed >>>> indexes a month at a time, and let them rebalance. >>>> >>>> That should get you back up and running. Once you're there we can go >>>> back to your original post :) >>>> >>>> Regards, >>>> Mark Walkom >>>> >>>> Infrastructure Engineer >>>> Campaign Monitor >>>> email: [email protected] >>>> web: www.campaignmonitor.com >>>> >>>> >>>> On 6 May 2014 11:15, Nishchay Shah <[email protected]> wrote: >>>> >>>>> >>>>> Thanks Nate, but this doesn't work. node2 is not the master. So >>>>> starting it first didn't make sense, anyway I tried it and I couldn't >>>>> execute anything on a nonmaster node (node2) unless master was started >>>>> >>>>> I started node2 (non master) and ran this: curl -XPUT >>>>> localhost:9200/_cluster/settings -d >>>>> '{"transient":{"cluster.routing.allocation.disable_allocation":true}}' >>>>> after 30s I got this: >>>>> {"error":"MasterNotDiscoveredException[waited for [30s]]","status":503} >>>>> >>>>> I started node1 and as bloody expected elasticsearch copied all the >>>>> indexes :( .. >>>>> *"auto importing dangled indices"* >>>>> >>>>> I cannot believe I am unable to get this fundamental elasticsearch >>>>> feature working ! >>>>> >>>>> >>>>> On Mon, May 5, 2014 at 4:25 PM, Nate Fox <[email protected]> wrote: >>>>> >>>>>> Get node2 running with rock. Then issue a disable_allocation and then >>>>>> bring up node1. >>>>>> curl -XPUT localhost:9200/_cluster/settings -d >>>>>> '{"transient":{"cluster.routing.allocation.disable_allocation":true}}' >>>>>> >>>>>> From there, adjust the replica settings on the indexes down to 0 so >>>>>> they dont copy. Once thats set, change disable_allocation to false. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, May 5, 2014 at 1:19 PM, Nish <[email protected]> wrote: >>>>>> >>>>>>> *.."- Fire up both nodes, make sure they both have the same cluster >>>>>>> name"* <= This is exactly what I wrote in my second message is >>>>>>> where Elasticsearch is messing up. When I move the index to a new node >>>>>>> and >>>>>>> delete that index from master and then start master node and other data >>>>>>> node, it (master) throws a message: >>>>>>> "auto importing dangled indices" >>>>>>> This means master is now copying the "deleted" index that exists >>>>>>> only on other node to itself ! >>>>>>> >>>>>>> >>>>>>> Basically this is what happens: >>>>>>> >>>>>>> 1. Node1 Master: rock,paper,scissors >>>>>>> 2. I move rock from Node 1 to Node 2 (I verify by starting ONLY >>>>>>> node1 and I can see that I am missing data that was originally in >>>>>>> "rock" >>>>>>> index, as expected, all good) >>>>>>> 3. SO node1 now has paper,scissors >>>>>>> 4. I start Node2 with ONLY "rock" index (verify independently, >>>>>>> it works) >>>>>>> 5. Then I start node 1 (master) and node 2(data) >>>>>>> 6. Node1 sees says "hey I don't have rock, but node2 has it, let >>>>>>> me copy it to myself" >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Monday, May 5, 2014 3:44:17 PM UTC-4, Nate Fox wrote: >>>>>>> >>>>>>>> You might turn off the bootstrap.mlockall flag just for now - it'll >>>>>>>> make ES swap a ton, but your error message looks like an OS level >>>>>>>> issue. >>>>>>>> Make sure you have lots of swap available and grab some coffee. >>>>>>>> >>>>>>>> What I'd also try if turning off bootstrap.mlockall doesnt work: >>>>>>>> - Tarball the entire data directory and save the tarball somewhere >>>>>>>> (unless you dont care about the data) >>>>>>>> - Set 31Gb for your ES HEAP. There's plenty of docs out there that >>>>>>>> say not to go over 32Gb of ram cause it'll cause Java to go into 64bit >>>>>>>> mode. >>>>>>>> - Copy the entire data dir to node2 >>>>>>>> - Go into the data dir on node1 and delete half of the indexes >>>>>>>> - Go into the data dir on node2 and delete the *other* half of the >>>>>>>> indexes >>>>>>>> - Fire up both nodes, make sure they both have the same cluster name >>>>>>>> >>>>>>>> I have no idea if this'll work, I'm by no means an ES expert. :) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, May 5, 2014 at 12:32 PM, Nish <[email protected]> wrote: >>>>>>>> >>>>>>>>> Currently I have 279 indexes on a single node and elasticsearch >>>>>>>>> starts for few minutes and dies ; I only have 60G RAM on disk and as >>>>>>>>> far as >>>>>>>>> I know 60% is the max that one should allocate to elasticsearch ; I >>>>>>>>> tried >>>>>>>>> allocating 38G and it lasted for few more minutes and it died. >>>>>>>>> >>>>>>>>> *(I think there's some state files that tell ES/Lucene which >>>>>>>>> indexes are on disk)* => Where is this ? How do I fix it so that >>>>>>>>> it doesn't move all indexes to all nodes ? I want to split the ~280 >>>>>>>>> indexes >>>>>>>>> into two nodes of 140each. So far I am not able to achieve this as the >>>>>>>>> master keeps moving nodes to itself ! >>>>>>>>> >>>>>>>>> On Monday, May 5, 2014 3:25:05 PM UTC-4, Nate Fox wrote: >>>>>>>>>> >>>>>>>>>> How many indexes do you have? It almost looks like the system >>>>>>>>>> itself cant allocate the ram needed? >>>>>>>>>> You might try jacking up the nofile to something like 999999 as >>>>>>>>>> well? I'd definitely go with 31g heapsize. >>>>>>>>>> >>>>>>>>>> As for moving indexes, you might be able to copy the entire data >>>>>>>>>> store, then remove some (I think there's some state files that tell >>>>>>>>>> ES/Lucene which indexes are on disk), so it might recover if its >>>>>>>>>> missing >>>>>>>>>> some and sees the others on another node? >>>>>>>>>> >>>>>>>>>> As for your other questions, it depends on usage as to how many >>>>>>>>>> nodes - especially search activity while indexing. We have 230 >>>>>>>>>> indexes >>>>>>>>>> (1740 shards) on 8 data nodes (5.7Tb / 6.1B docs). So it can >>>>>>>>>> definitely >>>>>>>>>> handle a lot more than what you're throwing at it. We dont search >>>>>>>>>> often nor >>>>>>>>>> do we load a ton of data at once. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sunday, May 4, 2014 7:13:09 AM UTC-7, Nish wrote: >>>>>>>>>>> >>>>>>>>>>> elasticsearch is set as a single node instance on a 60G RAM and >>>>>>>>>>> 32*2.6GHz machine. I am actively indexing historic data with >>>>>>>>>>> logstash. It >>>>>>>>>>> worked well with ~300 million documents (search and indexing were >>>>>>>>>>> doing ok) >>>>>>>>>>> , but all of a sudden es fails to starts and keep itself up. It >>>>>>>>>>> starts for >>>>>>>>>>> few minutes and I can query but fails with out of memory error. I >>>>>>>>>>> monitor >>>>>>>>>>> the memory and atleast 12G of memory is available when it fails. I >>>>>>>>>>> had set >>>>>>>>>>> the es_heap_size to 31G and then reduced it to 28, 24 and 18 and >>>>>>>>>>> the same >>>>>>>>>>> error every time (see dump below) >>>>>>>>>>> >>>>>>>>>>> *My security limits are as under (this is a test/POC server >>>>>>>>>>> thus "root" user) * >>>>>>>>>>> >>>>>>>>>>> root soft nofile 65536 >>>>>>>>>>> root hard nofile 65536 >>>>>>>>>>> root - memlock unlimited >>>>>>>>>>> >>>>>>>>>>> *ES settings * >>>>>>>>>>> config]# grep -v "^#" elasticsearch.yml | grep -v "^$" >>>>>>>>>>> bootstrap.mlockall: true >>>>>>>>>>> >>>>>>>>>>> *echo $ES_HEAP_SIZE* >>>>>>>>>>> 18432m >>>>>>>>>>> >>>>>>>>>>> ---DUMP---- >>>>>>>>>>> >>>>>>>>>>> # bin/elasticsearch >>>>>>>>>>> [2014-05-04 13:30:12,653][INFO ][node ] >>>>>>>>>>> [Sabretooth] version[1.1.1], pid[19309], >>>>>>>>>>> build[f1585f0/2014-04-16T14: >>>>>>>>>>> 27:12Z] >>>>>>>>>>> [2014-05-04 13:30:12,653][INFO ][node ] >>>>>>>>>>> [Sabretooth] initializing ... >>>>>>>>>>> [2014-05-04 13:30:12,669][INFO ][plugins ] >>>>>>>>>>> [Sabretooth] loaded [], sites [] >>>>>>>>>>> [2014-05-04 13:30:15,390][INFO ][node ] >>>>>>>>>>> [Sabretooth] initialized >>>>>>>>>>> [2014-05-04 13:30:15,390][INFO ][node ] >>>>>>>>>>> [Sabretooth] starting ... >>>>>>>>>>> [2014-05-04 13:30:15,531][INFO ][transport ] >>>>>>>>>>> [Sabretooth] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, >>>>>>>>>>> publish_address >>>>>>>>>>> {inet[/10.109.136.59:9300]} >>>>>>>>>>> [2014-05-04 13:30:18,553][INFO ][cluster.service ] >>>>>>>>>>> [Sabretooth] new_master [Sabretooth][eocFkTYMQnSTUar94 >>>>>>>>>>> A2vHw][ip-10-109-136-59][inet[/10.109.136.59:9300]], reason: >>>>>>>>>>> zen-disco-join (elected_as_master) >>>>>>>>>>> [2014-05-04 13:30:18,579][INFO ][discovery ] >>>>>>>>>>> [Sabretooth] elasticsearch/eocFkTYMQnSTUar94A2vHw >>>>>>>>>>> [2014-05-04 13:30:18,790][INFO ][http ] >>>>>>>>>>> [Sabretooth] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, >>>>>>>>>>> publish_address >>>>>>>>>>> {inet[/10.109.136.59:9200]} >>>>>>>>>>> [2014-05-04 13:30:19,976][INFO ][gateway ] >>>>>>>>>>> [Sabretooth] recovered [278] indices into cluster_state >>>>>>>>>>> [2014-05-04 13:30:19,984][INFO ][node ] >>>>>>>>>>> [Sabretooth] started >>>>>>>>>>> OpenJDK 64-Bit Server VM warning: Attempt to protect stack guard >>>>>>>>>>> pages failed. >>>>>>>>>>> OpenJDK 64-Bit Server VM warning: Attempt to deallocate stack >>>>>>>>>>> guard pages failed. >>>>>>>>>>> OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory( >>>>>>>>>>> 0x00000007f7c70000, 196608, 0) failed; error='Cannot allocate >>>>>>>>>>> memory' (errno=12) >>>>>>>>>>> # >>>>>>>>>>> # There is insufficient memory for the Java Runtime Environment >>>>>>>>>>> to continue. >>>>>>>>>>> # Native memory allocation (malloc) failed to allocate 196608 >>>>>>>>>>> bytes for committing reserved memory. >>>>>>>>>>> # An error report file with more information is saved as: >>>>>>>>>>> # /tmp/jvm-19309/hs_error.log >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ---- >>>>>>>>>>> *user untergeek on #logstash told me that I have reached a max >>>>>>>>>>> number of indices on a single node. Here are my questions: * >>>>>>>>>>> >>>>>>>>>>> 1. Can I move half of my indexes to a new node ? If yes, how >>>>>>>>>>> to do that without compromising indexes >>>>>>>>>>> 2. Logstash makes 1 index per day and I want to have 2 years >>>>>>>>>>> of data indexable ; Can I combine multiple indexes into one ? >>>>>>>>>>> Like one >>>>>>>>>>> month per month : this will mean I will not have more than 24 >>>>>>>>>>> indexes. >>>>>>>>>>> 3. How many nodes are ideal for 24 moths of data ~1.5G a day >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to a topic in >>>>>>>>> the Google Groups "elasticsearch" group. >>>>>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/ >>>>>>>>> topic/elasticsearch/cEimyMnhSv0/unsubscribe. >>>>>>>>> To unsubscribe from this group and all its topics, send an email >>>>>>>>> to [email protected]. >>>>>>>>> >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/564e2951- >>>>>>>>> ed54-4f34-97a9-4de88f187a7a%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/564e2951-ed54-4f34-97a9-4de88f187a7a%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>> You received this message because you are subscribed to a topic in >>>>>>> the Google Groups "elasticsearch" group. >>>>>>> To unsubscribe from this topic, visit >>>>>>> https://groups.google.com/d/topic/elasticsearch/cEimyMnhSv0/unsubscribe >>>>>>> . >>>>>>> To unsubscribe from this group and all its topics, send an email to >>>>>>> [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/elasticsearch/5de77e8a-46dd-43c9-b4ad-557d117072ff%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5de77e8a-46dd-43c9-b4ad-557d117072ff%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elasticsearch" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/elasticsearch/CAHU4sP_02AfqaFOdZU6ZOmua32BuG4w2tv125Vyu2j7HAZy93w%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAHU4sP_02AfqaFOdZU6ZOmua32BuG4w2tv125Vyu2j7HAZy93w%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/elasticsearch/CANma5K74Q97T%2BqJTsqp2%3DSjur9qzAnfpXaLfVzWBevK1DarPZA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CANma5K74Q97T%2BqJTsqp2%3DSjur9qzAnfpXaLfVzWBevK1DarPZA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elasticsearch/CAEM624Zg%2B%3D9%3Dy5b%2BP81_%2BVduTRAV__cg2FYNoYxFtcjLYMm-QA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAEM624Zg%2B%3D9%3Dy5b%2BP81_%2BVduTRAV__cg2FYNoYxFtcjLYMm-QA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/CANma5K4oQed7UteJioY-zCSQVU0z1rQWsqgWUoeELE6%3DNOfS8w%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CANma5K4oQed7UteJioY-zCSQVU0z1rQWsqgWUoeELE6%3DNOfS8w%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAEM624ZWXuxDoOy6EVxbsqmXvdzRLkr4Waq4DB62vjyd6na4Ow%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAEM624ZWXuxDoOy6EVxbsqmXvdzRLkr4Waq4DB62vjyd6na4Ow%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANma5K4MP3QwGAgv%2BDpWLuYceMpQUamQs3boaxFGR6xXwhE9Jg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
