Hey, maybe I misread your mail, but I am actually not sure, what part exactly is taking so much time. Is it the pinging of nodes, to finally remove a part of the cluster? Is it the recovery and copying of data in order to recreate a working cluster? Also, if you have different racks, you could use (I guess you do) rack based awareness allocation to make sure, all your data is available in case a rack fails. And what is a unhealthy state exactly?
Interested to get a few more information here. --Alex On Fri, Jan 17, 2014 at 2:54 PM, Marco Schirrmeister < [email protected]> wrote: > Hello, > > I have a question about cluster recovery after the cluster goes into an > unhealthy state. > Let's assume the following. > > We have a cluster with 9 nodes. > 3 master nodes (esmX) (master=true, data=false) > 4 data nodes (esdX) (master=false, data=true) > 2 client nodes (escX) (master=false, data=false) > minimum_master_nodes is set to 2. > > The cluster is deployed across multiple racks. > rack 1 > esm1, esm2, esd1, esd2 and esc1 > > rack2 > esm3, esd3, esd4 and esc2 > > With this configuration I can lose rack 2 and the cluster still fulfills > the requirements to form a proper cluster. > If I would loose rack 1 forever or a long time, I would manual spin up a > second master node in rack 2 that to fulfill 2 minimum masters. > > If now the network connection between the 2 racks fails, the cluster goes > in an unhealthy state. > After a while rack 1 will be back online and everything is working again. > I noticed that this takes up to many minutes. Even after playing with the > timeout settings for failure detection it takes relative long until it > thinks that the other nodes are gone and before it's back to normal. > > My question is, is that normal? Do I have to live with a few minutes > downtime if parts of the cluster becomes unreachable? > Or are there any options I could still try to tune? > > > Thanks > Marco > > > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/8f9233b5-bc4c-47f0-8a42-7d38db8dc7fb%40googlegroups.com > . > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-2xL9t%3Dn8_Uwk%3DLmhKDcNkZnuSDtvsJ9f65sRDzw3Mqg%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
