The cluster goes to yellow fairly quickly but never reaches a green state. If I knew that new replicas would be generated from the primaries when I add fresh disks, I would just go ahead and replace the failing disks at that point.
When I say "failing" disks, I mean the indicator lights on the disks in the system chassis indicate that they are exhibiting errors. I can see that this affects the ingestion rate of the cluster so I want to replace them before they fail completely. I have had this happen before with another system. When disks start to go bad Elasticsearch has trouble getting cluster status of the node with the failing disk and slows down to a crawl. It is best to try to replace disks before they fail completely when Elasticsearch is involved. Anyhow, I think the Elasticsearch dev folks should think about this failure scenario. It would be great if they added the capability to snapshot a single node after disabling shard reallocation - http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/setup-upgrade.html. As it stands now, replacing a failing or failed disk in a node is a troublesome prospect. On Friday, September 26, 2014 8:16:50 PM UTC-7, David Pilato wrote: > > Is your cluster still yellow? > It should be Green at some point unless you change some settings > explicitly. > > If your cluster does not index anymore, you could copy manually files in > data dir and copy them on your new disk. But I wonder how you can copy from > a failing disk? > > I'd probably let elasticsearch do it over the wire. > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > > Le 27 sept. 2014 à 02:30, vic hargrave <[email protected] <javascript:>> > a écrit : > > I have a situation where I need to replace disks that are failing on a > single node in my 4 node Elasticsearch cluster. As a result I'd like to > backup the Elasticsearch data on that node only, replace the disks and then > restore the data to the new (empty) disks. I've tried shutting down the > node in question, but the remaining 3 nodes can only get to a "yellow" > state. I'm using 5 primary shards and 1 replica shard per index. I > considered using snapshot for the single node, but it seems Elasticsearch > does not support snapshot and restore for a single node, it must be done on > the whole cluster. > > Is it possible to just manually copy the data from the failing disk to > another disk, replace the failing disk then copy the data back to the new > disk (starting and stopping Elasticsearch before and after this whole > process, of course)? > > -- vic > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/42e5d7c4-a2ee-45da-bfe5-d0327011f52d%40googlegroups.com > > <https://groups.google.com/d/msgid/elasticsearch/42e5d7c4-a2ee-45da-bfe5-d0327011f52d%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dbe5d1a2-d377-4982-a2e5-e55024f2c4b4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
