Earlier this week we discovered that our three node elasticsearch cluster needed to be expanded as it was getting dangerously close to maximum capacity. I was nervous about this and read up the best I could on best practices to doing this. The only information I seemed to be able to find is to ensure that the new nodes cannot be elected as masters when they join to avoid a split brain scenario. Fair enough.
I launched two new EC2 instances to join the cluster and watched. Some shards began relocating, no big deal. Six hours later I checked in and some shards were still locating, one shard was recovering. Weird but whatever... the cluster health is still green and searches are working fine. Then I got an alert at 2:30am that the cluster state is now yellow and find that we have 3 shards marked as recovering and 2 shards that unassigned. The cluster still technically works but 24 hours later after the new nodes were added I feel like my only choice to get a green cluster again will be to simply launch 5 fresh nodes and replay all the data from backups into it. Ugggggh. SERIOUSLY! What can I do to prevent this? I feel like I am missing something because I always heard the strength of elasticsearch is its ease of scaling out but it feels like every time I try it falls to the floor. :-( Thanks! James -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJreXKD3Wuyiq5XxGdSWyj3a%3DM2Xd5GQxZ9J3EywoT-OP52qFA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
