Mark, thank you for responding. It is quite odd, but what happened this morning is stranger.
I dropped and recreated the replica shard on the unused indicie. Now the in-use indicie shows Green. FYI we're running ES version 0.90, and my in-use indicie is 717 gb with 135M+ documents. On Friday I ran status reports on each indicie and compared both. Nothing showed as "failed" or "red" or plain "wrong" so I left it over the weekend. When I came in today the cluster was still Yellow. Any idea if createating the other indicie's replica shard caused the cluster's status to go green? It feels like a fluke, but I'm new to ES. If this is indeed an expected ES behavior, I'll add this to my restoral procedures. On Friday, March 21, 2014 9:48:27 PM UTC-4, Mark Walkom wrote: > > What version are you running? > > It's odd this would happen if, when you set replica's to zero, the cluster > state is green and your index is ok. > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: [email protected] <javascript:> > web: www.campaignmonitor.com > > > On 22 March 2014 06:15, Glenn Snead <[email protected] <javascript:>>wrote: > >> I have a six node cluster: 2 master nodes and 4 client / data nodes. I >> have two indicies. One with data and one that is set aside for future >> use. I'm having trouble with the indicie that is in use. >> After making some limits.conf configuraiton changes and restarting the >> impacted nodes, one of my indicies' replica shard will not complete >> initialization. >> I wasn't in charge of the node restarts and here is the sequence of >> events: >> Shut down the client and data nodes on each of the four servers. >> Start the client and data node on each server. >> I don't believe time was allowed to allow the cluster to reallocate or to >> move shards. >> >> limits.conf changes: >> - memlock unlimited >> hard nofiles 32000 >> soft nofiles 32000 >> >> Here's what I have tried thus far: >> >> Drop the replica shard, which brings the cluster status to Green. >> Verify the cluster's status - no replication, no realocating, etc. >> Re-add the replica shard. >> >> Drop the replica shard and the data nodes that were to carry the replica >> shard. >> Verify the cluster's status. >> Start the data nodes and allow the cluster to reallocate primary shards. >> - The cluster's status is Green. >> Add the replica shard to the indicie. The replica shard never completes >> initialization, even over a 24 hour period. >> >> I've checked the transaction log files on each node and they are all zero >> legnth files. >> The replica shard holding nodes are primary shards for the unused indicie. >> These nodes copied it's matching primary node's index Size (as seen in >> paramedic), but now Paramedic shows an index Size of only a few bytes. The >> index folder on the replica shard servers still has the data. >> >> Unknown to me, my target system was put online and my leadership doesn't >> want to schedule an outage window. Most my reasearch suggests that I drop >> the impacted indicie and re-initialize. I can replace the data, but this >> would impact the user interface while the indicie re-ingests the >> documents. This issue has occured before on my test system and the fix was >> to rebuild the index. However I never learned why the replica shard had >> the issue in the first place. >> >> My questions are: >> - Does the replica shard hosting server's index Size (shown in paramedic) >> indciate a course of action? >> - Is it possible to resolve this without dropping the indicie and >> rebuilding? I'd hate to resort to this each time we attempt ES server >> maintenance or configuration changes. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/a868b3da-fd28-49b4-bc8f-2f60f2c34ec7%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/a868b3da-fd28-49b4-bc8f-2f60f2c34ec7%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed2501e5-b504-46e2-ae04-69097e6d46ed%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
