Very interesting. The default 'write consistency level' with Elasticsearch is QUORUM, i.e. verify a quorum of replicas for a shard are available before processing a write for it. In this case you were just left with 1 replica, C, and a write happened. So you would think that it should not go through since 2 replicas would be required for quorum. However: https://github.com/elasticsearch/elasticsearch/issues/6482. I think this goes to show this is a real, not a hypothetical problem!
But guess what? *Even if this were fixed, and a write to C never happened: *it is still possible that once A & B were back, C could be picked as primary and clobber data. See: https://github.com/elasticsearch/elasticsearch/issues/7572#issuecomment-59983759 On Thu, Oct 23, 2014 at 7:48 PM, Evan Tahler <[email protected]> wrote: > Bump? I would love to hear some thoughts on this flow, and if there are > any suggestions on how to mitigate it (other than replicating all data to > all nodes). > > Thanks! > > > On Tuesday, October 14, 2014 3:52:31 PM UTC-7, Evan Tahler wrote: >> >> Hi Mailing List! I'm a first-time poster, and a long time reader. >> >> We recently had a crash in our ES (1.3.1 on Ubuntu) cluster which caused >> us to loose a significant volume of data. I have a "theory" on what >> happened to cause this, and I would love to hear your opinions on this, and >> if you have any suggestions to mitigate it. >> >> Here is a simplified play-by-play: >> >> >> 1. Cluster has 3 data nodes, A, B, and C. The index has 10 shards. >> The index has a replica count of 1, so A is the master and B is a replica. >> C is doing nothing. Re-allocation of indexes/shards is enabled. >> 2. A crashes. B takes over as master, and then starts transferring >> data to C as a new replica. >> 3. B crashes. C is now master with an impartial dataset. >> 4. There is a write to the index. >> 5. A and B finally reboot, and they are told that they are now stale >> (as C had a write while they were away). Both A and B delete their local >> data. A is chosen to be the new replica and re-sync from C. >> 6. ... all the data A and B had which C never got is lost forever. >> >> >> Is the above situation scenario possible? If it is, it seems like the >> default behavior of ES might be better to not reallocate in this scenario? >> This would have caused the write in step #4 to fail, but in our use case, >> that is preferable to data loss. >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/58e98223-c036-41e2-b53c-265343fa3173%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/58e98223-c036-41e2-b53c-265343fa3173%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHWG4DP848XunJ8_pQKYi36uF2Df1UghZVOwS%2BuzABaocmKKJw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
