On 31 Jan 2018 17:18, "Jeff Jirsa" <jji...@gmail.com> wrote:
I don’t know why this is a surprise (maybe because people like to talk about multiple rings, but the fact that replication strategy is set per keyspace and that you could use SimpleStrategy in a multiple dc cluster demonstrates this), but we can chat about that another time The reason I find it surprising, is that it makes very little *sense* to put a token belonging to a mode from one DC between tokens of nodes from another one. Having token ranges like that, with ends from nodes in different DCs, doesn't convey any *meaning* and have no correspondence to what is being modelled here. It also makes it nearly impossible to reason about range ownership (unless you're a machine, in which case you probably don't care). I understand that it works in the end, but it doesn't help to know that. It is an implementation detail sticking outside the code guts and it sure *is* surprising in all its ugliness. It also opens up the possibility of problems just like the one which have started this discussion. I don't find the argument of using SimpleStrategy for multi-DC particularly interesting, lest can I predict what to be expected from such an attempt. If this is deemed invalid config why does the new node *silently* steals the existing token, badly affecting the ownership of the rest of the nodes? It should just refuse to start! Philosophically, With multiple DCs, it may start up and not see the other DC for minutes/hours/days before it realizes there’s a token conflict - what should it do then? This was not the case for us - the new mode has seen all of the ring and could detect that there is a conflict. Still it decided to claim the token ownership, removing it from a longer-lived mode. This should be fairly easy to reproduce, however Kurt mentioned that there supposed to be some sort of protection against that. I'll try again tomorrow. If your suggestion to resolve that is to make sure we see the whole ring before starting up, we end up in a situation where we try not to startup unless we can see all nodes, and create outages during DC separations. I don't really see a problem here. A newly started node learns topology from the seed nodes - it doesn't need to *see* all nodes, just learn that the *exist* and which tokens are assigned to them. A node which is restarting doesn't even need to do that, because it doesn't need to reconsider its token ownership. Cheers, -- Alex