Hi,
Recently we discovered that Elasticsearch is not able to solve a previous split brain situation of an existing cluster. The problem (split brain and further resolution) can be splitted into two main parts: 1. Reorganization of the whole cluster and logging 2. Resolution of data conflicts The first thing should be fairly "easy" to solve. Discovery should take place regularly and update the cluster organization if necessary. The second thing would be more complex and dependent of what users are doing. In our application it is not that important that conflicts caused by split brain is solved by Elasticsearch - we can easily handle this (re-import the data modified while the split brain situation). IMHO it is much better to let ES solve the split brain than to let it run "forever" in the split brain situation. >From the original issue https://github.com/elasticsearch/elasticsearch/issues/5144 : ------------------------- we have a 4 node ES cluster running ("plain" Zen discovery - no cloud stuff). Two nodes are in one DC - two nodes in another DC. When the network connection between both DCs fails, ES forms two two-node ES clusters - a split brain. When the network is operative again, the split brain situation is remains persistent. I've setup a small local test with a 4 node ES cluster: +--------+ +--------+ | Node A | ----\ /---- | Node C | +--------+ \.........../ +--------+ +--------+ / \ +--------+ | Node B | ----/ \---- | Node D | +--------+ +--------+ Single ES cluster When the network connection fails, two two node clusters exists (split brain). I've simulated that with "iptables -A INPUT/OUTPUT -s/d -j DROP" statements. +--------+ +--------+ | Node A | ----\ /---- | Node C | +--------+ \ / +--------+ +--------+ / \ +--------+ | Node B | ----/ \---- | Node D | +--------+ +--------+ ES cluster ES cluster When the network between nodes AB and CD is operative again, the single cluster status is not restored (split brain is persistent). It did not make a difference, whether unicast or multicast ZEN discovery is used. Another issue is that operating system keepalive settings affects the time after which ES detects a node failure. Keepalive timeout settings (e.g. net.ipv4.tcp_keepalive_time/probes/intvl) directly influence the node failure detection. There should be some task, that regularly polls the "alive" status of all known other nodes. Tested with ES 1.0.0 (and an older 0.90.3). ----------------------- David Pilato: "Did you try to set minimum_master_node to 3? See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#master-election " ----------------------- Me: "Setting minimum_master_nodes to 3 is not an option. If I understand correctly, it would force all 4 nodes to stop working at all - means: no service at all. This wouldn't cover the case, that two nodes are taken down for maintenance work. And what if there a three DCs (each with 2 nodes) - a setting of minimum_master_nodes=5 would only allow one node to fail before ES stops working. IMHO there should be a regular job inside ES, that checks the existence of other nodes (either via unicast or via multicast) and triggers (re-)discovery if necessary - the split brain situation must be resolved." ----------------------- David Pilato: "Exactly. Cluster will stop working until network connection is up again. What do you expect? Which part of the cluster should hold the master in case of network outage? Cross Data center replication is not supported yet and you should consider: - use the great snapshot and restore feature to snapshot from a DC and restore in the other one - index in both DC (so two distinct clusters) from a client level - use Tribe node feature to search or index on multiple clusters I think we should move this conversation to the mailing list." -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1cc24862-5a95-4e2e-9dc4-6d8d5445b016%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
