Hi,

Recently we discovered that Elasticsearch is not able to solve a previous 
split brain situation of an existing cluster. The problem (split brain and 
further resolution) can be splitted into two main parts:

   1. Reorganization of the whole cluster and logging
   2. Resolution of data conflicts

The first thing should be fairly "easy" to solve. Discovery should take 
place regularly and update the cluster organization if necessary.

The second thing would be more complex and dependent of what users are 
doing. In our application it is not that important that conflicts caused by 
split brain is solved by Elasticsearch - we can easily handle this 
(re-import the data modified while the split brain situation).

IMHO it is much better to let ES solve the split brain than to let it run 
"forever" in the split brain situation.



>From the original issue 
https://github.com/elasticsearch/elasticsearch/issues/5144 :

-------------------------

we have a 4 node ES cluster running ("plain" Zen discovery - no cloud 
stuff). Two nodes are in one DC - two nodes in another DC.

When the network connection between both DCs fails, ES forms two two-node 
ES clusters - a split brain. When the network is operative again, the split 
brain situation is remains persistent.

I've setup a small local test with a 4 node ES cluster:

+--------+                         +--------+
| Node A | ----\             /---- | Node C |
+--------+      \.........../      +--------+
+--------+      /           \      +--------+
| Node B | ----/             \---- | Node D |
+--------+                         +--------+
               Single ES cluster

When the network connection fails, two two node clusters exists (split 
brain). I've simulated that with "iptables -A INPUT/OUTPUT -s/d -j DROP" 
statements.

+--------+                         +--------+
| Node A | ----\             /---- | Node C |
+--------+      \           /      +--------+
+--------+      /           \      +--------+
| Node B | ----/             \---- | Node D |
+--------+                         +--------+
  ES cluster                      ES cluster

When the network between nodes AB and CD is operative again, the single 
cluster status is not restored (split brain is persistent).

It did not make a difference, whether unicast or multicast ZEN discovery is 
used.

Another issue is that operating system keepalive settings affects the time 
after which ES detects a node failure. Keepalive timeout settings (e.g. 
net.ipv4.tcp_keepalive_time/probes/intvl) directly influence the node 
failure detection.

There should be some task, that regularly polls the "alive" status of all 
known other nodes.

Tested with ES 1.0.0 (and an older 0.90.3).

-----------------------

David Pilato: "Did you try to set minimum_master_node to 3? See 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#master-election
"

-----------------------

Me: "Setting minimum_master_nodes to 3 is not an option. If I understand 
correctly, it would force all 4 nodes to stop working at all - means: no 
service at all. This wouldn't cover the case, that two nodes are taken down 
for maintenance work. And what if there a three DCs (each with 2 nodes) - a 
setting of minimum_master_nodes=5 would only allow one node to fail before 
ES stops working. IMHO there should be a regular job inside ES, that checks 
the existence of other nodes (either via unicast or via multicast) and 
triggers (re-)discovery if necessary - the split brain situation must be 
resolved."

-----------------------

David Pilato: "Exactly. Cluster will stop working until network connection 
is up again.
What do you expect? Which part of the cluster should hold the master in 
case of network outage?

Cross Data center replication is not supported yet and you should consider:

   - use the great snapshot and restore feature to snapshot from a DC and 
   restore in the other one
   - index in both DC (so two distinct clusters) from a client level
   - use Tribe node feature to search or index on multiple clusters

I think we should move this conversation to the mailing list."

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1cc24862-5a95-4e2e-9dc4-6d8d5445b016%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to