Re: [devel] [PATCH 0/5] Review Request for Add support for split brain prevention V2 [#64]

Gary Lee Sun, 21 Jan 2018 22:57:25 -0800

HI Anders/Hans


On 20/01/18 00:56, Anders Widell wrote:

Ack from me also, with comments:
* I think my major comment is that I had originally envisioned thatyou would use the "etcdctl lock" command (in the V3 API) and that theactive SC would hold the lock for as long as it is active. The lockwould not be needed for reading. Your approach of only creating thelock when you wish to change active controller could be fine though.However, you shouldn't need the lock for reading - only when you wishto update the active controller. Regarding the Watch: I think youshould have the watch on the lock instead of (or in addition to) thedata you are protecting. At a fail-over, the old standby would acquirethe lock and wait a while to give the old active enough time to detectthat a fail-over is pending (it notices that the lock has beencreated). The old active would then be able to remove the lock andprevent the fail-over from happening. We can look into this in thenext iteration (next release) and keep it as it is for now.

[Gary] I will remove the opensaf_active_controller key and just have akey for the lock. The node is stored in the corresponding value. It's alot simpler that way so I will do it for this release. The lock will nothave a TTL and be persistent (until removed by another controller).

* You ought to utilise the test-and-set functionality in the etcd v2protocol, in the cases where you are changing the value of a key andknow (think you know) the previous value. unlock is an example ofthis, fail-over probably also. We could add this later but I think youshould at least extend the plugin API already now, so that it takes a"previous value" parameter where applicable.
* You have a try-again loop when you acquire the lock, but if themaximum number of retries have been done then you continue as if thelock was acquired successfully. It doesn't seem to be correct?


[Gary] Yes, will fix.

* It is not obvious (to me) that no more Watch thread can be createdsimultaneously. Could you add a flag that keeps track of if there isan existing thread, and add assert statements checking that there isno existing thread when you call MonitorActive() to create a new one?


[Gary] OK, will add a conditional statement.

* As Hans points out below, it seems that it is also possible that thewatch thread could disappear silently in some error case.


[Gary] Will make it assert in that case.

* As already pointed out by Hans, we should store our keys in somedirectory in the etcd database, so that the same database can be usedfor other purposes as well. I think the plugin (shell script) shouldadd a directory prefix to the key.

[Gary] Yes, good idea. The directory prefix will be handled by theplugin, in case the underlying key-value store doesn't handledirectories etc.

AndersW> Split-brain should not be possible, however the currentalgorithm will not guarantee that the active SC will be in the largestpartition in case TIPC connectivity is broken (partitioned). So itcould happen that a single isolated node (from TIPC point of view) isthe active SC, even though a larger TIPC partition exists. I thinkthis could be solved by writing the size of the cluster into the lock.An existing active SC shall reject a fail-over if it is beinginitiated from a node in a smaller partition.


[Gary] Can we postpone this for the next release?

Thanks
Gary





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Re: [devel] [PATCH 0/5] Review Request for Add support for split brain prevention V2 [#64]

Reply via email to