Summary: split-brain: select active SC from largest network partition V3 [#2795] Review request for Ticket(s): 2795 Peer Reviewer(s): Anders, Ravi, Hans Pull request to: *** LIST THE PERSON WITH PUSH ACCESS HERE *** Affected branch(es): develop Development branch: ticket-2795 Base revision: 1c302a300e449e8a8527671fbd6c7f4e2b41e95d Personal repository: git://git.code.sf.net/u/userid-2226215/review
-------------------------------- Impacted area Impact y/n -------------------------------- Docs n Build system n RPM/packaging n Configuration files n Startup scripts n SAF services n OpenSAF services y Core libraries y Samples n Tests n Other n Comments (indicate scope for each "y" above): --------------------------------------------- *** Changes from V2: *** fmd: made cluster_size atomic fmd: wait 3 seconds before promoting to active, to allow topology events to be processed first osaf: add check for existing takeover request, before trying to lock etcdv3 plugin: reliablity improvements revision c7bc78656d5de11f6147727bd8612274fb6e438f Author: Gary Lee <[email protected]> Date: Wed, 11 Apr 2018 17:16:46 +1000 rded: adapt to new Consensus API [#2795] - add 3 new internal message: RDE_MSG_NODE_UP RDE_MSG_NODE_DOWN RDE_MSG_TAKEOVER_REQUEST_CALLBACK - subscribe to AMFND service up events to keep track of the number of cluster members - listen for takeover requests in KV store revision 4899e5d0f5abdff8f15eca8ad17d3b13b6a00393 Author: Gary Lee <[email protected]> Date: Wed, 11 Apr 2018 17:16:18 +1000 fmd: adapt to new Consensus API [#2795] revision 812a315af21df06b2f9fdcc3d8fd5b7bbad3e550 Author: Gary Lee <[email protected]> Date: Wed, 11 Apr 2018 17:15:41 +1000 amfd: adapt to new Consensus API [#2795] revision b8a37c1b8965826e5faffbfebc44a84bdb6433a1 Author: Gary Lee <[email protected]> Date: Wed, 11 Apr 2018 17:14:39 +1000 osaf: add lock takeover request fuction [#2795] - add create and set (if previous value matches) functions to KeyValue class - add Consensus::MonitorTakeoverRequest() function for use by RDE to answer takeover requests - add Consensus::CreateTakeoverRequest() - before a SC is promoted to active, it will create a takeover request in the KV store. An existing SC can reject the lock takeover revision 955be872ba5887b1b521eac9f7732dd3f6afc593 Author: Gary Lee <[email protected]> Date: Wed, 11 Apr 2018 17:13:45 +1000 osaf: extend API to include a create key and an enhanced set key function [#2795] - add create_key function (fails if key already exists) - add setkey_match_prev function (set value if previous value matches) - add missing quotes - add etcd3.plugin Added Files: ------------ src/osaf/consensus/plugins/etcd3.plugin Complete diffstat: ------------------ src/amf/amfd/role.cc | 2 +- src/fm/fmd/fm_cb.h | 2 +- src/fm/fmd/fm_main.cc | 26 +- src/fm/fmd/fm_mds.cc | 2 + src/fm/fmd/fm_rda.cc | 27 +- src/osaf/consensus/consensus.cc | 435 ++++++++++++++++++++++++++----- src/osaf/consensus/consensus.h | 55 +++- src/osaf/consensus/key_value.cc | 105 +++++--- src/osaf/consensus/key_value.h | 19 +- src/osaf/consensus/plugins/etcd.plugin | 86 +++++- src/osaf/consensus/plugins/etcd3.plugin | 366 ++++++++++++++++++++++++++ src/osaf/consensus/plugins/sample.plugin | 67 ++++- src/rde/rded/rde_cb.h | 12 +- src/rde/rded/rde_main.cc | 75 ++++-- src/rde/rded/rde_mds.cc | 39 ++- src/rde/rded/rde_rda.cc | 2 +- src/rde/rded/role.cc | 46 +++- src/rde/rded/role.h | 2 +- 18 files changed, 1180 insertions(+), 188 deletions(-) Testing Commands: ----------------- 1) SI swap of safSi=SC-2N,safApp=OpenSAF 2) Isolate standby cluster (eg. use iptables to block port 6700 on a TCP system) 3) Isolate active cluster Testing, Expected Results: -------------------------- 1) No error 2) Standby will fail to be promoted as active as the takeover request is rejected 3) Standby will be promoted Conditions of Submission: ------------------------- Ack from any reviewer Arch Built Started Linux distro ------------------------------------------- mips n n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: ------------------- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not passed review because (see checked entries): ___ Your RR template is generally incomplete; it has too many blank entries that need proper data filled in. ___ You have failed to nominate the proper persons for review and push. ___ Your patches do not have proper short+long header ___ You have grammar/spelling in your header that is unacceptable. ___ You have exceeded a sensible line length in your headers/comments/text. ___ You have failed to put in a proper Trac Ticket # into your commits. ___ You have incorrectly put/left internal data in your comments/files (i.e. internal bug tracking tool IDs, product names etc) ___ You have not given any evidence of testing beyond basic build tests. Demonstrate some level of runtime or other sanity testing. ___ You have ^M present in some of your files. These have to be removed. ___ You have needlessly changed whitespace or added whitespace crimes like trailing spaces, or spaces before tabs. ___ You have mixed real technical changes with whitespace and other cosmetic code cleanup changes. These have to be separate commits. ___ You need to refactor your submission into logical chunks; there is too much content into a single commit. ___ You have extraneous garbage in your review (merge commits etc) ___ You have giant attachments which should never have been sent; Instead you should place your content in a public tree to be pulled. ___ You have too many commits attached to an e-mail; resend as threaded commits, or place in a public tree for a pull. ___ You have resent this content multiple times without a clear indication of what has changed between each re-send. ___ You have failed to adequately and individually address all of the comments and change requests that were proposed in the initial review. ___ You have a misconfigured ~/.gitconfig file (i.e. user.name, user.email etc) ___ Your computer have a badly configured date and time; confusing the the threaded patch review. ___ Your changes affect IPC mechanism, and you don't present any results for in-service upgradability test. ___ Your changes affect user manual and documentation, your patch series do not contain the patch that updates the Doxygen manual. ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
