Summary: amf: enhance to work in roaming SC and headless [#2936] Review request for Ticket(s): 2936 Peer Reviewer(s): Minh, Thang, Vu Pull request to: *** LIST THE PERSON WITH PUSH ACCESS HERE *** Affected branch(es): develop Development branch: ticket-2936 Base revision: b0086e3c5da87fad844e76c8c648f6dc6e7ae73a Personal repository: git://git.code.sf.net/u/thuantr/review
-------------------------------- Impacted area Impact y/n -------------------------------- Docs n Build system n RPM/packaging n Configuration files n Startup scripts n SAF services n OpenSAF services y Core libraries n Samples n Tests n Other n NOTE: Patch(es) contain lines longer than 80 characers Comments (indicate scope for each "y" above): --------------------------------------------- N/A revision a3ac6a80f584bb6f3cfa8e30b8475916a65c7e4e Author: thuan.tran <thuan.t...@dektech.com.au> Date: Fri, 10 Jul 2020 12:55:00 +0700 imm: define macro for values of canBeCoord [#2936] revision 975986be3d8d0a6ef2e54c47532aebbebeafc4da Author: thuan.tran <thuan.t...@dektech.com.au> Date: Fri, 10 Jul 2020 12:55:00 +0700 imm: reboot nodes used to be different partition with coord [#2936] - immnd send re-introduce refresh=3 with ex-immd (active) node id. - immd set very high priority for re-introduce msg of local immnd and choose coord if re-introduce refresh=3 from local immnd. - immd reply re-intro to reboot if ex-immd is not same as ex-immd of selected coord. - immd use new INTRO_RSP_2 to checkpoint ex-immd to standby. - immnd use MDS_RED_SUBSCRIBE for immd to know active/standby immd and help detect headless in multi partition clusters rejoin. - immnd discard FEVS from unknown immd or during re-introduce to avoid immnd OUT OF ORDER restart and lost ex-immd info. - Update README.SC_ABSENCE for this new feature. - Allow to configure disable/enable this new feature. - immd standby will reboot if see two actives immd to avoid sync with wrong active. revision 1754b0bdb1237441c77de2c8454dd6604c3bae60 Author: thuan.tran <thuan.t...@dektech.com.au> Date: Fri, 10 Jul 2020 12:55:00 +0700 amf: enhance to work in roaming SC and headless [#2936] - amfd reset msg id counter for node that ignore amfnd down event to avoid nodes reboot once more due to mismatch msg id after reboot up from reboot order for sending node_up after sync window. - amfd active order reboot its standby if it detect another active amfd (multi partition cluster rejoin). Two actives will be handled by RDE detect split-brain. - amfd standby should reboot itself if see two active peers to avoid standby do cold-sync or be updated with wrong active. Two actives will be handled by RDE detect split-brain. - amfd just become standby (out of sync) but see active down should reboot itself. Complete diffstat: ------------------ scripts/opensaf_reboot | 1 + src/amf/amfd/dmsg.cc | 8 +++ src/amf/amfd/evt.h | 1 + src/amf/amfd/main.cc | 3 + src/amf/amfd/mds.cc | 36 +++++++++++- src/amf/amfd/msg.h | 1 + src/amf/amfd/ndfsm.cc | 2 + src/amf/amfd/proc.h | 1 + src/amf/amfd/role.cc | 27 +++++++++ src/amf/amfd/util.cc | 2 +- src/amf/amfnd/amfnd.cc | 2 +- src/imm/README.SC_ABSENCE | 22 +++++++ src/imm/common/immsv_evt.c | 17 +++++- src/imm/common/immsv_evt.h | 15 ++++- src/imm/immd/immd.conf | 7 +++ src/imm/immd/immd.h | 10 ++++ src/imm/immd/immd_cb.h | 7 +++ src/imm/immd/immd_evt.c | 141 +++++++++++++++++++++++++++------------------ src/imm/immd/immd_main.c | 9 +++ src/imm/immd/immd_mbcsv.c | 24 ++++++-- src/imm/immd/immd_mds.c | 17 ++++-- src/imm/immd/immd_proc.c | 26 ++++----- src/imm/immd/immd_red.h | 1 + src/imm/immd/immd_sbevt.c | 24 +++++--- src/imm/immnd/immnd_cb.h | 4 ++ src/imm/immnd/immnd_evt.c | 88 ++++++++++++++++++++++------ src/imm/immnd/immnd_main.c | 2 + src/imm/immnd/immnd_mds.c | 35 ++++++++--- src/imm/immnd/immnd_proc.c | 19 +++--- 29 files changed, 421 insertions(+), 131 deletions(-) Testing Commands: ----------------- N/A Testing, Expected Results: -------------------------- N/A Conditions of Submission: ------------------------- ACK by reviewers Arch Built Started Linux distro ------------------------------------------- mips n n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: ------------------- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not passed review because (see checked entries): ___ Your RR template is generally incomplete; it has too many blank entries that need proper data filled in. ___ You have failed to nominate the proper persons for review and push. ___ Your patches do not have proper short+long header ___ You have grammar/spelling in your header that is unacceptable. ___ You have exceeded a sensible line length in your headers/comments/text. ___ You have failed to put in a proper Trac Ticket # into your commits. ___ You have incorrectly put/left internal data in your comments/files (i.e. internal bug tracking tool IDs, product names etc) ___ You have not given any evidence of testing beyond basic build tests. Demonstrate some level of runtime or other sanity testing. ___ You have ^M present in some of your files. These have to be removed. ___ You have needlessly changed whitespace or added whitespace crimes like trailing spaces, or spaces before tabs. ___ You have mixed real technical changes with whitespace and other cosmetic code cleanup changes. These have to be separate commits. ___ You need to refactor your submission into logical chunks; there is too much content into a single commit. ___ You have extraneous garbage in your review (merge commits etc) ___ You have giant attachments which should never have been sent; Instead you should place your content in a public tree to be pulled. ___ You have too many commits attached to an e-mail; resend as threaded commits, or place in a public tree for a pull. ___ You have resent this content multiple times without a clear indication of what has changed between each re-send. ___ You have failed to adequately and individually address all of the comments and change requests that were proposed in the initial review. ___ You have a misconfigured ~/.gitconfig file (i.e. user.name, user.email etc) ___ Your computer have a badly configured date and time; confusing the the threaded patch review. ___ Your changes affect IPC mechanism, and you don't present any results for in-service upgradability test. ___ Your changes affect user manual and documentation, your patch series do not contain the patch that updates the Doxygen manual. _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel