Hi,

Please help to review this ticket.

Thanks,
Minh

On 05/09/16 11:43, Minh Hon Chau wrote:
> Summary: AMF: Fix SG unstable from admin continuation of nodegroup after 
> headless [#1987]
> Review request for Trac Ticket(s): 1987
> Peer Reviewer(s): AMF devs
> Pull request to: <<LIST THE PERSON WITH PUSH ACCESS HERE>>
> Affected branch(es): 5.1, default
> Development branch: default
>
> --------------------------------
> Impacted area       Impact y/n
> --------------------------------
>   Docs                    n
>   Build system            n
>   RPM/packaging           n
>   Configuration files     n
>   Startup scripts         n
>   SAF services            y
>   OpenSAF services        n
>   Core libraries          n
>   Samples                 n
>   Tests                   n
>   Other                   n
>
>
> Comments (indicate scope for each "y" above):
> ---------------------------------------------
>   <<EXPLAIN/COMMENT THE PATCH SERIES HERE>>
>
> changeset 3a6bc0645a7d6ce8ce57f51271e0c129f6ecf0f4
> Author:       minh-chau <[email protected]>
> Date: Mon, 05 Sep 2016 09:19:14 +1000
>
>       AMF: Fix SG unstable from admin continuation of nodegroup after headless
>       [#1987]
>
>       The SG becomes unstable because some variables used in nodegroup 
> operation
>       are not restored after headless if this admin operation on nodegroup was
>       interrupted just before cluster goes into headless stage.
>
>       In order to restore nodegroup operation, AMF needs to know exactly 
> whether
>       nodegroup operation was running during headless up on @susi assignment. 
> If
>       susi is in QUIESCED, QUIESCING or being removed while its related 
> entities
>       su, si, sg are not in LOCKED and SHUTTING_DOWN, that means either node 
> or
>       nodegroup MUST be in LOCKED or SHUTTING DOWN. In case of SHUTTING_DOWN
>       saAmfNGAdminState, that's enough to know a nodegroup operation was 
> running.
>       However, if saAmfNGAdminState is in LOCKED, this case is an ambiguity of
>       locking a node. The reason of differentiation of locking a node or node
>       group is because 2N SG uses both AVD_SG_FSM_SG_ADMIN and 
> AVD_SG_FSM_SU_OPER
>       for node group operation while AVD_SG_FSM_SU_OPER is only used for node
>       operation. When 2N SG uses AVD_SG_FSM_SG_ADMIN for nodegroup, the
>       saAmfSGAdminState is borrowed (but not updated to IMM) to run the admin
>       operation sequence. Therefore, after headless if AVD_SG_FSM_SG_ADMIN was
>       being used for nodegroup then saAmfSGAdminState also needs to be set.
>
>       Because SG FSM state is used to restore nodegroup during restoring susi
>       assignment, the osaAmfSGFsmState (RTA) needs to be read earlier than 
> reading
>       susi assignment. This needs active AMFD become implementer earlier than
>       reading sg object. There was a known ticket reported in 1720, if only 
> make
>       active AMFD as early implementer than it will cause the standby AMFD 
> missing
>       ccb apply callback.This patch also needs to set both active and standby 
> AMFD
>       become implementer and applier earlier so that AMFD can read
>       osaAmfSGFsmState and do not cause regression of 1720.
>
>
> Complete diffstat:
> ------------------
>   osaf/services/saf/amf/amfd/include/node.h |   3 +
>   osaf/services/saf/amf/amfd/include/sg.h   |   1 +
>   osaf/services/saf/amf/amfd/nodegroup.cc   |  83 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>   osaf/services/saf/amf/amfd/role.cc        |  20 ++++++------
>   osaf/services/saf/amf/amfd/sg.cc          |  15 +++++----
>   osaf/services/saf/amf/amfd/sgproc.cc      |   2 +-
>   osaf/services/saf/amf/amfd/siass.cc       |   4 +-
>   7 files changed, 109 insertions(+), 19 deletions(-)
>
>
> Testing Commands:
> -----------------
>   Execute test cases in ticket
>
>
> Testing, Expected Results:
> --------------------------
>   Tests pass
>
>
> Conditions of Submission:
> -------------------------
>   ack from reviewers
>
>
> Arch      Built     Started    Linux distro
> -------------------------------------------
> mips        n          n
> mips64      n          n
> x86         n          n
> x86_64      y          y
> powerpc     n          n
> powerpc64   n          n
>
>
> Reviewer Checklist:
> -------------------
> [Submitters: make sure that your review doesn't trigger any checkmarks!]
>
>
> Your checkin has not passed review because (see checked entries):
>
> ___ Your RR template is generally incomplete; it has too many blank entries
>      that need proper data filled in.
>
> ___ You have failed to nominate the proper persons for review and push.
>
> ___ Your patches do not have proper short+long header
>
> ___ You have grammar/spelling in your header that is unacceptable.
>
> ___ You have exceeded a sensible line length in your headers/comments/text.
>
> ___ You have failed to put in a proper Trac Ticket # into your commits.
>
> ___ You have incorrectly put/left internal data in your comments/files
>      (i.e. internal bug tracking tool IDs, product names etc)
>
> ___ You have not given any evidence of testing beyond basic build tests.
>      Demonstrate some level of runtime or other sanity testing.
>
> ___ You have ^M present in some of your files. These have to be removed.
>
> ___ You have needlessly changed whitespace or added whitespace crimes
>      like trailing spaces, or spaces before tabs.
>
> ___ You have mixed real technical changes with whitespace and other
>      cosmetic code cleanup changes. These have to be separate commits.
>
> ___ You need to refactor your submission into logical chunks; there is
>      too much content into a single commit.
>
> ___ You have extraneous garbage in your review (merge commits etc)
>
> ___ You have giant attachments which should never have been sent;
>      Instead you should place your content in a public tree to be pulled.
>
> ___ You have too many commits attached to an e-mail; resend as threaded
>      commits, or place in a public tree for a pull.
>
> ___ You have resent this content multiple times without a clear indication
>      of what has changed between each re-send.
>
> ___ You have failed to adequately and individually address all of the
>      comments and change requests that were proposed in the initial review.
>
> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
>
> ___ Your computer have a badly configured date and time; confusing the
>      the threaded patch review.
>
> ___ Your changes affect IPC mechanism, and you don't present any results
>      for in-service upgradability test.
>
> ___ Your changes affect user manual and documentation, your patch series
>      do not contain the patch that updates the Doxygen manual.
>
>


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to