Can I have some feedback on this series?
Thanks,
Hans

On 02/28/2014 08:54 AM, Hans Feldt wrote:
> Summary: Correct AMF support for TERM-FAILED
> Review request for Trac Ticket(s): 538
> Peer Reviewer(s): Praveen, Nags, Hans N
> Pull request to: <<LIST THE PERSON WITH PUSH ACCESS HERE>>
> Affected branch(es): All
> Development branch: default
>
> --------------------------------
> Impacted area       Impact y/n
> --------------------------------
>   Docs                    n
>   Build system            n
>   RPM/packaging           n
>   Configuration files     n
>   Startup scripts         n
>   SAF services            y
>   OpenSAF services        n
>   Core libraries          n
>   Samples                 n
>   Tests                   n
>   Other                   n
>
>
> Comments (indicate scope for each "y" above):
> ---------------------------------------------
>
> It is very important to get this into the pending releases!
>
>
> changeset 7f72f8d9cbd64fa017a71aa73337b8d74128ade8
> Author:       Hans Feldt <[email protected]>
> Date: Fri, 28 Feb 2014 08:12:11 +0100
>
>       amfd: allow modification of node repair attributes [#538]
>
>       To prepare for correct handling of TERMINATION-FAILED it is important 
> that
>       all the repair related attributes of the AMF system model can be 
> changed.
>
>       This patch allows changing saAmfNodeAutoRepair and
>       saAmfNodeFailfastOnTerminationFailure and also logs such change to SAF 
> LOG.
>
> changeset 5069ae52df6a857f374c93dcee4dc364f9f4fd0a
> Author:       Hans Feldt <[email protected]>
> Date: Fri, 28 Feb 2014 08:20:51 +0100
>
>       amfd: reboot node when term-failed SU [#538]
>
>       When a component enters the TERM-FAILED presence state and if all the 
> repair
>       conditions on SG and node are true, a node reboot request is ordered. 
> The
>       comp presence state is also SAFlogged.
>
> changeset 785f74ff482ef8e6f644f95cd1064b2d22a86ab1
> Author:       Hans Feldt <[email protected]>
> Date: Fri, 28 Feb 2014 08:24:08 +0100
>
>       amfnd: correct term-failed behaviour [#538]
>
>       Problem: possible split brain on application level and spec violation.
>
>       Analysis: The AMF node director requests a comp/SU failover from the AMF
>       director despite that a comp is in TERM-FAILED presence state.
>
>       Change: Correct this behavior and just disable the SU and let the AMF
>       director handle possible node reboot or manual repair.
>
> changeset f56cac35542db8d592e48c758269bb5418aced38
> Author:       Hans Feldt <[email protected]>
> Date: Fri, 28 Feb 2014 08:35:29 +0100
>
>       amfd: auto clear comp cleanup failed alarm [#538]
>
>
> Complete diffstat:
> ------------------
>   osaf/services/saf/amf/amfd/comp.cc        |  44 
> +++++++++++++++++++++++++++++++++++++-------
>   osaf/services/saf/amf/amfd/include/util.h |   2 ++
>   osaf/services/saf/amf/amfd/node.cc        |  27 +++++++++++++++++++++++++++
>   osaf/services/saf/amf/amfd/sg.cc          |   4 ++++
>   osaf/services/saf/amf/amfd/sgproc.cc      |  38 
> --------------------------------------
>   osaf/services/saf/amf/amfd/sgtype.cc      |   6 ++++++
>   osaf/services/saf/amf/amfd/util.cc        |  38 
> ++++++++++++++++++++++++++++++++++++++
>   osaf/services/saf/amf/amfnd/clc.cc        |   3 +--
>   osaf/services/saf/amf/amfnd/su.cc         |   1 -
>   osaf/services/saf/amf/amfnd/susm.cc       |  45 
> +++++++--------------------------------------
>   10 files changed, 122 insertions(+), 86 deletions(-)
>
>
> Testing Commands:
> -----------------
>
> Case 1:
> ============
>   2 node cluster, amf demo and the following script run on SC1 (active SC and
>   active demo):
>
> immcfg -f AppConfig-2N.xml
> amf-adm unlock-in safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
> amf-adm unlock-in safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1
> amf-adm unlock safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
> amf-adm unlock safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1
> sleep 2
>
> immcfg -a saAmfSGAutoRepair=1 safSg=AmfDemo,safApp=AmfDemo1
> immcfg -a saAmfNodeAutoRepair=1 safAmfNode=SC-1,safAmfCluster=myAmfCluster
> immcfg -a saAmfNodeFailfastOnTerminationFailure=1 
> safAmfNode=SC-1,safAmfCluster=myAmfCluster
> immcfg -a saAmfNodeAutoRepair=1 safAmfNode=SC-2,safAmfCluster=myAmfCluster
> immcfg -a saAmfNodeFailfastOnTerminationFailure=1 
> safAmfNode=SC-2,safAmfCluster=myAmfCluster
>
> pkill demo
>
> Case 2:
> ===========
> The same but the saAmfSGAutoRepair=0 and admin repair of SU
>
>
> Testing, Expected Results:
> --------------------------
>
> Case 1:
> ===============
> SC1 rebooted
> demo failed over to SC2
> "component cleanup failed" alarm raised and cleared
> New SAF LOGs to visualize important changes:
>
>          80 08:29:56 02/28/2014 NO safApp=safAmfService "CCB 3 Modified 
> safSg=AmfDemo,safApp=AmfDemo1
>          81 08:29:56 02/28/2014 NO safApp=safAmfService 
> "safSg=AmfDemo,safApp=AmfDemo1 saAmfSGAutoRepair changed to 1
>          82 08:29:56 02/28/2014 NO safApp=safAmfService "CCB 4 Modified 
> safAmfNode=SC-1,safAmfCluster=myAmfCluster
>          83 08:29:56 02/28/2014 NO safApp=safAmfService 
> "safAmfNode=SC-1,safAmfCluster=myAmfCluster saAmfNodeAutoRepair changed to 1
>          84 08:29:56 02/28/2014 NO safApp=safAmfService "CCB 5 Modified 
> safAmfNode=SC-1,safAmfCluster=myAmfCluster
>          85 08:29:56 02/28/2014 NO safApp=safAmfService 
> "safAmfNode=SC-1,safAmfCluster=myAmfCluster 
> saAmfNodeFailfastOnTerminationFailure changed to 1
>          86 08:29:57 02/28/2014 NO safApp=safAmfService "CCB 6 Modified 
> safAmfNode=SC-2,safAmfCluster=myAmfCluster
>          87 08:29:57 02/28/2014 NO safApp=safAmfService 
> "safAmfNode=SC-2,safAmfCluster=myAmfCluster saAmfNodeAutoRepair changed to 1
>          88 08:29:57 02/28/2014 NO safApp=safAmfService "CCB 7 Modified 
> safAmfNode=SC-2,safAmfCluster=myAmfCluster
>          89 08:29:57 02/28/2014 NO safApp=safAmfService 
> "safAmfNode=SC-2,safAmfCluster=myAmfCluster 
> saAmfNodeFailfastOnTerminationFailure changed to 1
>          90 08:29:57 02/28/2014 NO safApp=safAmfService 
> "safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 PresenceState 
> RESTARTING => TERMINATION_FAILED
>          91 08:29:57 02/28/2014 NO safApp=safAmfService "Ordering reboot of 
> 'safAmfNode=SC-1,safAmfCluster=myAmfCluster' as repair action
>
>
> Case 2:
> =================
>
> Node not rebooted (as expected), repair does not fully work (yet):
>
> Feb 28 08:45:41 SC-1 local0.notice osafimmnd[382]: NO Ccb 6 COMMITTED 
> (immcfg_SC-1_663)
> Feb 28 08:45:41 SC-1 user.notice amf_demo[638]: exiting (caught term signal)
> Feb 28 08:45:41 SC-1 local0.notice osafamfnd[447]: NO 
> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' faulted due to 
> 'avaDown' : Recovery is 'componentRestart'
> Feb 28 08:45:41 SC-1 local0.notice osafamfnd[447]: NO Cleanup of 
> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' failed
> Feb 28 08:45:41 SC-1 local0.notice osafamfnd[447]: NO Reason:'Exec of script 
> success, but script exits with non-zero status'
> Feb 28 08:45:41 SC-1 local0.notice osafamfnd[447]: NO Exit code: 1
> Feb 28 08:45:41 SC-1 local0.warn osafamfnd[447]: WA 
> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State 
> RESTARTING => TERMINATION_FAILED
> Feb 28 08:45:41 SC-1 local0.notice osafamfnd[447]: NO 
> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State INSTANTIATED => 
> TERMINATION_FAILED
> Feb 28 08:45:43 SC-1 local0.notice osafamfnd[447]: NO Repair request for 
> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> Feb 28 08:45:43 SC-1 local0.notice osafamfnd[447]: NO 
> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State TERMINATION_FAILED 
> => UNINSTANTIATED
>
> That the SU stays uninstantiated yet enabled:
>
>          88 08:45:41 02/28/2014 NO safApp=safAmfService 
> "safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 PresenceState 
> RESTARTING => TERMINATION_FAILED
>          89 08:45:41 02/28/2014 NO safApp=safAmfService 
> "safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 OperState ENABLED => DISABLED
>          90 08:45:43 02/28/2014 NO safApp=safAmfService "Admin op "REPAIRED" 
> initiated for 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', invocation: 
> 73014444033
>          91 08:45:43 02/28/2014 NO safApp=safAmfService 
> "safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 PresenceState TERMINATION_FAILED => 
> UNINSTANTIATED
>          92 08:45:43 02/28/2014 NO safApp=safAmfService "Admin op done for 
> invocation: 73014444033, result 1
>          93 08:45:43 02/28/2014 NO safApp=safAmfService 
> "safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 OperState DISABLED => ENABLED
>
> even though the repair succeeds
>
>
> Conditions of Submission:
> -------------------------
>   Ack from reviewers
>
>
> Arch      Built     Started    Linux distro
> -------------------------------------------
> mips        n          n
> mips64      n          n
> x86         n          n
> x86_64      y          y
> powerpc     n          n
> powerpc64   n          n
>
>
> Reviewer Checklist:
> -------------------
> [Submitters: make sure that your review doesn't trigger any checkmarks!]
>
>
> Your checkin has not passed review because (see checked entries):
>
> ___ Your RR template is generally incomplete; it has too many blank entries
>      that need proper data filled in.
>
> ___ You have failed to nominate the proper persons for review and push.
>
> ___ Your patches do not have proper short+long header
>
> ___ You have grammar/spelling in your header that is unacceptable.
>
> ___ You have exceeded a sensible line length in your headers/comments/text.
>
> ___ You have failed to put in a proper Trac Ticket # into your commits.
>
> ___ You have incorrectly put/left internal data in your comments/files
>      (i.e. internal bug tracking tool IDs, product names etc)
>
> ___ You have not given any evidence of testing beyond basic build tests.
>      Demonstrate some level of runtime or other sanity testing.
>
> ___ You have ^M present in some of your files. These have to be removed.
>
> ___ You have needlessly changed whitespace or added whitespace crimes
>      like trailing spaces, or spaces before tabs.
>
> ___ You have mixed real technical changes with whitespace and other
>      cosmetic code cleanup changes. These have to be separate commits.
>
> ___ You need to refactor your submission into logical chunks; there is
>      too much content into a single commit.
>
> ___ You have extraneous garbage in your review (merge commits etc)
>
> ___ You have giant attachments which should never have been sent;
>      Instead you should place your content in a public tree to be pulled.
>
> ___ You have too many commits attached to an e-mail; resend as threaded
>      commits, or place in a public tree for a pull.
>
> ___ You have resent this content multiple times without a clear indication
>      of what has changed between each re-send.
>
> ___ You have failed to adequately and individually address all of the
>      comments and change requests that were proposed in the initial review.
>
> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
>
> ___ Your computer have a badly configured date and time; confusing the
>      the threaded patch review.
>
> ___ Your changes affect IPC mechanism, and you don't present any results
>      for in-service upgradability test.
>
> ___ Your changes affect user manual and documentation, your patch series
>      do not contain the patch that updates the Doxygen manual.
>
>
> ------------------------------------------------------------------------------
> Flow-based real-time traffic analytics software. Cisco certified tool.
> Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
> Customize your own dashboards, set traffic alerts and generate reports.
> Network behavioral analysis & security monitoring. All-in-one tool.
> http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>

------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to