Re: [devel] [PATCH 0 of 1] Review Request for amfd: update RT objects before node-failover of active controller [#494].

Hans Feldt Wed, 07 May 2014 01:59:17 -0700

Agree. Amfd/imm.cc contains avd_imm_update_runtime_attrs() that should be 
called on the new active. It will sync the IMM attributes with AMFs view.
/Hans


> -----Original Message-----
> From: Anders Björnerstedt
> Sent: den 7 maj 2014 10:44
> To: [email protected]; Hans Feldt; [email protected]
> Cc: [email protected]
> Subject: RE: [devel] [PATCH 0 of 1] Review Request for amfd: update RT 
> objects before node-failover of active controller [#494].
> 
> Hi Praveen
> 
> I normally dont get involved in AMF patch reviews but this ticket and the fix 
> caught my attention.
> There is a general issue that bothers me about the approach, if I have not 
> missunderstood it.
> 
> I understand this is a node failover of active controller.
> That is inherrently an event that is not fully under control.
> It is also an event that really is time critical.
> A failover may occurr in several ways.
> 
> Here it seems that one kind of failover is "semi-controlable" and old active 
> is in
> essence trying to "clean up" its backlog in a job queue before it triggers 
> the failover.
> 
> There will be other failover cases, such as a crash of the IMMD where it will 
> not
> be able to do this. So any cleanup (if necessary) must anyway be covered by 
> new active.
> 
> In addition, updates to cached runtime data is a secondary duty of the AMF.
> Cached runtime data is CACHED and not absolutely obligated to reflect the 
> original
> State (which is in the AMF) in realtime. So updates of cached runtiome data 
> should not
> Really be a reason for delaying a failover.
> 
> /AndersBj
> 
> 
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: den 7 maj 2014 10:26
> To: Hans Feldt; [email protected]
> Cc: [email protected]
> Subject: [devel] [PATCH 0 of 1] Review Request for amfd: update RT objects 
> before node-failover of active controller [#494].
> 
> Summary: amfd: update RT objects before node-failover of active controller 
> [#494].
> Review request for Trac Ticket(s): #494 (its duplicates #853 and #858) Peer 
> Reviewer(s): Hans F., Nagendra.
> Pull request to: <<LIST THE PERSON WITH PUSH ACCESS HERE>> Affected 
> branch(es): All Development branch: <<IF ANY GIVE THE
> REPO URL>>
> 
> --------------------------------
> Impacted area       Impact y/n
> --------------------------------
>  Docs                    n
>  Build system            n
>  RPM/packaging           n
>  Configuration files     n
>  Startup scripts         n
>  SAF services            n
>  OpenSAF services        y
>  Core libraries          n
>  Samples                 n
>  Tests                   n
>  Other                   n
> 
> 
> Comments (indicate scope for each "y" above):
> ---------------------------------------------
> Please see the analysis og tickets and commit log below.
> 
> changeset bcf6eda79102f83c6940d75dd13073a9130026d0
> Author:       [email protected]
> Date: Wed, 07 May 2014 13:43:33 +0530
> 
>       amfd: update RT objects before node-failover of active controller 
> [#494].
> 
>       Problem: Run time objects and attributes are not updated when 
> node-failover
>       gots escalated for active controller and standby controller took the 
> active
>       role.
> 
>       Reason: Activities related to update of runtime objects and certain
>       attribute to IMM are given low priotiy and are pushed in Job queue by 
> AMF.
>       These jobs are completed when AMF is not busy in any other high priority
>       activity. When node-failover is escalated, AMFD sends reboot message to
>       AMFND to reboot the node. In case node-failover is escalated for active
>       controller, it will send reboot message to AMFND which will reboot the
>       controller. In such a case, some IMM related activites in JOB queue will
>       remian uncompleted. All such activites should be compleleted before
>       rebooting the active controller when node-failover is escalated for it.
> 
>       Fix: Fix will finish all IMM related jobs before sending reboot message 
> to
>       AMFND when node-failover is escalated for active controller.
> 
> 
> Complete diffstat:
> ------------------
>  osaf/services/saf/amf/amfd/sgproc.cc |  6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> 
> Testing Commands:
> -----------------
> Tested the duplicate bug #858.
> This is easy to reproduce.
> After reproducing observed the states:
> safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
>         saAmfSUAdminState=UNLOCKED(1)
>         saAmfSUOperState=ENABLED(1)
>         saAmfSUPresenceState=UNINSTANTIATED(1)
>         saAmfSUReadinessState=IN-SERVICE(2)
> 
> 
> Testing, Expected Results:
> --------------------------
> Pass observed the satates:
> safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
>         saAmfSUAdminState=UNLOCKED(1)
>         saAmfSUOperState=DISABLED(2)
>         saAmfSUPresenceState=UNINSTANTIATED(1)
>         saAmfSUReadinessState=OUT-OF-SERVICE(1)
> AMFD logs:
> May  7 12:05:47.624746 osafamfd [26472:imm.cc:0143] >> exec: Update 
> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> saAmfSUReadinessState May  7 12:05:47.624799 osafamfd 
> [26472:imma_oi_api.c:2270] >> saImmOiRtObjectUpdate_2 May  7
> 12:05:47.626863 osafamfd [26472:mds_dt_trans.c:0671] >> 
> mdtm_process_poll_recv_data_tcp May  7 12:05:47.627392 osafamfd
> [26472:imma_oi_api.c:2554] << saImmOiRtObjectUpdate_2 May  7 12:05:47.627419 
> osafamfd [26472:imm.cc:0172] << exec
> 
> May  7 12:05:47.634134 osafamfd [26472:util.cc:1681] TR Sending REBOOT MSG to 
> 2010f May  7 12:05:47.634372 osafamfd
> [26472:sgproc.cc:0715] << avd_su_oper_state_evh
> 
> 
> 
> Conditions of Submission:
> -------------------------
> Ack from one of the reviewers.
> 
> Arch      Built     Started    Linux distro
> -------------------------------------------
> mips        n          n
> mips64      n          n
> x86         n          n
> x86_64      y          y
> powerpc     n          n
> powerpc64   n          n
> 
> 
> Reviewer Checklist:
> -------------------
> [Submitters: make sure that your review doesn't trigger any checkmarks!]
> 
> 
> Your checkin has not passed review because (see checked entries):
> 
> ___ Your RR template is generally incomplete; it has too many blank entries
>     that need proper data filled in.
> 
> ___ You have failed to nominate the proper persons for review and push.
> 
> ___ Your patches do not have proper short+long header
> 
> ___ You have grammar/spelling in your header that is unacceptable.
> 
> ___ You have exceeded a sensible line length in your headers/comments/text.
> 
> ___ You have failed to put in a proper Trac Ticket # into your commits.
> 
> ___ You have incorrectly put/left internal data in your comments/files
>     (i.e. internal bug tracking tool IDs, product names etc)
> 
> ___ You have not given any evidence of testing beyond basic build tests.
>     Demonstrate some level of runtime or other sanity testing.
> 
> ___ You have ^M present in some of your files. These have to be removed.
> 
> ___ You have needlessly changed whitespace or added whitespace crimes
>     like trailing spaces, or spaces before tabs.
> 
> ___ You have mixed real technical changes with whitespace and other
>     cosmetic code cleanup changes. These have to be separate commits.
> 
> ___ You need to refactor your submission into logical chunks; there is
>     too much content into a single commit.
> 
> ___ You have extraneous garbage in your review (merge commits etc)
> 
> ___ You have giant attachments which should never have been sent;
>     Instead you should place your content in a public tree to be pulled.
> 
> ___ You have too many commits attached to an e-mail; resend as threaded
>     commits, or place in a public tree for a pull.
> 
> ___ You have resent this content multiple times without a clear indication
>     of what has changed between each re-send.
> 
> ___ You have failed to adequately and individually address all of the
>     comments and change requests that were proposed in the initial review.
> 
> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
> 
> ___ Your computer have a badly configured date and time; confusing the
>     the threaded patch review.
> 
> ___ Your changes affect IPC mechanism, and you don't present any results
>     for in-service upgradability test.
> 
> ___ Your changes affect user manual and documentation, your patch series
>     do not contain the patch that updates the Doxygen manual.
> 
> 
> ------------------------------------------------------------------------------
> Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
> &#149; 3 signs your SCM is hindering your productivity &#149; Requirements 
> for releasing software faster &#149; Expert tips and
> advice for migrating your SCM now http://p.sf.net/sfu/perforce 
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Re: [devel] [PATCH 0 of 1] Review Request for amfd: update RT objects before node-failover of active controller [#494].

Reply via email to