Summary: amfnd: do not repair su without AMFD request in su-failover escalation [#1863]. Review request for Trac Ticket(s): #1863 Peer Reviewer(s): AMF devs Pull request to: <<LIST THE PERSON WITH PUSH ACCESS HERE>> Affected branch(es): ALL Development branch: <<IF ANY GIVE THE REPO URL>>
-------------------------------- Impacted area Impact y/n -------------------------------- Docs n Build system n RPM/packaging n Configuration files n Startup scripts n SAF services y OpenSAF services n Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): --------------------------------------------- changeset 7ad74af914afb1d60a3dba84f5c5260b8d8103ab Author: [email protected] Date: Tue, 14 Jun 2016 14:23:01 +0530 amfnd: do not repair su without AMFD request in su-failover escalation [#1863] In thre reported problem, lock operation on SU got timeout when a quiesced comp fault with su-failover recovery. AMFND calls avnd_err_su_repair() to repair the SU when su-failover recovery is going on. When quiesced comp faults with su-failover recovery, AMFND launches cleaup of components. In the meantime, AMFND gets removal of assignments and as a part of oper done it deletes SUSI and callsavnd_err_su_repair(). Inside this function AMFND tries to instantiate UNINSTANTIATED comps. No componnet is instantiated as they are in TERMINATING state. But SU_FAILOVER flag is reset inside this function. Since AMFND clears the flag, it loses the context of su-failover escalation. When first comp is cleaned up, AMFND instantiates it and thus the condition of all components are terminated for informing AMFD about su-failover escalation is not met. Because of this AMFD never responds for lock operation and it gets timed out. As a part of fix AMFND does not call avnd_err_su_repair() during su-failover escalation and also does not reset SU_FAILOVER flag for comp-failover recovery inside this function. Complete diffstat: ------------------ osaf/services/saf/amf/amfnd/clc.cc | 5 ++++- osaf/services/saf/amf/amfnd/err.cc | 2 -- osaf/services/saf/amf/amfnd/susm.cc | 5 ++++- 3 files changed, 8 insertions(+), 4 deletions(-) Testing Commands: ----------------- Tested as per ticket description for recovery policy comp-failover, su-failover and node-switchover. Testing, Expected Results: -------------------------- Lock operation is responded and all comps remains in UNINSTANTIATED state for su-failover recovery. Conditions of Submission: ------------------------- Ack from any reviewer. Arch Built Started Linux distro ------------------------------------------- mips n n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: ------------------- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not passed review because (see checked entries): ___ Your RR template is generally incomplete; it has too many blank entries that need proper data filled in. ___ You have failed to nominate the proper persons for review and push. ___ Your patches do not have proper short+long header ___ You have grammar/spelling in your header that is unacceptable. ___ You have exceeded a sensible line length in your headers/comments/text. ___ You have failed to put in a proper Trac Ticket # into your commits. ___ You have incorrectly put/left internal data in your comments/files (i.e. internal bug tracking tool IDs, product names etc) ___ You have not given any evidence of testing beyond basic build tests. Demonstrate some level of runtime or other sanity testing. ___ You have ^M present in some of your files. These have to be removed. ___ You have needlessly changed whitespace or added whitespace crimes like trailing spaces, or spaces before tabs. ___ You have mixed real technical changes with whitespace and other cosmetic code cleanup changes. These have to be separate commits. ___ You need to refactor your submission into logical chunks; there is too much content into a single commit. ___ You have extraneous garbage in your review (merge commits etc) ___ You have giant attachments which should never have been sent; Instead you should place your content in a public tree to be pulled. ___ You have too many commits attached to an e-mail; resend as threaded commits, or place in a public tree for a pull. ___ You have resent this content multiple times without a clear indication of what has changed between each re-send. ___ You have failed to adequately and individually address all of the comments and change requests that were proposed in the initial review. ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc) ___ Your computer have a badly configured date and time; confusing the the threaded patch review. ___ Your changes affect IPC mechanism, and you don't present any results for in-service upgradability test. ___ Your changes affect user manual and documentation, your patch series do not contain the patch that updates the Doxygen manual. ------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
