Hi Anders, On 1/29/2016 1:02 PM, Nhat Pham wrote: > Hi Mahesh, > > As described in the README, the CKPT service returns SA_AIS_ERR_TRY_AGAIN > fault code in this case. > I guess it's same for other services. > > @Anders: Could you please confirm this? Please clarify on the SAF API`s behavior , that doesn't dependent on Director to provide the services like Initialize handle (saCkptInitialize()) ect ..
-AVM > > Best regards, > Nhat Pham > > -----Original Message----- > From: A V Mahesh [mailto:[email protected]] > Sent: Friday, January 29, 2016 2:11 PM > To: Nhat Pham <[email protected]>; [email protected] > Cc: [email protected] > Subject: Re: [PATCH 0 of 1] Review Request for cpsv: Support preserving and > recovering checkpoint replicas during headless state V2 [#1621] > > Hi, > > On 1/29/2016 11:45 AM, Nhat Pham wrote: >> - The behavior of application will be consistent with other saf >> services like imm/amf behavior during headless state. >> [Nhat] I'm not clear what you mean about "consistent"? > In the obscene of Director (SC's) , what is expected return values of SAF API > should ( all services ) , > which are not in aposition to provide service at that moment. > > I think all services should return same SAF ERRS., I thinks currently we > don't have it , may be Anders Widel will help us. > > -AVM > > > On 1/29/2016 11:45 AM, Nhat Pham wrote: >> Hi Mahesh, >> >> Please see the attachment for the README. Let me know if there is any >> more information required. >> >> Regarding your comments: >> - during headless state applications may behave like during CPND >> restart case [Nhat] Headless state and CPND restart are different >> events. Thus, the behavior is different. >> Headless state is a case where both SCs go down. >> >> - The behavior of application will be consistent with other saf >> services like imm/amf behavior during headless state. >> [Nhat] I'm not clear what you mean about "consistent"? >> >> Best regards, >> Nhat Pham >> >> -----Original Message----- >> From: A V Mahesh [mailto:[email protected]] >> Sent: Friday, January 29, 2016 11:12 AM >> To: Nhat Pham <[email protected]>; [email protected] >> Cc: [email protected] >> Subject: Re: [PATCH 0 of 1] Review Request for cpsv: Support >> preserving and recovering checkpoint replicas during headless state V2 >> [#1621] >> >> Hi Nhat Pham, >> >> I stared reviewing this patch , so can please provide README file >> with scope and limitations , that will help to define testing/reviewing >> scope . >> >> Following are minimum things we can keep in mind while >> reviewing/accepting patch , >> >> - Not effecting existing functionality >> - during headless state applications may behave like during CPND >> restart case >> - The minimum functionally of application works >> - The behavior of application will be consistent with >> other saf services like imm/amf behavior during headless state. >> >> So please do provide any additional detailed in README if any of the >> above is deviated , that allow users to know about the >> limitations/deviation. >> >> -AVM >> >> On 1/4/2016 3:15 PM, Nhat Pham wrote: >>> Summary: cpsv: Support preserving and recovering checkpoint replicas >>> during headless state [#1621] Review request for Trac Ticket(s): >>> #1621 Peer Reviewer(s): [email protected]; >>> [email protected] Pull request to: [email protected] >>> Affected branch(es): default Development branch: default >>> >>> -------------------------------- >>> Impacted area Impact y/n >>> -------------------------------- >>> Docs n >>> Build system n >>> RPM/packaging n >>> Configuration files n >>> Startup scripts n >>> SAF services y >>> OpenSAF services n >>> Core libraries n >>> Samples n >>> Tests n >>> Other n >>> >>> >>> Comments (indicate scope for each "y" above): >>> --------------------------------------------- >>> >>> changeset faec4a4445a4c23e8f630857b19aabb43b5af18d >>> Author: Nhat Pham <[email protected]> >>> Date: Mon, 04 Jan 2016 16:34:33 +0700 >>> >>> cpsv: Support preserving and recovering checkpoint replicas during >>> headless state [#1621] >>> >>> Background: >>> ---------- This enhancement supports to preserve checkpoint replicas >> in case >>> both SCs down (headless state) and recover replicas in case one of >> SCs up >>> again. If both SCs goes down, checkpoint replicas on surviving nodes >> still >>> remain. When a SC is available again, surviving replicas are >> automatically >>> registered to the SC checkpoint database. Content in surviving >> replicas are >>> intacted and synchronized to new replicas. >>> >>> When no SC is available, client API calls changing checkpoint >> configuration >>> which requires SC communication, are rejected. Client API calls >> reading and >>> writing existing checkpoint replicas still work. >>> >>> Limitation: The CKPT service does not support recovering checkpoints >> in >>> following cases: >>> - The checkpoint which is unlinked before headless. >>> - The non-collocated checkpoint has active replica locating on SC. >>> - The non-collocated checkpoint has active replica locating on a PL >> and this >>> PL restarts during headless state. In this cases, the checkpoint >> replica is >>> destroyed. The fault code SA_AIS_ERR_BAD_HANDLE is returned when the >> client >>> accesses the checkpoint in these cases. The client must re-open the >>> checkpoint. >>> >>> While in headless state, accessing checkpoint replicas does not work >> if the >>> node which hosts the active replica goes down. It will back working >> when a >>> SC available again. >>> >>> Solution: >>> --------- The solution for this enhancement includes 2 parts: >>> >>> 1. To destroy un-recoverable checkpoint described above when both >> SCs are >>> down: When both SCs are down, the CPND deletes un-recoverable >> checkpoint >>> nodes and replicas on PLs. Then it requests CPA to destroy >> corresponding >>> checkpoint node by using new message CPA_EVT_ND2A_CKPT_DESTROY >>> >>> 2. To update CPD with checkpoint information When an active SC is up >> after >>> headless, CPND will update CPD with checkpoint information by using >> new >>> message CPD_EVT_ND2D_CKPT_INFO_UPDATE instead of using >>> CPD_EVT_ND2D_CKPT_CREATE. This is because the CPND will create new >> ckpt_id >>> for the checkpoint which might be different with the current ckpt id >> if the >>> CPD_EVT_ND2D_CKPT_CREATE is used. The CPD collects checkpoint >> information >>> within 6s. During this updating time, following requests is rejected >> with >>> fault code SA_AIS_ERR_TRY_AGAIN: >>> - CPD_EVT_ND2D_CKPT_CREATE >>> - CPD_EVT_ND2D_CKPT_UNLINK >>> - CPD_EVT_ND2D_ACTIVE_SET >>> - CPD_EVT_ND2D_CKPT_RDSET >>> >>> >>> Complete diffstat: >>> ------------------ >>> osaf/libs/agents/saf/cpa/cpa_proc.c | 52 >> +++++++++++++++++++++++++++++++++++ >>> osaf/libs/common/cpsv/cpsv_edu.c | 43 >> +++++++++++++++++++++++++++++ >>> osaf/libs/common/cpsv/include/cpd_cb.h | 3 ++ >>> osaf/libs/common/cpsv/include/cpd_imm.h | 1 + >>> osaf/libs/common/cpsv/include/cpd_proc.h | 7 ++++ >>> osaf/libs/common/cpsv/include/cpd_tmr.h | 3 +- >>> osaf/libs/common/cpsv/include/cpnd_cb.h | 1 + >>> osaf/libs/common/cpsv/include/cpnd_init.h | 2 + >>> osaf/libs/common/cpsv/include/cpsv_evt.h | 20 +++++++++++++ >>> osaf/services/saf/cpsv/cpd/Makefile.am | 3 +- >>> osaf/services/saf/cpsv/cpd/cpd_evt.c | 229 >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++ >>> osaf/services/saf/cpsv/cpd/cpd_imm.c | 112 >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> osaf/services/saf/cpsv/cpd/cpd_init.c | 20 ++++++++++++- >>> osaf/services/saf/cpsv/cpd/cpd_proc.c | 309 >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> osaf/services/saf/cpsv/cpd/cpd_tmr.c | 7 ++++ >>> osaf/services/saf/cpsv/cpnd/cpnd_db.c | 16 ++++++++++ >>> osaf/services/saf/cpsv/cpnd/cpnd_evt.c | 22 +++++++++++++++ >>> osaf/services/saf/cpsv/cpnd/cpnd_init.c | 23 ++++++++++++++- >>> osaf/services/saf/cpsv/cpnd/cpnd_mds.c | 13 ++++++++ >>> osaf/services/saf/cpsv/cpnd/cpnd_proc.c | 314 >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- >>> 20 files changed, 1189 insertions(+), 11 deletions(-) >>> >>> >>> Testing Commands: >>> ----------------- >>> - >>> >>> Testing, Expected Results: >>> -------------------------- >>> - >>> >>> >>> Conditions of Submission: >>> ------------------------- >>> <<HOW MANY DAYS BEFORE PUSHING, CONSENSUS ETC>> >>> >>> >>> Arch Built Started Linux distro >>> ------------------------------------------- >>> mips n n >>> mips64 n n >>> x86 n n >>> x86_64 n n >>> powerpc n n >>> powerpc64 n n >>> >>> >>> Reviewer Checklist: >>> ------------------- >>> [Submitters: make sure that your review doesn't trigger any >>> checkmarks!] >>> >>> >>> Your checkin has not passed review because (see checked entries): >>> >>> ___ Your RR template is generally incomplete; it has too many blank >> entries >>> that need proper data filled in. >>> >>> ___ You have failed to nominate the proper persons for review and push. >>> >>> ___ Your patches do not have proper short+long header >>> >>> ___ You have grammar/spelling in your header that is unacceptable. >>> >>> ___ You have exceeded a sensible line length in your >> headers/comments/text. >>> ___ You have failed to put in a proper Trac Ticket # into your commits. >>> >>> ___ You have incorrectly put/left internal data in your comments/files >>> (i.e. internal bug tracking tool IDs, product names etc) >>> >>> ___ You have not given any evidence of testing beyond basic build tests. >>> Demonstrate some level of runtime or other sanity testing. >>> >>> ___ You have ^M present in some of your files. These have to be removed. >>> >>> ___ You have needlessly changed whitespace or added whitespace crimes >>> like trailing spaces, or spaces before tabs. >>> >>> ___ You have mixed real technical changes with whitespace and other >>> cosmetic code cleanup changes. These have to be separate commits. >>> >>> ___ You need to refactor your submission into logical chunks; there is >>> too much content into a single commit. >>> >>> ___ You have extraneous garbage in your review (merge commits etc) >>> >>> ___ You have giant attachments which should never have been sent; >>> Instead you should place your content in a public tree to be pulled. >>> >>> ___ You have too many commits attached to an e-mail; resend as threaded >>> commits, or place in a public tree for a pull. >>> >>> ___ You have resent this content multiple times without a clear indication >>> of what has changed between each re-send. >>> >>> ___ You have failed to adequately and individually address all of the >>> comments and change requests that were proposed in the initial >> review. >>> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc) >>> >>> ___ Your computer have a badly configured date and time; confusing the >>> the threaded patch review. >>> >>> ___ Your changes affect IPC mechanism, and you don't present any results >>> for in-service upgradability test. >>> >>> ___ Your changes affect user manual and documentation, your patch series >>> do not contain the patch that updates the Doxygen manual. >>> > ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
