Hi Nhat, Tested ACK from me.
-AVM On 12/8/2015 7:31 AM, Nhat Pham wrote: > Hi Mahesh, > > The patch was updated with your comment. > Could you please help to run CPSV regression test? Thanks. > > Best regards, > Nhat Pham > > -----Original Message----- > From: A V Mahesh [mailto:[email protected]] > Sent: Monday, December 7, 2015 3:48 PM > To: Nhat Pham <[email protected]>; [email protected] > Cc: [email protected] > Subject: Re: [devel] [PATCH 1 of 1] cpsv: cpd broadcasts > CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615] > > Hi Nhat , > > I did code review following are my comment : > Optional : Not required to have function for this > `cpd_proc_broadcast_RDSET_STOP(() > you can do write the event construction . > > If you feel it will be useful in another context in future, just make all > lower case character. > > If you are not immediate requirement , I will complete CPSV regeneration > testing and provide you the result/observation , other wise ACK form me not > tested. > > -AVM > > On 12/7/2015 2:03 PM, Nhat Pham wrote: >> Hi Mahesh, >> >> Have you had chance to look into the patch? Thanks. >> >> Best regards, >> Nhat Pham >> >> -----Original Message----- >> From: Nhat Pham [mailto:[email protected]] >> Sent: Wednesday, December 2, 2015 4:48 PM >> To: 'A V Mahesh' <[email protected]>; [email protected] >> Cc: [email protected] >> Subject: Re: [devel] [PATCH 1 of 1] cpsv: cpd broadcasts >> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615] >> >> Hi Mahesh, >> >> >> >> [AVM] If no process has the checkpoint open (S2 & S5 closed checkpoint ) >> when saCkptCheckpointUnlink() is >> invoked, the checkpoint is immediately deleted. >> >> I hope in your test case case after the S5 & before S6 no process >> has the checkpoint opened , so we need to fix this issue fist. this >> fix will fix IMM objects are not created issue. >> >> >> >> [Nhat Pham] Yes, I agree that the fix should solve the problem after >> S5. I'm working on that. >> >> The S6 was mentioned as a phenomenon because the user doesn't see any >> problem after S5. >> >> >> >> [AVM] I mean Open same checkpoint ( S5. Open checkpoint on PL-3 ) >> , while retention timer running we can reopen the checkpoint >> then it will stop retention timer. >> >> >> >> [Nhat Pham] I got that. The step description for #1615 is correct. >> >> >> >> BTW, could you please have a look at the patch for #1615 again? Thanks. >> >> >> >> Best regards, >> >> Nhat Pham >> >> >> >> From: A V Mahesh [mailto:[email protected]] >> Sent: Wednesday, December 2, 2015 4:08 PM >> To: Nhat Pham <[email protected]>; [email protected] >> Cc: [email protected] >> Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts >> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615] >> >> >> >> Hi Nhat, >> >> Please see below for my comments tagged with [AVM] >> >> -AVM >> >> On 12/2/2015 12:33 PM, Nhat Pham wrote: >> >> Hi Mahesh, >> >> >> >> The ticket #1615 and 1616 report 2 different problems although the >> steps to reproduce the problem are quite similar. >> >> The problem scenarios are quite tricky I think. J >> >> >> >> Please see below for my comments. >> >> >> >> Best regards, >> >> Nhat Pham >> >> >> >> From: A V Mahesh [mailto:[email protected]] >> Sent: Wednesday, December 2, 2015 12:04 PM >> To: Nhat Pham <mailto:[email protected]> >> <[email protected]>; [email protected] >> <mailto:[email protected]> >> Cc: [email protected] >> <mailto:[email protected]> >> Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts >> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615] >> >> >> >> Hi Nhat, >> >> On 12/2/2015 7:42 AM, Nhat Pham wrote: >> >> Problem: >>>>> -------- >>>>> A non-collocated checkpoint is firstly created on SC-2. Then the >>>>> checkpoint is closed on SC-2. >>>>> The CPD broadcasts CPND_EVT_D2ND_CKPT_RDSET with START to start >>>>> retention duration timer on CPND because there is no user. During >>>>> that time the checkpoint is opened again and using on PL-3. >>>>> After retention duration, the checkpoint is destroyed on both SC-1 >>>>> and SC-2. >> I interpreted `again using on PL-3` as the PL-3 was also opened the >> non-collocated checkpoint at fist time it self both SC-2 & PL-3 >> (opened/closed) in sequence >> so I thought PL3 CPND already has the database of checkpoint , and >> then we are trying to re-open the ckpt again on PL-3. >> >> In ticket #1616 reproducible step, it was mentioned PL-3 was also >> opened ckpt , so I carried out the same test >> and interpreted the #1615 & #1616 test cases are same except the >> UNLINK >> is called , so I was suggesting >> to address both together in one go with and with out Unlink called. >> >> Now I got it they are NOT related testcases. my current >> understanding test cases of of is Ticket #1616 & Ticket #1615 is as >> follows and you addressed >> #1615 in this patch , please confirm : >> >> ========================================== >> Ticket #1616 - Non-collocated ckpt >> >> S1. Create a checkpoint on SC-2 ( OpenFlags with >> SA_CKPT_CHECKPOINT_CREATE ) success >> S2. Close the checkpoint on SC-2 ( retention timer starts because of >> simple >> Close ) success >> S3. Open checkpoint on PL-3 success >> S4. Unlink the checkpoint on PL-3 success >> S5. Close the checkpoint on PL-3 success >> >> After this replicas will be deleted immediately on SC-1 & SC-2 ( no >> retention timer starts because of Unlink called ) , the subsequent >> checkpoint Create with same name on any node (PL-3/SC-2/SC-1) >> >> [Nhat Pham] Actually, the replicas on SC-1 and SC-2 are not deleted >> immediately after this step. This leads the problem after S6. >> >> [AVM] If no process has the checkpoint open (S2 & S5 closed checkpoint ) >> when saCkptCheckpointUnlink() is >> invoked, the checkpoint is immediately deleted. >> >> I hope in your test case case after the S5 & before S6 no process >> has the checkpoint opened , so we need to fix this issue fist. this >> fix will fix IMM objects are not created issue. >> >> >> >> [Nhat Pham] Yes, I agree that the fix should solve the problem after >> S5. I'm working on that. >> >> The S6 was mentioned as a phenomenon because the user doesn't see any >> problem after S5. >> >> >> >> >> >> >> S6. Create the checkpoint on PL-3 ( OpenFlags with >> SA_CKPT_CHECKPOINT_CREATE ) >> >> checkpoint is created successfully with replicas BUT the IMM objects >> is not created imm database >> >> [Nhat Pham] It should be "Create the checkpoint on SC-2" (not PL-3). >> >> Actually, the checkpoint is not created in this case because the CPND >> on SC-2 finds that checkpoint exists. >> >> It just informs the CPD with CPD_EVT_ND2D_CKPT_USR_INFO. Thus the IMM >> objects are not created. >> >> >> >> [AVM] This shouldn't happen if no process has the checkpoint open >> when >> saCkptCheckpointUnlink() is invoked >> >> ========================================== >> >> >> ========================================== >> Ticket #1615 - Non-collocated ckpt >> >> S1. Create a checkpoint on SC-1 ( OpenFlags with >> SA_CKPT_CHECKPOINT_CREATE ) success >> S2. Open a checkpoint on SC-2 success >> S3. Close the checkpoint on SC-1 success >> S4. Close the checkpoint on SC-2 success ( After this replicas Still >> exist >> on SC-1 & SC-2 and the retention timer started ) >> >> [Nhat Pham] S2 and S4 might not be necessary to trigger the retention >> timer started. >> >> S5. Open checkpoint on PL-3 success ( retention timer still running on >> SC-1 & SC-2 it was not stopped ) , >> S6. Create Section Failed with SA_AIS_ERR_NOT_EXIST >> >> The reason for the Section Create SA_AIS_ERR_NOT_EXIST is the >> retention timer SC-1 & SC-2 was not stopped even after step 5 (S5), >> the subsequent checkpoint Create with same name on any node >> (PL-3/SC-2/SC-1) >> >> [Nhat Pham] regarding "the subsequent checkpoint Create with same >> name on any node (PL-3/SC-2/SC-1) ", do you mean that: >> >> The checkpoint with same name can be re-created on any node after this step. >> >> [AVM] I mean Open same checkpoint ( S5. Open checkpoint on PL-3 ) >> , while retention timer running we can reopen the checkpoint >> then it will stop retention timer. >> >> [Nhat Pham] I got that. The step description above is correct. >> >> >> >> ========================================== >> >> -AVM >> >> On 12/2/2015 7:42 AM, Nhat Pham wrote: >> >> Hi Mahesh, >> >> I'm not clear about the your proposal below. Could you please help to >> make it clearer? Thanks. >> >> My understanding about the existing implementation: >> In case the non-collocated checkpoint exist on controllers, when the >> checkpoint is opened on PL first time (i.e the PL doesn't know that if >> the checkpoint exist and the cp_node doesn't exist in CPND database) >> the cpnd on PL sends CPD_EVT_ND2D_CKPT_CREATE to CPD to create the >> checkpoint. >> The CPD finds the checkpoint existing so it returns the message >> CPND_EVT_D2ND_CKPT_INFO with create_replica == false. >> The CPND updates its database with new checkpoint node without >> creating a replica on PL. >> >> Dec 1 9:03:35.780616 osafckptnd [468:cpsv_evt.c:2199] TR cpnd <<== >> CPND_EVT_A2ND_CKPT_OPEN(hdl=1, safCkpt=test3) from node 0x2030F Dec 1 >> 9:03:35.780874 osafckptnd [468:cpsv_evt.c:2195] TR cpnd ==>> >> CPD_EVT_ND2D_CKPT_CREATE(safCkpt=test3, creationFlags=0x2) to CPD Dec >> 1 >> 9:03:35.782806 osafckptnd [468:cpsv_evt.c:2201] TR cpnd <<== [3] >> CPND_EVT_D2ND_CKPT_INFO(err=1, active=0x2020F, create_rep=false) from >> CPD >> >> So, the flow in this case is: >> cpnd_evt_proc_ckpt_open() --> CPD_EVT_ND2D_CKPT_CREATE --> >> cpd_evt_proc_ckpt_create() >> >> Best regards, >> Nhat Pham >> >> -----Original Message----- >> From: A V Mahesh [mailto:[email protected]] >> Sent: Tuesday, December 1, 2015 5:51 PM >> To: Nhat Pham <mailto:[email protected]> >> <[email protected]>; [email protected] >> <mailto:[email protected]> >> Cc: [email protected] >> <mailto:[email protected]> >> Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts >> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615] >> >> Hi, >> we need to check wha cpnd_ckpt_node_find_by_name() is returning on >> PL-3 if a no-collocated ckpt replicas exist on controller with >> unlinked , >> >> If it returns null we also need to find any non-collated replica >> exist on Controller nodes , while opening a checkpoint from PL-3, We >> are not suppose to create new Replica on PL-3 if replica exist on >> controllers ( sc-1 & sc-2 ) >> >> -AVM >> >> On 12/1/2015 3:47 PM, A V Mahesh wrote: >> >> Hi , >> >> We may need to handle else condition of below with >> `cp_node->is_unlink == true` case in function >> cpnd_evt_proc_ckpt_open() >> >> `if(((cp_node = cpnd_ckpt_node_find_by_name(cb, ckpt_name)) != NULL) >> && cp_node->is_unlink == false) {` >> >> -AVM >> >> On 12/1/2015 3:25 PM, A V Mahesh wrote: >> >> Hi , >> >> The approach of stopping existing ckpt is different , it should be >> through >> >> cpnd_evt_proc_ckpt_open() --> cpnd_send_ckpt_usr_info_to_cpd --> >> CPD_EVT_ND2D_CKPT_USR_INFO --> cpd_evt_proc_ckpt_usr_info() So please >> do change based on this flow in >> cpd_evt_proc_ckpt_usr_info() and republish the patch . >> >> >> -AVM >> >> >> On 12/1/2015 12:25 PM, Nhat Pham wrote: >> >> osaf/libs/common/cpsv/include/cpd_proc.h | 2 ++ >> osaf/services/saf/cpsv/cpd/cpd_evt.c | 8 +++++++- >> osaf/services/saf/cpsv/cpd/cpd_proc.c | 22 ++++++++++++++++++++++ >> 3 files changed, 31 insertions(+), 1 deletions(-) >> >> >> Problem: >> -------- >> A non-collocated checkpoint is firstly created on SC-2. Then the >> checkpoint is closed on SC-2. >> The CPD broadcasts CPND_EVT_D2ND_CKPT_RDSET with START to start >> retention duration timer on CPND because there is no user. During that >> time the checkpoint is opened again and using on PL-3. >> After retention duration, the checkpoint is destroyed on both SC-1 and SC-2. >> >> Solution: >> --------- >> The problem happens because the CPD doesn't broadcasts >> CPND_EVT_D2ND_CKPT_RDSET with STOP when the checkpoint is opened again >> on PL-3. The CPD is updated to broadcasts CPND_EVT_D2ND_CKPT_RDSET >> with STOP when the checkpoint is opened again. >> >> diff --git a/osaf/libs/common/cpsv/include/cpd_proc.h >> b/osaf/libs/common/cpsv/include/cpd_proc.h >> --- a/osaf/libs/common/cpsv/include/cpd_proc.h >> +++ b/osaf/libs/common/cpsv/include/cpd_proc.h >> @@ -71,6 +71,8 @@ uint32_t cpd_proc_retention_set(CPD_CB * >> uint32_t cpd_proc_unlink_set(CPD_CB *cb, CPD_CKPT_INFO_NODE **ckpt_node, >> CPD_CKPT_MAP_INFO *map_info, SaNameT *ckpt_name); >> +void cpd_proc_broadcast_RDSET_STOP(SaCkptCheckpointHandleT >> ckpt_id, CPD_CB *cb); >> + >> void cpd_cb_dump(void); >> uint32_t cpd_mbcsv_chgrole(CPD_CB *cb); diff --git >> a/osaf/services/saf/cpsv/cpd/cpd_evt.c >> b/osaf/services/saf/cpsv/cpd/cpd_evt.c >> --- a/osaf/services/saf/cpsv/cpd/cpd_evt.c >> +++ b/osaf/services/saf/cpsv/cpd/cpd_evt.c >> @@ -355,8 +355,14 @@ static uint32_t cpd_evt_proc_ckpt_create >> } >> if (is_first_rep) >> TRACE_2("cpd ckpt create success for first replica >> ckpt_id:%llx,dest :%"PRIu64,map_info->ckpt_id,sinfo->dest); >> - else >> + else >> TRACE_2("cpd ckpt create success ckpt_id:%llx,dest >> :%"PRIu64,map_info->ckpt_id,sinfo->dest); >> + >> + >> + /* In case the first user re-creates the existing >> non-collocated checkpoint. All CPND should stop RD timer */ >> + if ((is_first_rep == false) && >> (!(map_info->attributes.creationFlags & >> SA_CKPT_CHECKPOINT_COLLOCATED))) >> + if (ckpt_node->num_users == 1) >> + cpd_proc_broadcast_RDSET_STOP(ckpt_node->ckpt_id, cb); >> TRACE_LEAVE(); >> return proc_rc; >> diff --git a/osaf/services/saf/cpsv/cpd/cpd_proc.c >> b/osaf/services/saf/cpsv/cpd/cpd_proc.c >> --- a/osaf/services/saf/cpsv/cpd/cpd_proc.c >> +++ b/osaf/services/saf/cpsv/cpd/cpd_proc.c >> @@ -1251,3 +1251,25 @@ uint32_t cpd_ckpt_reploc_imm_object_dele >> } >> return NCSCC_RC_SUCCESS; >> } >> + >> +/****************************************************************** >> +************************ >> >> + * Name : cpd_proc_broadcast_RDSET_STOP >> + * >> + * Description : This routine broadcast message >> CPND_EVT_D2ND_CKPT_RDSET with STOP >> + * >> + * Return Values : None >> + * >> + * Notes : None >> +******************************************************************* >> +***********************/ >> >> + >> +void cpd_proc_broadcast_RDSET_STOP(SaCkptCheckpointHandleT ckpt_id, >> CPD_CB *cb) >> +{ >> + CPSV_EVT send_evt; >> + >> + memset(&send_evt, 0, sizeof(CPSV_EVT)); >> + send_evt.type = CPSV_EVT_TYPE_CPND; >> + send_evt.info.cpnd.type = CPND_EVT_D2ND_CKPT_RDSET; >> + send_evt.info.cpnd.info.rdset.ckpt_id = ckpt_id; >> + send_evt.info.cpnd.info.rdset.type = CPSV_CKPT_RDSET_STOP; >> + cpd_mds_bcast_send(cb, &send_evt, NCSMDS_SVC_ID_CPND); } >> >> >> >> >> >> >> >> >> >> >> >> >> ---------------------------------------------------------------------- >> -------- Go from Idea to Many App Stores Faster with Intel(R) XDK Give >> your users amazing mobile app experiences with Intel(R) XDK. >> Use one codebase in this all-in-one HTML5 development environment. >> Design, debug & build mobile apps & 2D/3D high-impact games for multiple >> OSs. >> http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140 >> _______________________________________________ >> Opensaf-devel mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >> >> > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
