Hi Nhat, On 12/2/2015 7:42 AM, Nhat Pham wrote: > Problem: > >>>-------- > >>>A non-collocated checkpoint is firstly created on SC-2. Then the > >>>checkpoint is closed on SC-2. > >>>The CPD broadcasts CPND_EVT_D2ND_CKPT_RDSET with START to start > >>>retention duration timer on CPND because there is no user. During > >>>that time the checkpoint is opened again and using on PL-3. > >>>After retention duration, the checkpoint is destroyed on both SC-1 > >>>and SC-2.
I interpreted `again using on PL-3` as the PL-3 was also opened the non-collocated checkpoint at fist time it self both SC-2 & PL-3 (opened/closed) in sequence so I thought PL3 CPND already has the database of checkpoint , and then we are trying to re-open the ckpt again on PL-3. In ticket #1616 reproducible step, it was mentioned PL-3 was also opened ckpt , so I carried out the same test and interpreted the #1615 & #1616 test cases are same except the UNLINK is called , so I was suggesting to address both together in one go with and with out Unlink called. Now I got it they are NOT related testcases. my current understanding test cases of of is Ticket #1616 & Ticket #1615 is as follows and you addressed #1615 in this patch , please confirm : ========================================== Ticket #1616 - Non-collocated ckpt S1. Create a checkpoint on SC-2 ( OpenFlags with SA_CKPT_CHECKPOINT_CREATE ) /success/ S2. Close the checkpoint on SC-2 ( retention timer starts because of simple Close ) /success/ S3. Open checkpoint on PL-3 /success/ S4. Unlink the checkpoint on PL-3 /success/ S5. Close the checkpoint on PL-3 /success/ / //After this replicas will be deleted immediately on SC-1 & SC-2 ( no retention timer starts because of Unlink called ) ,// //the subsequent checkpoint Create with same name on any node (PL-3/SC-2/SC-1) / S6. Create the checkpoint on PL-3 ( OpenFlags with SA_CKPT_CHECKPOINT_CREATE ) /checkpoint is created successfully with replicas BUT the IMM objects is not created imm database / ========================================== ========================================== Ticket #1615 - Non-collocated ckpt S1. Create a checkpoint on SC-1 ( OpenFlags with SA_CKPT_CHECKPOINT_CREATE ) /success/ S2. Open a checkpoint on SC-2 /success/ S3. Close the checkpoint on SC-1 /success/ S4. Close the checkpoint on SC-2 /success ( /After this replicas Still exist on SC-1 & SC-2 and the retention timer started ) S5. Open checkpoint on PL-3 /success / ( retention timer still running on SC-1 & SC-2 it was not stopped ) , S6. Create Section Failed with SA_AIS_ERR_NOT_EXIST The reason for the Section Create SA_AIS_ERR_NOT_EXIST is the retention timer SC-1 & SC-2 was not stopped even after step 5 (S5), the subsequent checkpoint Create with same name on any node (PL-3/SC-2/SC-1) ========================================== -AVM On 12/2/2015 7:42 AM, Nhat Pham wrote: > Hi Mahesh, > > I'm not clear about the your proposal below. Could you please help to make > it clearer? Thanks. > > My understanding about the existing implementation: > In case the non-collocated checkpoint exist on controllers, when the > checkpoint is opened on PL first time > (i.e the PL doesn't know that if the checkpoint exist and the cp_node > doesn't exist in CPND database) > the cpnd on PL sends CPD_EVT_ND2D_CKPT_CREATE to CPD to create the > checkpoint. > The CPD finds the checkpoint existing so it returns the message > CPND_EVT_D2ND_CKPT_INFO with create_replica == false. > The CPND updates its database with new checkpoint node without creating a > replica on PL. > > Dec 1 9:03:35.780616 osafckptnd [468:cpsv_evt.c:2199] TR cpnd <<== > CPND_EVT_A2ND_CKPT_OPEN(hdl=1, safCkpt=test3) from node 0x2030F > Dec 1 9:03:35.780874 osafckptnd [468:cpsv_evt.c:2195] TR cpnd ==>> > CPD_EVT_ND2D_CKPT_CREATE(safCkpt=test3, creationFlags=0x2) to CPD > Dec 1 9:03:35.782806 osafckptnd [468:cpsv_evt.c:2201] TR cpnd <<== [3] > CPND_EVT_D2ND_CKPT_INFO(err=1, active=0x2020F, create_rep=false) from CPD > > So, the flow in this case is: > cpnd_evt_proc_ckpt_open() --> CPD_EVT_ND2D_CKPT_CREATE --> > cpd_evt_proc_ckpt_create() > > Best regards, > Nhat Pham > > -----Original Message----- > From: A V Mahesh [mailto:[email protected]] > Sent: Tuesday, December 1, 2015 5:51 PM > To: Nhat Pham <[email protected]>; [email protected] > Cc: [email protected] > Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts CPND_EVT_D2ND_CKPT_RDSET > with STOP [#1615] > > Hi, > we need to check wha cpnd_ckpt_node_find_by_name() is returning on PL-3 if > a no-collocated ckpt replicas exist on controller with unlinked , > > If it returns null we also need to find any non-collated replica exist on > Controller nodes , while opening a checkpoint from PL-3, We are not > suppose to create new Replica on PL-3 if replica exist on controllers ( sc-1 > & sc-2 ) > > -AVM > > On 12/1/2015 3:47 PM, A V Mahesh wrote: >> Hi , >> >> We may need to handle else condition of below with >> `cp_node->is_unlink == true` case in function >> cpnd_evt_proc_ckpt_open() >> >> `if(((cp_node = cpnd_ckpt_node_find_by_name(cb, ckpt_name)) != NULL) >> && cp_node->is_unlink == false) {` >> >> -AVM >> >> On 12/1/2015 3:25 PM, A V Mahesh wrote: >>> Hi , >>> >>> The approach of stopping existing ckpt is different , it should be >>> through >>> >>> cpnd_evt_proc_ckpt_open() --> cpnd_send_ckpt_usr_info_to_cpd --> >>> CPD_EVT_ND2D_CKPT_USR_INFO --> cpd_evt_proc_ckpt_usr_info() So please >>> do change based on this flow in >>> cpd_evt_proc_ckpt_usr_info() and republish the patch . >>> >>> >>> -AVM >>> >>> >>> On 12/1/2015 12:25 PM, Nhat Pham wrote: >>>> osaf/libs/common/cpsv/include/cpd_proc.h | 2 ++ >>>> osaf/services/saf/cpsv/cpd/cpd_evt.c | 8 +++++++- >>>> osaf/services/saf/cpsv/cpd/cpd_proc.c | 22 ++++++++++++++++++++++ >>>> 3 files changed, 31 insertions(+), 1 deletions(-) >>>> >>>> >>>> Problem: >>>> -------- >>>> A non-collocated checkpoint is firstly created on SC-2. Then the >>>> checkpoint is closed on SC-2. >>>> The CPD broadcasts CPND_EVT_D2ND_CKPT_RDSET with START to start >>>> retention duration timer on CPND because there is no user. During >>>> that time the checkpoint is opened again and using on PL-3. >>>> After retention duration, the checkpoint is destroyed on both SC-1 >>>> and SC-2. >>>> >>>> Solution: >>>> --------- >>>> The problem happens because the CPD doesn't broadcasts >>>> CPND_EVT_D2ND_CKPT_RDSET with STOP when the checkpoint is opened >>>> again on PL-3. The CPD is updated to broadcasts >>>> CPND_EVT_D2ND_CKPT_RDSET with STOP when the checkpoint is opened >>>> again. >>>> >>>> diff --git a/osaf/libs/common/cpsv/include/cpd_proc.h >>>> b/osaf/libs/common/cpsv/include/cpd_proc.h >>>> --- a/osaf/libs/common/cpsv/include/cpd_proc.h >>>> +++ b/osaf/libs/common/cpsv/include/cpd_proc.h >>>> @@ -71,6 +71,8 @@ uint32_t cpd_proc_retention_set(CPD_CB * >>>> uint32_t cpd_proc_unlink_set(CPD_CB *cb, CPD_CKPT_INFO_NODE >>>> **ckpt_node, >>>> CPD_CKPT_MAP_INFO *map_info, SaNameT *ckpt_name); >>>> +void cpd_proc_broadcast_RDSET_STOP(SaCkptCheckpointHandleT >>>> ckpt_id, CPD_CB *cb); >>>> + >>>> void cpd_cb_dump(void); >>>> uint32_t cpd_mbcsv_chgrole(CPD_CB *cb); diff --git >>>> a/osaf/services/saf/cpsv/cpd/cpd_evt.c >>>> b/osaf/services/saf/cpsv/cpd/cpd_evt.c >>>> --- a/osaf/services/saf/cpsv/cpd/cpd_evt.c >>>> +++ b/osaf/services/saf/cpsv/cpd/cpd_evt.c >>>> @@ -355,8 +355,14 @@ static uint32_t cpd_evt_proc_ckpt_create >>>> } >>>> if (is_first_rep) >>>> TRACE_2("cpd ckpt create success for first replica >>>> ckpt_id:%llx,dest :%"PRIu64,map_info->ckpt_id,sinfo->dest); >>>> - else >>>> + else >>>> TRACE_2("cpd ckpt create success ckpt_id:%llx,dest >>>> :%"PRIu64,map_info->ckpt_id,sinfo->dest); >>>> + >>>> + >>>> + /* In case the first user re-creates the existing >>>> non-collocated checkpoint. All CPND should stop RD timer */ >>>> + if ((is_first_rep == false) && >>>> (!(map_info->attributes.creationFlags & >>>> SA_CKPT_CHECKPOINT_COLLOCATED))) >>>> + if (ckpt_node->num_users == 1) >>>> + cpd_proc_broadcast_RDSET_STOP(ckpt_node->ckpt_id, cb); >>>> TRACE_LEAVE(); >>>> return proc_rc; >>>> diff --git a/osaf/services/saf/cpsv/cpd/cpd_proc.c >>>> b/osaf/services/saf/cpsv/cpd/cpd_proc.c >>>> --- a/osaf/services/saf/cpsv/cpd/cpd_proc.c >>>> +++ b/osaf/services/saf/cpsv/cpd/cpd_proc.c >>>> @@ -1251,3 +1251,25 @@ uint32_t cpd_ckpt_reploc_imm_object_dele >>>> } >>>> return NCSCC_RC_SUCCESS; >>>> } >>>> + >>>> +/****************************************************************** >>>> +************************ >>>> >>>> + * Name : cpd_proc_broadcast_RDSET_STOP >>>> + * >>>> + * Description : This routine broadcast message >>>> CPND_EVT_D2ND_CKPT_RDSET with STOP >>>> + * >>>> + * Return Values : None >>>> + * >>>> + * Notes : None >>>> +******************************************************************* >>>> +***********************/ >>>> >>>> + >>>> +void cpd_proc_broadcast_RDSET_STOP(SaCkptCheckpointHandleT ckpt_id, >>>> CPD_CB *cb) >>>> +{ >>>> + CPSV_EVT send_evt; >>>> + >>>> + memset(&send_evt, 0, sizeof(CPSV_EVT)); >>>> + send_evt.type = CPSV_EVT_TYPE_CPND; >>>> + send_evt.info.cpnd.type = CPND_EVT_D2ND_CKPT_RDSET; >>>> + send_evt.info.cpnd.info.rdset.ckpt_id = ckpt_id; >>>> + send_evt.info.cpnd.info.rdset.type = CPSV_CKPT_RDSET_STOP; >>>> + cpd_mds_bcast_send(cb, &send_evt, NCSMDS_SVC_ID_CPND); } > ------------------------------------------------------------------------------ Go from Idea to Many App Stores Faster with Intel(R) XDK Give your users amazing mobile app experiences with Intel(R) XDK. Use one codebase in this all-in-one HTML5 development environment. Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs. http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140 _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
