Hi Nhat,

Tested ACK from me.

-AVM

On 12/8/2015 7:31 AM, Nhat Pham wrote:
> Hi Mahesh,
>
> The patch was updated with your comment.
> Could you please help to run CPSV regression test? Thanks.
>
> Best regards,
> Nhat Pham
>
> -----Original Message-----
> From: A V Mahesh [mailto:[email protected]]
> Sent: Monday, December 7, 2015 3:48 PM
> To: Nhat Pham <[email protected]>; [email protected]
> Cc: [email protected]
> Subject: Re: [devel] [PATCH 1 of 1] cpsv: cpd broadcasts
> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615]
>
> Hi Nhat ,
>
> I did code review following are my comment :
> Optional : Not required to have function for this
> `cpd_proc_broadcast_RDSET_STOP(()
>    you can do write the event construction .
>
> If you feel it will be useful in another context in future, just make all
> lower case character.
>
> If you are not immediate requirement , I will complete CPSV regeneration
> testing and provide you the result/observation , other wise ACK form me not
> tested.
>
> -AVM
>
> On 12/7/2015 2:03 PM, Nhat Pham wrote:
>> Hi Mahesh,
>>
>> Have you had chance to look into the patch? Thanks.
>>
>> Best regards,
>> Nhat Pham
>>
>> -----Original Message-----
>> From: Nhat Pham [mailto:[email protected]]
>> Sent: Wednesday, December 2, 2015 4:48 PM
>> To: 'A V Mahesh' <[email protected]>; [email protected]
>> Cc: [email protected]
>> Subject: Re: [devel] [PATCH 1 of 1] cpsv: cpd broadcasts
>> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615]
>>
>> Hi Mahesh,
>>
>>
>>
>> [AVM]   If no process has the checkpoint open (S2 & S5  closed checkpoint )
>> when saCkptCheckpointUnlink() is
>> invoked, the checkpoint is immediately deleted.
>>
>> I hope in your test case case  after the S5  &  before  S6 no  process
>> has the checkpoint opened  , so we need to fix this  issue fist.  this
>> fix will fix IMM objects are not created issue.
>>
>>
>>
>> [Nhat Pham] Yes, I agree that the fix should solve the problem after
>> S5. I'm working on that.
>>
>> The S6 was mentioned as a phenomenon because the user doesn't see any
>> problem after S5.
>>
>>
>>
>> [AVM]  I mean  Open same  checkpoint  ( S5. Open  checkpoint on PL-3 )
>> , while retention timer running we can reopen the checkpoint
>>                then it will stop retention timer.
>>
>>
>>
>> [Nhat Pham] I got that. The step description for #1615 is correct.
>>
>>
>>
>> BTW, could you please have a look at the patch for #1615 again? Thanks.
>>
>>
>>
>> Best regards,
>>
>> Nhat Pham
>>
>>
>>
>> From: A V Mahesh [mailto:[email protected]]
>> Sent: Wednesday, December 2, 2015 4:08 PM
>> To: Nhat Pham <[email protected]>; [email protected]
>> Cc: [email protected]
>> Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts
>> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615]
>>
>>
>>
>> Hi Nhat,
>>
>> Please see below for my comments  tagged with [AVM]
>>
>> -AVM
>>
>> On 12/2/2015 12:33 PM, Nhat Pham wrote:
>>
>> Hi Mahesh,
>>
>>
>>
>> The ticket #1615 and 1616 report 2 different problems although the
>> steps to reproduce the problem are quite similar.
>>
>> The problem scenarios are quite tricky I think. J
>>
>>
>>
>> Please see below for my comments.
>>
>>
>>
>> Best regards,
>>
>> Nhat Pham
>>
>>
>>
>> From: A V Mahesh [mailto:[email protected]]
>> Sent: Wednesday, December 2, 2015 12:04 PM
>> To: Nhat Pham  <mailto:[email protected]>
>> <[email protected]>; [email protected]
>> <mailto:[email protected]>
>> Cc: [email protected]
>> <mailto:[email protected]>
>> Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts
>> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615]
>>
>>
>>
>> Hi Nhat,
>>
>> On 12/2/2015 7:42 AM, Nhat Pham wrote:
>>
>> Problem:
>>>>> --------
>>>>> A non-collocated checkpoint is firstly created on SC-2. Then the
>>>>> checkpoint is closed on SC-2.
>>>>> The CPD broadcasts CPND_EVT_D2ND_CKPT_RDSET with START to start
>>>>> retention duration timer on CPND because there is no user. During
>>>>> that time the checkpoint is opened again and using on PL-3.
>>>>> After retention duration, the checkpoint is destroyed on both SC-1
>>>>> and SC-2.
>> I interpreted  `again  using on PL-3`  as the PL-3 was also opened the
>> non-collocated checkpoint at fist time it self  both  SC-2  & PL-3
>> (opened/closed)   in sequence
>> so I thought  PL3 CPND already has the database of checkpoint , and
>> then  we are trying to re-open the ckpt again on PL-3.
>>
>> In  ticket #1616  reproducible step,  it was mentioned PL-3 was also
>> opened ckpt , so I carried out the same test
>> and interpreted   the #1615   & #1616  test cases are same   except the
>> UNLINK
>> is called ,  so  I was suggesting
>> to address both together  in one go with and with out  Unlink  called.
>>
>> Now I got it they are NOT related  testcases.  my current
>> understanding test cases of of  is Ticket #1616  & Ticket #1615 is as
>> follows  and you addressed
>> #1615 in this patch , please confirm :
>>
>> ==========================================
>> Ticket #1616  - Non-collocated ckpt
>>
>> S1. Create a checkpoint on SC-2     ( OpenFlags with
>> SA_CKPT_CHECKPOINT_CREATE )  success
>> S2. Close the checkpoint on SC-2   ( retention timer starts because of
>> simple
>> Close )  success
>> S3. Open  checkpoint on PL-3         success
>> S4. Unlink the checkpoint on PL-3  success
>> S5. Close the checkpoint on PL-3   success
>>
>> After this replicas will be deleted immediately on SC-1 & SC-2 ( no
>> retention timer starts because of  Unlink called ) , the subsequent
>> checkpoint Create with same name on  any node (PL-3/SC-2/SC-1)
>>
>> [Nhat Pham] Actually, the replicas on SC-1 and SC-2 are not deleted
>> immediately after this step. This leads the problem after S6.
>>
>> [AVM]   If no process has the checkpoint open (S2 & S5  closed checkpoint )
>> when saCkptCheckpointUnlink() is
>> invoked, the checkpoint is immediately deleted.
>>
>> I hope in your test case case  after the S5  &  before  S6 no  process
>> has the checkpoint opened  , so we need to fix this  issue fist.  this
>> fix will fix IMM objects are not created issue.
>>
>>
>>
>> [Nhat Pham] Yes, I agree that the fix should solve the problem after
>> S5. I'm working on that.
>>
>> The S6 was mentioned as a phenomenon because the user doesn't see any
>> problem after S5.
>>
>>
>>
>>
>>
>>
>> S6. Create the checkpoint on PL-3  ( OpenFlags with
>>    SA_CKPT_CHECKPOINT_CREATE )
>>
>> checkpoint is created  successfully with replicas BUT the IMM objects
>> is not created imm database
>>
>> [Nhat Pham] It should be "Create the checkpoint on SC-2" (not PL-3).
>>
>> Actually, the checkpoint is not created in this case because the CPND
>> on SC-2 finds that checkpoint exists.
>>
>> It just informs the CPD with CPD_EVT_ND2D_CKPT_USR_INFO. Thus the IMM
>> objects are not created.
>>
>>
>>
>> [AVM]  This shouldn't  happen if no process has the checkpoint open
>> when
>> saCkptCheckpointUnlink() is invoked
>>
>> ==========================================
>>
>>
>> ==========================================
>> Ticket #1615  - Non-collocated ckpt
>>
>> S1. Create a checkpoint on SC-1     ( OpenFlags with
>> SA_CKPT_CHECKPOINT_CREATE )  success
>> S2. Open a checkpoint on SC-2   success
>> S3. Close the checkpoint on SC-1  success
>> S4. Close the checkpoint on SC-2   success   ( After this replicas Still
>> exist
>> on SC-1 & SC-2 and the  retention timer started )
>>
>> [Nhat Pham] S2 and S4 might not be necessary to trigger the retention
>> timer started.
>>
>> S5. Open  checkpoint on PL-3   success    ( retention timer still running on
>> SC-1 & SC-2 it was not stopped ) ,
>> S6. Create Section   Failed    with  SA_AIS_ERR_NOT_EXIST
>>
>> The reason for the  Section  Create SA_AIS_ERR_NOT_EXIST  is the
>> retention timer SC-1 & SC-2  was not stopped even after  step 5 (S5),
>> the subsequent checkpoint Create with same name on  any node
>> (PL-3/SC-2/SC-1)
>>
>> [Nhat Pham] regarding "the subsequent  checkpoint Create with same
>> name on any node (PL-3/SC-2/SC-1) ", do you mean that:
>>
>> The checkpoint with same name can be re-created on any node after this step.
>>
>> [AVM]  I mean  Open same  checkpoint  ( S5. Open  checkpoint on PL-3 )
>> , while retention timer running we can reopen the checkpoint
>>                then it will stop retention timer.
>>
>> [Nhat Pham] I got that. The step description above is correct.
>>
>>
>>
>> ==========================================
>>
>> -AVM
>>
>> On 12/2/2015 7:42 AM, Nhat Pham wrote:
>>
>> Hi Mahesh,
>>
>> I'm not clear about the your proposal below. Could you please help to
>> make it clearer? Thanks.
>>
>> My understanding about the existing implementation:
>> In case the non-collocated checkpoint exist on controllers, when the
>> checkpoint is opened on PL first time (i.e the PL doesn't know that if
>> the checkpoint exist and the cp_node doesn't exist in CPND database)
>> the cpnd on PL sends CPD_EVT_ND2D_CKPT_CREATE to CPD to create the
>> checkpoint.
>> The CPD finds the checkpoint existing so it returns the message
>> CPND_EVT_D2ND_CKPT_INFO with create_replica == false.
>> The CPND updates its database with new checkpoint node without
>> creating a replica on PL.
>>
>> Dec  1  9:03:35.780616 osafckptnd [468:cpsv_evt.c:2199] TR cpnd <<==
>> CPND_EVT_A2ND_CKPT_OPEN(hdl=1, safCkpt=test3) from node 0x2030F Dec  1
>> 9:03:35.780874 osafckptnd [468:cpsv_evt.c:2195] TR cpnd ==>>
>> CPD_EVT_ND2D_CKPT_CREATE(safCkpt=test3, creationFlags=0x2) to CPD Dec
>> 1
>> 9:03:35.782806 osafckptnd [468:cpsv_evt.c:2201] TR cpnd <<== [3]
>> CPND_EVT_D2ND_CKPT_INFO(err=1, active=0x2020F, create_rep=false) from
>> CPD
>>
>> So, the flow in this case is:
>> cpnd_evt_proc_ckpt_open() --> CPD_EVT_ND2D_CKPT_CREATE  -->
>> cpd_evt_proc_ckpt_create()
>>
>> Best regards,
>> Nhat Pham
>>
>> -----Original Message-----
>> From: A V Mahesh [mailto:[email protected]]
>> Sent: Tuesday, December 1, 2015 5:51 PM
>> To: Nhat Pham  <mailto:[email protected]>
>> <[email protected]>; [email protected]
>> <mailto:[email protected]>
>> Cc: [email protected]
>> <mailto:[email protected]>
>> Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts
>> CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615]
>>
>> Hi,
>> we need to check wha  cpnd_ckpt_node_find_by_name() is returning on
>> PL-3 if a no-collocated ckpt replicas exist  on controller with
>> unlinked ,
>>
>> If it returns null we also need to find any  non-collated replica
>> exist on Controller nodes , while opening  a  checkpoint from PL-3, We
>> are not suppose to create new Replica on PL-3 if replica exist on
>> controllers ( sc-1 & sc-2  )
>>
>> -AVM
>>
>> On 12/1/2015 3:47 PM, A V Mahesh wrote:
>>
>> Hi ,
>>
>> We may need to handle  else condition of  below with
>> `cp_node->is_unlink == true` case  in function
>> cpnd_evt_proc_ckpt_open()
>>
>> `if(((cp_node = cpnd_ckpt_node_find_by_name(cb, ckpt_name)) != NULL)
>> && cp_node->is_unlink == false) {`
>>
>> -AVM
>>
>> On 12/1/2015 3:25 PM, A V Mahesh wrote:
>>
>> Hi ,
>>
>> The approach of  stopping  existing ckpt is different , it should be
>> through
>>
>> cpnd_evt_proc_ckpt_open() --> cpnd_send_ckpt_usr_info_to_cpd -->
>> CPD_EVT_ND2D_CKPT_USR_INFO --> cpd_evt_proc_ckpt_usr_info() So please
>> do change based on this flow  in
>> cpd_evt_proc_ckpt_usr_info() and republish the patch .
>>
>>
>> -AVM
>>
>>
>> On 12/1/2015 12:25 PM, Nhat Pham wrote:
>>
>> osaf/libs/common/cpsv/include/cpd_proc.h |   2 ++
>>     osaf/services/saf/cpsv/cpd/cpd_evt.c     |   8 +++++++-
>>     osaf/services/saf/cpsv/cpd/cpd_proc.c    |  22 ++++++++++++++++++++++
>>     3 files changed, 31 insertions(+), 1 deletions(-)
>>
>>
>> Problem:
>> --------
>> A non-collocated checkpoint is firstly created on SC-2. Then the
>> checkpoint is closed on SC-2.
>> The CPD broadcasts CPND_EVT_D2ND_CKPT_RDSET with START to start
>> retention duration timer on CPND because there is no user. During that
>> time the checkpoint is opened again and using on PL-3.
>> After retention duration, the checkpoint is destroyed on both SC-1 and SC-2.
>>
>> Solution:
>> ---------
>> The problem happens because the CPD doesn't broadcasts
>> CPND_EVT_D2ND_CKPT_RDSET with STOP when the checkpoint is opened again
>> on PL-3. The CPD is updated to broadcasts CPND_EVT_D2ND_CKPT_RDSET
>> with STOP when the checkpoint is opened again.
>>
>> diff --git a/osaf/libs/common/cpsv/include/cpd_proc.h
>> b/osaf/libs/common/cpsv/include/cpd_proc.h
>> --- a/osaf/libs/common/cpsv/include/cpd_proc.h
>> +++ b/osaf/libs/common/cpsv/include/cpd_proc.h
>> @@ -71,6 +71,8 @@ uint32_t cpd_proc_retention_set(CPD_CB *
>>     uint32_t cpd_proc_unlink_set(CPD_CB *cb, CPD_CKPT_INFO_NODE **ckpt_node,
>>                        CPD_CKPT_MAP_INFO *map_info, SaNameT *ckpt_name);
>>     +void cpd_proc_broadcast_RDSET_STOP(SaCkptCheckpointHandleT
>> ckpt_id, CPD_CB *cb);
>> +
>>     void cpd_cb_dump(void);
>>       uint32_t cpd_mbcsv_chgrole(CPD_CB *cb); diff --git
>> a/osaf/services/saf/cpsv/cpd/cpd_evt.c
>> b/osaf/services/saf/cpsv/cpd/cpd_evt.c
>> --- a/osaf/services/saf/cpsv/cpd/cpd_evt.c
>> +++ b/osaf/services/saf/cpsv/cpd/cpd_evt.c
>> @@ -355,8 +355,14 @@ static uint32_t cpd_evt_proc_ckpt_create
>>         }
>>         if (is_first_rep)
>>             TRACE_2("cpd ckpt create success for first replica
>> ckpt_id:%llx,dest :%"PRIu64,map_info->ckpt_id,sinfo->dest);
>> -    else
>> +    else
>>             TRACE_2("cpd ckpt create success ckpt_id:%llx,dest
>> :%"PRIu64,map_info->ckpt_id,sinfo->dest);
>> +
>> +
>> +    /* In case the first user re-creates the existing
>> non-collocated checkpoint. All CPND should stop RD timer */
>> +    if ((is_first_rep == false) &&
>> (!(map_info->attributes.creationFlags &
>> SA_CKPT_CHECKPOINT_COLLOCATED)))
>> +        if (ckpt_node->num_users == 1)
>> + cpd_proc_broadcast_RDSET_STOP(ckpt_node->ckpt_id, cb);
>>           TRACE_LEAVE();
>>         return proc_rc;
>> diff --git a/osaf/services/saf/cpsv/cpd/cpd_proc.c
>> b/osaf/services/saf/cpsv/cpd/cpd_proc.c
>> --- a/osaf/services/saf/cpsv/cpd/cpd_proc.c
>> +++ b/osaf/services/saf/cpsv/cpd/cpd_proc.c
>> @@ -1251,3 +1251,25 @@ uint32_t cpd_ckpt_reploc_imm_object_dele
>>         }
>>         return NCSCC_RC_SUCCESS;
>>     }
>> +
>> +/******************************************************************
>> +************************
>>
>> + * Name          : cpd_proc_broadcast_RDSET_STOP
>> + *
>> + * Description   : This routine broadcast message
>> CPND_EVT_D2ND_CKPT_RDSET with STOP
>> + *
>> + * Return Values : None
>> + *
>> + * Notes         : None
>> +*******************************************************************
>> +***********************/
>>
>> +
>> +void cpd_proc_broadcast_RDSET_STOP(SaCkptCheckpointHandleT ckpt_id,
>> CPD_CB *cb)
>> +{
>> +    CPSV_EVT send_evt;
>> +
>> +    memset(&send_evt, 0, sizeof(CPSV_EVT));
>> +    send_evt.type = CPSV_EVT_TYPE_CPND;
>> +    send_evt.info.cpnd.type = CPND_EVT_D2ND_CKPT_RDSET;
>> +    send_evt.info.cpnd.info.rdset.ckpt_id = ckpt_id;
>> +    send_evt.info.cpnd.info.rdset.type = CPSV_CKPT_RDSET_STOP;
>> +    cpd_mds_bcast_send(cb, &send_evt, NCSMDS_SVC_ID_CPND); }
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ----------------------------------------------------------------------
>> -------- Go from Idea to Many App Stores Faster with Intel(R) XDK Give
>> your users amazing mobile app experiences with Intel(R) XDK.
>> Use one codebase in this all-in-one HTML5 development environment.
>> Design, debug & build mobile apps & 2D/3D high-impact games for multiple
>> OSs.
>> http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
>> _______________________________________________
>> Opensaf-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>
>>
>


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to