Hi Mahesh,

Have you had chance to look into the patch? Thanks.

Best regards,
Nhat Pham

-----Original Message-----
From: Nhat Pham [mailto:[email protected]]
Sent: Wednesday, December 2, 2015 4:48 PM
To: 'A V Mahesh' <[email protected]>; [email protected]
Cc: [email protected]
Subject: Re: [devel] [PATCH 1 of 1] cpsv: cpd broadcasts 
CPND_EVT_D2ND_CKPT_RDSET with STOP [#1615]

Hi Mahesh,



[AVM]   If no process has the checkpoint open (S2 & S5  closed checkpoint ) 
when saCkptCheckpointUnlink() is
invoked, the checkpoint is immediately deleted.

I hope in your test case case  after the S5  &  before  S6 no  process has the 
checkpoint opened  , so we need to fix this  issue fist.  this fix will fix 
IMM objects are not created issue.



[Nhat Pham] Yes, I agree that the fix should solve the problem after S5. I'm 
working on that.

The S6 was mentioned as a phenomenon because the user doesn't see any problem 
after S5.



[AVM]  I mean  Open same  checkpoint  ( S5. Open  checkpoint on PL-3 ) , while 
retention timer running we can reopen the checkpoint
             then it will stop retention timer.



[Nhat Pham] I got that. The step description for #1615 is correct.



BTW, could you please have a look at the patch for #1615 again? Thanks.



Best regards,

Nhat Pham



From: A V Mahesh [mailto:[email protected]]
Sent: Wednesday, December 2, 2015 4:08 PM
To: Nhat Pham <[email protected]>; [email protected]
Cc: [email protected]
Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts CPND_EVT_D2ND_CKPT_RDSET with 
STOP [#1615]



Hi Nhat,

Please see below for my comments  tagged with [AVM]

-AVM

On 12/2/2015 12:33 PM, Nhat Pham wrote:

Hi Mahesh,



The ticket #1615 and 1616 report 2 different problems although the steps to 
reproduce the problem are quite similar.

The problem scenarios are quite tricky I think. J



Please see below for my comments.



Best regards,

Nhat Pham



From: A V Mahesh [mailto:[email protected]]
Sent: Wednesday, December 2, 2015 12:04 PM
To: Nhat Pham  <mailto:[email protected]> <[email protected]>; 
[email protected] <mailto:[email protected]>
Cc: [email protected] 
<mailto:[email protected]>
Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts CPND_EVT_D2ND_CKPT_RDSET with 
STOP [#1615]



Hi Nhat,

On 12/2/2015 7:42 AM, Nhat Pham wrote:

Problem:
>>> --------
>>> A non-collocated checkpoint is firstly created on SC-2. Then the
>>> checkpoint is closed on SC-2.
>>> The CPD broadcasts CPND_EVT_D2ND_CKPT_RDSET with START to start
>>> retention duration timer on CPND because there is no user. During
>>> that time the checkpoint is opened again and using on PL-3.
>>> After retention duration, the checkpoint is destroyed on both SC-1
>>> and SC-2.


I interpreted  `again  using on PL-3`  as the PL-3 was also opened the 
non-collocated checkpoint at fist time it self  both  SC-2  & PL-3 
(opened/closed)   in sequence
so I thought  PL3 CPND already has the database of checkpoint , and then  we 
are trying to re-open the ckpt again on PL-3.

In  ticket #1616  reproducible step,  it was mentioned PL-3 was also opened 
ckpt , so I carried out the same test
and interpreted   the #1615   & #1616  test cases are same   except the UNLINK 
is called ,  so  I was suggesting
to address both together  in one go with and with out  Unlink  called.

Now I got it they are NOT related  testcases.  my current understanding test 
cases of of  is Ticket #1616  & Ticket #1615 is as follows  and you addressed 
#1615 in this patch , please confirm :

==========================================
Ticket #1616  - Non-collocated ckpt

S1. Create a checkpoint on SC-2     ( OpenFlags with 
SA_CKPT_CHECKPOINT_CREATE )  success
S2. Close the checkpoint on SC-2   ( retention timer starts because of  simple 
Close )  success
S3. Open  checkpoint on PL-3         success
S4. Unlink the checkpoint on PL-3  success
S5. Close the checkpoint on PL-3   success

After this replicas will be deleted immediately on SC-1 & SC-2 ( no retention 
timer starts because of  Unlink called ) , the subsequent  checkpoint Create 
with same name on  any node (PL-3/SC-2/SC-1)

[Nhat Pham] Actually, the replicas on SC-1 and SC-2 are not deleted 
immediately after this step. This leads the problem after S6.

[AVM]   If no process has the checkpoint open (S2 & S5  closed checkpoint ) 
when saCkptCheckpointUnlink() is
invoked, the checkpoint is immediately deleted.

I hope in your test case case  after the S5  &  before  S6 no  process has the 
checkpoint opened  , so we need to fix this  issue fist.  this fix will fix 
IMM objects are not created issue.



[Nhat Pham] Yes, I agree that the fix should solve the problem after S5. I'm 
working on that.

The S6 was mentioned as a phenomenon because the user doesn't see any problem 
after S5.






S6. Create the checkpoint on PL-3  ( OpenFlags with 
 SA_CKPT_CHECKPOINT_CREATE )

checkpoint is created  successfully with replicas BUT the IMM objects is not 
created imm database

[Nhat Pham] It should be "Create the checkpoint on SC-2" (not PL-3).

Actually, the checkpoint is not created in this case because the CPND on SC-2 
finds that checkpoint exists.

It just informs the CPD with CPD_EVT_ND2D_CKPT_USR_INFO. Thus the IMM objects 
are not created.



[AVM]  This shouldn't  happen if no process has the checkpoint open  when 
saCkptCheckpointUnlink() is invoked

==========================================


==========================================
Ticket #1615  - Non-collocated ckpt

S1. Create a checkpoint on SC-1     ( OpenFlags with 
SA_CKPT_CHECKPOINT_CREATE )  success
S2. Open a checkpoint on SC-2   success
S3. Close the checkpoint on SC-1  success
S4. Close the checkpoint on SC-2   success   ( After this replicas Still exist 
on SC-1 & SC-2 and the  retention timer started )

[Nhat Pham] S2 and S4 might not be necessary to trigger the retention timer 
started.

S5. Open  checkpoint on PL-3   success    ( retention timer still running on 
SC-1 & SC-2 it was not stopped ) ,
S6. Create Section   Failed    with  SA_AIS_ERR_NOT_EXIST

The reason for the  Section  Create SA_AIS_ERR_NOT_EXIST  is the  retention 
timer SC-1 & SC-2  was not stopped even after  step 5 (S5), the subsequent 
checkpoint Create with same name on  any node (PL-3/SC-2/SC-1)

[Nhat Pham] regarding "the subsequent  checkpoint Create with same name on 
any node (PL-3/SC-2/SC-1) ", do you mean that:

The checkpoint with same name can be re-created on any node after this step.

[AVM]  I mean  Open same  checkpoint  ( S5. Open  checkpoint on PL-3 ) , while 
retention timer running we can reopen the checkpoint
             then it will stop retention timer.

[Nhat Pham] I got that. The step description above is correct.



==========================================

-AVM

On 12/2/2015 7:42 AM, Nhat Pham wrote:

Hi Mahesh,

I'm not clear about the your proposal below. Could you please help to make it 
clearer? Thanks.

My understanding about the existing implementation:
In case the non-collocated checkpoint exist on controllers, when the 
checkpoint is opened on PL first time (i.e the PL doesn't know that if the 
checkpoint exist and the cp_node doesn't exist in CPND database) the cpnd on 
PL sends CPD_EVT_ND2D_CKPT_CREATE to CPD to create the checkpoint.
The CPD finds the checkpoint existing so it returns the message 
CPND_EVT_D2ND_CKPT_INFO with create_replica == false.
The CPND updates its database with new checkpoint node without creating a 
replica on PL.

Dec  1  9:03:35.780616 osafckptnd [468:cpsv_evt.c:2199] TR cpnd <<== 
CPND_EVT_A2ND_CKPT_OPEN(hdl=1, safCkpt=test3) from node 0x2030F Dec  1 
9:03:35.780874 osafckptnd [468:cpsv_evt.c:2195] TR cpnd ==>> 
CPD_EVT_ND2D_CKPT_CREATE(safCkpt=test3, creationFlags=0x2) to CPD Dec  1 
9:03:35.782806 osafckptnd [468:cpsv_evt.c:2201] TR cpnd <<== [3] 
CPND_EVT_D2ND_CKPT_INFO(err=1, active=0x2020F, create_rep=false) from CPD

So, the flow in this case is:
cpnd_evt_proc_ckpt_open() --> CPD_EVT_ND2D_CKPT_CREATE  -->
cpd_evt_proc_ckpt_create()

Best regards,
Nhat Pham

-----Original Message-----
From: A V Mahesh [mailto:[email protected]]
Sent: Tuesday, December 1, 2015 5:51 PM
To: Nhat Pham  <mailto:[email protected]> <[email protected]>; 
[email protected] <mailto:[email protected]>
Cc: [email protected] 
<mailto:[email protected]>
Subject: Re: [PATCH 1 of 1] cpsv: cpd broadcasts CPND_EVT_D2ND_CKPT_RDSET with 
STOP [#1615]

Hi,
we need to check wha  cpnd_ckpt_node_find_by_name() is returning on PL-3 if a 
no-collocated ckpt replicas exist  on controller with unlinked ,

If it returns null we also need to find any  non-collated replica exist on 
Controller nodes , while opening  a  checkpoint from PL-3, We are not suppose 
to create new Replica on PL-3 if replica exist on controllers ( sc-1 & sc-2  )

-AVM

On 12/1/2015 3:47 PM, A V Mahesh wrote:

Hi ,

We may need to handle  else condition of  below with `cp_node->is_unlink == 
true` case  in function
cpnd_evt_proc_ckpt_open()

`if(((cp_node = cpnd_ckpt_node_find_by_name(cb, ckpt_name)) != NULL) && 
cp_node->is_unlink == false) {`

-AVM

On 12/1/2015 3:25 PM, A V Mahesh wrote:

Hi ,

The approach of  stopping  existing ckpt is different , it should be through

cpnd_evt_proc_ckpt_open() --> cpnd_send_ckpt_usr_info_to_cpd --> 
CPD_EVT_ND2D_CKPT_USR_INFO --> cpd_evt_proc_ckpt_usr_info() So please do 
change based on this flow  in
cpd_evt_proc_ckpt_usr_info() and republish the patch .


-AVM


On 12/1/2015 12:25 PM, Nhat Pham wrote:

osaf/libs/common/cpsv/include/cpd_proc.h |   2 ++
  osaf/services/saf/cpsv/cpd/cpd_evt.c     |   8 +++++++-
  osaf/services/saf/cpsv/cpd/cpd_proc.c    |  22 ++++++++++++++++++++++
  3 files changed, 31 insertions(+), 1 deletions(-)


Problem:
--------
A non-collocated checkpoint is firstly created on SC-2. Then the checkpoint is 
closed on SC-2.
The CPD broadcasts CPND_EVT_D2ND_CKPT_RDSET with START to start retention 
duration timer on CPND because there is no user. During that time the 
checkpoint is opened again and using on PL-3.
After retention duration, the checkpoint is destroyed on both SC-1 and SC-2.

Solution:
---------
The problem happens because the CPD doesn't broadcasts 
CPND_EVT_D2ND_CKPT_RDSET with STOP when the checkpoint is opened again on 
PL-3. The CPD is updated to broadcasts CPND_EVT_D2ND_CKPT_RDSET with STOP when 
the checkpoint is opened again.

diff --git a/osaf/libs/common/cpsv/include/cpd_proc.h
b/osaf/libs/common/cpsv/include/cpd_proc.h
--- a/osaf/libs/common/cpsv/include/cpd_proc.h
+++ b/osaf/libs/common/cpsv/include/cpd_proc.h
@@ -71,6 +71,8 @@ uint32_t cpd_proc_retention_set(CPD_CB *
  uint32_t cpd_proc_unlink_set(CPD_CB *cb, CPD_CKPT_INFO_NODE **ckpt_node,
                     CPD_CKPT_MAP_INFO *map_info, SaNameT *ckpt_name);
  +void cpd_proc_broadcast_RDSET_STOP(SaCkptCheckpointHandleT
ckpt_id, CPD_CB *cb);
+
  void cpd_cb_dump(void);
    uint32_t cpd_mbcsv_chgrole(CPD_CB *cb); diff --git 
a/osaf/services/saf/cpsv/cpd/cpd_evt.c
b/osaf/services/saf/cpsv/cpd/cpd_evt.c
--- a/osaf/services/saf/cpsv/cpd/cpd_evt.c
+++ b/osaf/services/saf/cpsv/cpd/cpd_evt.c
@@ -355,8 +355,14 @@ static uint32_t cpd_evt_proc_ckpt_create
      }
      if (is_first_rep)
          TRACE_2("cpd ckpt create success for first replica ckpt_id:%llx,dest 
:%"PRIu64,map_info->ckpt_id,sinfo->dest);
-    else
+    else
          TRACE_2("cpd ckpt create success ckpt_id:%llx,dest 
:%"PRIu64,map_info->ckpt_id,sinfo->dest);
+
+
+    /* In case the first user re-creates the existing
non-collocated checkpoint. All CPND should stop RD timer */
+    if ((is_first_rep == false) &&
(!(map_info->attributes.creationFlags &
SA_CKPT_CHECKPOINT_COLLOCATED)))
+        if (ckpt_node->num_users == 1)
+ cpd_proc_broadcast_RDSET_STOP(ckpt_node->ckpt_id, cb);
        TRACE_LEAVE();
      return proc_rc;
diff --git a/osaf/services/saf/cpsv/cpd/cpd_proc.c
b/osaf/services/saf/cpsv/cpd/cpd_proc.c
--- a/osaf/services/saf/cpsv/cpd/cpd_proc.c
+++ b/osaf/services/saf/cpsv/cpd/cpd_proc.c
@@ -1251,3 +1251,25 @@ uint32_t cpd_ckpt_reploc_imm_object_dele
      }
      return NCSCC_RC_SUCCESS;
  }
+
+/******************************************************************
+************************

+ * Name          : cpd_proc_broadcast_RDSET_STOP
+ *
+ * Description   : This routine broadcast message
CPND_EVT_D2ND_CKPT_RDSET with STOP
+ *
+ * Return Values : None
+ *
+ * Notes         : None
+*******************************************************************
+***********************/

+
+void cpd_proc_broadcast_RDSET_STOP(SaCkptCheckpointHandleT ckpt_id,
CPD_CB *cb)
+{
+    CPSV_EVT send_evt;
+
+    memset(&send_evt, 0, sizeof(CPSV_EVT));
+    send_evt.type = CPSV_EVT_TYPE_CPND;
+    send_evt.info.cpnd.type = CPND_EVT_D2ND_CKPT_RDSET;
+    send_evt.info.cpnd.info.rdset.ckpt_id = ckpt_id;
+    send_evt.info.cpnd.info.rdset.type = CPSV_CKPT_RDSET_STOP;
+    cpd_mds_bcast_send(cb, &send_evt, NCSMDS_SVC_ID_CPND); }












------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK Give your users 
amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel



------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to