Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-08-09 Thread A V Mahesh
Hi Hoang,

This patch looks more stable , I haven t found any issue in basic in 
service upgrade testing testing .

I am going to ACK the patch.

-AVM

On 8/3/2016 4:11 PM, Vo Minh Hoang wrote:
> Dear Mahesh,
>
> I have just submit a V4 patch that try to eliminate the possible error in
> communicating between old and new version.
>
> My testing shows OK result but when I cannot reproduce the problem exactly,
> I do not have high confident about it.
>
> Would you please help me review and check the result of this patch.
>
> Thank you and best regards,
> Hoang
>
> -Original Message-
> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
> Sent: Wednesday, July 27, 2016 4:53 PM
> To: Vo Minh Hoang ; 'Nhat Pham'
> ; anders.wid...@ericsson.com
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [PATCH 1 of 1] cpsv: To update checkpoint user number for each
> node [#1669] V3
>
> Hi  Hoang,
>
> I still able to reproduce the problem , some time it increments two time of
> current readers ,
>
> some time it is getting  decremented to less then zero ( variable are set
> (0xfff6) )
>
> Unfortunately I don't have any specific steps order , but this issue occurs
> in cluster setup with  1new controller & 1 old controller  and  2 old
> payloads
>
> when tow  application opened & holded  on old payloads ( don't exist) , and
> try to do fail-overs of controllers and then exit the applications on both
> payloads,
>
> you will end up with error.
>
> I broad , I suggest you look at the new messages that are getting introduced
> in this patch are prevented with version check
>
> 
> ===
>
> PL-3:~ # immlist safCkpt=checkpoint_test77
> Name   Type Value(s)
> 
> safCkptSA_STRING_T
> safCkpt=checkpoint_test77
> saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
> saCkptCheckpointSize   SA_UINT64_T 2097152
> (0x20)
> saCkptCheckpointRetDurationSA_TIME_T
> 9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
> saCkptCheckpointNumWriters SA_UINT32_T
> 4294967286 (0xfff6)
> saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
> saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
> saCkptCheckpointNumReaders SA_UINT32_T
> 4294967286 (0xfff6)
> saCkptCheckpointNumOpeners SA_UINT32_T  0 (0x0)
> saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
> saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
> saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152
> (0x20)
> saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
> *saCkptCheckpointCreationTimestamp  SA_TIME_T
> 14696097540 (0x14651a48f19e8400, Wed Jul 27 14:25:54 2016)*
> saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
> SaImmAttrImplementerName   SA_STRING_T
> safCheckPointService
> SaImmAttrClassName SA_STRING_T
> SaCkptCheckpoint
> SaImmAttrAdminOwnerNameSA_STRING_T 
>
>
>
> SC-2:~ # immlist safCkpt=checkpoint_test77
> Name   Type Value(s)
> 
> safCkptSA_STRING_T
> safCkpt=checkpoint_test77
> saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
> saCkptCheckpointSize   SA_UINT64_T 2097152
> (0x20)
> saCkptCheckpointRetDurationSA_TIME_T
> 9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
> saCkptCheckpointNumWriters SA_UINT32_T  20 (0x14)
> saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
> saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
> saCkptCheckpointNumReaders SA_UINT32_T  20 (0x14)
> saCkptCheckpointNumOpeners SA_UINT32_T  20 (0x14)
> saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
> saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
> saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152
> (0x20)
> saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
> *saCkptCheckpointCreationTimestamp  SA_TIME_T
> 14696106140 (0x14651b112d9d1c00, Wed Jul 27 14:40:14 2016)*
> saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
> 

Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-08-03 Thread Vo Minh Hoang
Dear Mahesh,

I have just submit a V4 patch that try to eliminate the possible error in
communicating between old and new version.

My testing shows OK result but when I cannot reproduce the problem exactly,
I do not have high confident about it.

Would you please help me review and check the result of this patch.

Thank you and best regards,
Hoang

-Original Message-
From: A V Mahesh [mailto:mahesh.va...@oracle.com] 
Sent: Wednesday, July 27, 2016 4:53 PM
To: Vo Minh Hoang ; 'Nhat Pham'
; anders.wid...@ericsson.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 1 of 1] cpsv: To update checkpoint user number for each
node [#1669] V3

Hi  Hoang,

I still able to reproduce the problem , some time it increments two time of
current readers ,

some time it is getting  decremented to less then zero ( variable are set
(0xfff6) )

Unfortunately I don't have any specific steps order , but this issue occurs
in cluster setup with  1new controller & 1 old controller  and  2 old
payloads

when tow  application opened & holded  on old payloads ( don't exist) , and
try to do fail-overs of controllers and then exit the applications on both
payloads,

you will end up with error.

I broad , I suggest you look at the new messages that are getting introduced
in this patch are prevented with version check


===

PL-3:~ # immlist safCkpt=checkpoint_test77
Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T 
4294967286 (0xfff6)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
saCkptCheckpointNumReaders SA_UINT32_T 
4294967286 (0xfff6)
saCkptCheckpointNumOpeners SA_UINT32_T  0 (0x0)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
*saCkptCheckpointCreationTimestamp  SA_TIME_T 
14696097540 (0x14651a48f19e8400, Wed Jul 27 14:25:54 2016)*
saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 



SC-2:~ # immlist safCkpt=checkpoint_test77
Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T  20 (0x14)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
saCkptCheckpointNumReaders SA_UINT32_T  20 (0x14)
saCkptCheckpointNumOpeners SA_UINT32_T  20 (0x14)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
*saCkptCheckpointCreationTimestamp  SA_TIME_T 
14696106140 (0x14651b112d9d1c00, Wed Jul 27 14:40:14 2016)*
saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 


===

-AVM

On 7/26/2016 8:41 

Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-07-27 Thread Vo Minh Hoang
Dear Mahesh,

Thank you very much for your provided information.
I will continue investigating this problem.

Sincerely,
Hoang

-Original Message-
From: A V Mahesh [mailto:mahesh.va...@oracle.com] 
Sent: Wednesday, July 27, 2016 4:53 PM
To: Vo Minh Hoang ; 'Nhat Pham'
; anders.wid...@ericsson.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 1 of 1] cpsv: To update checkpoint user number for each
node [#1669] V3

Hi  Hoang,

I still able to reproduce the problem , some time it increments two time of
current readers ,

some time it is getting  decremented to less then zero ( variable are set
(0xfff6) )

Unfortunately I don't have any specific steps order , but this issue occurs
in cluster setup with  1new controller & 1 old controller  and  2 old
payloads

when tow  application opened & holded  on old payloads ( don't exist) , and
try to do fail-overs of controllers and then exit the applications on both
payloads,

you will end up with error.

I broad , I suggest you look at the new messages that are getting introduced
in this patch are prevented with version check


===

PL-3:~ # immlist safCkpt=checkpoint_test77
Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T 
4294967286 (0xfff6)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
saCkptCheckpointNumReaders SA_UINT32_T 
4294967286 (0xfff6)
saCkptCheckpointNumOpeners SA_UINT32_T  0 (0x0)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
*saCkptCheckpointCreationTimestamp  SA_TIME_T 
14696097540 (0x14651a48f19e8400, Wed Jul 27 14:25:54 2016)*
saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 



SC-2:~ # immlist safCkpt=checkpoint_test77
Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T  20 (0x14)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
saCkptCheckpointNumReaders SA_UINT32_T  20 (0x14)
saCkptCheckpointNumOpeners SA_UINT32_T  20 (0x14)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
*saCkptCheckpointCreationTimestamp  SA_TIME_T 
14696106140 (0x14651b112d9d1c00, Wed Jul 27 14:40:14 2016)*
saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 


===

-AVM

On 7/26/2016 8:41 AM, Vo Minh Hoang wrote:
> Dear Mahesh,
>
> Thank you very much for your checking.
>
> Unfortunately, I unsuccessfully reproduce this problem in our environment.
> Would you please send us the trace log of d and nd of both 

Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-07-27 Thread A V Mahesh
Hi  Hoang,

I still able to reproduce the problem , some time it increments two time 
of current readers ,

some time it is getting  decremented to less then zero ( variable are 
set (0xfff6) )

Unfortunately I don't have any specific steps order , but this issue 
occurs in cluster setup with  1new controller & 1 old controller  and  2 
old payloads

when tow  application opened & holded  on old payloads ( don't exist) , 
and try to do fail-overs of controllers and then exit the applications  
on both payloads,

you will end up with error.

I broad , I suggest you look at the new messages that are getting 
introduced  in this patch are prevented with version check

===

PL-3:~ # immlist safCkpt=checkpoint_test77
Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T 
4294967286 (0xfff6)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
saCkptCheckpointNumReaders SA_UINT32_T 
4294967286 (0xfff6)
saCkptCheckpointNumOpeners SA_UINT32_T  0 (0x0)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
*saCkptCheckpointCreationTimestamp  SA_TIME_T 
14696097540 (0x14651a48f19e8400, Wed Jul 27 14:25:54 2016)*
saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 



SC-2:~ # immlist safCkpt=checkpoint_test77
Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T  20 (0x14)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
saCkptCheckpointNumReaders SA_UINT32_T  20 (0x14)
saCkptCheckpointNumOpeners SA_UINT32_T  20 (0x14)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
*saCkptCheckpointCreationTimestamp  SA_TIME_T 
14696106140 (0x14651b112d9d1c00, Wed Jul 27 14:40:14 2016)*
saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 

===

-AVM

On 7/26/2016 8:41 AM, Vo Minh Hoang wrote:
> Dear Mahesh,
>
> Thank you very much for your checking.
>
> Unfortunately, I unsuccessfully reproduce this problem in our environment.
> Would you please send us the trace log of d and nd of both SC-1 and SC-2
> when error occur for investigating.
>
> For reference, here is my reproduce steps:
> 1. prepare SC-1 with patch, SC-2 without patch
> 2. create checkpoint in SC-1
> 3. open checkpoint in SC-2
> 4. immlist to get checkpoint information
> 5. unlink and close checkpoint in SC-1
> 6. immlist again to confirm its deletion
> 7. create checkpoint again in SC-1
> 8. list all replica in sharemem, there is a different here, in you error
> log, why sharemem is different 

Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-07-27 Thread A V Mahesh
Hi Hoan,

   I tested in-service testing  on 4 node setup ( combination of old and 
new) ,  with application opened on two payloads,
   restarting controllers.

   I am trying to recollect the test case , mean while you can also give 
a try .

-AVM

On 7/26/2016 1:58 PM, A V Mahesh wrote:
> Hi Hoan,
>
> I need to refresh ,I will setup again and provide the detailed
> reproducible steps.
>
> -AVM
>
> On 7/26/2016 8:41 AM, Vo Minh Hoang wrote:
>> Dear Mahesh,
>>
>> Thank you very much for your checking.
>>
>> Unfortunately, I unsuccessfully reproduce this problem in our environment.
>> Would you please send us the trace log of d and nd of both SC-1 and SC-2
>> when error occur for investigating.
>>
>> For reference, here is my reproduce steps:
>> 1. prepare SC-1 with patch, SC-2 without patch
>> 2. create checkpoint in SC-1
>> 3. open checkpoint in SC-2
>> 4. immlist to get checkpoint information
>> 5. unlink and close checkpoint in SC-1
>> 6. immlist again to confirm its deletion
>> 7. create checkpoint again in SC-1
>> 8. list all replica in sharemem, there is a different here, in you error
>> log, why sharemem is different between SC-1 and SC-2? In my opinion sharemem
>> should be one.
>> 9. immlist to check information
>>
>> Please tell us if I miss something.
>> I am sorry for any inconvenient.
>>
>> Thank you and best regards.
>> Hoang
>>
>> -Original Message-
>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>> Sent: Friday, July 15, 2016 10:26 AM
>> To: Nhat Pham ; anders.wid...@ericsson.com; Nhat
>> Pham ; Hoang Vo 
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: Re: [PATCH 1 of 1] cpsv: To update checkpoint user number for each
>> node [#1669] V3
>>
>> Hi  Hoang /Nhat Pham,
>>
>>
>> The basic testing with in-service upgrade(one old  controller with
>> out patch and one new  controller with patch ) is corrupting the
>> Writers/Readers/Openers  DB,
>>
>> please verify in-service upgrade test with collocated & no-collocated ckpts
>> and address new issue and publish V4 patch.
>>
>> SC-1:/avm/opensaf_app/cpsv_applications/virtualaddr # immlist
>> safCkpt=checkpoint_test77
>> Name   Type Value(s)
>> 
>> safCkptSA_STRING_T
>> safCkpt=checkpoint_test77
>> saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
>> saCkptCheckpointSize   SA_UINT64_T 2097152
>> (0x20)
>> saCkptCheckpointRetDurationSA_TIME_T
>> 9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
>> saCkptCheckpointNumWriters SA_UINT32_T
>> 4294967291 (0xfffb)
>> saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
>> saCkptCheckpointNumReplicasSA_UINT32_T  4 (0x4)
>> saCkptCheckpointNumReaders SA_UINT32_T
>> 4294967291 (0xfffb)
>> saCkptCheckpointNumOpeners SA_UINT32_T
>> 4294967291 (0xfffb)
>> saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
>> saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
>> saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152
>> (0x20)
>> saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
>> saCkptCheckpointCreationTimestamp  SA_TIME_T
>> 14685525530 (0x146158c4278eda00, Fri Jul 15 08:45:53 2016)
>> saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
>> SaImmAttrImplementerName   SA_STRING_T
>> safCheckPointService
>> SaImmAttrClassName SA_STRING_T
>> SaCkptCheckpoint
>> SaImmAttrAdminOwnerNameSA_STRING_T 
>>
>> -AVM
>>
>>
>> On 7/13/2016 12:44 PM, A V Mahesh wrote:
>>> Hi  Hoang /Nhat Pham,
>>>
>>> I just started testing , fowling test case is failing , I may report
>>> more  as soon as I get some
>>>
>>> Test case 1 :
>>>
>>> Step 1 : saCkptCheckpointOpen  on SC-1
>>>
>>> SC-1:# ./node_A
>>> 0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
>>> 1 saCkptCheckpointOpen  returned checkpointHandle 626e70
>>> 2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
>>> 3 saCkptCheckpointOpen  returned checkpointHandle 627170
>>> 4 saCkptCheckpointOpen  returned checkpointHandle 6272f0
>>> saCkptCheckpointWrite Waiting to Read from Checkpoint 
>>> saCkptCheckpointWrite Press  key to continue...
>>>
>>> 1 saCkptCheckpointWrite  checkpointHandle 626bf0
>>> 2 saCkptCheckpointWrite  checkpointHandle 626bf0
>>> 3 saCkptCheckpointWrite  checkpointHandle 626bf0
>>> 4 saCkptCheckpointWrite  checkpointHandle 626bf0
>>> 222 saCkptCheckpointWrite  checkpointHandle 626bf0
>>> saCkptCheckpointRead Waiting to Read from Checkpoint 
>>> 

Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-07-26 Thread A V Mahesh
Hi Hoan,

I need to refresh ,I will setup again and provide the detailed 
reproducible steps.

-AVM

On 7/26/2016 8:41 AM, Vo Minh Hoang wrote:
> Dear Mahesh,
>
> Thank you very much for your checking.
>
> Unfortunately, I unsuccessfully reproduce this problem in our environment.
> Would you please send us the trace log of d and nd of both SC-1 and SC-2
> when error occur for investigating.
>
> For reference, here is my reproduce steps:
> 1. prepare SC-1 with patch, SC-2 without patch
> 2. create checkpoint in SC-1
> 3. open checkpoint in SC-2
> 4. immlist to get checkpoint information
> 5. unlink and close checkpoint in SC-1
> 6. immlist again to confirm its deletion
> 7. create checkpoint again in SC-1
> 8. list all replica in sharemem, there is a different here, in you error
> log, why sharemem is different between SC-1 and SC-2? In my opinion sharemem
> should be one.
> 9. immlist to check information
>
> Please tell us if I miss something.
> I am sorry for any inconvenient.
>
> Thank you and best regards.
> Hoang
>
> -Original Message-
> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
> Sent: Friday, July 15, 2016 10:26 AM
> To: Nhat Pham ; anders.wid...@ericsson.com; Nhat
> Pham ; Hoang Vo 
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [PATCH 1 of 1] cpsv: To update checkpoint user number for each
> node [#1669] V3
>
> Hi  Hoang /Nhat Pham,
>
>
> The basic testing with in-service upgrade(one old  controller with
> out patch and one new  controller with patch ) is corrupting the
> Writers/Readers/Openers  DB,
>
> please verify in-service upgrade test with collocated & no-collocated ckpts
> and address new issue and publish V4 patch.
>
> SC-1:/avm/opensaf_app/cpsv_applications/virtualaddr # immlist
> safCkpt=checkpoint_test77
> Name   Type Value(s)
> 
> safCkptSA_STRING_T
> safCkpt=checkpoint_test77
> saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
> saCkptCheckpointSize   SA_UINT64_T 2097152
> (0x20)
> saCkptCheckpointRetDurationSA_TIME_T
> 9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
> saCkptCheckpointNumWriters SA_UINT32_T
> 4294967291 (0xfffb)
> saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
> saCkptCheckpointNumReplicasSA_UINT32_T  4 (0x4)
> saCkptCheckpointNumReaders SA_UINT32_T
> 4294967291 (0xfffb)
> saCkptCheckpointNumOpeners SA_UINT32_T
> 4294967291 (0xfffb)
> saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
> saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
> saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152
> (0x20)
> saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
> saCkptCheckpointCreationTimestamp  SA_TIME_T
> 14685525530 (0x146158c4278eda00, Fri Jul 15 08:45:53 2016)
> saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
> SaImmAttrImplementerName   SA_STRING_T
> safCheckPointService
> SaImmAttrClassName SA_STRING_T
> SaCkptCheckpoint
> SaImmAttrAdminOwnerNameSA_STRING_T 
>
> -AVM
>
>
> On 7/13/2016 12:44 PM, A V Mahesh wrote:
>> Hi  Hoang /Nhat Pham,
>>
>> I just started testing , fowling test case is failing , I may report
>> more  as soon as I get some
>>
>> Test case 1 :
>>
>> Step 1 : saCkptCheckpointOpen  on SC-1
>>
>> SC-1:# ./node_A
>> 0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
>> 1 saCkptCheckpointOpen  returned checkpointHandle 626e70
>> 2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
>> 3 saCkptCheckpointOpen  returned checkpointHandle 627170
>> 4 saCkptCheckpointOpen  returned checkpointHandle 6272f0
>> saCkptCheckpointWrite Waiting to Read from Checkpoint 
>> saCkptCheckpointWrite Press  key to continue...
>>
>> 1 saCkptCheckpointWrite  checkpointHandle 626bf0
>> 2 saCkptCheckpointWrite  checkpointHandle 626bf0
>> 3 saCkptCheckpointWrite  checkpointHandle 626bf0
>> 4 saCkptCheckpointWrite  checkpointHandle 626bf0
>> 222 saCkptCheckpointWrite  checkpointHandle 626bf0
>> saCkptCheckpointRead Waiting to Read from Checkpoint 
>> saCkptCheckpointRead Press  key to continue...
>>
>> Step 2 : saCkptCheckpointOpen  on SC-2
>>
>> SC-2:/avm/opensaf_app/cpsv_applications/virtualaddr # ./node_B
>> 0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
>> 1 saCkptCheckpointOpen  returned checkpointHandle 626e70
>> 2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
>> 3 saCkptCheckpointOpen  returned checkpointHandle 627170
>> 4 

Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-07-25 Thread Vo Minh Hoang
Dear Mahesh,

Thank you very much for your checking.

Unfortunately, I unsuccessfully reproduce this problem in our environment.
Would you please send us the trace log of d and nd of both SC-1 and SC-2
when error occur for investigating.

For reference, here is my reproduce steps:
1. prepare SC-1 with patch, SC-2 without patch
2. create checkpoint in SC-1
3. open checkpoint in SC-2
4. immlist to get checkpoint information
5. unlink and close checkpoint in SC-1
6. immlist again to confirm its deletion
7. create checkpoint again in SC-1
8. list all replica in sharemem, there is a different here, in you error
log, why sharemem is different between SC-1 and SC-2? In my opinion sharemem
should be one.
9. immlist to check information

Please tell us if I miss something.
I am sorry for any inconvenient.

Thank you and best regards.
Hoang

-Original Message-
From: A V Mahesh [mailto:mahesh.va...@oracle.com] 
Sent: Friday, July 15, 2016 10:26 AM
To: Nhat Pham ; anders.wid...@ericsson.com; Nhat
Pham ; Hoang Vo 
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 1 of 1] cpsv: To update checkpoint user number for each
node [#1669] V3

Hi  Hoang /Nhat Pham,


The basic testing with in-service upgrade(one old  controller with 
out patch and one new  controller with patch ) is corrupting the
Writers/Readers/Openers  DB,

please verify in-service upgrade test with collocated & no-collocated ckpts
and address new issue and publish V4 patch.

SC-1:/avm/opensaf_app/cpsv_applications/virtualaddr # immlist
safCkpt=checkpoint_test77
Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T 
4294967291 (0xfffb)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  4 (0x4)
saCkptCheckpointNumReaders SA_UINT32_T 
4294967291 (0xfffb)
saCkptCheckpointNumOpeners SA_UINT32_T 
4294967291 (0xfffb)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
saCkptCheckpointCreationTimestamp  SA_TIME_T 
14685525530 (0x146158c4278eda00, Fri Jul 15 08:45:53 2016)
saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 

-AVM


On 7/13/2016 12:44 PM, A V Mahesh wrote:
> Hi  Hoang /Nhat Pham,
>
> I just started testing , fowling test case is failing , I may report 
> more  as soon as I get some
>
> Test case 1 :
>
> Step 1 : saCkptCheckpointOpen  on SC-1
>
> SC-1:# ./node_A
> 0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
> 1 saCkptCheckpointOpen  returned checkpointHandle 626e70
> 2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
> 3 saCkptCheckpointOpen  returned checkpointHandle 627170
> 4 saCkptCheckpointOpen  returned checkpointHandle 6272f0 
> saCkptCheckpointWrite Waiting to Read from Checkpoint 
> saCkptCheckpointWrite Press  key to continue...
>
> 1 saCkptCheckpointWrite  checkpointHandle 626bf0
> 2 saCkptCheckpointWrite  checkpointHandle 626bf0
> 3 saCkptCheckpointWrite  checkpointHandle 626bf0
> 4 saCkptCheckpointWrite  checkpointHandle 626bf0
> 222 saCkptCheckpointWrite  checkpointHandle 626bf0 
> saCkptCheckpointRead Waiting to Read from Checkpoint 
> saCkptCheckpointRead Press  key to continue...
>
> Step 2 : saCkptCheckpointOpen  on SC-2
>
> SC-2:/avm/opensaf_app/cpsv_applications/virtualaddr # ./node_B
> 0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
> 1 saCkptCheckpointOpen  returned checkpointHandle 626e70
> 2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
> 3 saCkptCheckpointOpen  returned checkpointHandle 627170
> 4 saCkptCheckpointOpen  returned checkpointHandle 6272f0 
> saCkptCheckpointWrite Waiting to Read from Checkpoint 
> saCkptCheckpointWrite Press  key to continue...
>
> 1 saCkptCheckpointWrite  checkpointHandle 626bf0
> 2 saCkptCheckpointWrite  checkpointHandle 626bf0
> 3 saCkptCheckpointWrite  checkpointHandle 626bf0

Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-07-14 Thread A V Mahesh
Hi  Hoang /Nhat Pham,


The basic testing with in-service upgrade(one old  controller with 
out patch and one new  controller with patch ) is corrupting the  
Writers/Readers/Openers  DB,

please verify in-service upgrade test with collocated & no-collocated 
ckpts and address new issue and publish V4 patch.

SC-1:/avm/opensaf_app/cpsv_applications/virtualaddr # immlist 
safCkpt=checkpoint_test77
Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T 
4294967291 (0xfffb)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  4 (0x4)
saCkptCheckpointNumReaders SA_UINT32_T 
4294967291 (0xfffb)
saCkptCheckpointNumOpeners SA_UINT32_T 
4294967291 (0xfffb)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
saCkptCheckpointCreationTimestamp  SA_TIME_T 
14685525530 (0x146158c4278eda00, Fri Jul 15 08:45:53 2016)
saCkptCheckpointCreationFlags  SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 

-AVM


On 7/13/2016 12:44 PM, A V Mahesh wrote:
> Hi  Hoang /Nhat Pham,
>
> I just started testing , fowling test case is failing , I may report 
> more  as soon as I get some
>
> Test case 1 :
>
> Step 1 : saCkptCheckpointOpen  on SC-1
>
> SC-1:# ./node_A
> 0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
> 1 saCkptCheckpointOpen  returned checkpointHandle 626e70
> 2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
> 3 saCkptCheckpointOpen  returned checkpointHandle 627170
> 4 saCkptCheckpointOpen  returned checkpointHandle 6272f0
> saCkptCheckpointWrite Waiting to Read from Checkpoint 
> saCkptCheckpointWrite Press  key to continue...
>
> 1 saCkptCheckpointWrite  checkpointHandle 626bf0
> 2 saCkptCheckpointWrite  checkpointHandle 626bf0
> 3 saCkptCheckpointWrite  checkpointHandle 626bf0
> 4 saCkptCheckpointWrite  checkpointHandle 626bf0
> 222 saCkptCheckpointWrite  checkpointHandle 626bf0
> saCkptCheckpointRead Waiting to Read from Checkpoint 
> saCkptCheckpointRead Press  key to continue...
>
> Step 2 : saCkptCheckpointOpen  on SC-2
>
> SC-2:/avm/opensaf_app/cpsv_applications/virtualaddr # ./node_B
> 0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
> 1 saCkptCheckpointOpen  returned checkpointHandle 626e70
> 2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
> 3 saCkptCheckpointOpen  returned checkpointHandle 627170
> 4 saCkptCheckpointOpen  returned checkpointHandle 6272f0
> saCkptCheckpointWrite Waiting to Read from Checkpoint 
> saCkptCheckpointWrite Press  key to continue...
>
> 1 saCkptCheckpointWrite  checkpointHandle 626bf0
> 2 saCkptCheckpointWrite  checkpointHandle 626bf0
> 3 saCkptCheckpointWrite  checkpointHandle 626bf0
> 4 saCkptCheckpointWrite  checkpointHandle 626bf0
> 222 saCkptCheckpointWrite  checkpointHandle 626bf0
> saCkptCheckpointRead Waiting to Read from Checkpoint 
> saCkptCheckpointRead Press  key to continue...
>
> Step 3 : do   # immlist safCkpt=checkpoint_test77
>
> Name   Type Value(s)
> 
> safCkptSA_STRING_T 
> safCkpt=checkpoint_test77
> saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
> saCkptCheckpointSize   SA_UINT64_T 2097152 
> (0x20)
> saCkptCheckpointRetDurationSA_TIME_T 
> 9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
> saCkptCheckpointNumWriters SA_UINT32_T  10 (0xa)
> saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
> saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
> saCkptCheckpointNumReaders SA_UINT32_T  10 (0xa)
> saCkptCheckpointNumOpeners SA_UINT32_T  10 (0xa)
> saCkptCheckpointNumCorruptSections 

Re: [devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-07-13 Thread A V Mahesh
Hi  Hoang /Nhat Pham,

I just started testing , fowling test case is failing , I may report 
more  as soon as I get some

Test case 1 :

Step 1 : saCkptCheckpointOpen  on SC-1

SC-1:# ./node_A
0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
1 saCkptCheckpointOpen  returned checkpointHandle 626e70
2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
3 saCkptCheckpointOpen  returned checkpointHandle 627170
4 saCkptCheckpointOpen  returned checkpointHandle 6272f0
saCkptCheckpointWrite Waiting to Read from Checkpoint 
saCkptCheckpointWrite Press  key to continue...

1 saCkptCheckpointWrite  checkpointHandle 626bf0
2 saCkptCheckpointWrite  checkpointHandle 626bf0
3 saCkptCheckpointWrite  checkpointHandle 626bf0
4 saCkptCheckpointWrite  checkpointHandle 626bf0
222 saCkptCheckpointWrite  checkpointHandle 626bf0
saCkptCheckpointRead Waiting to Read from Checkpoint 
saCkptCheckpointRead Press  key to continue...

Step 2 : saCkptCheckpointOpen  on SC-2

SC-2:/avm/opensaf_app/cpsv_applications/virtualaddr # ./node_B
0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
1 saCkptCheckpointOpen  returned checkpointHandle 626e70
2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
3 saCkptCheckpointOpen  returned checkpointHandle 627170
4 saCkptCheckpointOpen  returned checkpointHandle 6272f0
saCkptCheckpointWrite Waiting to Read from Checkpoint 
saCkptCheckpointWrite Press  key to continue...

1 saCkptCheckpointWrite  checkpointHandle 626bf0
2 saCkptCheckpointWrite  checkpointHandle 626bf0
3 saCkptCheckpointWrite  checkpointHandle 626bf0
4 saCkptCheckpointWrite  checkpointHandle 626bf0
222 saCkptCheckpointWrite  checkpointHandle 626bf0
saCkptCheckpointRead Waiting to Read from Checkpoint 
saCkptCheckpointRead Press  key to continue...

Step 3 : do   # immlist safCkpt=checkpoint_test77

Name   Type Value(s)

safCkptSA_STRING_T 
safCkpt=checkpoint_test77
saCkptCheckpointUsedSize   SA_UINT64_T 110 (0x6e)
saCkptCheckpointSize   SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointRetDurationSA_TIME_T 
9223372036854775807 (0x7fff, Sat Apr 12 05:17:16 2262)
saCkptCheckpointNumWriters SA_UINT32_T  10 (0xa)
saCkptCheckpointNumSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointNumReplicasSA_UINT32_T  2 (0x2)
saCkptCheckpointNumReaders SA_UINT32_T  10 (0xa)
saCkptCheckpointNumOpeners SA_UINT32_T  10 (0xa)
saCkptCheckpointNumCorruptSections SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSectionsSA_UINT32_T  1 (0x1)
saCkptCheckpointMaxSectionSize SA_UINT64_T 2097152 
(0x20)
saCkptCheckpointMaxSectionIdSize   SA_UINT64_T 256 (0x100)
saCkptCheckpointCreationTimestamp  SA_TIME_T 
14683927200 (0x1460c766225ea000, Wed Jul 13 12:22:00 2016)
saCkptCheckpointCreationFlags  SA_UINT32_T  9 (0x9)
SaImmAttrImplementerName   SA_STRING_T 
safCheckPointService
SaImmAttrClassName SA_STRING_T 
SaCkptCheckpoint
SaImmAttrAdminOwnerNameSA_STRING_T 


Step 4 :  saCkptCheckpointUnlink & saCkptCheckpointClose  on SC-1

Attempt 1-0 Read DataBuffer 
:VV
0 saCkptCheckpointClose  checkpointHandle 626bf0
1 saCkptCheckpointClose  checkpointHandle 626e70
2 saCkptCheckpointClose  checkpointHandle 626ff0
3 saCkptCheckpointClose  checkpointHandle 627170
4 saCkptCheckpointClose  checkpointHandle 6272f0

Step 5 :  do   # immlist safCkpt=checkpoint_test77

error - object or attribute does not exist as expected

Step 6 :  saCkptCheckpointOpen  on SC-1 again ( note  on SC-2 ckpt still 
open )

SC-1:# ./node_A
0 saCkptCheckpointOpen  returned checkpointHandle 626bf0
1 saCkptCheckpointOpen  returned checkpointHandle 626e70
2 saCkptCheckpointOpen  returned checkpointHandle 626ff0
3 saCkptCheckpointOpen  returned checkpointHandle 627170
4 saCkptCheckpointOpen  returned checkpointHandle 6272f0
saCkptCheckpointWrite Waiting to Read from Checkpoint 
saCkptCheckpointWrite Press  key to continue...

Step 7 : check Replicas on both  ( you will see 2 Replicas and partially 
10 Openers )

SC-1: # ls /dev/shm/
opensaf_CPND_CHECKPOINT_INFO_131343 opensaf_NCS_GLND_LCK_CKPT_INFO  
opensaf_NCS_MQND_QUEUE_CKPT_INFO
opensaf_NCS_GLND_EVT_CKPT_INFO opensaf_NCS_GLND_RES_CKPT_INFO 
opensaf_safCkpt=checkpoint_test7_131343_2

SC-2: # ls
opensaf_CPND_CHECKPOINT_INFO_131599 opensaf_NCS_GLND_LCK_CKPT_INFO  
opensaf_NCS_MQND_QUEUE_CKPT_INFO
opensaf_NCS_GLND_EVT_CKPT_INFO opensaf_NCS_GLND_RES_CKPT_INFO 

[devel] [PATCH 1 of 1] cpsv: To update checkpoint user number for each node [#1669] V3

2016-05-04 Thread Nhat Pham
 osaf/libs/common/cpsv/include/cpd_cb.h   |2 +
 osaf/libs/common/cpsv/include/cpd_proc.h |3 +
 osaf/libs/common/cpsv/include/cpd_red.h  |   14 ++
 osaf/libs/common/cpsv/include/cpsv_evt.h |8 +
 osaf/services/saf/cpsv/cpd/cpd_db.c  |   14 ++-
 osaf/services/saf/cpsv/cpd/cpd_evt.c |8 +
 osaf/services/saf/cpsv/cpd/cpd_mbcsv.c   |   90 +-
 osaf/services/saf/cpsv/cpd/cpd_proc.c|  148 +++
 osaf/services/saf/cpsv/cpd/cpd_red.c |   32 +-
 osaf/services/saf/cpsv/cpd/cpd_sbevt.c   |   68 ++
 10 files changed, 370 insertions(+), 17 deletions(-)


Problem:
---
The saCkptCheckpointNumOpeners is not updated when a node which has a 
checkpoint client restarts.

Solution:

Currently CPD doesn't store number of user on each node. This patch updates CPD 
to update information
about users on each node for each checkpoint. When a node restarts, the CPD 
update the total number of
users for a checkpoint accordingly. This is reflected on 
saCkptCheckpointNumOpeners attribute correctly.

diff --git a/osaf/libs/common/cpsv/include/cpd_cb.h 
b/osaf/libs/common/cpsv/include/cpd_cb.h
--- a/osaf/libs/common/cpsv/include/cpd_cb.h
+++ b/osaf/libs/common/cpsv/include/cpd_cb.h
@@ -92,6 +92,8 @@ typedef struct cpd_ckpt_info_node {
uint32_t num_users;
uint32_t num_readers;
uint32_t num_writers;
+   uint32_t node_users_cnt;
+   CPD_NODE_USER_INFO *node_users;
 
/* for imm */
SaUint32T ckpt_used_size;
diff --git a/osaf/libs/common/cpsv/include/cpd_proc.h 
b/osaf/libs/common/cpsv/include/cpd_proc.h
--- a/osaf/libs/common/cpsv/include/cpd_proc.h
+++ b/osaf/libs/common/cpsv/include/cpd_proc.h
@@ -108,5 +108,8 @@ uint32_t cpd_mbcsv_enc_async_update(CPD_
 uint32_t cpd_mbcsv_close(CPD_CB *cb);
 bool cpd_is_noncollocated_replica_present_on_payload(CPD_CB *cb, 
CPD_CKPT_INFO_NODE *ckpt_node);
 uint32_t cpd_ckpt_reploc_imm_object_delete(CPD_CB *cb,  CPD_CKPT_REPLOC_INFO 
*ckpt_reploc_node ,bool is_unlink_set);
+void cpd_proc_increase_node_user_info(CPD_CKPT_INFO_NODE *ckpt_node, MDS_DEST 
cpnd_dest, SaCkptCheckpointOpenFlagsT open_flags);
+void cpd_proc_decrease_node_user_info(CPD_CKPT_INFO_NODE *ckpt_node, MDS_DEST 
cpnd_dest, SaCkptCheckpointOpenFlagsT open_flags);
+void cpd_proc_update_user_info_when_node_down(CPD_CB *cb, NODE_ID node_id);
 uint32_t cpd_proc_ckpt_update_post(CPD_CB *cb);
 #endif
diff --git a/osaf/libs/common/cpsv/include/cpd_red.h 
b/osaf/libs/common/cpsv/include/cpd_red.h
--- a/osaf/libs/common/cpsv/include/cpd_red.h
+++ b/osaf/libs/common/cpsv/include/cpd_red.h
@@ -28,6 +28,7 @@ typedef enum cpd_mbcsv_msg_type {
CPD_A2S_MSG_CKPT_UNLINK,
CPD_A2S_MSG_CKPT_USR_INFO,
CPD_A2S_MSG_CKPT_DEST_DOWN,
+   CPD_A2S_MSG_CKPT_USR_INFO_2,
CPD_A2S_MSG_MAX_EVT
 } CPD_MBCSV_MSG_TYPE;
 
@@ -64,6 +65,18 @@ typedef struct cpd_a2s_ckpt_usr_info {
 
 } CPD_A2S_CKPT_USR_INFO;
 
+typedef struct cpd_a2s_ckpt_usr_info_2 {
+   SaCkptCheckpointHandleT ckpt_id;
+   uint32_t num_user;
+   uint32_t num_writer;
+   uint32_t num_reader;
+   uint32_t num_sections;
+   uint32_t ckpt_on_scxb1;
+   uint32_t ckpt_on_scxb2;
+   uint32_t node_users_cnt;
+   CPD_NODE_USER_INFO *node_list;
+} CPD_A2S_CKPT_USR_INFO_2;
+
 typedef struct cpd_mbcsv_msg {
CPD_MBCSV_MSG_TYPE type;
union {
@@ -76,6 +89,7 @@ typedef struct cpd_mbcsv_msg {
CPD_A2S_CKPT_UNLINK ckpt_ulink;
CPD_A2S_CKPT_USR_INFO usr_info;
CPSV_CKPT_DEST_INFO dest_down;
+   CPD_A2S_CKPT_USR_INFO_2 usr_info_2;
} info;
 } CPD_MBCSV_MSG;
 
diff --git a/osaf/libs/common/cpsv/include/cpsv_evt.h 
b/osaf/libs/common/cpsv/include/cpsv_evt.h
--- a/osaf/libs/common/cpsv/include/cpsv_evt.h
+++ b/osaf/libs/common/cpsv/include/cpsv_evt.h
@@ -840,6 +840,14 @@ typedef struct cpd_tmr_info {
} info;
 } CPD_TMR_INFO;
 
+typedef struct cpd_node_user_info {
+   MDS_DEST dest;
+   uint32_t num_users;
+   uint32_t num_writers;
+   uint32_t num_readers;
+   struct cpd_node_user_info *next;
+} CPD_NODE_USER_INFO;
+
 /**
  CPD Event Data Structures
  
**/
diff --git a/osaf/services/saf/cpsv/cpd/cpd_db.c 
b/osaf/services/saf/cpsv/cpd/cpd_db.c
--- a/osaf/services/saf/cpsv/cpd/cpd_db.c
+++ b/osaf/services/saf/cpsv/cpd/cpd_db.c
@@ -137,6 +137,7 @@ uint32_t cpd_ckpt_node_delete(CPD_CB *cb
 {
uint32_t rc = NCSCC_RC_SUCCESS;
CPD_NODE_REF_INFO *nref_info, *next_info;
+   CPD_NODE_USER_INFO *node_user, *next_node_user;
 
TRACE_ENTER();
 
@@ -153,6 +154,13 @@ uint32_t cpd_ckpt_node_delete(CPD_CB *cb
nref_info = next_info;
}
 
+   node_user = ckpt_node->node_users;
+   while (node_user) {
+