Hi Sergio,
I raised a ticket(https://sourceforge.net/p/opensaf/tickets/3306/) in
opensaf community. 
I attached the patch in the ticket.
Can you please download the patch and test the patch in your scenario and
share your observations.

Thanks & Regards
Mohan Kanakam | 91-8333082448
Senior Software Engineer
High Availability Solutions
 www.GetHighAvailability.com
Get High Availability Today !
NJ, USA: 1 508-507-6507    |    Hyderabad, India: 91 798-992-5293

-----Original Message-----
From: Mohan Kanakam [mailto:mo...@gethighavailability.com] 
Sent: 17 February 2022 00:43
To: 'Mohan Kanakam'; 'Sérgio Marques'; opensaf-users@lists.sourceforge.net
Subject: Re: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Hi Sergio,
I have an update.
We have the patch for the scenario we could reproduce. We will test it and
we will share it to you for testing your scenarios. Once you confirm, I will
float the patch in the community.

Thanks & Regards
Mohan Kanakam | 91-8333082448
Senior Software Engineer
High Availability Solutions
 www.GetHighAvailability.com
Get High Availability Today !
NJ, USA: 1 508-507-6507    |    Hyderabad, India: 91 798-992-5293
-----Original Message-----
From: Mohan Kanakam [mailto:mo...@gethighavailability.com] 
Sent: 14 February 2022 22:24
To: 'Sérgio Marques'; 'opensaf-users@lists.sourceforge.net'
Subject: RE: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Hi Sergio,
Thanks for the logs.
We could also reproduce the issue by simply running the demo application as
below.

<139>1 2022-02-14T22:15:09.496432+05:30 sc1-VirtualBox osafckptnd 27692
mds.log [meta sequenceId="2"] MDS_SND_RCV: Invalid Sync CTXT Len

We will debug the issue this week and will let you know.

Thanks & Regards
Mohan Kanakam | 91-8333082448
Senior Software Engineer
High Availability Solutions
 www.GetHighAvailability.com
Get High Availability Today !
NJ, USA: 1 508-507-6507    |    Hyderabad, India: 91 798-992-5293
-----Original Message-----
From: Sérgio Marques [mailto:sergio-l-marq...@alticelabs.com] 
Sent: 14 February 2022 16:03
To: Mohan Kanakam; opensaf-users@lists.sourceforge.net
Subject: RE: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Hi Mohan,

I have increased the MDS log level (export MDS_LOG_LEVEL=5) to have more
detail in the "MDS_SND_RCV: Invalid Sync CTXT Len" error.
I'm sending the logs in attach. You can find the "Invalid Sync CTXT Len"
errors in logs/sc/cc-2/mds.log file.

Thanks and regards,
Sérgio Marques

-----Original Message-----
From: Sérgio Marques 
Sent: 11 de fevereiro de 2022 15:23
To: Mohan Kanakam <mo...@gethighavailability.com>;
opensaf-users@lists.sourceforge.net
Subject: RE: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Hi Mohan,

To reproduce the problem I only need to power up the cluster nodes and
launch our applications at the SC and PL nodes immediately after openSAF
coming up.
These applications starts creating some checkpoints as the one bellow. 

[root@OLT2T4-UNICOM-2~]# immlist
safCkpt=CKPT_BACKPLANE_CONTROL,safApp=safCkptService
Name                                               Type         Value(s)
========================================================================
safCkpt                                            SA_STRING_T
safCkpt=CKPT_BACKPLANE_CONTROL
saCkptCheckpointUsedSize                           SA_UINT64_T  2024 (0x7e8)
saCkptCheckpointSize                               SA_UINT64_T  2024 (0x7e8)
saCkptCheckpointRetDuration                        SA_TIME_T
9223372036854775807 (0x7fffffffffffffff, Sat Jan 27 10:50:44 1990)
saCkptCheckpointNumWriters                         SA_UINT32_T  7 (0x7)
saCkptCheckpointNumSections                        SA_UINT32_T  22 (0x16)
saCkptCheckpointNumReplicas                        SA_UINT32_T  2 (0x2)
saCkptCheckpointNumReaders                         SA_UINT32_T  7 (0x7)
saCkptCheckpointNumOpeners                         SA_UINT32_T  7 (0x7)
saCkptCheckpointNumCorruptSections                 SA_UINT32_T  0 (0x0)
saCkptCheckpointMaxSections                        SA_UINT32_T  22 (0x16)
saCkptCheckpointMaxSectionSize                     SA_UINT64_T  92 (0x5c)
saCkptCheckpointMaxSectionIdSize                   SA_UINT64_T  1 (0x1)
saCkptCheckpointCreationTimestamp                  SA_TIME_T
1644591280000000000 (0x16d2c30e451ae000, Fri Feb 11 14:54:40 2022)
saCkptCheckpointCreationFlags                      SA_UINT32_T  2 (0x2)
SaImmAttrImplementerName                           SA_STRING_T
safCheckPointService
SaImmAttrClassName                                 SA_STRING_T
SaCkptCheckpoint
SaImmAttrAdminOwnerName                            SA_STRING_T  <Empty>

After creating these ckpts, the nodes start creating, reading and writing
their sections.
I'm sending in attach the requested logs.
Please feel free to ask for more info/logs/tests.

Thanks a lot for your help,
Regards,
Sérgio Marques


-----Original Message-----
From: Mohan Kanakam <mo...@gethighavailability.com> 
Sent: 10 de fevereiro de 2022 17:57
To: 'Mohan Kanakam' <mo...@gethighavailability.com>; Sérgio Marques
<sergio-l-marq...@alticelabs.com>; opensaf-users@lists.sourceforge.net
Subject: RE: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Atenção: Este email foi originado fora da Altice Portugal. Por favor, não
clique em links nem abra anexos, a não ser que conheça o remetente e saiba
que o seu conteúdo é seguro.


Hi Sergio,
Can you please share syslog and mdslog of all the nodes.

Thanks & Regards
Mohan Kanakam | 91-8333082448
Senior Software Engineer
High Availability Solutions
 www.GetHighAvailability.com
Get High Availability Today !
NJ, USA: 1 508-507-6507    |    Hyderabad, India: 91 798-992-5293
-----Original Message-----
From: Mohan Kanakam [mailto:mo...@gethighavailability.com]
Sent: 10 February 2022 23:23
To: 'Sérgio Marques'; 'opensaf-users@lists.sourceforge.net'
Subject: RE: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Hi Sergio,
Thanks for the information.
Can you please share us the steps(kind of checkpoint, frequency of
checkpoint, etc.) to reproduce the issue in our lab?
Can you please share ckptd and ckptnd traces(you can enable/disable at
runtime using "kill -USR2 <ckptnd_pid/ckptd_pid>") and application api
calls.

Thanks & Regards
Mohan Kanakam | 91-8333082448
Senior Software Engineer
High Availability Solutions
 www.GetHighAvailability.com
Get High Availability Today !
NJ, USA: 1 508-507-6507    |    Hyderabad, India: 91 798-992-5293

-----Original Message-----
From: Sérgio Marques [mailto:sergio-l-marq...@alticelabs.com]
Sent: 10 February 2022 20:17
To: Mohan Kanakam; opensaf-users@lists.sourceforge.net
Subject: RE: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Hi Mohan,

Thanks for your quick answer.
I have a cluster with 2 controller and 2 payload boards. All of the nodes
are using the last opensaf version 5.22.01.
Some checkpoints are created by the payload and controller applications.
I can see these errors every 3 to 10 seconds at the controller slave card.
Is there a way of debugging this issue to find out where exactly are the
messages being lost?
Please feel free to ask for more info/logs/tests.

Thanks and regards,
Sérgio Marques

-----Original Message-----
From: Mohan Kanakam <mo...@gethighavailability.com>
Sent: 9 de fevereiro de 2022 17:48
To: Sérgio Marques <sergio-l-marq...@alticelabs.com>;
opensaf-users@lists.sourceforge.net
Subject: RE: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Hi Sergio,
Can you please let us know the opensaf version being used.
Since, I don't know the test scenario and the use case, so I am giving
generic answer.
If ckptnd sends a sync message(checkpoint) to Active ckptd and waits for its
reply and if Active ckptd is stopped, then ckptnd will never get reply of
those messages and the context of sync messages gets invalid and unanswered.
It looks, there are few such messages pending at ckptd to be replied, and if
Active controller reboots, then Active ckptd is stopped and all the sync
messages waiting to be replied at ckptnd may get such error because the
context of the sync messages are lost.
To me, it looks the messages are genuine, but then it may be a concern that
few ckptnd's checkpoint messages are not being responded and they are lost.
So, application(if running on payload) need to send it again.

Thanks & Regards
Mohan Kanakam | 91-8333082448
Senior Software Engineer
High Availability Solutions
 www.GetHighAvailability.com
Get High Availability Today !
NJ, USA: 1 508-507-6507    |    Hyderabad, India: 91 798-992-5293

-----Original Message-----
From: Sérgio Marques [mailto:sergio-l-marq...@alticelabs.com]
Sent: 09 February 2022 14:50
To: opensaf-users@lists.sourceforge.net
Subject: [users] MDS_SND_RCV: Invalid Sync CTXT Len

Hi,

In my system, some "MDS_SND_RCV: Invalid Sync CTXT Len" events are being
registered in mds.log.
Are these errors normal? I'm asking this because I'm experiencing very weird
problems when rebooting the active controller node.
Thanks in advance,
Sérgio Marques

<141>1 2022-01-23T02:24:44.936356Z OLT2T4-UNICOM-2 osaflcknd 2151 mds.log
[meta sequenceId="347"] MDTM: svc down event for svc_id = GLA(3), subscri.
by svc_id = GLND(4) pwe_id=1 Adest = <nodeid[0x20c0f]:osaflcknd[2151]>
<141>1 2022-01-23T02:24:44.936699Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4505"] MDTM: svc down event for svc_id = CPA(18), subscri.
by svc_id = CPND(17) pwe_id=1 Adest = <nodeid[0x20c0f]:osafckptnd[2169]>
<141>1 2022-01-23T02:24:44.936703Z OLT2T4-UNICOM-2 osafckptd 2210 mds.log
[meta sequenceId="750"] MDTM: svc down event for svc_id = CPA(18), subscri.
by svc_id = CPD(16) pwe_id=1 Adest = <nodeid[0x20c0f]:osafckptd[2210]>
<141>1 2022-01-23T02:24:44.937203Z OLT2T4-UNICOM-2 osafimmnd 1588 mds.log
[meta sequenceId="743"] MDTM: svc down event for svc_id = IMMA_OM(26),
subscri. by svc_id = IMMND(25) pwe_id=1 Adest =
<nodeid[0x20c0f]:osafimmnd[1588]>
<141>1 2022-01-23T02:24:44.937479Z OLT2T4-UNICOM-2 osafimmnd 1588 mds.log
[meta sequenceId="744"] MDTM: svc down event for svc_id = IMMA_OI(27),
subscri. by svc_id = IMMND(25) pwe_id=1 Adest =
<nodeid[0x20c0f]:osafimmnd[1588]>
<139>1 2022-01-23T02:24:46.71289Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4506"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.726201Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4507"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.73951Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4508"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.75349Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4509"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.768767Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4510"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.777688Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4511"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.788105Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4512"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.801785Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4513"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.815353Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4514"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.824419Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4515"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.833325Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4516"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.842368Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4517"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.851617Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4518"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.862324Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4519"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.876586Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4520"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:46.887585Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4521"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:48.764874Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4522"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:48.776664Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4523"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:48.788544Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4524"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:48.799566Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4525"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:48.806417Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4526"] MDS_SND_RCV: Invalid Sync CTXT Len
<139>1 2022-01-23T02:24:48.81328Z OLT2T4-UNICOM-2 osafckptnd 2169 mds.log
[meta sequenceId="4527"] MDS_SND_RCV: Invalid Sync CTXT Len

_______________________________________________
Opensaf-users mailing list
Opensaf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-users




_______________________________________________
Opensaf-users mailing list
Opensaf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-users



_______________________________________________
Opensaf-users mailing list
Opensaf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to