---

** [tickets:#3277] ntf: Discarded notifications accumulation causing  standby 
controller reboot during cold sync**

**Status:** unassigned
**Milestone:** 5.21.10
**Created:** Mon Aug 02, 2021 01:31 PM UTC by Mohan  Kanakam
**Last Updated:** Mon Aug 02, 2021 01:31 PM UTC
**Owner:** Mohan  Kanakam


Ntf service accumulates lots of discarded notifications(around 2,00,000) and it 
checkpoints these discarded notifications to Standby Ntf while coming up in 
cold sync. Standby Ntf takes more than 40 seconds to process them. During this 
time, Act Ntf gets few notifications and it checkpoints(async updates) 
notifications information to Standby Ntf which is a sync call with timeout of 1 
second.  Since, Standby Ntf is busy in processing cold sync, so it doesn't 
process async updates from Act Ntf and Act Ntf keeps timing out at an interval 
of 1 second for more than 40 times(i.e. more than 40 seconds).
During this time, Standby Clmd sends NtfInitialize request to Act Ntf and gets 
timeout for 4 times(40 seconds) and then Amf timesout(csi timeout 40 sec) for 
CSI and reboots the upcoming node.

The root cause is it loses down event of subscriber and never removes the 
subscriber information and discarded notifications keep increasing each time a 
notification is sent.
The notification can be missed because of less memory in the system or not able 
to send the down event in the mail box etc. We don't know the real root cause, 
but discarded notifications can be accumulated only in such cases.
We could reproduce it, please check the reproducible steps.

Steps to reproduce:
1.  comment the line  clientRemoveMDS()  in proc_ntfa_updn_mds_msg() function 
in ntfs_evt.c file.
2.  subscribe to ntf service by using ntfsubscribe.
3.  send the notifications using ntfsend(ntfsend -s 1  
--notificationType=0x4000    --additionalText=TEXT --repeatSends=200000).
4.  while running the ntfsend , kill the ntfsubscribe pid.
5.  start the standby and see the discarded notifications in osafntfd trace 
file.



---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to