Hi Son Tran Ngoc,
Thanks for the steps.
I tried to reproduce the issue but i did not succeed.
these are the steps i performed to reproduce the issue:
1) I started the SC-1,SC-2 and PL-6 with 100k objects.
2) I enabled the network delay between SC-1 and PL-6 by running following
command on PL-6:
tc qdisc add dev enp0s9 root handle 1: htb default 0
tc class add dev enp0s9 parent 1: classid 1:1 htb rate 100mbit prio 2
tc filter add dev enp0s9 parent 1: protocol ip handle 800::1 prio 1 u32 flowid
1:1 match ip dst 192.168.56.122
tc qdisc add dev enp0s9 parent 1:1 handle 100: netem delay 1000ms
3) I restarted the PL-6 and run the following command on PL-6:
timeout 180 tail -n 0 -F /var/log/syslog | (grep -m 1 "SERVER STATE:
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT"; pkill -P $$ timeout) ;
4) I disabled the network delay between SC-1 and PL-6 by running following
command on PL-6:
tc filter del dev enp0s9 parent 1: prio 1 handle 800::1 u32
SC-1 log:
Nov 17 18:46:23 vem1-VirtualBox osafimmd[28119]: NO MDS event from svc_id 25
(change:3, dest:569611447702886)
Nov 17 18:46:25 vem1-VirtualBox osafimmd[28119]: NO Node 2060f request sync
sync-pid:5478 epoch:0
Nov 17 18:46:25 vem1-VirtualBox osafimmnd[28131]: NO Announce sync, epoch:10
Nov 17 18:46:25 vem1-VirtualBox osafimmnd[28131]: NO SERVER STATE:
IMM_SERVER_READY --> IMM_SERVER_SYNC_SERVER
Nov 17 18:46:25 vem1-VirtualBox osafimmnd[28131]: NO NODE STATE->
IMM_NODE_R_AVAILABLE
Nov 17 18:46:25 vem1-VirtualBox osafimmd[28119]: NO Successfully announced
sync. New ruling epoch:10
Nov 17 18:46:25 vem1-VirtualBox osafimmloadd: logtrace: trace enabled to file
'osafimmnd', mask=0x0
Nov 17 18:46:25 vem1-VirtualBox osafimmloadd: NO Sync starting
Nov 17 18:46:30 vem1-VirtualBox dhclient[884]: DHCPREQUEST of 192.168.56.122 on
enp0s9 to 192.168.56.100 port 67 (xid=0x3b4cc293)
Nov 17 18:46:30 vem1-VirtualBox dhclient[884]: DHCPACK of 192.168.56.122 from
192.168.56.100
Nov 17 18:46:30 vem1-VirtualBox NetworkManager[726]: <info> [1668690990.3675]
address 192.168.56.122
Nov 17 18:46:30 vem1-VirtualBox NetworkManager[726]: <info> [1668690990.3678]
plen 24 (255.255.255.0)
Nov 17 18:46:30 vem1-VirtualBox NetworkManager[726]: <info> [1668690990.3681]
server identifier 192.168.56.100
Nov 17 18:46:30 vem1-VirtualBox NetworkManager[726]: <info> [1668690990.3684]
lease time 600
Nov 17 18:46:30 vem1-VirtualBox NetworkManager[726]: <info> [1668690990.3686]
dhcp4 (enp0s9): state changed bound -> bound
Nov 17 18:46:30 vem1-VirtualBox dbus[707]: [system] Activating via systemd:
service name='org.freedesktop.nm_dispatcher'
unit='dbus-org.freedesktop.nm-dispatcher.service'
Nov 17 18:46:30 vem1-VirtualBox systemd[1]: Starting Network Manager Script
Dispatcher Service...
Nov 17 18:46:30 vem1-VirtualBox dbus[707]: [system] Successfully activated
service 'org.freedesktop.nm_dispatcher'
Nov 17 18:46:30 vem1-VirtualBox systemd[1]: Started Network Manager Script
Dispatcher Service.
Nov 17 18:46:30 vem1-VirtualBox dhclient[884]: bound to 192.168.56.122 --
renewal in 289 seconds.
Nov 17 18:46:30 vem1-VirtualBox nm-dispatcher: req:1 'dhcp4-change' [enp0s9]:
new request (1 scripts)
Nov 17 18:46:30 vem1-VirtualBox nm-dispatcher: req:1 'dhcp4-change' [enp0s9]:
start running ordered scripts...
Nov 17 18:48:15 vem1-VirtualBox osafimmloadd: IN Synced 100627 objects in total
Nov 17 18:48:15 vem1-VirtualBox osafimmnd[28131]: NO NODE STATE->
IMM_NODE_FULLY_AVAILABLE 19360
PL-6 log:
Nov 17 18:46:25 pl3-VirtualBox osafimmnd[5478]: NO SERVER STATE:
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Nov 17 18:47:51 pl3-VirtualBox dhclient[849]: DHCPREQUEST of 192.168.56.107 on
enp0s9 to 192.168.56.100 port 67 (xid=0x3015643a)
Nov 17 18:47:51 pl3-VirtualBox dhclient[849]: DHCPACK of 192.168.56.107 from
192.168.56.100
Nov 17 18:47:51 pl3-VirtualBox NetworkManager[650]: <info> [1668691071.6233]
address 192.168.56.107
Nov 17 18:47:51 pl3-VirtualBox NetworkManager[650]: <info> [1668691071.6237]
plen 24 (255.255.255.0)
Nov 17 18:47:51 pl3-VirtualBox NetworkManager[650]: <info> [1668691071.6240]
server identifier 192.168.56.100
Nov 17 18:47:51 pl3-VirtualBox NetworkManager[650]: <info> [1668691071.6242]
lease time 600
Nov 17 18:47:51 pl3-VirtualBox NetworkManager[650]: <info> [1668691071.6245]
dhcp4 (enp0s9): state changed bound -> bound
Nov 17 18:47:51 pl3-VirtualBox dbus[635]: [system] Activating via systemd:
service name='org.freedesktop.nm_dispatcher'
unit='dbus-org.freedesktop.nm-dispatcher.service'
Nov 17 18:47:51 pl3-VirtualBox systemd[1]: Starting Network Manager Script
Dispatcher Service...
Nov 17 18:47:51 pl3-VirtualBox dbus[635]: [system] Successfully activated
service 'org.freedesktop.nm_dispatcher'
Nov 17 18:47:51 pl3-VirtualBox dhclient[849]: bound to 192.168.56.107 --
renewal in 260 seconds.
Nov 17 18:47:51 pl3-VirtualBox systemd[1]: Started Network Manager Script
Dispatcher Service.
Nov 17 18:47:51 pl3-VirtualBox nm-dispatcher: req:1 'dhcp4-change' [enp0s9]:
new request (1 scripts)
Nov 17 18:47:51 pl3-VirtualBox nm-dispatcher: req:1 'dhcp4-change' [enp0s9]:
start running ordered scripts...
Nov 17 18:48:16 pl3-VirtualBox osafimmnd[5478]: NO NODE STATE->
IMM_NODE_FULLY_AVAILABLE 2846
Nov 17 18:48:16 pl3-VirtualBox osafimmnd[5478]: NO RepositoryInitModeT is
SA_IMM_INIT_FROM_FILE
Thanks
Mohan
---
** [tickets:#3323] imm: PL sync failed after reconnected with SC**
**Status:** review
**Milestone:** 5.22.11
**Created:** Wed Oct 05, 2022 09:37 AM UTC by Son Tran Ngoc
**Last Updated:** Tue Oct 18, 2022 01:59 AM UTC
**Owner:** Son Tran Ngoc
Active SC1 and PL4 lost connect suddenly ( may be due to the environment
reason),they re-established contact but PL4 sync failed due to PL4 did not
update active SC1 information and discard message from IMMD SC1
PL4 sync failed log
2022-09-22 04:07:05.230 DEBUG: Syncing node PL-4 (timeout=120)
2022-09-22 04:08:06.325 WARNING: waiting more than 60 sec for node PL-4 to sync
PL4 discard message from SC1 log
2022-09-22 04:07:08.406 PL-4 osafimmnd[354]: WA DISCARD message from IMMD 2010f
as ACT:0 SBY:2020f
2022-09-22 04:07:09.013 PL-4 osafimmnd[354]: message repeated 243 times: [ WA
DISCARD message from IMMD 2010f as ACT:0 SBY:2020f]
step to reproduce.
1.start SCs, PLs.
2.Block traffic SC1 and PL4 ( make sure block traffic after IMM State :
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT ).
3.Unblock traffic SC1 and PL4.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list._______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets