- **status**: fixed --> review
- **assigned_to**: Minh Hon Chau
- **Comment**:

sent out for review a V2 patch not to rely on node_id



---

** [tickets:#2737] amfd: Fail to saImmOiInitialize causes unexpected reboot in 
node shutting down **

**Status:** review
**Milestone:** 5.18.01
**Created:** Tue Dec 12, 2017 10:52 AM UTC by Minh Hon Chau
**Last Updated:** Fri Dec 22, 2017 03:58 AM UTC
**Owner:** Minh Hon Chau


When shutting down active controller, local amfnd terminates Opensaf 
components. Hence, amfd will get bad handle and try to reintialize the services 
handle. At some point, amfd gets
SA_AIS_ERR_LIBRARY returned from saImmOiInitialize() and exit. This causes 
unexpected node reboot.

2017-12-07 21:14:14.663 SC-2 systemd[1]: Stopping OpenSAF daemon...
2017-12-07 21:14:14.675 SC-2 opensafd: Stopping OpenSAF Services
2017-12-07 21:14:14.682 SC-2 osafamfnd[262]: NO Shutdown initiated
2017-12-07 21:14:14.685 SC-2 osafamfnd[262]: NO Terminating all AMF components
2017-12-07 21:14:15.507 SC-2 osafimmd[196]: NO MDS event from svc_id 25 
(change:4, dest:567412424442034)
2017-12-07 21:14:15.508 SC-2 osafimmnd[207]: NO Global discard node received 
for nodeId:2040f pid:178
2017-12-07 21:14:15.521 SC-2 osafimmd[196]: NO MDS event from svc_id 25 
(change:4, dest:568511936069806)
2017-12-07 21:14:15.521 SC-2 osafimmnd[207]: NO Global discard node received 
for nodeId:2050f pid:174
2017-12-07 21:14:15.679 SC-2 osafimmd[196]: NO MDS event from svc_id 25 
(change:4, dest:566312912814258)
2017-12-07 21:14:15.679 SC-2 osafimmnd[207]: NO Global discard node received 
for nodeId:2030f pid:178
2017-12-07 21:14:16.013 SC-2 osaffmd[186]: NO IMMD down on: 2010f
2017-12-07 21:14:16.013 SC-2 osafimmd[196]: NO MDS event from svc_id 24 
(change:1, dest:13)
2017-12-07 21:14:16.013 SC-2 osafimmd[196]: NO MDS event from svc_id 24 
(change:6, dest:13)
2017-12-07 21:14:16.013 SC-2 osafimmd[196]: WA IMMD lost contact with peer IMMD 
(NCSMDS_RED_DOWN)
2017-12-07 21:14:16.014 SC-2 osafimmnd[207]: NO Implementer disconnected 10 <0, 
2010f> (@safSmf_applier1)
2017-12-07 21:14:16.014 SC-2 osafimmnd[207]: WA DISCARD DUPLICATE FEVS 
message:1246
2017-12-07 21:14:16.014 SC-2 osafimmnd[207]: WA Error code 2 returned for 
message type 82 - ignoring
2017-12-07 21:14:16.014 SC-2 osafimmnd[207]: WA DISCARD DUPLICATE FEVS 
message:1247
2017-12-07 21:14:16.014 SC-2 osafimmnd[207]: WA Error code 2 returned for 
message type 82 - ignoring
2017-12-07 21:14:16.056 SC-2 osafdtmd[150]: NO Lost contact with 'PL-5'
2017-12-07 21:14:16.058 SC-2 osafrded[177]: NO Peer down on node 0x2010f
2017-12-07 21:14:16.085 SC-2 osafdtmd[150]: NO Lost contact with 'PL-4'
2017-12-07 21:14:16.108 SC-2 osafdtmd[150]: NO Lost contact with 'PL-3'
2017-12-07 21:14:16.167 SC-2 osafamfwd[342]: exiting for shutdown
2017-12-07 21:14:16.168 SC-2 osafclmd[242]: exiting for shutdown
2017-12-07 21:14:16.170 SC-2 osafrded[177]: exiting for shutdown
2017-12-07 21:14:16.171 SC-2 osafclmna[168]: exiting for shutdown
2017-12-07 21:14:16.172 SC-2 osafsmfd[284]: exiting for shutdown
2017-12-07 21:14:16.177 SC-2 osafsmfnd[277]: exiting for shutdown
2017-12-07 21:14:16.219 SC-2 osaffmd[186]: NO FM down on: 2010f
2017-12-07 21:14:16.220 SC-2 osaffmd[186]: NO IMMND down on: 2010f
2017-12-07 21:14:16.220 SC-2 osafimmd[196]: NO MDS event from svc_id 25 
(change:4, dest:564113889558735)
2017-12-07 21:14:16.220 SC-2 osafimmd[196]: WA IMMND DOWN on active controller 
1 detected at standby immd!! 2. Possible failover
2017-12-07 21:14:16.221 SC-2 osafimmd[196]: NO Skipping re-send of fevs message 
1246 since it has recently been resent.
2017-12-07 21:14:16.221 SC-2 osafimmd[196]: NO Skipping re-send of fevs message 
1247 since it has recently been resent.
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Global discard node received 
for nodeId:2010f pid:207
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Implementer disconnected 1 <0, 
2010f(down)> (safLogService)
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Implementer disconnected 2 <0, 
2010f(down)> (@safLogService_appl)
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Implementer disconnected 3 <0, 
2010f(down)> (@OpenSafImmReplicatorA)
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Implementer disconnected 4 <0, 
2010f(down)> (safClmService)
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Implementer disconnected 5 <0, 
2010f(down)> (safAmfService)
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Implementer disconnected 6 <0, 
2010f(down)> (OpenSafImmPBE)
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Implementer disconnected 8 <0, 
2010f(down)> (safCheckPointService)
2017-12-07 21:14:16.221 SC-2 osafimmnd[207]: NO Implementer disconnected 9 <0, 
2010f(down)> (safSmfService)
2017-12-07 21:14:16.233 SC-2 osafimmd[196]: exiting for shutdown
2017-12-07 21:14:16.234 SC-2 osafimmnd[207]: WA SC Absence IS allowed:900 IMMD 
service is DOWN
2017-12-07 21:14:16.234 SC-2 osafimmnd[207]: NO IMMD SERVICE IS DOWN, HYDRA IS 
CONFIGURED => UNREGISTERING IMMND form MDS
2017-12-07 21:14:16.234 SC-2 osafimmnd[207]: NO Removing client id:1040002020f 
sv_id:26
2017-12-07 21:14:16.234 SC-2 osafimmnd[207]: NO Removing client id:5f0002020f 
sv_id:27
2017-12-07 21:14:16.234 SC-2 osafimmnd[207]: NO Removing client id:630002020f 
sv_id:27
2017-12-07 21:14:16.234 SC-2 osafntfimcnd[417]: NO saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
2017-12-07 21:14:16.234 SC-2 osafamfd[252]: NO Re-initializing with IMM
2017-12-07 21:14:16.235 SC-2 osafimmnd[207]: NO Implementer disconnected 11 
<99, 2020f> (@safAmfService2020f)
2017-12-07 21:14:16.235 SC-2 osafimmnd[207]: NO Removing client id:640002020f 
sv_id:26
2017-12-07 21:14:16.235 SC-2 osafimmnd[207]: NO Removing client id:670002020f 
sv_id:27
2017-12-07 21:14:16.235 SC-2 osafimmnd[207]: NO Removing client id:680002020f 
sv_id:27
2017-12-07 21:14:16.235 SC-2 osafimmnd[207]: NO Implementer disconnected 12 
<104, 2020f> (@OpenSafImmReplicatorB)
2017-12-07 21:14:16.235 SC-2 osafimmnd[207]: NO MDS unregisterede. sleeping ...
2017-12-07 21:14:16.240 SC-2 osafntfimcnd[601]: logtrace: trace enabled to file 
'osafntfimcn', mask=0x0
2017-12-07 21:14:16.244 SC-2 osafimmnd[207]: NO Sleep done registering IMMND 
with MDS
2017-12-07 21:14:16.245 SC-2 osafckptd[388]: exiting for shutdown
2017-12-07 21:14:16.245 SC-2 osafimmnd[207]: NO MDS: mds_register_callback: 
dest 2020f000000de already exist
2017-12-07 21:14:16.245 SC-2 osafimmnd[207]: NO SUCCESS IN REGISTERING IMMND 
WITH MDS
2017-12-07 21:14:16.246 SC-2 osafimmnd[207]: exiting for shutdown
2017-12-07 21:14:16.246 SC-2 osafckptnd[321]: exiting for shutdown
2017-12-07 21:14:16.247 SC-2 osafamfd[252]: WA imma_mds_svc_evt: 
mds_auth_server_connect failed
2017-12-07 21:14:16.248 SC-2 osaflogd[222]: exiting for shutdown
2017-12-07 21:14:16.248 SC-2 osaffmd[186]: exiting for shutdown
2017-12-07 21:14:17.249 SC-2 osafntfimcnd[601]: WA ntfimcn_imm_init 
saImmOiInitialize_2() returned SA_AIS_ERR_TIMEOUT (5)
2017-12-07 21:14:17.285 SC-2 kernel:[43550.099165] lxcbr0: port 5(vethM077NX) 
entered disabled state
2017-12-07 21:14:17.285 SC-2 kernel:[43550.100374] device vethM077NX left 
promiscuous mode
2017-12-07 21:14:17.285 SC-2 kernel:[43550.100376] lxcbr0: port 5(vethM077NX) 
entered disabled state
2017-12-07 21:14:21.248 SC-2 osafamfd[252]: ER saImmOiInitialize failed 2
2017-12-07 21:14:21.250 SC-2 osafamfnd[262]: ER AMFD has unexpectedly crashed. 
Rebooting node
2017-12-07 21:14:21.251 SC-2 osafamfnd[262]: Rebooting OpenSAF NodeId = 131599 
EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131599, SupervisionTime = 60
2017-12-07 21:14:21.261 SC-2 opensaf_reboot: Rebooting local node; timeout=60


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to