Classic split brain. You should adjust dtm to worst case network latency (TIPC
link tolerace) and have redundant communication to the other controller
---
** [tickets:#458] Dtmd flickering resulted in node reboot**
**Status:** unassigned
**Created:** Fri Jun 14, 2013 06:49 AM UTC by A V Mahesh (AVM)
**Last Updated:** Fri Jun 14, 2013 06:49 AM UTC
**Owner:** nobody
Jun 13 17:42:00 PL-3 osafimmnd[5650]: NO Implementer disconnected 54 <0,
2020f(down)> (safAmfService)
Jun 13 17:42:00 PL-3 osafimmnd[5650]: NO Implementer disconnected 43 <0,
2020f(down)> (safSmfService)
Jun 13 17:42:00 PL-3 osafimmnd[5650]: NO Implementer disconnected 44 <0,
2020f(down)> (safMsgGrpService)
Jun 13 17:42:00 PL-3 osafimmnd[5650]: NO Implementer disconnected 45 <0,
2020f(down)> (safCheckPointService)
Jun 13 17:42:00 PL-3 osafimmnd[5650]: NO Implementer disconnected 47 <0,
2020f(down)> (safLckService)
Jun 13 17:42:00 PL-3 osafimmnd[5650]: NO Implementer disconnected 46 <0,
2020f(down)> (safEvtService)
Jun 13 17:42:02 PL-3 osafimmnd[5650]: NO No IMMD service => cluster restart
Jun 13 17:42:03 PL-3 osafamfnd[3391]: NO
'safComp=IMMND,safSu=PL-3,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown'
: Recovery is 'componentRestart'
Jun 13 17:42:03 PL-3 osafimmnd[5982]: Started
Jun 13 17:42:03 PL-3 osafdtmd[3320]: NO Lost contact with 'SC-2'
Jun 13 17:42:04 PL-3 osafdtmd[3320]: NO Established contact with 'SC-2'
Jun 13 17:42:05 PL-3 osafamfnd[3391]: ER AMF director unexpectedly crashed
Jun 13 17:42:05 PL-3 osafamfnd[3391]: Rebooting OpenSAF NodeId = 131855 EE Name
= , Reason: local AVD down(Adest) or both AVD down(Vdest) received
Jun 13 17:42:05 PL-3 osafdtmd[3320]: NO Lost contact with 'SC-1'
Jun 13 17:42:05 PL-3 opensaf_reboot: Rebooting local node
Jun 13 17:42:06 PL-3 osafdtmd[3320]: NO Established contact with 'SC-1'
Jun 13 17:42:06 PL-3 osafimmnd[5982]: NO SERVER STATE: IMM_SERVER_ANONYMOUS -->
IMM_SERVER_CLUSTER_WAITING
Jun 13 17:42:07 PL-3 osafimmnd[5982]: WA Resending introduce-me - problems with
MDS ?
Jun 13 17:42:07 PL-3 osafimmnd[5982]: NO SERVER STATE:
IMM_SERVER_CLUSTER_WAITING --> IMM_SER
=============================================
It looks MDS problem.
DTM lost the connection and established again within 1 second.
Jun 13 17:42:03 PL-3 osafdtmd[3320]: NO Lost contact with 'SC-2'
Jun 13 17:42:04 PL-3 osafdtmd[3320]: NO Established contact with 'SC-2'
Since AMFND will get down event as it lost contact with Act AMFD and Std AMFD.
This results in AMFND rebooting the blade.
Jun 13 17:42:05 PL-3 osafamfnd[3391]: ER AMF director unexpectedly crashed
Jun 13 17:42:05 PL-3 osafamfnd[3391]: Rebooting OpenSAF NodeId = 131855 EE Name
= , Reason: local AVD down(Adest) or both AVD down(Vdest) received
---
Sent from sourceforge.net because you indicated interest in
<https://sourceforge.net/p/opensaf/tickets/458/>
To unsubscribe from further messages, please visit
<https://sourceforge.net/auth/subscriptions/>
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets