Hi all, I'm running a simple test system with just one SC and one payload. Setup: 1 controller (vm-sc1), 1 payload (vm-pl3) Opensaf 4.6.0 running on RHEL 6.6 VMs with TCP
When I suspend the controller, after about 8-9 seconds, the payload will reboot, which is expected. What I'm trying to understand is why the osafamfnd process generates the "ER AMF director unexpectedly crashed" log. I've also seen this same log coming from setups with 2 controllers and 4 payloads and when opensaf is being shutdown Anyways, I've provided the output from the messages, osafamfnd and osafimmnd log files around the time in question (Nov 16 18:18:19) Thanks. /var/log/messages: Nov 16 18:17:07 vm-pl3 osafamfnd[1880]: logtrace: trace enabled to file /var/log/opensaf/osafamfnd, mask=0xffffffff Nov 16 18:17:27 vm-pl3 osafimmnd[1837]: logtrace: trace enabled to file /var/log/opensaf/osafimmnd, mask=0xffffffff Nov 16 18:17:38 vm-pl3 osafclmna[1869]: logtrace: trace enabled to file /var/log/opensaf/osafclmna, mask=0xffffffff Nov 16 18:17:43 vm-pl3 osafamfwd[1892]: logtrace: trace enabled to file /var/log/opensaf/osafamfwd, mask=0xffffffff Nov 16 18:18:19 vm-pl3 osafdtmd[1819]: NO Lost contact with 'vm-sc1' Nov 16 18:18:19 vm-pl3 osafimmnd[1837]: NO No IMMD service => cluster restart, exiting Nov 16 18:18:19 vm-pl3 osafamfnd[1880]: ER AMF director unexpectedly crashed Nov 16 18:18:19 vm-pl3 osafamfnd[1880]: Rebooting OpenSAF NodeId = 131855 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received, OwnNodeId = 131855, SupervisionTime = 60 Nov 16 18:18:19 vm-pl3 opensaf_reboot: Rebooting local node; timeout=60 Nov 16 18:19:14 vm-pl3 osafamfwd[1892]: Last received healthcheck cnt=22 at Thu Nov 16 18:18:14 2017 Nov 16 18:19:14 vm-pl3 osafamfwd[1892]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: AMFND unresponsive, AMFWDOG initiated system reboot, OwnNodeId = 131855, SupervisionTime = 60 /var/log/opensaf/osafamfnd: Nov 16 18:18:19.448727 osafamfnd [1880:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.448882 osafamfnd [1880:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.448908 osafamfnd [1880:main.cc:0642] >> avnd_evt_process Nov 16 18:18:19.448915 osafamfnd [1880:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.449053 osafamfnd [1880:main.cc:0657] TR Evt type:43 Nov 16 18:18:19.449078 osafamfnd [1880:di.cc:0451] >> avnd_evt_mds_avd_dn_evh Nov 16 18:18:19.449094 osafamfnd [1880:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.449130 osafamfnd [1880:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.449135 osafamfnd [1880:di.cc:0462] ER AMF director unexpectedly crashed Nov 16 18:18:19.449161 osafamfnd [1880:clma_mds.c:0945] T2 CLMA Rcvd MDS subscribe evt from svc 34 Nov 16 18:18:19.449172 osafamfnd [1880:clma_mds.c:0959] TR CLMS down Nov 16 18:18:19.449206 osafamfnd [1880:clma_mds.c:0945] T2 CLMA Rcvd MDS subscribe evt from svc 34 Nov 16 18:18:19.449213 osafamfnd [1880:clma_mds.c:0959] TR CLMS down Nov 16 18:18:19.449221 osafamfnd [1880:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.450045 osafamfnd [1880:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp /var/log/opensaf/osafimmnd: Nov 16 18:17:32.157768 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:17:37.189766 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:17:42.215437 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:17:47.240941 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:17:52.266672 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:17:57.292457 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:18:02.318141 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:18:07.343932 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:18:12.369621 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:18:17.395313 osafimmnd [1837:immnd_proc.c:1637] T5 tmout:1000 ste:10 ME:22 RE:22 crd:0 rim:KEEP_REPO 4.3A:1 2Pbe:0 VetA/B: 1/0 othsc:0/0 Nov 16 18:18:19.448718 osafimmnd [1837:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.448789 osafimmnd [1837:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.448857 osafimmnd [1837:immnd_mds.c:0549] >> immnd_mds_svc_evt Nov 16 18:18:19.448878 osafimmnd [1837:immnd_mds.c:0576] TR Director Service in NOACTIVE state Nov 16 18:18:19.448896 osafimmnd [1837:immnd_mds.c:0628] << immnd_mds_svc_evt Nov 16 18:18:19.448912 osafimmnd [1837:immnd_mds.c:0549] >> immnd_mds_svc_evt Nov 16 18:18:19.448920 osafimmnd [1837:immnd_mds.c:0556] TR IMMD SERVICE DOWN => CLUSTER GOING DOWN Nov 16 18:18:19.448930 osafimmnd [1837:immnd_mds.c:0628] << immnd_mds_svc_evt Nov 16 18:18:19.448941 osafimmnd [1837:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.448977 osafimmnd [1837:mds_dt_trans.c:0576] >> mdtm_process_poll_recv_data_tcp Nov 16 18:18:19.449032 osafimmnd [1837:immsv_evt.c:5414] T8 Received: IMMND_EVT_MDS_INFO (1) from 0 Nov 16 18:18:19.449063 osafimmnd [1837:immsv_evt.c:5414] T8 Received: IMMND_EVT_MDS_INFO (1) from 0 Nov 16 18:18:19.449102 osafimmnd [1837:immnd_evt.c:9815] NO No IMMD service => cluster restart, exiting Regards, David ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users
