---
** [tickets:#2338] amfd got crashed while changing role from queised to active**
**Status:** unassigned
**Milestone:** 5.2.RC1
**Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj
**Last Updated:** Fri Mar 03, 2017 05:41 AM UTC
**Owner:** nobody
#Environment details
OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )
#Summary
amfd got crashed while changing role from queised to active
#Steps followed & Observed behaviour
1. Invoke switchovers
2. After few successfull switchovers, SC-1 got Active role and SC-2 got
standby role.
3. Invoke one more switchover where SC-1 got queised role and
SC-2 successfully become active after this cpd got crashed(SC-2) while
SC-1 changing role from queised to active amfd got crashed on SC-1, resulted
into cluster reset
>>For CPD crash refer ticket #2337
Syslog of SC-1:
Mar 2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC.
Dumping incrementally to file imm.db
Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for
SaAmfNodeSwBundle, returned 5
Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5
Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807:
avd_mds_qsd_role_evh: Assertion '0' failed.
Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed.
Rebooting node
Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343
EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId =
131343, SupervisionTime = 60
BT:
(gdb) thread apply all bt
Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)):
0 0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1 0x00007f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1,
i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105
2 0x00007f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1,
i_timeout=30000) at src/base/osaf_poll.c:44
3 0x00007f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=30000) at
src/base/osaf_poll.c:128
4 0x00007f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1",
size=64) at src/rde/agent/rda_papi.cc:673
5 0x00007f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at
src/rde/agent/rda_papi.cc:150
6 0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
7 0x00007f2e034209cd in clone () from /lib64/libc.so.6
8 0x0000000000000000 in ?? ()
Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)):
0 0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1 0x00007f2e04188958 in mdtm_process_recv_events () at
src/mds/mds_dt_tipc.c:669
2 0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
3 0x00007f2e034209cd in clone () from /lib64/libc.so.6
4 0x0000000000000000 in ?? ()
Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)):
0 0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1 0x00007f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1,
i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105
2 0x00007f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406
3 0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
4 0x00007f2e034209cd in clone () from /lib64/libc.so.6
5 0x0000000000000000 in ?? ()
Thread 1 (Thread 0x7f2e05007720 (LWP 2178)):
0 0x00007f2e0337bb55 in raise () from /lib64/libc.so.6
1 0x00007f2e0337d131 in abort () from /lib64/libc.so.6
2 0x00007f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f
"src/amf/amfd/role.cc", __line=807,
__func=0x7f2e05216c90 <avd_mds_qsd_role_evh(cl_cb_tag*,
AVD_EVT*)::__FUNCTION__> "avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0")
at src/base/sysf_def.c:281
3 0x00007f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0
<_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807
4 0x00007f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>,
evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811
5 0x00007f2e051560ee in main_loop () at src/amf/amfd/main.cc:702
6 0x00007f2e051566fd in main (argc=2, argv=0x7fff5826f318) at
src/amf/amfd/main.cc:861
(gdb)
Notes:
1. Syslog of both controller's attached
2. amfd bt attached
3. amfd trace attached
Both nodes are not in time sysnc, there is time gap between two nodes
Relative to SC-2, SC-1 is (+50 min ahead)
Time Diff
==========
TestBed-R1:~ date
Thu Mar 2 16:34:45 IST 2017
TestBed-R2:~ date
Thu Mar 2 15:44:30 IST 2017
=========
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets