---

** [tickets:#2338] amfd got crashed while changing role from queised to active**

**Status:** unassigned
**Milestone:** 5.2.RC1
**Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj
**Last Updated:** Fri Mar 03, 2017 05:41 AM UTC
**Owner:** nobody


#Environment details
OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )


#Summary
amfd got crashed while changing role from queised to active

#Steps followed & Observed behaviour
   1. Invoke switchovers
   2. After few successfull switchovers, SC-1 got Active role and SC-2 got 
standby role.
   3. Invoke one more switchover where SC-1 got queised role and 
        SC-2 successfully become active after this cpd got crashed(SC-2) while 
SC-1 changing role from queised to active amfd got crashed on SC-1, resulted 
into cluster reset

>>For CPD crash refer ticket #2337

Syslog of SC-1:
Mar  2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. 
Dumping incrementally to file imm.db
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for 
SaAmfNodeSwBundle, returned 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: 
avd_mds_qsd_role_evh: Assertion '0' failed.
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. 
Rebooting node
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 
EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131343, SupervisionTime = 60



BT:
(gdb) thread apply all bt

Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)):
0  0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1  0x00007f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x00007f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout=30000) at src/base/osaf_poll.c:44
3  0x00007f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=30000) at 
src/base/osaf_poll.c:128
4  0x00007f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", 
size=64) at src/rde/agent/rda_papi.cc:673
5  0x00007f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at 
src/rde/agent/rda_papi.cc:150
6  0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
7  0x00007f2e034209cd in clone () from /lib64/libc.so.6
8  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)):
0  0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1  0x00007f2e04188958 in mdtm_process_recv_events () at 
src/mds/mds_dt_tipc.c:669
2  0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
3  0x00007f2e034209cd in clone () from /lib64/libc.so.6
4  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)):
0  0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1  0x00007f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, 
i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x00007f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406
3  0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
4  0x00007f2e034209cd in clone () from /lib64/libc.so.6
5  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f2e05007720 (LWP 2178)):
0  0x00007f2e0337bb55 in raise () from /lib64/libc.so.6
1  0x00007f2e0337d131 in abort () from /lib64/libc.so.6
2  0x00007f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f 
"src/amf/amfd/role.cc", __line=807,
    __func=0x7f2e05216c90 <avd_mds_qsd_role_evh(cl_cb_tag*, 
AVD_EVT*)::__FUNCTION__> "avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0")
    at src/base/sysf_def.c:281
3  0x00007f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 
<_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807
4  0x00007f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>, 
evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811
5  0x00007f2e051560ee in main_loop () at src/amf/amfd/main.cc:702
6  0x00007f2e051566fd in main (argc=2, argv=0x7fff5826f318) at 
src/amf/amfd/main.cc:861
(gdb)





Notes:
1. Syslog of both controller's attached
2. amfd bt attached
3. amfd trace attached

Both nodes are not in time sysnc, there is time gap between two nodes
Relative to SC-2, SC-1 is (+50 min ahead)
Time Diff
==========
TestBed-R1:~  date
Thu Mar 2 16:34:45 IST 2017
TestBed-R2:~  date
Thu Mar 2 15:44:30 IST 2017
=========


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to