[tickets] [opensaf:tickets] #2338 amfd got crashed while changing role from queised to active
- **status**: review --> fixed - **Comment**: changeset: 8687:43deca051ae2 branch: opensaf-5.0.x parent: 8682:50a2033a8a8d user:Nagendra Kumardate:Fri Mar 10 15:30:59 2017 +0530 summary: amfd: handle TIMEOUT for avd_imm_applier_set [#2338] changeset: 8688:c4271e0114d8 branch: opensaf-5.1.x parent: 8683:59e265654232 user:Nagendra Kumar date:Fri Mar 10 15:31:10 2017 +0530 summary: amfd: handle TIMEOUT for avd_imm_applier_set [#2338] changeset: 8689:4cefc956fdf0 tag: tip parent: 8686:03647db14f06 user:Nagendra Kumar date:Fri Mar 10 15:31:28 2017 +0530 summary: amfd: handle TIMEOUT for avd_imm_applier_set [#2338] [staging:43deca] [staging:c4271e] [staging:4cefc9] --- ** [tickets:#2338] amfd got crashed while changing role from queised to active** **Status:** fixed **Milestone:** 5.2.RC1 **Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj **Last Updated:** Wed Mar 08, 2017 08:39 AM UTC **Owner:** Nagendra Kumar **Attachments:** - [osafamfd.tgz](https://sourceforge.net/p/opensaf/tickets/2338/attachment/osafamfd.tgz) (2.8 MB; application/octet-stream) - [syslog.7z](https://sourceforge.net/p/opensaf/tickets/2338/attachment/syslog.7z) (649.4 kB; application/octet-stream) #Environment details OS : Suse 64bit Changeset : 8634 ( 5.2.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled ) #Summary amfd got crashed while changing role from queised to active #Steps followed & Observed behaviour 1. Invoke switchovers 2. After few successfull switchovers, SC-1 got Active role and SC-2 got standby role. 3. Invoke one more switchover where SC-1 got queised role and SC-2 successfully become active after this cpd got crashed(SC-2) while SC-1 changing role from queised to active amfd got crashed on SC-1, resulted into cluster reset >>For CPD crash refer ticket #2337 Syslog of SC-1: Mar 2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for SaAmfNodeSwBundle, returned 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: avd_mds_qsd_role_evh: Assertion '0' failed. Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. Rebooting node Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 131343, SupervisionTime = 60 BT: (gdb) thread apply all bt Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout=3) at src/base/osaf_poll.c:44 3 0x7f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=3) at src/base/osaf_poll.c:128 4 0x7f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", size=64) at src/rde/agent/rda_papi.cc:673 5 0x7f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at src/rde/agent/rda_papi.cc:150 6 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 7 0x7f2e034209cd in clone () from /lib64/libc.so.6 8 0x in ?? () Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e04188958 in mdtm_process_recv_events () at src/mds/mds_dt_tipc.c:669 2 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 3 0x7f2e034209cd in clone () from /lib64/libc.so.6 4 0x in ?? () Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406 3 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 4 0x7f2e034209cd in clone () from /lib64/libc.so.6 5 0x in ?? () Thread 1 (Thread 0x7f2e05007720 (LWP 2178)): 0 0x7f2e0337bb55 in raise () from /lib64/libc.so.6 1 0x7f2e0337d131 in abort () from /lib64/libc.so.6 2 0x7f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f "src/amf/amfd/role.cc", __line=807, __func=0x7f2e05216c90 "avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0") at src/base/sysf_def.c:281 3 0x7f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807 4 0x7f2e05156536 in process_event
[tickets] [opensaf:tickets] #2338 amfd got crashed while changing role from queised to active
- **status**: assigned --> accepted --- ** [tickets:#2338] amfd got crashed while changing role from queised to active** **Status:** accepted **Milestone:** 5.2.RC1 **Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj **Last Updated:** Tue Mar 07, 2017 07:27 AM UTC **Owner:** Nagendra Kumar **Attachments:** - [osafamfd.tgz](https://sourceforge.net/p/opensaf/tickets/2338/attachment/osafamfd.tgz) (2.8 MB; application/octet-stream) - [syslog.7z](https://sourceforge.net/p/opensaf/tickets/2338/attachment/syslog.7z) (649.4 kB; application/octet-stream) #Environment details OS : Suse 64bit Changeset : 8634 ( 5.2.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled ) #Summary amfd got crashed while changing role from queised to active #Steps followed & Observed behaviour 1. Invoke switchovers 2. After few successfull switchovers, SC-1 got Active role and SC-2 got standby role. 3. Invoke one more switchover where SC-1 got queised role and SC-2 successfully become active after this cpd got crashed(SC-2) while SC-1 changing role from queised to active amfd got crashed on SC-1, resulted into cluster reset >>For CPD crash refer ticket #2337 Syslog of SC-1: Mar 2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for SaAmfNodeSwBundle, returned 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: avd_mds_qsd_role_evh: Assertion '0' failed. Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. Rebooting node Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 131343, SupervisionTime = 60 BT: (gdb) thread apply all bt Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout=3) at src/base/osaf_poll.c:44 3 0x7f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=3) at src/base/osaf_poll.c:128 4 0x7f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", size=64) at src/rde/agent/rda_papi.cc:673 5 0x7f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at src/rde/agent/rda_papi.cc:150 6 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 7 0x7f2e034209cd in clone () from /lib64/libc.so.6 8 0x in ?? () Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e04188958 in mdtm_process_recv_events () at src/mds/mds_dt_tipc.c:669 2 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 3 0x7f2e034209cd in clone () from /lib64/libc.so.6 4 0x in ?? () Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406 3 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 4 0x7f2e034209cd in clone () from /lib64/libc.so.6 5 0x in ?? () Thread 1 (Thread 0x7f2e05007720 (LWP 2178)): 0 0x7f2e0337bb55 in raise () from /lib64/libc.so.6 1 0x7f2e0337d131 in abort () from /lib64/libc.so.6 2 0x7f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f "src/amf/amfd/role.cc", __line=807, __func=0x7f2e05216c90"avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0") at src/base/sysf_def.c:281 3 0x7f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807 4 0x7f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811 5 0x7f2e051560ee in main_loop () at src/amf/amfd/main.cc:702 6 0x7f2e051566fd in main (argc=2, argv=0x7fff5826f318) at src/amf/amfd/main.cc:861 (gdb) Notes: 1. Syslog of both controller's attached 2. amfd bt attached 3. amfd trace attached Both nodes are not in time sysnc, there is time gap between two nodes Relative to SC-2, SC-1 is (+50 min ahead) Time Diff == TestBed-R1:~ date Thu Mar 2 16:34:45 IST 2017 TestBed-R2:~ date Thu Mar 2 15:44:30 IST 2017 = --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at
[tickets] [opensaf:tickets] #2338 amfd got crashed while changing role from queised to active
- **status**: unassigned --> assigned - **assigned_to**: Nagendra Kumar --- ** [tickets:#2338] amfd got crashed while changing role from queised to active** **Status:** assigned **Milestone:** 5.2.RC1 **Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj **Last Updated:** Fri Mar 03, 2017 05:42 AM UTC **Owner:** Nagendra Kumar **Attachments:** - [osafamfd.tgz](https://sourceforge.net/p/opensaf/tickets/2338/attachment/osafamfd.tgz) (2.8 MB; application/octet-stream) - [syslog.7z](https://sourceforge.net/p/opensaf/tickets/2338/attachment/syslog.7z) (649.4 kB; application/octet-stream) #Environment details OS : Suse 64bit Changeset : 8634 ( 5.2.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled ) #Summary amfd got crashed while changing role from queised to active #Steps followed & Observed behaviour 1. Invoke switchovers 2. After few successfull switchovers, SC-1 got Active role and SC-2 got standby role. 3. Invoke one more switchover where SC-1 got queised role and SC-2 successfully become active after this cpd got crashed(SC-2) while SC-1 changing role from queised to active amfd got crashed on SC-1, resulted into cluster reset >>For CPD crash refer ticket #2337 Syslog of SC-1: Mar 2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for SaAmfNodeSwBundle, returned 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: avd_mds_qsd_role_evh: Assertion '0' failed. Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. Rebooting node Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 131343, SupervisionTime = 60 BT: (gdb) thread apply all bt Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout=3) at src/base/osaf_poll.c:44 3 0x7f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=3) at src/base/osaf_poll.c:128 4 0x7f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", size=64) at src/rde/agent/rda_papi.cc:673 5 0x7f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at src/rde/agent/rda_papi.cc:150 6 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 7 0x7f2e034209cd in clone () from /lib64/libc.so.6 8 0x in ?? () Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e04188958 in mdtm_process_recv_events () at src/mds/mds_dt_tipc.c:669 2 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 3 0x7f2e034209cd in clone () from /lib64/libc.so.6 4 0x in ?? () Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406 3 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 4 0x7f2e034209cd in clone () from /lib64/libc.so.6 5 0x in ?? () Thread 1 (Thread 0x7f2e05007720 (LWP 2178)): 0 0x7f2e0337bb55 in raise () from /lib64/libc.so.6 1 0x7f2e0337d131 in abort () from /lib64/libc.so.6 2 0x7f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f "src/amf/amfd/role.cc", __line=807, __func=0x7f2e05216c90"avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0") at src/base/sysf_def.c:281 3 0x7f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807 4 0x7f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811 5 0x7f2e051560ee in main_loop () at src/amf/amfd/main.cc:702 6 0x7f2e051566fd in main (argc=2, argv=0x7fff5826f318) at src/amf/amfd/main.cc:861 (gdb) Notes: 1. Syslog of both controller's attached 2. amfd bt attached 3. amfd trace attached Both nodes are not in time sysnc, there is time gap between two nodes Relative to SC-2, SC-1 is (+50 min ahead) Time Diff == TestBed-R1:~ date Thu Mar 2 16:34:45 IST 2017 TestBed-R2:~ date Thu Mar 2 15:44:30 IST 2017 = --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a
[tickets] [opensaf:tickets] #2338 amfd got crashed while changing role from queised to active
- Attachments has changed: Diff: --- old +++ new @@ -0,0 +1,2 @@ +osafamfd.tgz (2.8 MB; application/octet-stream) +syslog.7z (649.4 kB; application/octet-stream) --- ** [tickets:#2338] amfd got crashed while changing role from queised to active** **Status:** unassigned **Milestone:** 5.2.RC1 **Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj **Last Updated:** Fri Mar 03, 2017 05:42 AM UTC **Owner:** nobody **Attachments:** - [osafamfd.tgz](https://sourceforge.net/p/opensaf/tickets/2338/attachment/osafamfd.tgz) (2.8 MB; application/octet-stream) - [syslog.7z](https://sourceforge.net/p/opensaf/tickets/2338/attachment/syslog.7z) (649.4 kB; application/octet-stream) #Environment details OS : Suse 64bit Changeset : 8634 ( 5.2.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled ) #Summary amfd got crashed while changing role from queised to active #Steps followed & Observed behaviour 1. Invoke switchovers 2. After few successfull switchovers, SC-1 got Active role and SC-2 got standby role. 3. Invoke one more switchover where SC-1 got queised role and SC-2 successfully become active after this cpd got crashed(SC-2) while SC-1 changing role from queised to active amfd got crashed on SC-1, resulted into cluster reset >>For CPD crash refer ticket #2337 Syslog of SC-1: Mar 2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for SaAmfNodeSwBundle, returned 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: avd_mds_qsd_role_evh: Assertion '0' failed. Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. Rebooting node Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 131343, SupervisionTime = 60 BT: (gdb) thread apply all bt Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout=3) at src/base/osaf_poll.c:44 3 0x7f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=3) at src/base/osaf_poll.c:128 4 0x7f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", size=64) at src/rde/agent/rda_papi.cc:673 5 0x7f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at src/rde/agent/rda_papi.cc:150 6 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 7 0x7f2e034209cd in clone () from /lib64/libc.so.6 8 0x in ?? () Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e04188958 in mdtm_process_recv_events () at src/mds/mds_dt_tipc.c:669 2 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 3 0x7f2e034209cd in clone () from /lib64/libc.so.6 4 0x in ?? () Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406 3 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 4 0x7f2e034209cd in clone () from /lib64/libc.so.6 5 0x in ?? () Thread 1 (Thread 0x7f2e05007720 (LWP 2178)): 0 0x7f2e0337bb55 in raise () from /lib64/libc.so.6 1 0x7f2e0337d131 in abort () from /lib64/libc.so.6 2 0x7f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f "src/amf/amfd/role.cc", __line=807, __func=0x7f2e05216c90"avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0") at src/base/sysf_def.c:281 3 0x7f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807 4 0x7f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811 5 0x7f2e051560ee in main_loop () at src/amf/amfd/main.cc:702 6 0x7f2e051566fd in main (argc=2, argv=0x7fff5826f318) at src/amf/amfd/main.cc:861 (gdb) Notes: 1. Syslog of both controller's attached 2. amfd bt attached 3. amfd trace attached Both nodes are not in time sysnc, there is time gap between two nodes Relative to SC-2, SC-1 is (+50 min ahead) Time Diff == TestBed-R1:~ date Thu Mar 2 16:34:45 IST 2017 TestBed-R2:~ date Thu Mar 2 15:44:30 IST 2017 = --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
[tickets] [opensaf:tickets] #2338 amfd got crashed while changing role from queised to active
--- ** [tickets:#2338] amfd got crashed while changing role from queised to active** **Status:** unassigned **Milestone:** 5.2.RC1 **Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj **Last Updated:** Fri Mar 03, 2017 05:41 AM UTC **Owner:** nobody #Environment details OS : Suse 64bit Changeset : 8634 ( 5.2.FC) Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled ) #Summary amfd got crashed while changing role from queised to active #Steps followed & Observed behaviour 1. Invoke switchovers 2. After few successfull switchovers, SC-1 got Active role and SC-2 got standby role. 3. Invoke one more switchover where SC-1 got queised role and SC-2 successfully become active after this cpd got crashed(SC-2) while SC-1 changing role from queised to active amfd got crashed on SC-1, resulted into cluster reset >>For CPD crash refer ticket #2337 Syslog of SC-1: Mar 2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. Dumping incrementally to file imm.db Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for SaAmfNodeSwBundle, returned 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5 Mar 2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: avd_mds_qsd_role_evh: Assertion '0' failed. Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. Rebooting node Mar 2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 131343, SupervisionTime = 60 BT: (gdb) thread apply all bt Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, i_timeout=3) at src/base/osaf_poll.c:44 3 0x7f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=3) at src/base/osaf_poll.c:128 4 0x7f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", size=64) at src/rde/agent/rda_papi.cc:673 5 0x7f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at src/rde/agent/rda_papi.cc:150 6 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 7 0x7f2e034209cd in clone () from /lib64/libc.so.6 8 0x in ?? () Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e04188958 in mdtm_process_recv_events () at src/mds/mds_dt_tipc.c:669 2 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 3 0x7f2e034209cd in clone () from /lib64/libc.so.6 4 0x in ?? () Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)): 0 0x7f2e034174f6 in poll () from /lib64/libc.so.6 1 0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105 2 0x7f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406 3 0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0 4 0x7f2e034209cd in clone () from /lib64/libc.so.6 5 0x in ?? () Thread 1 (Thread 0x7f2e05007720 (LWP 2178)): 0 0x7f2e0337bb55 in raise () from /lib64/libc.so.6 1 0x7f2e0337d131 in abort () from /lib64/libc.so.6 2 0x7f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f "src/amf/amfd/role.cc", __line=807, __func=0x7f2e05216c90"avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0") at src/base/sysf_def.c:281 3 0x7f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807 4 0x7f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811 5 0x7f2e051560ee in main_loop () at src/amf/amfd/main.cc:702 6 0x7f2e051566fd in main (argc=2, argv=0x7fff5826f318) at src/amf/amfd/main.cc:861 (gdb) Notes: 1. Syslog of both controller's attached 2. amfd bt attached 3. amfd trace attached Both nodes are not in time sysnc, there is time gap between two nodes Relative to SC-2, SC-1 is (+50 min ahead) Time Diff == TestBed-R1:~ date Thu Mar 2 16:34:45 IST 2017 TestBed-R2:~ date Thu Mar 2 15:44:30 IST 2017 = --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org!