TC #21: Node shutdown and same as TC #17. Logs " TC 21" are attached in ticket.
TC #22: Node lock and unlock(2nd time), result same as TC #18. Logs " TC 22" are attached in ticket. TC #23: Node Group shutdown operation: Stop controller during shutdown operation and it will remain in shutdown state even if all the assignments are removed. TC #24: Configuration: Start SC-1 and PL-3 and PL-4, create 2N Red model SU1(Act) on PL-3 and SU2(Standby) on PL-4 . Create a node group having PL-3 and PL-4. Now, lock and lock-in the node group. Reboot the controller. Check the admin state of node group, it will be 3. Now, do unlock-in of node group, SU1 and SU2 are not instantiated(which is wrong). Node group admin state is 2 now(locked). Logs " TC 24" are attached in ticket. TC #25: Same configuration as TC #24: Lock the node group and keep gdb in amf_csi_set_callback in SU1's comp(Act) and amf_csi_remove_callback in SU2's comp(Standby). Reboot the controller. When SC-1 comes up, respond from gdb from SU1 and SU2 components. The following error comes, that means extra susi remove is coming: Feb 11 19:46:27 PM_PL-3 osafamfnd[16881]: ER susi_assign_evh: 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' has no assignments Logs " TC 25" are attached in ticket. TC #26: Same configuration as TC #24: Lock the node group and keep gdb in amf_csi_remove_callback in SU1's comp(Act) and amf_csi_remove_callback in SU2's comp(Standby). Reboot the controller. When SC-1 comes up, respond from gdb from SU1 and SU2 components. Amfnd on PL-3 and PL-4 crashes: PL-3: Feb 11 19:56:41 PM_PL-3 osafamfnd[17561]: di.cc:850: avnd_di_susi_resp_send: Assertion 'si' failed. Feb 11 19:56:42 PM_PL-3 osafclmna[17552]: AL AMF Node Director is down, terminate this process PL-4: Feb 11 19:56:40 PM_PL-4 osafamfnd[10912]: di.cc:850: avnd_di_susi_resp_send: Assertion 'si' failed. Feb 11 19:56:40 PM_PL-4 osafimmnd[10893]: NO Global discard node received for nodeId:2030f pid:17542 Feb 11 19:56:40 PM_PL-4 osafimmnd[10893]: NO Implementer disconnected 13 <0, 2030f(down)> (MsgQueueService131855) Feb 11 19:56:40 PM_PL-4 amf_demo[11007]: AL AMF Node Director is down, terminate this process Logs " TC 26" are attached in ticket. Thanks -Nagu > -----Original Message----- > From: Nagendra Kumar > Sent: 10 February 2016 17:30 > To: minh chau; hans.nordeb...@ericsson.com; gary....@dektech.com.au; > Praveen Malviya > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add support > for cloud resilience [#1620] V2 > > TC #16. Same configuration as #12: Run SI shutdown and keep sleep > of 5 sec before saAmfCSIQuiescingComplete and stop controller and then > after sleep, reject saAmfCSIQuiescingComplete with > SA_AIS_ERR_FAILED_OPERATION. All the assignment from SU1 on PL-3 and > SU2 on PL-4 are removed and SI admin state is 2(locked): > saAmfSIAdminState SA_UINT32_T 2 (0x2) > > "Si going into locked state" is different behaviour when controller is up and > running and run this test case. In case, controller is available, SI will be > in > unlocked state and all the assignments will be on SU2 as Act and SU3 as > Standby (on PL-4). This need either correction or documentation. > > TC #17. Same configuration as #12: Run SG shutdown and keep > sleep > of 5 sec before saAmfCSIQuiescingComplete and stop controller and then > after sleep, reject saAmfCSIQuiescingComplete with > SA_AIS_ERR_FAILED_OPERATION. Amfnd crashes[Please note that this test > case works with controller up]: > Syslog and bt: > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO component with > QUIESCED/QUIESCING assignment failed > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO recovery action 'comp > restart' escalated to 'comp failover' > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO SU failover probation timer > started (timeout: 1200000000000 ns) > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO Performing failover of > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' (SU failover count: 1) > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO > 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > recovery action escalated from 'componentRestart' to 'componentFailover' > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO > 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > faulted due to 'csiSetcallbackFailed' : Recovery is 'componentFailover' > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State > INSTANTIATED => TERMINATING > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO Removed > 'safSi=AmfDemo,safApp=AmfDemo1' from > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > Feb 10 11:44:29 PM_PL-3 amf_demo[15721]: saAmfHAStateGet FAILED - 7 > Feb 10 11:44:29 PM_PL-3 amf_demo[15721]: exiting (caught term signal) > Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO avnd_di_oper_send() > deferred as AMF director is offline > Feb 10 11:44:29 PM_PL-3 osafimmnd[15760]: AL AMF Node Director is > down, terminate this process > > Program terminated with signal 11, Segmentation fault. > #0 0x0000000000412b50 in > avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*) () > (gdb) bt > #0 0x0000000000412b50 in > avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*) () > #1 0x000000000040a093 in > avnd_comp_clc_terming_cleansucc_hdler(avnd_cb_tag*, avnd_comp_tag*) > () > #2 0x000000000040c7d4 in avnd_comp_clc_fsm_run(avnd_cb_tag*, > avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) () > #3 0x000000000040ce49 in avnd_evt_clc_resp_evh(avnd_cb_tag*, > avnd_evt_tag*) () > #4 0x000000000042133f in avnd_main_process() () at main.cc:667 > #5 0x0000000000405517 in main () at main.cc:186 > (gdb) thread apply all bt > > Thread 4 (Thread 0x7fdaf3c05b00 (LWP 15512)): > #0 0x00007fdaf2b2b415 in __lll_unlock_wake () from /lib64/libpthread.so.0 > #1 0x00007fdaf2b27ac4 in _L_unlock_553 () from /lib64/libpthread.so.0 > #2 0x00007fdaf2b279f7 in __pthread_mutex_unlock_usercnt () from > /lib64/libpthread.so.0 > #3 0x00007fdaf37edac3 in ncs_os_lock () from > /usr/local/lib/libopensaf_core.so.0 > #4 0x00007fdaf37e084d in ncs_ipc_send () from > /usr/local/lib/libopensaf_core.so.0 > #5 0x000000000041eea1 in avnd_evt_send(avnd_cb_tag*, avnd_evt_tag*) () > #6 0x000000000040a2cb in > comp_clc_resp_callback(NCS_OS_PROC_EXECUTE_TIMED_CB_INFO*) () > #7 0x00007fdaf37ecdfb in give_exec_mod_cb () from > /usr/local/lib/libopensaf_core.so.0 > #8 0x00007fdaf37ecfde in ncs_exec_mod_hdlr () from > /usr/local/lib/libopensaf_core.so.0 > #9 0x00007fdaf2b247b6 in start_thread () from /lib64/libpthread.so.0 > #10 0x00007fdaf20da9cd in clone () from /lib64/libc.so.6 > #11 0x0000000000000000 in ?? () > > Thread 3 (Thread 0x7fdaf3c25b00 (LWP 15510)): > #0 0x00007fdaf20d14f6 in poll () from /lib64/libc.so.6 > #1 0x00007fdaf3817623 in mdtm_process_recv_events () from > /usr/local/lib/libopensaf_core.so.0 > #2 0x00007fdaf2b247b6 in start_thread () from /lib64/libpthread.so.0 > #3 0x00007fdaf20da9cd in clone () from /lib64/libc.so.6 > #4 0x0000000000000000 in ?? () > > Thread 2 (Thread 0x7fdaf3c58b00 (LWP 15509)): > #0 0x00007fdaf20d14f6 in poll () from /lib64/libc.so.6 > #1 0x00007fdaf37db22f in osaf_ppoll () from > /usr/local/lib/libopensaf_core.so.0 > #2 0x00007fdaf37e2acf in ncs_tmr_wait () from > /usr/local/lib/libopensaf_core.so.0 > #3 0x00007fdaf2b247b6 in start_thread () from /lib64/libpthread.so.0 > #4 0x00007fdaf20da9cd in clone () from /lib64/libc.so.6 > #5 0x0000000000000000 in ?? () > > Thread 1 (Thread 0x7fdaf3c28720 (LWP 15508)): > #0 0x0000000000412b50 in > avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*) () > #1 0x000000000040a093 in > avnd_comp_clc_terming_cleansucc_hdler(avnd_cb_tag*, avnd_comp_tag*) > () > #2 0x000000000040c7d4 in avnd_comp_clc_fsm_run(avnd_cb_tag*, > avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) () > #3 0x000000000040ce49 in avnd_evt_clc_resp_evh(avnd_cb_tag*, > avnd_evt_tag*) () > #4 0x000000000042133f in avnd_main_process() () at main.cc:667 > #5 0x0000000000405517 in main () at main.cc:186 > > TC #18. Same configuration as #12: Run SG lock and keep gdb in > amf_csi_remove_callback and stop controller and then start the controller > and make is up. Now release Amfnd from gdb so that it can respond to csi > remove(Please note that controller has reboot and is available now). Now, > issue SG unlock. Amfnd crashes on PL-3 and PL-4 at the same location[Please > note that this test case works with controller up]: > Syslog and Bt: > Feb 10 16:35:51 PM_PL-3 amf_demo[26623]: CSI Remove for all CSIs > Feb 10 16:35:51 PM_PL-3 osafamfnd[26545]: NO Removed > 'safSi=AmfDemo,safApp=AmfDemo1' from > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: NO Assigning > 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > Feb 10 16:36:02 PM_PL-3 amf_demo[26623]: CSI Set - add > 'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState Active > Feb 10 16:36:02 PM_PL-3 amf_demo[26623]: name: abcdef, value: val1 > Feb 10 16:36:02 PM_PL-3 amf_demo[26623]: name: abcdef, value: val2 > Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: NO Assigned > 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: di.cc:850: > avnd_di_susi_resp_send: Assertion 'si' failed. > Feb 10 16:36:02 PM_PL-3 osafclmna[26536]: AL AMF Node Director is down, > terminate this process > > Core was generated by `/usr/local/lib/opensaf/osafamfnd -- > tracemask=0xffffffff'. > Program terminated with signal 6, Aborted. > #0 0x00007f022ebd9b55 in raise () from /lib64/libc.so.6 > (gdb) bt > #0 0x00007f022ebd9b55 in raise () from /lib64/libc.so.6 > #1 0x00007f022ebdb131 in abort () from /lib64/libc.so.6 > #2 0x00007f023038331b in __osafassert_fail () from > /usr/local/lib/libopensaf_core.so.0 > #3 0x000000000041b399 in avnd_di_susi_resp_send(avnd_cb_tag*, > avnd_su_tag*, avnd_su_si_rec*) () > #4 0x000000000042e9fa in avnd_su_si_oper_done(avnd_cb_tag*, > avnd_su_tag*, avnd_su_si_rec*) () > #5 0x0000000000411622 in avnd_comp_csi_assign_done(avnd_cb_tag*, > avnd_comp_tag*, avnd_comp_csi_rec*) () > #6 0x0000000000407397 in avnd_evt_ava_resp_evh(avnd_cb_tag*, > avnd_evt_tag*) () > #7 0x000000000042133f in avnd_main_process() () at main.cc:667 > #8 0x0000000000405517 in main () at main.cc:186 > > TC #19. Same configuration as #12: Run Node lock and keep sleep > of > 5 sec in amf_csi_set_callback and stop controller. Reject quisced assignment > in amf_csi_set_callback, Amfnd crashes. Syslog and gdb is the same as in TC > #17. > > TC #20. Same configuration as #12: Issue Node shutdown: and keep > sleep of 5 sec in amf_csi_set_callback before sending saAmfResponse() and > stop controller. Amfnd crashes: > Syslog: > > Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO component with > QUIESCED/QUIESCING a ssignment > failed > Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO recovery action 'comp > restart' esca lated to 'comp > failover' > Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO SU failover probation timer > started (timeout: 1200000000000 > ns) > Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO Performing failover of > 'safSu=SU1,s > afSg=AmfDemo,safApp=AmfDemo1' > (SU failover count: 1) > Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO > 'safComp=AmfDemo,safSu=SU1,safSg=Am > fDemo,safApp=AmfDemo1' recovery action escalated from > 'componentRestart' to 'com > ponentFailover' > Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO > 'safComp=AmfDemo,safSu=SU1,safSg=Am > fDemo,safApp=AmfDemo1' faulted due to 'csiSetcallbackFailed' : Recovery is > 'comp onentFailover' > Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO > 'safSu=SU1,safSg=AmfDemo,safApp=Amf > Demo1' > Presence State INSTANTIATED => TERMINATING > Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO Removed > 'safSi=AmfDemo,safApp=AmfDe mo1' > from > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: saAmfHAStateGet FAILED - 7 > Feb 10 17:21:10 PM_PL-3 osafimmnd[29561]: AL AMF Node Director is > down, terminat e this process > Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: AL AMF Node Director is down, > terminate this process > Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: exiting (caught term signal) > Feb 10 17:21:10 PM_PL-3 osafclmna[29321]: AL AMF Node Director is down, > terminat e this process > > Bt: > Core was generated by `/usr/local/lib/opensaf/osafamfnd -- > tracemask=0xffffffff'. > Program terminated with signal 11, Segmentation fault. > #0 0x00000000004117c9 in avnd_comp_csi_assign_done(avnd_cb_tag*, > avnd_comp_tag*, avnd_comp_csi_rec*) () > (gdb) bt > #0 0x00000000004117c9 in avnd_comp_csi_assign_done(avnd_cb_tag*, > avnd_comp_tag*, avnd_comp_csi_rec*) () > #1 0x0000000000406a3b in > avnd_evt_ava_csi_quiescing_compl_evh(avnd_cb_tag*, avnd_evt_tag*) () > #2 0x000000000042133f in avnd_main_process() () at main.cc:667 > #3 0x0000000000405517 in main () at main.cc:186 > > > Thanks > -Nagu > > > -----Original Message----- > > From: Nagendra Kumar > > Sent: 09 February 2016 21:39 > > To: minh chau; hans.nordeb...@ericsson.com; gary....@dektech.com.au; > > Praveen Malviya > > Cc: opensaf-devel@lists.sourceforge.net > > Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add > support > > for cloud resilience [#1620] V2 > > > > 15. Same configuration as Test Case #12, SI lock. Keep gdb in both the SUs > > for csi remove and keep timeout as 100 sec. Slock SI and stop controller. > > Start controller and allow csi remove to timeout. > > Two things: > > SU2 has Standby assignment(which is wrong), SU1 has not > > assignment. > > Error at PL-4 : SU-SI record addition failed > > > > PM_SC-1:/home/nagu/views/staging # amf-state siass safSISU=safSu=PL- > > 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF > > saAmfSISUHAState=ACTIVE(1) > > saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > > safSISU=safSu=PL- > > 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF > > saAmfSISUHAState=ACTIVE(1) > > saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > > > safSISU=safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,s > > afApp=AmfDemo1 > > saAmfSISUHAState=STANDBY(2) > > saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > > safSISU=safSu=SC- > > 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF > > saAmfSISUHAState=ACTIVE(1) > > saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > > safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- > > 2N,safApp=OpenSAF > > saAmfSISUHAState=ACTIVE(1) > > saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > > > > Syslog of PL-4: > > > > Feb 9 21:24:50 PM_PL-4 osafamfnd[7998]: NO > > 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' component restart > > probation timer started (timeout: 60000000000 ns) Feb 9 21:24:50 PM_PL- > 4 > > osafamfnd[7998]: NO Restarting a component of > > 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' (comp restart count: 1) > Feb > > 9 21:24:50 PM_PL-4 osafamfnd[7998]: NO > > 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' > > faulted due to 'csiRemovecallbackTimeout' : Recovery is > 'componentRestart' > > Feb 9 21:24:55 PM_PL-4 amf_demo_script: killproc > > /opt/amf_demo/amf_demo failed Feb 9 21:24:55 PM_PL-4 > > amf_demo[8200]: > > 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' > > started Feb 9 21:24:55 PM_PL-4 osafamfnd[7998]: NO Removed > > 'safSi=AmfDemo1,safApp=AmfDemo1' from > > 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' > > Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: HC started with AMF Feb 9 > > 21:24:55 PM_PL-4 amf_demo[8200]: Registered with AMF Feb 9 21:24:55 > > PM_PL-4 amf_demo[8200]: CSI Set - add > > 'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState Standby > > Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: name: abcdef, value: val1 > > Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: name: abcdef, value: val2 > > Feb 9 21:24:55 PM_PL-4 osafamfnd[7998]: CR SU-SI record addition failed, > > SU= safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 : > > SI=safSi=AmfDemo,safApp=AmfDemo1 Feb 9 21:24:55 PM_PL-4 > > amf_demo[8200]: Health check 1 Feb 9 21:25:50 PM_PL-4 > > osafamfnd[7998]: NO 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' > > Component or SU restart probation timer expired > > > > Thanks > > -Nagu > > > > ------------------------------------------------------------------------------ > > Site24x7 APM Insight: Get Deep Visibility into Application Performance APM > + > > Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor > > end-to-end web transactions and take corrective actions now > Troubleshoot > > faster and improve end-user experience. Signup Now! > > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > > _______________________________________________ > > Opensaf-devel mailing list > > Opensaf-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/opensaf-devel > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel