>> SI Swap again and the commands come out with success, but swap doesn't >> happen and syslog prints:
Modification in #13, SU1 gets Act, but SU2 gets assignment removed as an outcome of SI swap. Next Si-swap failed as only one assignment. > -----Original Message----- > From: Nagendra Kumar > Sent: 09 February 2016 20:41 > To: minh chau; hans.nordeb...@ericsson.com; gary....@dektech.com.au; > Praveen Malviya > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add support > for cloud resilience [#1620] V2 > > 12. Issue shutdown on SI and keep sleep in csi set callback, stop > controller and let csi set callback timeout. Start SC-1 and immlist the SI, > it is > in shutting down state: > saAmfSIAdminState SA_UINT32_T 4 (0x4) > 13. Issue SI Swap of appl SI (SU1 Act, SU2 Std): Keep gdb in Quisced csi > callback and allow to timeout and stop the controller. > At one time: Start the controller, SU1 gets Standby and SU2 gets Act. Now > issue, SI Swap again and the commands come out with success, but swap > doesn't happen and syslog prints: > Feb 9 20:33:51 PM_SC-1 osafamfd[9497]: NO > safSi=AmfDemo,safApp=AmfDemo1 Swap initiated > > Please find the amfd trace attached. > > 14.) test Case #13: At another time: Amfnd crash: Bt and syslog(below) and > Amfnd traces(osafamfnd-PL-3) attached. > > Program terminated with signal 11, Segmentation fault. > #0 0x000000000041deaa in avnd_err_process(avnd_cb_tag*, > avnd_comp_tag*, avnd_err_tag*) > () > (gdb) bt > #0 0x000000000041deaa in avnd_err_process(avnd_cb_tag*, > avnd_comp_tag*, avnd_err_tag*) > () > #1 0x0000000000407559 in avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*, > avnd_evt_tag*) () > #2 0x000000000042133f in avnd_main_process() () at main.cc:667 > #3 0x0000000000405517 in main () at main.cc:186 > (gdb) thread apply bt all > (gdb) thread apply all bt > > Thread 4 (Thread 0x7fe84b5b3b00 (LWP 7892)): > #0 0x00007fe84a4d976d in read () from /lib64/libpthread.so.0 > #1 0x00007fe84b19af17 in ncs_exec_mod_hdlr () from > /usr/local/lib/libopensaf_core.so.0 > #2 0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0 > #3 0x00007fe849a889cd in clone () from /lib64/libc.so.6 > #4 0x0000000000000000 in ?? () > > Thread 3 (Thread 0x7fe84b5d3b00 (LWP 7890)): > #0 0x00007fe849a7f4f6 in poll () from /lib64/libc.so.6 > #1 0x00007fe84b1c5623 in mdtm_process_recv_events () > from /usr/local/lib/libopensaf_core.so.0 > #2 0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0 > #3 0x00007fe849a889cd in clone () from /lib64/libc.so.6 > #4 0x0000000000000000 in ?? () > > Thread 2 (Thread 0x7fe84b606b00 (LWP 7889)): > #0 0x00007fe849a7f4f6 in poll () from /lib64/libc.so.6 > #1 0x00007fe84b18922f in osaf_ppoll () from > /usr/local/lib/libopensaf_core.so.0 > #2 0x00007fe84b190acf in ncs_tmr_wait () from > /usr/local/lib/libopensaf_core.so.0 > #3 0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0 > #4 0x00007fe849a889cd in clone () from /lib64/libc.so.6 > #5 0x0000000000000000 in ?? () > ---Type <return> to continue, or q <return> to quit--- > > Thread 1 (Thread 0x7fe84b5d6720 (LWP 7888)): > #0 0x000000000041deaa in avnd_err_process(avnd_cb_tag*, > avnd_comp_tag*, avnd_err_tag*) > () > #1 0x0000000000407559 in avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*, > avnd_evt_tag*) () > #2 0x000000000042133f in avnd_main_process() () at main.cc:667 > #3 0x0000000000405517 in main () at main.cc:186 > > Syslog: > Feb 9 20:05:44 PM_PL-3 osafimmnd[7869]: NO Re-introduce-me > highestProcessed:1514 highestReceived:1514 Feb 9 20:05:46 PM_PL-3 > kernel: [117927.208595] TIPC: Resetting link <1.1.3:eth0-1.1.1:eth0>, peer > not responding Feb 9 20:05:46 PM_PL-3 kernel: [117927.208604] TIPC: Lost > link <1.1.3:eth0-1.1.1:eth0> on network plane A Feb 9 20:05:46 PM_PL-3 > kernel: [117927.208610] TIPC: Lost contact with <1.1.1> Feb 9 20:05:49 > PM_PL-3 osafimmnd[7869]: WA MDS Send Failed to service:IMMD rc:2 Feb 9 > 20:05:49 PM_PL-3 osafamfnd[7888]: NO component with > QUIESCED/QUIESCING assignment failed Feb 9 20:05:49 PM_PL-3 > osafamfnd[7888]: NO recovery action 'comp restart' escalated to 'comp > failover' > Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO SU failover probation timer > started (timeout: 1200000000000 ns) Feb 9 20:05:49 PM_PL-3 > osafamfnd[7888]: NO Performing failover of > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' (SU failover count: 1) Feb > 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO > 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > recovery action escalated from 'componentRestart' to 'componentFailover' > Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO > 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > faulted due to 'csiSetcallbackTimeout' : Recovery is 'componentFailover' > Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State > INSTANTIATED => TERMINATING Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: > NO Removed 'safSi=AmfDemo,safApp=AmfDemo1' from > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO Assigned > 'safSi=AmfDemo1,safApp=AmfDemo1' QUIESCED to > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' > Feb 9 20:05:49 PM_PL-3 osafclmna[7879]: AL AMF Node Director is down, > terminate this process Feb 9 20:05:49 PM_PL-3 osafamfwd[7947]: > Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: AMF > unexpectedly crashed, OwnNodeId = 131855, SupervisionTime = 60 Feb 9 > 20:05:49 PM_PL-3 osafckptnd[7937]: AL AMF Node Director is down, > terminate this process Feb 9 20:05:49 PM_PL-3 osaflcknd[7927]: AL AMF > Node Director is down, terminate this process Feb 9 20:05:49 PM_PL-3 > osafimmnd[7869]: AL AMF Node Director is down, terminate this process Feb > 9 20:05:49 PM_PL-3 osafmsgnd[7908]: AL AMF Node Director is down, > terminate this process Feb 9 20:05:49 PM_PL-3 osafsmfnd[7898]: AL AMF > Node Director is down, terminate this process Feb 9 20:05:49 PM_PL-3 > opensaf_reboot: Rebooting local node; timeout=60 > > > > -----Original Message----- > > From: Nagendra Kumar > > Sent: 09 February 2016 19:40 > > To: minh chau; hans.nordeb...@ericsson.com; gary....@dektech.com.au; > > Praveen Malviya > > Cc: opensaf-devel@lists.sourceforge.net > > Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add > > support for cloud resilience [#1620] V2 > > > > Testing continued.... > > > > 11. Lock SI and then unlock SI and keep sleep in csi set callback and then > > reboot SC-1. Allow csi set timeout. When SC-1 is coming Amfd crashes. > > Complete Amfd Logs attached and Amfnd of SC-1 and PL-3 is coming in > > next email. > > > > Thanks > > -Nagu > > > > > -----Original Message----- > > > From: Nagendra Kumar > > > Sent: 09 February 2016 15:57 > > > To: minh chau; hans.nordeb...@ericsson.com; > gary....@dektech.com.au; > > > Praveen Malviya > > > Cc: opensaf-devel@lists.sourceforge.net > > > Subject: RE: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add > > > support for cloud resilience [#1620] V2 > > > > > > Continued.... > > > > > > > -----Original Message----- > > > > From: Nagendra Kumar [mailto:nagendr...@oracle.com] > > > > Sent: 09 February 2016 15:56 > > > > To: 'minh chau'; 'hans.nordeb...@ericsson.com'; > > > > 'gary....@dektech.com.au'; Praveen Malviya > > > > Cc: 'opensaf-devel@lists.sourceforge.net' > > > > Subject: RE: [devel] FW: [PATCH 0 of 5] Review Request for amf: > > > > Add support for cloud resilience [#1620] V2 > > > > > > > > Hi Hans N, > > > > Please find the amfd and amfnd of SC-1 and amfnd of PL-3 > > > traces > > > > attached in 3 emails coming(because of limit of devel list, I am > > > > not able to send it in one go). It took second reboot to reproduce > > > > it for TC #6, but it is coming at the same location. > > > > > > > > Feb 9 15:32:28 PM_SC-1 osafamfd[3962]: NO Received node_up from > > > > 2010f: msg_id 1 Feb 9 15:32:28 PM_SC-1 osafamfd[3962]: siass.cc:842: > > > > avd_susi_recreate: Assertion 'su' failed. > > > > Feb 9 15:32:28 PM_SC-1 osafamfnd[3972]: WA AMF director > > > > unexpectedly crashed Feb 9 15:32:28 PM_SC-1 osafamfnd[3972]: WA > > AMF > > > > director unexpectedly crashed > > > > > > > > Thanks > > > > -Nagu ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel