15. Same configuration as Test Case #12, SI lock. Keep gdb in both the SUs for 
csi remove and keep timeout as 100 sec. Slock SI and stop controller.
Start controller and allow csi remove to timeout.
Two things: 
        SU2 has Standby assignment(which is wrong), SU1 has not assignment.
        Error at PL-4 : SU-SI record addition failed

PM_SC-1:/home/nagu/views/staging # amf-state  siass
safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
        saAmfSISUHAState=STANDBY(2)
        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)

Syslog of PL-4:

Feb  9 21:24:50 PM_PL-4 osafamfnd[7998]: NO 
'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' component restart probation timer 
started (timeout: 60000000000 ns)
Feb  9 21:24:50 PM_PL-4 osafamfnd[7998]: NO Restarting a component of 
'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' (comp restart count: 1)
Feb  9 21:24:50 PM_PL-4 osafamfnd[7998]: NO 
'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
Feb  9 21:24:55 PM_PL-4 amf_demo_script: killproc /opt/amf_demo/amf_demo failed
Feb  9 21:24:55 PM_PL-4 amf_demo[8200]: 
'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' started
Feb  9 21:24:55 PM_PL-4 osafamfnd[7998]: NO Removed 
'safSi=AmfDemo1,safApp=AmfDemo1' from 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
Feb  9 21:24:55 PM_PL-4 amf_demo[8200]: HC started with AMF
Feb  9 21:24:55 PM_PL-4 amf_demo[8200]: Registered with AMF
Feb  9 21:24:55 PM_PL-4 amf_demo[8200]: CSI Set - add 
'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState Standby
Feb  9 21:24:55 PM_PL-4 amf_demo[8200]:         name: abcdef, value: val1
Feb  9 21:24:55 PM_PL-4 amf_demo[8200]:         name: abcdef, value: val2
Feb  9 21:24:55 PM_PL-4 osafamfnd[7998]: CR SU-SI record addition failed, SU= 
safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 : SI=safSi=AmfDemo,safApp=AmfDemo1
Feb  9 21:24:55 PM_PL-4 amf_demo[8200]: Health check 1
Feb  9 21:25:50 PM_PL-4 osafamfnd[7998]: NO 
'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' Component or SU restart probation 
timer expired

Thanks
-Nagu

> -----Original Message-----
> From: Nagendra Kumar
> Sent: 09 February 2016 20:44
> To: minh chau; hans.nordeb...@ericsson.com; gary....@dektech.com.au;
> Praveen Malviya
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add support
> for cloud resilience [#1620] V2
> 
> >> SI Swap again and the commands come out with success, but swap
> doesn't happen and syslog prints:
> 
> Modification in #13, SU1 gets Act, but SU2 gets assignment removed as an
> outcome of SI swap.
> 
> Next Si-swap failed as only one assignment.
> 
> > -----Original Message-----
> > From: Nagendra Kumar
> > Sent: 09 February 2016 20:41
> > To: minh chau; hans.nordeb...@ericsson.com; gary....@dektech.com.au;
> > Praveen Malviya
> > Cc: opensaf-devel@lists.sourceforge.net
> > Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add
> > support for cloud resilience [#1620] V2
> >
> > 12. Issue shutdown on SI and keep sleep in csi set callback, stop
> > controller and let csi set callback timeout. Start SC-1 and immlist
> > the SI, it is in shutting down state:
> > saAmfSIAdminState                                  SA_UINT32_T  4 (0x4)
> > 13. Issue SI Swap of appl SI (SU1 Act, SU2 Std): Keep gdb in Quisced csi
> > callback and allow to timeout and stop the controller.
> >  At one time: Start the controller, SU1 gets Standby and SU2 gets Act.
> > Now issue, SI Swap again and the commands come out with success, but
> > swap doesn't happen and syslog prints:
> > Feb  9 20:33:51 PM_SC-1 osafamfd[9497]: NO
> > safSi=AmfDemo,safApp=AmfDemo1 Swap initiated
> >
> > Please find the amfd trace attached.
> >
> > 14.) test Case #13: At another time: Amfnd crash: Bt and syslog(below)
> > and Amfnd traces(osafamfnd-PL-3) attached.
> >
> > Program terminated with signal 11, Segmentation fault.
> > #0  0x000000000041deaa in avnd_err_process(avnd_cb_tag*,
> > avnd_comp_tag*, avnd_err_tag*)
> >     ()
> > (gdb) bt
> > #0  0x000000000041deaa in avnd_err_process(avnd_cb_tag*,
> > avnd_comp_tag*, avnd_err_tag*)
> >     ()
> > #1  0x0000000000407559 in avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*,
> > avnd_evt_tag*) ()
> > #2  0x000000000042133f in avnd_main_process() () at main.cc:667
> > #3  0x0000000000405517 in main () at main.cc:186
> > (gdb) thread apply bt all
> > (gdb) thread apply all bt
> >
> > Thread 4 (Thread 0x7fe84b5b3b00 (LWP 7892)):
> > #0  0x00007fe84a4d976d in read () from /lib64/libpthread.so.0
> > #1  0x00007fe84b19af17 in ncs_exec_mod_hdlr () from
> > /usr/local/lib/libopensaf_core.so.0
> > #2  0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0
> > #3  0x00007fe849a889cd in clone () from /lib64/libc.so.6
> > #4  0x0000000000000000 in ?? ()
> >
> > Thread 3 (Thread 0x7fe84b5d3b00 (LWP 7890)):
> > #0  0x00007fe849a7f4f6 in poll () from /lib64/libc.so.6
> > #1  0x00007fe84b1c5623 in mdtm_process_recv_events ()
> >    from /usr/local/lib/libopensaf_core.so.0
> > #2  0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0
> > #3  0x00007fe849a889cd in clone () from /lib64/libc.so.6
> > #4  0x0000000000000000 in ?? ()
> >
> > Thread 2 (Thread 0x7fe84b606b00 (LWP 7889)):
> > #0  0x00007fe849a7f4f6 in poll () from /lib64/libc.so.6
> > #1  0x00007fe84b18922f in osaf_ppoll () from
> > /usr/local/lib/libopensaf_core.so.0
> > #2  0x00007fe84b190acf in ncs_tmr_wait () from
> > /usr/local/lib/libopensaf_core.so.0
> > #3  0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0
> > #4  0x00007fe849a889cd in clone () from /lib64/libc.so.6
> > #5  0x0000000000000000 in ?? ()
> > ---Type <return> to continue, or q <return> to quit---
> >
> > Thread 1 (Thread 0x7fe84b5d6720 (LWP 7888)):
> > #0  0x000000000041deaa in avnd_err_process(avnd_cb_tag*,
> > avnd_comp_tag*, avnd_err_tag*)
> >     ()
> > #1  0x0000000000407559 in avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*,
> > avnd_evt_tag*) ()
> > #2  0x000000000042133f in avnd_main_process() () at main.cc:667
> > #3  0x0000000000405517 in main () at main.cc:186
> >
> > Syslog:
> > Feb  9 20:05:44 PM_PL-3 osafimmnd[7869]: NO Re-introduce-me
> > highestProcessed:1514 highestReceived:1514 Feb  9 20:05:46 PM_PL-3
> > kernel: [117927.208595] TIPC: Resetting link <1.1.3:eth0-1.1.1:eth0>,
> > peer not responding Feb  9 20:05:46 PM_PL-3 kernel: [117927.208604]
> > TIPC: Lost link <1.1.3:eth0-1.1.1:eth0> on network plane A Feb  9
> > 20:05:46 PM_PL-3
> > kernel: [117927.208610] TIPC: Lost contact with <1.1.1> Feb  9
> > 20:05:49
> > PM_PL-3 osafimmnd[7869]: WA MDS Send Failed to service:IMMD rc:2 Feb
> > 9
> > 20:05:49 PM_PL-3 osafamfnd[7888]: NO component with
> QUIESCED/QUIESCING
> > assignment failed Feb  9 20:05:49 PM_PL-3
> > osafamfnd[7888]: NO recovery action 'comp restart' escalated to 'comp
> > failover'
> > Feb  9 20:05:49 PM_PL-3 osafamfnd[7888]: NO SU failover probation
> > timer started (timeout: 1200000000000 ns) Feb  9 20:05:49 PM_PL-3
> > osafamfnd[7888]: NO Performing failover of
> > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' (SU failover count: 1) Feb
> > 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO
> > 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> > recovery action escalated from 'componentRestart' to
> 'componentFailover'
> > Feb  9 20:05:49 PM_PL-3 osafamfnd[7888]: NO
> > 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> > faulted due to 'csiSetcallbackTimeout' : Recovery is 'componentFailover'
> > Feb  9 20:05:49 PM_PL-3 osafamfnd[7888]: NO
> > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State
> INSTANTIATED
> > => TERMINATING Feb  9 20:05:49 PM_PL-3 osafamfnd[7888]:
> > NO Removed 'safSi=AmfDemo,safApp=AmfDemo1' from
> > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> > Feb  9 20:05:49 PM_PL-3 osafamfnd[7888]: NO Assigned
> > 'safSi=AmfDemo1,safApp=AmfDemo1' QUIESCED to
> > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> > Feb  9 20:05:49 PM_PL-3 osafclmna[7879]: AL AMF Node Director is down,
> > terminate this process Feb  9 20:05:49 PM_PL-3 osafamfwd[7947]:
> > Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: AMF
> > unexpectedly crashed, OwnNodeId = 131855, SupervisionTime = 60 Feb  9
> > 20:05:49 PM_PL-3 osafckptnd[7937]: AL AMF Node Director is down,
> > terminate this process Feb  9 20:05:49 PM_PL-3 osaflcknd[7927]: AL AMF
> > Node Director is down, terminate this process Feb  9 20:05:49 PM_PL-3
> > osafimmnd[7869]: AL AMF Node Director is down, terminate this process
> > Feb
> > 9 20:05:49 PM_PL-3 osafmsgnd[7908]: AL AMF Node Director is down,
> > terminate this process Feb  9 20:05:49 PM_PL-3 osafsmfnd[7898]: AL AMF
> > Node Director is down, terminate this process Feb  9 20:05:49 PM_PL-3
> > opensaf_reboot: Rebooting local node; timeout=60
> >
> >
> > > -----Original Message-----
> > > From: Nagendra Kumar
> > > Sent: 09 February 2016 19:40
> > > To: minh chau; hans.nordeb...@ericsson.com;
> gary....@dektech.com.au;
> > > Praveen Malviya
> > > Cc: opensaf-devel@lists.sourceforge.net
> > > Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add
> > > support for cloud resilience [#1620] V2
> > >
> > > Testing continued....
> > >
> > > 11.       Lock SI and then unlock SI and keep sleep in csi set callback 
> > > and then
> > > reboot SC-1. Allow csi set timeout. When SC-1 is coming Amfd crashes.
> > > Complete Amfd Logs attached and Amfnd of SC-1 and PL-3 is coming in
> > > next email.
> > >
> > > Thanks
> > > -Nagu
> > >
> > > > -----Original Message-----
> > > > From: Nagendra Kumar
> > > > Sent: 09 February 2016 15:57
> > > > To: minh chau; hans.nordeb...@ericsson.com;
> > gary....@dektech.com.au;
> > > > Praveen Malviya
> > > > Cc: opensaf-devel@lists.sourceforge.net
> > > > Subject: RE: [devel] FW: [PATCH 0 of 5] Review Request for amf:
> > > > Add support for cloud resilience [#1620] V2
> > > >
> > > > Continued....
> > > >
> > > > > -----Original Message-----
> > > > > From: Nagendra Kumar [mailto:nagendr...@oracle.com]
> > > > > Sent: 09 February 2016 15:56
> > > > > To: 'minh chau'; 'hans.nordeb...@ericsson.com';
> > > > > 'gary....@dektech.com.au'; Praveen Malviya
> > > > > Cc: 'opensaf-devel@lists.sourceforge.net'
> > > > > Subject: RE: [devel] FW: [PATCH 0 of 5] Review Request for amf:
> > > > > Add support for cloud resilience [#1620] V2
> > > > >
> > > > > Hi Hans N,
> > > > >               Please find the amfd and amfnd of SC-1 and amfnd of PL-3
> > > > traces
> > > > > attached in 3 emails coming(because of limit of devel list, I am
> > > > > not able to send it in one go). It took second reboot to
> > > > > reproduce it for TC #6, but it is coming at the same location.
> > > > >
> > > > > Feb  9 15:32:28 PM_SC-1 osafamfd[3962]: NO Received node_up
> from
> > > > > 2010f: msg_id 1 Feb  9 15:32:28 PM_SC-1 osafamfd[3962]:
> siass.cc:842:
> > > > > avd_susi_recreate: Assertion 'su' failed.
> > > > > Feb  9 15:32:28 PM_SC-1 osafamfnd[3972]: WA AMF director
> > > > > unexpectedly crashed Feb  9 15:32:28 PM_SC-1 osafamfnd[3972]: WA
> > > AMF
> > > > > director unexpectedly crashed
> > > > >
> > > > > Thanks
> > > > > -Nagu
> 
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance APM +
> Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor
> end-to-end web transactions and take corrective actions now Troubleshoot
> faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Opensaf-devel mailing list
> Opensaf-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to