Hi Minh,
                I thought traces should be ok, so I didn't uploaded the 
configuration file. I will upload it going forward.

Thanks
-Nagu

> -----Original Message-----
> From: minh chau [mailto:minh.c...@dektech.com.au]
> Sent: 15 February 2016 14:27
> To: Nagendra Kumar; Gary Lee
> Cc: hans.nordeb...@ericsson.com; Praveen Malviya; opensaf-
> de...@lists.sourceforge.net
> Subject: Re: [devel] [PATCH 0 of 5] Review Request for amf: Add support for
> cloud resilience [#1620] V2
> 
> Hi Nagu,
> 
> One thing that can help us to reproduce your problems, that can you attach
> to the ticket the models you are using for test?
> 
> Thanks,
> Minh
> 
> On 15/02/16 19:32, Nagendra Kumar wrote:
> > Hi Gary,
> >     I am using the patch tar sent by Minh(9 Feb on devel list) and I using
> these on same change set #7280 mentioned by Minh. So, please contact him
> for any clarifications.
> >
> > Are you finding mismatch in the traces attached in the ticket #1620 (for
> many test cases) and source code of Amfd anf Amfnd ?
> >
> > BTW, I am attaching the tar sent by Minh and how I applied patches on top
> of #7280. Please note 010_log_1179.patch, I have taken from my repo as the
> tar sent by Minh was not having correct log patch for 1179. So, ideally, Amf
> patches should be the same, please check that. I enabled cloud feature
> (IMMSV_SC_ABSENCE_ALLOWED) manually.
> > ==================================================
> > patch -p1 < /tmp/sf_cloud_resilience_integration/777_osaftimer_2.diff
> > patch -p1 <  ../OpensafHeadless/patches/010_log_1179.patch
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1620_README_V2.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1620_amfd_V3.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1620_amfnd_V3.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1180_ntf_agent.diff
> > patch -p1 <
> > /tmp/sf_cloud_resilience_integration/1180_ntf_libs_common.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1180_ntf_readme.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/180_ntf_test.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1180_ntf_tools.diff
> > patch -p1 <
> > /tmp/sf_cloud_resilience_integration/1620_common_libs_V2.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1620_config.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1621_ckpt.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_1.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_2.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_3.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_4.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_5.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_6.diff
> > patch -p1 <
> > /tmp/sf_cloud_resilience_integration/1625_imm_7_compile_err.diff
> > patch -p1 < /tmp/sf_cloud_resilience_integration/1646_clm.patch
> > ==================================================
> >
> > I manually compared installed Amfd and Amfnd binary files with binary files
> created in source code repo while compiling. I compiled again and they are
> the same. All the patches are applied.
> > So, please check from your side and confirm me if I am making any
> mistake?
> >
> > Thanks
> > -Nagu
> >> -----Original Message-----
> >> From: Gary Lee [mailto:gary....@dektech.com.au]
> >> Sent: 15 February 2016 13:36
> >> To: Nagendra Kumar
> >> Cc: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya;
> opensaf-
> >> de...@lists.sourceforge.net
> >> Subject: Re: [devel] [PATCH 0 of 5] Review Request for amf: Add
> >> support for cloud resilience [#1620] V2
> >>
> >> Hi Nagu
> >>
> >> I think we need to make sure we’re all looking at the same source code.
> >>
> >> I have trouble recreating some of the problems you’ve seen, but I see
> >> other problems.
> >>
> >> Perhaps we can set up a fork of opensaf-staging on source forge, and
> >> check in the patches?
> >>
> >> Thanks
> >> Gary
> >>
> >>
> >>> On 15 Feb 2016, at 4:00 PM, Gary Lee <gary....@dektech.com.au>
> wrote:
> >>>
> >>> Hi Nagu
> >>>
> >>> Just wanted to confirm that when you attach gdb to a process, the
> >>> process
> >> is amf_demo, and not amfnd?
> >>> Thanks
> >>> Gary
> >>>
> >>>> On 13 Feb 2016, at 1:35 AM, Nagendra Kumar
> <nagendr...@oracle.com>
> >> wrote:
> >>>> TC #27: Same configuration as TC #24:
> >>>> Add a new Csi in running demo appl: Keep gdb in SU1 comp in
> >> amf_csi_set_callback and add new csi to existing si. Stop controller,
> >> and then respond from gdb. Start controller. Only Act assignment is
> >> given to SU1 component. Standby csi assignment is not given to SU2
> component:
> >>
> safCSIComp=safComp=AmfDemo\,safSu=SU1\,safSg=AmfDemo\,safApp=Am
> >> fDemo1
> >>>> ,safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1
> >>>>
> >>
> safCSIComp=safComp=AmfDemo\,safSu=SU2\,safSg=AmfDemo\,safApp=Am
> >> fDemo1
> >>>> ,safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1
> >>>>
> >>
> safCSIComp=safComp=AmfDemo\,safSu=SU1\,safSg=AmfDemo\,safApp=Am
> >> fDemo1
> >>>> ,safCsi=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
> >>>>
> >>>> Logs " TC 27" are attached in ticket.
> >>>>
> >>>> TC #28: Same configuration as TC #24:
> >>>> Delete a Csi in running demo appl: Add a new csi and keep gdb in
> >>>> SU1
> >> comp in amf_csi_remove_callback and then delete csi. Stop controller,
> >> and then respond from gdb. Start controller. Amfd crashes:
> >>>> Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: Started Feb 12 15:52:32
> >>>> PM_SC-1 osafamfnd[16096]: NO Sending node up due to NCSMDS_UP
> >> Feb 12
> >>>> 15:52:32 PM_SC-1 osafamfd[16086]: NO Received node_up from
> 2010f:
> >>>> msg_id 1 Feb 12 15:52:32 PM_SC-1 osafamfd[16086]: csi.cc:1470:
> >> avd_compcsi_recreate: Assertion 'csi' failed.
> >>>> Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: WA AMF director
> >>>> unexpectedly crashed Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: WA
> >> AMF
> >>>> director unexpectedly crashed
> >>>>
> >>>> Logs " TC 28" are attached in ticket.
> >>>>
> >>>> TC #29: Configuration: One controller, and two PLs. 2N SG, SU1 and
> >>>> SU2
> >> with 3 comp with one csi and SI associated. Si under si deps.
> >> safSi=B,safApp=Test depends on safSi=A,safApp=Test and
> >> safSi=C,safApp=Test depends on safSi=B,safApp=Test. SU1(Act) on PL-3
> >> and
> >> SU2(Standby) on PL-4.
> >>>> Lock SI A and keep gdb in quisced callback and stop controller.
> >>>> Respond
> >> from gdb and start controller. After controller comes up SU2 has two
> >> Standby assignment and no active assignments, which looks serious
> problem.
> >>>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=B,safApp=Test
> >>>>        saAmfSISUHAState=STANDBY(2)
> >>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
> >>>>        saAmfSISUHAState=STANDBY(2)
> >>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>
> >>>> At the same time, there are two error message in controller:
> >>>> Feb 12 19:32:01 PM_SC-1 osafamfd[1572]: EM sg_2n_fsm.cc:1439:
> >>>> safSu=ABC2,safSg=2N,safApp=Test (31) Feb 12 19:32:01 PM_SC-1
> >>>> osafamfd[1572]: EM sg_2n_fsm.cc:1439:
> >> safSu=ABC2,safSg=2N,safApp=Test
> >>>> (31)
> >>>>
> >>>> Logs " TC 29" are attached in ticket.
> >>>>
> >>>> TC #30: Configuration same as TC #29: This case, Lock SI B, which
> >>>> is
> >> dependent on SI A and is sponsor of SI C.
> >>>> Only assignment of SI B is removed and SI C assignment remains:
> >>>> safSISU=safSu=ABC1\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
> >>>>        saAmfSISUHAState=ACTIVE(1)
> >>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=A,safApp=Test
> >>>>        saAmfSISUHAState=STANDBY(2)
> >>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>> safSISU=safSu=ABC1\,safSg=2N\,safApp=Test,safSi=A,safApp=Test
> >>>>        saAmfSISUHAState=ACTIVE(1)
> >>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
> >>>>        saAmfSISUHAState=STANDBY(2)
> >>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>
> >>>> Logs " TC 30" are attached in ticket.
> >>>>
> >>>> TC #31: Same configuration as TC #30. Lock SI B, which is dependent
> >>>> on SI
> >> A and is sponsor of SI C. Assignment of SI B is removed and tolerance
> >> timer will start running.
> >>>> Reboot the controller. The assignment of SI C should be removed
> >>>> because
> >> its sponsor is in locked state.
> >>>> Logs " TC 30" are attached in ticket.
> >>>>
> >>>> Thanks
> >>>> -Nagu
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Nagendra Kumar
> >>>>> Sent: 11 February 2016 20:06
> >>>>> To: minh chau; hans.nordeb...@ericsson.com;
> >> gary....@dektech.com.au;
> >>>>> Praveen Malviya
> >>>>> Cc: opensaf-devel@lists.sourceforge.net
> >>>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf:
> >>>>> Add support for cloud resilience [#1620] V2
> >>>>>
> >>>>> TC #21: Node shutdown and same as TC #17. Logs " TC 21" are
> >>>>> attached in ticket.
> >>>>>
> >>>>> TC #22: Node lock and unlock(2nd time), result same as TC #18. Logs "
> >> TC 22"
> >>>>> are attached in ticket.
> >>>>>
> >>>>> TC #23: Node Group shutdown operation: Stop controller during
> >>>>> shutdown operation and it will remain in shutdown state even if
> >>>>> all the assignments are removed.
> >>>>>
> >>>>> TC #24: Configuration: Start SC-1 and PL-3 and PL-4, create 2N Red
> >>>>> model
> >>>>> SU1(Act) on PL-3 and SU2(Standby) on PL-4 . Create a node group
> >>>>> having PL-
> >>>>> 3 and PL-4.
> >>>>> Now, lock and lock-in the node group. Reboot the controller. Check
> >>>>> the admin state of node group, it will be 3. Now, do unlock-in of
> >>>>> node group,
> >>>>> SU1 and SU2 are not instantiated(which is wrong). Node group admin
> >>>>> state is
> >>>>> 2 now(locked). Logs " TC 24" are attached in ticket.
> >>>>>
> >>>>> TC #25: Same configuration as TC #24:
> >>>>> Lock the node group and keep gdb in amf_csi_set_callback in SU1's
> >>>>> comp(Act) and amf_csi_remove_callback in SU2's comp(Standby).
> >> Reboot
> >>>>> the controller. When SC-1 comes up, respond from gdb from SU1 and
> >>>>> SU2 components. The following error comes, that means extra susi
> >>>>> remove is
> >>>>> coming:
> >>>>> Feb 11 19:46:27 PM_PL-3 osafamfnd[16881]: ER susi_assign_evh:
> >>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' has no assignments
> >>>>>
> >>>>> Logs " TC 25" are attached in ticket.
> >>>>>
> >>>>> TC #26: Same configuration as TC #24:
> >>>>> Lock the node group and keep gdb in amf_csi_remove_callback in
> >>>>> SU1's
> >>>>> comp(Act) and amf_csi_remove_callback in SU2's comp(Standby).
> >> Reboot
> >>>>> the controller. When SC-1 comes up, respond from gdb from SU1 and
> >>>>> SU2 components. Amfnd on PL-3 and PL-4 crashes:
> >>>>> PL-3:
> >>>>> Feb 11 19:56:41 PM_PL-3 osafamfnd[17561]: di.cc:850:
> >>>>> avnd_di_susi_resp_send: Assertion 'si' failed.
> >>>>> Feb 11 19:56:42 PM_PL-3 osafclmna[17552]: AL AMF Node Director is
> >>>>> down, terminate this process
> >>>>> PL-4:
> >>>>> Feb 11 19:56:40 PM_PL-4 osafamfnd[10912]: di.cc:850:
> >>>>> avnd_di_susi_resp_send: Assertion 'si' failed.
> >>>>> Feb 11 19:56:40 PM_PL-4 osafimmnd[10893]: NO Global discard node
> >>>>> received for nodeId:2030f pid:17542 Feb 11 19:56:40 PM_PL-4
> >>>>> osafimmnd[10893]: NO Implementer disconnected 13 <0,
> 2030f(down)>
> >>>>> (MsgQueueService131855) Feb 11 19:56:40 PM_PL-4
> amf_demo[11007]:
> >> AL
> >>>>> AMF Node Director is down, terminate this process
> >>>>>
> >>>>> Logs " TC 26" are attached in ticket.
> >>>>>
> >>>>> Thanks
> >>>>> -Nagu
> >>>>>> -----Original Message-----
> >>>>>> From: Nagendra Kumar
> >>>>>> Sent: 10 February 2016 17:30
> >>>>>> To: minh chau; hans.nordeb...@ericsson.com;
> >>>>>> gary....@dektech.com.au; Praveen Malviya
> >>>>>> Cc: opensaf-devel@lists.sourceforge.net
> >>>>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf:
> >>>>>> Add support for cloud resilience [#1620] V2
> >>>>>>
> >>>>>> TC #16.        Same configuration as #12: Run SI shutdown and
> keep sleep
> >>>>>> of 5 sec before saAmfCSIQuiescingComplete  and stop controller
> >>>>>> and then after sleep, reject saAmfCSIQuiescingComplete with
> >>>>>> SA_AIS_ERR_FAILED_OPERATION. All the assignment from SU1 on
> PL-3
> >>>>>> and
> >>>>>> SU2 on PL-4 are removed and SI admin state is 2(locked):
> >>>>>> saAmfSIAdminState                                  SA_UINT32_T  2 (0x2)
> >>>>>>
> >>>>>> "Si going into locked state" is different behaviour when
> >>>>>> controller is up and running and run this test case. In case,
> >>>>>> controller is available, SI will be in unlocked state and all the
> >>>>>> assignments will be on SU2 as Act and SU3 as Standby (on PL-4).
> >>>>>> This need either correction
> >>>>> or documentation.
> >>>>>> TC #17.                Same configuration as #12: Run SG shutdown and
> >>>>> keep sleep
> >>>>>> of 5 sec before saAmfCSIQuiescingComplete  and stop controller
> >>>>>> and then after sleep, reject saAmfCSIQuiescingComplete with
> >>>>>> SA_AIS_ERR_FAILED_OPERATION. Amfnd crashes[Please note that
> this
> >>>>>> test case works with controller up]:
> >>>>>> Syslog and bt:
> >>>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO component with
> >>>>>> QUIESCED/QUIESCING assignment failed Feb 10 11:44:29 PM_PL-3
> >>>>>> osafamfnd[15508]: NO recovery action 'comp restart' escalated to
> >>>>>> 'comp failover'
> >>>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO SU failover
> >>>>>> probation timer started (timeout: 1200000000000 ns) Feb 10
> >>>>>> 11:44:29 PM_PL-3
> >>>>>> osafamfnd[15508]: NO Performing failover of
> >>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' (SU failover count:
> 1)
> >>>>>> Feb
> >>>>>> 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
> >>>>>>
> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>> recovery action escalated from 'componentRestart' to
> >>>>> 'componentFailover'
> >>>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
> >>>>>>
> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>> faulted due to 'csiSetcallbackFailed' : Recovery is
> 'componentFailover'
> >>>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
> >>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State
> >>>>> INSTANTIATED
> >>>>>> => TERMINATING Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
> >>>>> Removed
> >>>>>> 'safSi=AmfDemo,safApp=AmfDemo1' from
> >>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>> Feb 10 11:44:29 PM_PL-3 amf_demo[15721]: saAmfHAStateGet
> FAILED
> >> - 7
> >>>>>> Feb 10 11:44:29 PM_PL-3 amf_demo[15721]: exiting (caught term
> >>>>>> signal) Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
> >>>>>> avnd_di_oper_send() deferred as AMF director is offline Feb 10
> >>>>>> 11:44:29 PM_PL-3
> >>>>>> osafimmnd[15760]: AL AMF Node Director is down, terminate this
> >>>>>> process
> >>>>>>
> >>>>>> Program terminated with signal 11, Segmentation fault.
> >>>>>> #0  0x0000000000412b50 in
> >>>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*,
> avnd_comp_tag*)
> >> ()
> >>>>>> (gdb) bt
> >>>>>> #0  0x0000000000412b50 in
> >>>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*,
> avnd_comp_tag*)
> >> ()
> >>>>>> #1  0x000000000040a093 in
> >>>>>> avnd_comp_clc_terming_cleansucc_hdler(avnd_cb_tag*,
> >>>>> avnd_comp_tag*)
> >>>>>> ()
> >>>>>> #2  0x000000000040c7d4 in avnd_comp_clc_fsm_run(avnd_cb_tag*,
> >>>>>> avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) ()
> >>>>>> #3  0x000000000040ce49 in avnd_evt_clc_resp_evh(avnd_cb_tag*,
> >>>>>> avnd_evt_tag*) ()
> >>>>>> #4  0x000000000042133f in avnd_main_process() () at main.cc:667
> >>>>>> #5  0x0000000000405517 in main () at main.cc:186
> >>>>>> (gdb) thread apply all bt
> >>>>>>
> >>>>>> Thread 4 (Thread 0x7fdaf3c05b00 (LWP 15512)):
> >>>>>> #0  0x00007fdaf2b2b415 in __lll_unlock_wake () from
> >>>>>> /lib64/libpthread.so.0
> >>>>>> #1  0x00007fdaf2b27ac4 in _L_unlock_553 () from
> >>>>>> /lib64/libpthread.so.0
> >>>>>> #2  0x00007fdaf2b279f7 in __pthread_mutex_unlock_usercnt () from
> >>>>>> /lib64/libpthread.so.0
> >>>>>> #3  0x00007fdaf37edac3 in ncs_os_lock () from
> >>>>>> /usr/local/lib/libopensaf_core.so.0
> >>>>>> #4  0x00007fdaf37e084d in ncs_ipc_send () from
> >>>>>> /usr/local/lib/libopensaf_core.so.0
> >>>>>> #5  0x000000000041eea1 in avnd_evt_send(avnd_cb_tag*,
> >>>>>> avnd_evt_tag*)
> >>>>>> ()
> >>>>>> #6  0x000000000040a2cb in
> >>>>>>
> comp_clc_resp_callback(NCS_OS_PROC_EXECUTE_TIMED_CB_INFO*)
> >> ()
> >>>>>> #7  0x00007fdaf37ecdfb in give_exec_mod_cb () from
> >>>>>> /usr/local/lib/libopensaf_core.so.0
> >>>>>> #8  0x00007fdaf37ecfde in ncs_exec_mod_hdlr () from
> >>>>>> /usr/local/lib/libopensaf_core.so.0
> >>>>>> #9  0x00007fdaf2b247b6 in start_thread () from
> >>>>>> /lib64/libpthread.so.0
> >>>>>> #10 0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
> >>>>>> #11 0x0000000000000000 in ?? ()
> >>>>>>
> >>>>>> Thread 3 (Thread 0x7fdaf3c25b00 (LWP 15510)):
> >>>>>> #0  0x00007fdaf20d14f6 in poll () from /lib64/libc.so.6
> >>>>>> #1  0x00007fdaf3817623 in mdtm_process_recv_events () from
> >>>>>> /usr/local/lib/libopensaf_core.so.0
> >>>>>> #2  0x00007fdaf2b247b6 in start_thread () from
> >>>>>> /lib64/libpthread.so.0
> >>>>>> #3  0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
> >>>>>> #4  0x0000000000000000 in ?? ()
> >>>>>>
> >>>>>> Thread 2 (Thread 0x7fdaf3c58b00 (LWP 15509)):
> >>>>>> #0  0x00007fdaf20d14f6 in poll () from /lib64/libc.so.6
> >>>>>> #1  0x00007fdaf37db22f in osaf_ppoll () from
> >>>>>> /usr/local/lib/libopensaf_core.so.0
> >>>>>> #2  0x00007fdaf37e2acf in ncs_tmr_wait () from
> >>>>>> /usr/local/lib/libopensaf_core.so.0
> >>>>>> #3  0x00007fdaf2b247b6 in start_thread () from
> >>>>>> /lib64/libpthread.so.0
> >>>>>> #4  0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
> >>>>>> #5  0x0000000000000000 in ?? ()
> >>>>>>
> >>>>>> Thread 1 (Thread 0x7fdaf3c28720 (LWP 15508)):
> >>>>>> #0  0x0000000000412b50 in
> >>>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*,
> avnd_comp_tag*)
> >> ()
> >>>>>> #1  0x000000000040a093 in
> >>>>>> avnd_comp_clc_terming_cleansucc_hdler(avnd_cb_tag*,
> >>>>> avnd_comp_tag*)
> >>>>>> ()
> >>>>>> #2  0x000000000040c7d4 in avnd_comp_clc_fsm_run(avnd_cb_tag*,
> >>>>>> avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) ()
> >>>>>> #3  0x000000000040ce49 in avnd_evt_clc_resp_evh(avnd_cb_tag*,
> >>>>>> avnd_evt_tag*) ()
> >>>>>> #4  0x000000000042133f in avnd_main_process() () at main.cc:667
> >>>>>> #5  0x0000000000405517 in main () at main.cc:186
> >>>>>>
> >>>>>> TC #18.                Same configuration as #12: Run SG lock and keep 
> >>>>>> gdb
> >>>>> in
> >>>>>> amf_csi_remove_callback  and stop controller and then start the
> >>>>>> controller and make is up. Now release Amfnd from gdb so that it
> >>>>>> can respond to csi remove(Please note that controller has reboot
> >>>>>> and is available now). Now, issue SG unlock. Amfnd crashes on
> >>>>>> PL-3 and PL-4 at the same location[Please note that this test
> >>>>>> case works with controller
> >>>>> up]:
> >>>>>> Syslog and Bt:
> >>>>>> Feb 10 16:35:51 PM_PL-3 amf_demo[26623]: CSI Remove for all CSIs
> >>>>>> Feb
> >>>>>> 10 16:35:51 PM_PL-3 osafamfnd[26545]: NO Removed
> >>>>>> 'safSi=AmfDemo,safApp=AmfDemo1' from
> >>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: NO Assigning
> >>>>>> 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to
> >>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]: CSI Set - add
> >>>>>> 'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState
> Active
> >>>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]:        name: abcdef,
> value:
> >> val1
> >>>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]:        name: abcdef,
> value:
> >> val2
> >>>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: NO Assigned
> >>>>>> 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to
> >>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: di.cc:850:
> >>>>>> avnd_di_susi_resp_send: Assertion 'si' failed.
> >>>>>> Feb 10 16:36:02 PM_PL-3 osafclmna[26536]: AL AMF Node Director
> is
> >>>>>> down, terminate this process
> >>>>>>
> >>>>>> Core was generated by `/usr/local/lib/opensaf/osafamfnd --
> >>>>>> tracemask=0xffffffff'.
> >>>>>> Program terminated with signal 6, Aborted.
> >>>>>> #0  0x00007f022ebd9b55 in raise () from /lib64/libc.so.6
> >>>>>> (gdb) bt
> >>>>>> #0  0x00007f022ebd9b55 in raise () from /lib64/libc.so.6
> >>>>>> #1  0x00007f022ebdb131 in abort () from /lib64/libc.so.6
> >>>>>> #2  0x00007f023038331b in __osafassert_fail () from
> >>>>>> /usr/local/lib/libopensaf_core.so.0
> >>>>>> #3  0x000000000041b399 in avnd_di_susi_resp_send(avnd_cb_tag*,
> >>>>>> avnd_su_tag*, avnd_su_si_rec*) ()
> >>>>>> #4  0x000000000042e9fa in avnd_su_si_oper_done(avnd_cb_tag*,
> >>>>>> avnd_su_tag*, avnd_su_si_rec*) ()
> >>>>>> #5  0x0000000000411622 in
> >> avnd_comp_csi_assign_done(avnd_cb_tag*,
> >>>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
> >>>>>> #6  0x0000000000407397 in avnd_evt_ava_resp_evh(avnd_cb_tag*,
> >>>>>> avnd_evt_tag*) ()
> >>>>>> #7  0x000000000042133f in avnd_main_process() () at main.cc:667
> >>>>>> #8  0x0000000000405517 in main () at main.cc:186
> >>>>>>
> >>>>>> TC #19.                Same configuration as #12: Run Node lock and 
> >>>>>> keep
> >>>>> sleep of
> >>>>>> 5 sec in amf_csi_set_callback and stop controller. Reject quisced
> >>>>>> assignment in amf_csi_set_callback, Amfnd crashes. Syslog and gdb
> >>>>>> is the same as in TC #17.
> >>>>>>
> >>>>>> TC #20.                Same configuration as #12: Issue Node shutdown:
> >>>>> and keep
> >>>>>> sleep of 5 sec in amf_csi_set_callback before sending
> >>>>>> saAmfResponse() and stop controller. Amfnd crashes:
> >>>>>> Syslog:
> >>>>>>
> >>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO component with
> >>>>>> QUIESCED/QUIESCING a                                              
> >>>>>> ssignment failed
> >>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO recovery action
> >> 'comp
> >>>>>> restart' esca                                              lated to 
> >>>>>> 'comp failover'
> >>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO SU failover
> >>>>>> probation
> >>>>> timer
> >>>>>> started                                               (timeout: 
> >>>>>> 1200000000000 ns)
> >>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO Performing failover
> >> of
> >>>>>> 'safSu=SU1,s
> >> afSg=AmfDemo,safApp=AmfDemo1'
> >>>>>> (SU failover count: 1)
> >>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
> >>>>>> 'safComp=AmfDemo,safSu=SU1,safSg=Am
> >>>>>> fDemo,safApp=AmfDemo1' recovery action escalated from
> >>>>>> 'componentRestart' to 'com                                             
> >>>>>>  ponentFailover'
> >>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
> >>>>>> 'safComp=AmfDemo,safSu=SU1,safSg=Am
> >>>>>> fDemo,safApp=AmfDemo1' faulted due to 'csiSetcallbackFailed' :
> >>>>>> Recovery
> >>>>> is
> >>>>>> 'comp                                              onentFailover'
> >>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
> >>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=Amf
> >> Demo1'
> >>>>>> Presence State INSTANTIATED => TERMINATING Feb 10 17:21:10
> >> PM_PL-3
> >>>>>> osafamfnd[29330]: NO Removed
> >>>>>> 'safSi=AmfDemo,safApp=AmfDe                                            
> >>>>>>   mo1' from
> >>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: saAmfHAStateGet
> FAILED
> >> - 7
> >>>>>> Feb 10 17:21:10 PM_PL-3 osafimmnd[29561]: AL AMF Node Director
> is
> >>>>>> down, terminat                                              e this 
> >>>>>> process
> >>>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: AL AMF Node Director
> is
> >>>>> down,
> >>>>>> terminate                                               this process
> >>>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: exiting (caught term
> >>>>>> signal) Feb 10 17:21:10 PM_PL-3 osafclmna[29321]: AL AMF Node
> >>>>>> Director is
> >>>>> down,
> >>>>>> terminat                                              e this process
> >>>>>>
> >>>>>> Bt:
> >>>>>> Core was generated by `/usr/local/lib/opensaf/osafamfnd --
> >>>>>> tracemask=0xffffffff'.
> >>>>>> Program terminated with signal 11, Segmentation fault.
> >>>>>> #0  0x00000000004117c9 in
> >> avnd_comp_csi_assign_done(avnd_cb_tag*,
> >>>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
> >>>>>> (gdb) bt
> >>>>>> #0  0x00000000004117c9 in
> >> avnd_comp_csi_assign_done(avnd_cb_tag*,
> >>>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
> >>>>>> #1  0x0000000000406a3b in
> >>>>>> avnd_evt_ava_csi_quiescing_compl_evh(avnd_cb_tag*,
> >> avnd_evt_tag*)
> >>>>>> ()
> >>>>>> #2  0x000000000042133f in avnd_main_process() () at main.cc:667
> >>>>>> #3  0x0000000000405517 in main () at main.cc:186
> >>>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>> -Nagu
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Nagendra Kumar
> >>>>>>> Sent: 09 February 2016 21:39
> >>>>>>> To: minh chau; hans.nordeb...@ericsson.com;
> >>>>> gary....@dektech.com.au;
> >>>>>>> Praveen Malviya
> >>>>>>> Cc: opensaf-devel@lists.sourceforge.net
> >>>>>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf:
> >>>>>>> Add
> >>>>>> support
> >>>>>>> for cloud resilience [#1620] V2
> >>>>>>>
> >>>>>>> 15. Same configuration as Test Case #12, SI lock. Keep gdb in
> >>>>>>> both the SUs for csi remove and keep timeout as 100 sec. Slock
> >>>>>>> SI and stop
> >>>>> controller.
> >>>>>>> Start controller and allow csi remove to timeout.
> >>>>>>> Two things:
> >>>>>>>       SU2 has Standby assignment(which is wrong), SU1 has not
> >>>>> assignment.
> >>>>>>>       Error at PL-4 : SU-SI record addition failed
> >>>>>>>
> >>>>>>> PM_SC-1:/home/nagu/views/staging # amf-state  siass
> >>>>>>> safSISU=safSu=PL-
> >>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
> >>>>>>>        saAmfSISUHAState=ACTIVE(1)
> >>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>>>> safSISU=safSu=PL-
> >>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
> >>>>>>>        saAmfSISUHAState=ACTIVE(1)
> >>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>>>>
> >>
> safSISU=safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,s
> >>>>>>> afApp=AmfDemo1
> >>>>>>>        saAmfSISUHAState=STANDBY(2)
> >>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>>>> safSISU=safSu=SC-
> >>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
> >>>>>>>        saAmfSISUHAState=ACTIVE(1)
> >>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-
> >>>>>>> 2N,safApp=OpenSAF
> >>>>>>>        saAmfSISUHAState=ACTIVE(1)
> >>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>>>>
> >>>>>>> Syslog of PL-4:
> >>>>>>>
> >>>>>>> Feb  9 21:24:50 PM_PL-4 osafamfnd[7998]: NO
> >>>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' component
> restart
> >>>>>>> probation timer started (timeout: 60000000000 ns) Feb  9
> >>>>>>> 21:24:50
> >>>>>>> PM_PL-
> >>>>>> 4
> >>>>>>> osafamfnd[7998]: NO Restarting a component of
> >>>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' (comp restart
> count:
> >> 1)
> >>>>>> Feb
> >>>>>>> 9 21:24:50 PM_PL-4 osafamfnd[7998]: NO
> >>>>>>>
> 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>>> faulted due to 'csiRemovecallbackTimeout' : Recovery is
> >>>>>> 'componentRestart'
> >>>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo_script: killproc
> >>>>>>> /opt/amf_demo/amf_demo failed Feb  9 21:24:55 PM_PL-4
> >>>>>>> amf_demo[8200]:
> >>>>>>>
> 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>>> started Feb  9 21:24:55 PM_PL-4 osafamfnd[7998]: NO Removed
> >>>>>>> 'safSi=AmfDemo1,safApp=AmfDemo1' from
> >>>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]: HC started with AMF
> Feb
> >> 9
> >>>>>>> 21:24:55 PM_PL-4 amf_demo[8200]: Registered with AMF Feb  9
> >>>>> 21:24:55
> >>>>>>> PM_PL-4 amf_demo[8200]: CSI Set - add
> >>>>>>> 'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState
> >> Standby
> >>>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]:         name: abcdef,
> value:
> >>>>> val1
> >>>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]:         name: abcdef,
> value:
> >>>>> val2
> >>>>>>> Feb  9 21:24:55 PM_PL-4 osafamfnd[7998]: CR SU-SI record
> >>>>>>> addition failed, SU=
> safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 :
> >>>>>>> SI=safSi=AmfDemo,safApp=AmfDemo1 Feb  9 21:24:55 PM_PL-4
> >>>>>>> amf_demo[8200]: Health check 1 Feb  9 21:25:50 PM_PL-4
> >>>>>>> osafamfnd[7998]: NO
> >> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
> >>>>>>> Component or SU restart probation timer expired
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> -Nagu
> >>>>>>>
> >>>>>>> ----------------------------------------------------------------
> >>>>>>> --
> >>>>>>> --
> >>>>>>> ----------
> >>>>>>> Site24x7 APM Insight: Get Deep Visibility into Application
> >>>>>>> Performance APM
> >>>>>> +
> >>>>>>> Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> >>>>>>> Monitor end-to-end web transactions and take corrective actions
> >>>>>>> now
> >>>>>> Troubleshoot
> >>>>>>> faster and improve end-user experience. Signup Now!
> >>>>>>>
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> >>>>>>> _______________________________________________
> >>>>>>> Opensaf-devel mailing list
> >>>>>>> Opensaf-devel@lists.sourceforge.net
> >>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
> >>>>>> -----------------------------------------------------------------
> >>>>>> --
> >>>>>> ---
> >>>>>> --------
> >>>>>> Site24x7 APM Insight: Get Deep Visibility into Application
> >>>>>> Performance APM + Mobile APM + RUM: Monitor 3 App instances at
> >> just
> >>>>>> $35/Month Monitor end-to-end web transactions and take
> corrective
> >>>>>> actions now Troubleshoot faster and improve end-user experience.
> >> Signup Now!
> >>>>>>
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> >>>>>> _______________________________________________
> >>>>>> Opensaf-devel mailing list
> >>>>>> Opensaf-devel@lists.sourceforge.net
> >>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
> >>>>> ------------------------------------------------------------------
> >>>>> --
> >>>>> ----------
> >>>>> Site24x7 APM Insight: Get Deep Visibility into Application
> >>>>> Performance APM + Mobile APM + RUM: Monitor 3 App instances at
> >>>>> just $35/Month Monitor end-to-end web transactions and take
> >>>>> corrective actions now Troubleshoot faster and improve end-user
> experience.
> >> Signup Now!
> >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> >>>>> _______________________________________________
> >>>>> Opensaf-devel mailing list
> >>>>> Opensaf-devel@lists.sourceforge.net
> >>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
> 

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to