Hi Nagu,

One thing that can help us to reproduce your problems, that can you 
attach to the ticket the models you are using for test?

Thanks,
Minh

On 15/02/16 19:32, Nagendra Kumar wrote:
> Hi Gary,
>       I am using the patch tar sent by Minh(9 Feb on devel list) and I using 
> these on same change set #7280 mentioned by Minh. So, please contact him for 
> any clarifications.
>
> Are you finding mismatch in the traces attached in the ticket #1620 (for many 
> test cases) and source code of Amfd anf Amfnd ?
>
> BTW, I am attaching the tar sent by Minh and how I applied patches on top of 
> #7280. Please note 010_log_1179.patch, I have taken from my repo as the tar 
> sent by Minh was not having correct log patch for 1179. So, ideally, Amf 
> patches should be the same, please check that. I enabled cloud feature 
> (IMMSV_SC_ABSENCE_ALLOWED) manually.
> ==================================================
> patch -p1 < /tmp/sf_cloud_resilience_integration/777_osaftimer_2.diff
> patch -p1 <  ../OpensafHeadless/patches/010_log_1179.patch
> patch -p1 < /tmp/sf_cloud_resilience_integration/1620_README_V2.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1620_amfd_V3.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1620_amfnd_V3.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1180_ntf_agent.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1180_ntf_libs_common.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1180_ntf_readme.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/180_ntf_test.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1180_ntf_tools.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1620_common_libs_V2.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1620_config.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1621_ckpt.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_1.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_2.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_3.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_4.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_5.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_6.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1625_imm_7_compile_err.diff
> patch -p1 < /tmp/sf_cloud_resilience_integration/1646_clm.patch
> ==================================================
>
> I manually compared installed Amfd and Amfnd binary files with binary files 
> created in source code repo while compiling. I compiled again and they are 
> the same. All the patches are applied.
> So, please check from your side and confirm me if I am making any mistake?
>
> Thanks
> -Nagu
>> -----Original Message-----
>> From: Gary Lee [mailto:gary....@dektech.com.au]
>> Sent: 15 February 2016 13:36
>> To: Nagendra Kumar
>> Cc: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya; opensaf-
>> de...@lists.sourceforge.net
>> Subject: Re: [devel] [PATCH 0 of 5] Review Request for amf: Add support for
>> cloud resilience [#1620] V2
>>
>> Hi Nagu
>>
>> I think we need to make sure we’re all looking at the same source code.
>>
>> I have trouble recreating some of the problems you’ve seen, but I see other
>> problems.
>>
>> Perhaps we can set up a fork of opensaf-staging on source forge, and check
>> in the patches?
>>
>> Thanks
>> Gary
>>
>>
>>> On 15 Feb 2016, at 4:00 PM, Gary Lee <gary....@dektech.com.au> wrote:
>>>
>>> Hi Nagu
>>>
>>> Just wanted to confirm that when you attach gdb to a process, the process
>> is amf_demo, and not amfnd?
>>> Thanks
>>> Gary
>>>
>>>> On 13 Feb 2016, at 1:35 AM, Nagendra Kumar <nagendr...@oracle.com>
>> wrote:
>>>> TC #27: Same configuration as TC #24:
>>>> Add a new Csi in running demo appl: Keep gdb in SU1 comp in
>> amf_csi_set_callback and add new csi to existing si. Stop controller, and 
>> then
>> respond from gdb. Start controller. Only Act assignment is given to SU1
>> component. Standby csi assignment is not given to SU2 component:
>> safCSIComp=safComp=AmfDemo\,safSu=SU1\,safSg=AmfDemo\,safApp=Am
>> fDemo1
>>>> ,safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1
>>>>
>> safCSIComp=safComp=AmfDemo\,safSu=SU2\,safSg=AmfDemo\,safApp=Am
>> fDemo1
>>>> ,safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1
>>>>
>> safCSIComp=safComp=AmfDemo\,safSu=SU1\,safSg=AmfDemo\,safApp=Am
>> fDemo1
>>>> ,safCsi=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
>>>>
>>>> Logs " TC 27" are attached in ticket.
>>>>
>>>> TC #28: Same configuration as TC #24:
>>>> Delete a Csi in running demo appl: Add a new csi and keep gdb in SU1
>> comp in amf_csi_remove_callback and then delete csi. Stop controller, and
>> then respond from gdb. Start controller. Amfd crashes:
>>>> Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: Started Feb 12 15:52:32
>>>> PM_SC-1 osafamfnd[16096]: NO Sending node up due to NCSMDS_UP
>> Feb 12
>>>> 15:52:32 PM_SC-1 osafamfd[16086]: NO Received node_up from 2010f:
>>>> msg_id 1 Feb 12 15:52:32 PM_SC-1 osafamfd[16086]: csi.cc:1470:
>> avd_compcsi_recreate: Assertion 'csi' failed.
>>>> Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: WA AMF director
>>>> unexpectedly crashed Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: WA
>> AMF
>>>> director unexpectedly crashed
>>>>
>>>> Logs " TC 28" are attached in ticket.
>>>>
>>>> TC #29: Configuration: One controller, and two PLs. 2N SG, SU1 and SU2
>> with 3 comp with one csi and SI associated. Si under si deps.
>> safSi=B,safApp=Test depends on safSi=A,safApp=Test and
>> safSi=C,safApp=Test depends on safSi=B,safApp=Test. SU1(Act) on PL-3 and
>> SU2(Standby) on PL-4.
>>>> Lock SI A and keep gdb in quisced callback and stop controller. Respond
>> from gdb and start controller. After controller comes up SU2 has two
>> Standby assignment and no active assignments, which looks serious problem.
>>>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=B,safApp=Test
>>>>        saAmfSISUHAState=STANDBY(2)
>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
>>>>        saAmfSISUHAState=STANDBY(2)
>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>
>>>> At the same time, there are two error message in controller:
>>>> Feb 12 19:32:01 PM_SC-1 osafamfd[1572]: EM sg_2n_fsm.cc:1439:
>>>> safSu=ABC2,safSg=2N,safApp=Test (31) Feb 12 19:32:01 PM_SC-1
>>>> osafamfd[1572]: EM sg_2n_fsm.cc:1439:
>> safSu=ABC2,safSg=2N,safApp=Test
>>>> (31)
>>>>
>>>> Logs " TC 29" are attached in ticket.
>>>>
>>>> TC #30: Configuration same as TC #29: This case, Lock SI B, which is
>> dependent on SI A and is sponsor of SI C.
>>>> Only assignment of SI B is removed and SI C assignment remains:
>>>> safSISU=safSu=ABC1\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
>>>>        saAmfSISUHAState=ACTIVE(1)
>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=A,safApp=Test
>>>>        saAmfSISUHAState=STANDBY(2)
>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>> safSISU=safSu=ABC1\,safSg=2N\,safApp=Test,safSi=A,safApp=Test
>>>>        saAmfSISUHAState=ACTIVE(1)
>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
>>>>        saAmfSISUHAState=STANDBY(2)
>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>
>>>> Logs " TC 30" are attached in ticket.
>>>>
>>>> TC #31: Same configuration as TC #30. Lock SI B, which is dependent on SI
>> A and is sponsor of SI C. Assignment of SI B is removed and tolerance timer
>> will start running.
>>>> Reboot the controller. The assignment of SI C should be removed because
>> its sponsor is in locked state.
>>>> Logs " TC 30" are attached in ticket.
>>>>
>>>> Thanks
>>>> -Nagu
>>>>
>>>>> -----Original Message-----
>>>>> From: Nagendra Kumar
>>>>> Sent: 11 February 2016 20:06
>>>>> To: minh chau; hans.nordeb...@ericsson.com;
>> gary....@dektech.com.au;
>>>>> Praveen Malviya
>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add
>>>>> support for cloud resilience [#1620] V2
>>>>>
>>>>> TC #21: Node shutdown and same as TC #17. Logs " TC 21" are attached
>>>>> in ticket.
>>>>>
>>>>> TC #22: Node lock and unlock(2nd time), result same as TC #18. Logs "
>> TC 22"
>>>>> are attached in ticket.
>>>>>
>>>>> TC #23: Node Group shutdown operation: Stop controller during
>>>>> shutdown operation and it will remain in shutdown state even if all
>>>>> the assignments are removed.
>>>>>
>>>>> TC #24: Configuration: Start SC-1 and PL-3 and PL-4, create 2N Red
>>>>> model
>>>>> SU1(Act) on PL-3 and SU2(Standby) on PL-4 . Create a node group
>>>>> having PL-
>>>>> 3 and PL-4.
>>>>> Now, lock and lock-in the node group. Reboot the controller. Check
>>>>> the admin state of node group, it will be 3. Now, do unlock-in of
>>>>> node group,
>>>>> SU1 and SU2 are not instantiated(which is wrong). Node group admin
>>>>> state is
>>>>> 2 now(locked). Logs " TC 24" are attached in ticket.
>>>>>
>>>>> TC #25: Same configuration as TC #24:
>>>>> Lock the node group and keep gdb in amf_csi_set_callback in SU1's
>>>>> comp(Act) and amf_csi_remove_callback in SU2's comp(Standby).
>> Reboot
>>>>> the controller. When SC-1 comes up, respond from gdb from SU1 and
>>>>> SU2 components. The following error comes, that means extra susi
>>>>> remove is
>>>>> coming:
>>>>> Feb 11 19:46:27 PM_PL-3 osafamfnd[16881]: ER susi_assign_evh:
>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' has no assignments
>>>>>
>>>>> Logs " TC 25" are attached in ticket.
>>>>>
>>>>> TC #26: Same configuration as TC #24:
>>>>> Lock the node group and keep gdb in amf_csi_remove_callback in SU1's
>>>>> comp(Act) and amf_csi_remove_callback in SU2's comp(Standby).
>> Reboot
>>>>> the controller. When SC-1 comes up, respond from gdb from SU1 and
>>>>> SU2 components. Amfnd on PL-3 and PL-4 crashes:
>>>>> PL-3:
>>>>> Feb 11 19:56:41 PM_PL-3 osafamfnd[17561]: di.cc:850:
>>>>> avnd_di_susi_resp_send: Assertion 'si' failed.
>>>>> Feb 11 19:56:42 PM_PL-3 osafclmna[17552]: AL AMF Node Director is
>>>>> down, terminate this process
>>>>> PL-4:
>>>>> Feb 11 19:56:40 PM_PL-4 osafamfnd[10912]: di.cc:850:
>>>>> avnd_di_susi_resp_send: Assertion 'si' failed.
>>>>> Feb 11 19:56:40 PM_PL-4 osafimmnd[10893]: NO Global discard node
>>>>> received for nodeId:2030f pid:17542 Feb 11 19:56:40 PM_PL-4
>>>>> osafimmnd[10893]: NO Implementer disconnected 13 <0, 2030f(down)>
>>>>> (MsgQueueService131855) Feb 11 19:56:40 PM_PL-4 amf_demo[11007]:
>> AL
>>>>> AMF Node Director is down, terminate this process
>>>>>
>>>>> Logs " TC 26" are attached in ticket.
>>>>>
>>>>> Thanks
>>>>> -Nagu
>>>>>> -----Original Message-----
>>>>>> From: Nagendra Kumar
>>>>>> Sent: 10 February 2016 17:30
>>>>>> To: minh chau; hans.nordeb...@ericsson.com;
>>>>>> gary....@dektech.com.au; Praveen Malviya
>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add
>>>>>> support for cloud resilience [#1620] V2
>>>>>>
>>>>>> TC #16.  Same configuration as #12: Run SI shutdown and keep sleep
>>>>>> of 5 sec before saAmfCSIQuiescingComplete  and stop controller and
>>>>>> then after sleep, reject saAmfCSIQuiescingComplete with
>>>>>> SA_AIS_ERR_FAILED_OPERATION. All the assignment from SU1 on PL-3
>>>>>> and
>>>>>> SU2 on PL-4 are removed and SI admin state is 2(locked):
>>>>>> saAmfSIAdminState                                  SA_UINT32_T  2 (0x2)
>>>>>>
>>>>>> "Si going into locked state" is different behaviour when controller
>>>>>> is up and running and run this test case. In case, controller is
>>>>>> available, SI will be in unlocked state and all the assignments
>>>>>> will be on SU2 as Act and SU3 as Standby (on PL-4). This need
>>>>>> either correction
>>>>> or documentation.
>>>>>> TC #17.          Same configuration as #12: Run SG shutdown and
>>>>> keep sleep
>>>>>> of 5 sec before saAmfCSIQuiescingComplete  and stop controller and
>>>>>> then after sleep, reject saAmfCSIQuiescingComplete with
>>>>>> SA_AIS_ERR_FAILED_OPERATION. Amfnd crashes[Please note that this
>>>>>> test case works with controller up]:
>>>>>> Syslog and bt:
>>>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO component with
>>>>>> QUIESCED/QUIESCING assignment failed Feb 10 11:44:29 PM_PL-3
>>>>>> osafamfnd[15508]: NO recovery action 'comp restart' escalated to
>>>>>> 'comp failover'
>>>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO SU failover probation
>>>>>> timer started (timeout: 1200000000000 ns) Feb 10 11:44:29 PM_PL-3
>>>>>> osafamfnd[15508]: NO Performing failover of
>>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' (SU failover count: 1)
>>>>>> Feb
>>>>>> 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>>>>> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>> recovery action escalated from 'componentRestart' to
>>>>> 'componentFailover'
>>>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>>>>> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>> faulted due to 'csiSetcallbackFailed' : Recovery is 'componentFailover'
>>>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State
>>>>> INSTANTIATED
>>>>>> => TERMINATING Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>>>> Removed
>>>>>> 'safSi=AmfDemo,safApp=AmfDemo1' from
>>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>> Feb 10 11:44:29 PM_PL-3 amf_demo[15721]: saAmfHAStateGet FAILED
>> - 7
>>>>>> Feb 10 11:44:29 PM_PL-3 amf_demo[15721]: exiting (caught term
>>>>>> signal) Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>>>>> avnd_di_oper_send() deferred as AMF director is offline Feb 10
>>>>>> 11:44:29 PM_PL-3
>>>>>> osafimmnd[15760]: AL AMF Node Director is down, terminate this
>>>>>> process
>>>>>>
>>>>>> Program terminated with signal 11, Segmentation fault.
>>>>>> #0  0x0000000000412b50 in
>>>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*)
>> ()
>>>>>> (gdb) bt
>>>>>> #0  0x0000000000412b50 in
>>>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*)
>> ()
>>>>>> #1  0x000000000040a093 in
>>>>>> avnd_comp_clc_terming_cleansucc_hdler(avnd_cb_tag*,
>>>>> avnd_comp_tag*)
>>>>>> ()
>>>>>> #2  0x000000000040c7d4 in avnd_comp_clc_fsm_run(avnd_cb_tag*,
>>>>>> avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) ()
>>>>>> #3  0x000000000040ce49 in avnd_evt_clc_resp_evh(avnd_cb_tag*,
>>>>>> avnd_evt_tag*) ()
>>>>>> #4  0x000000000042133f in avnd_main_process() () at main.cc:667
>>>>>> #5  0x0000000000405517 in main () at main.cc:186
>>>>>> (gdb) thread apply all bt
>>>>>>
>>>>>> Thread 4 (Thread 0x7fdaf3c05b00 (LWP 15512)):
>>>>>> #0  0x00007fdaf2b2b415 in __lll_unlock_wake () from
>>>>>> /lib64/libpthread.so.0
>>>>>> #1  0x00007fdaf2b27ac4 in _L_unlock_553 () from
>>>>>> /lib64/libpthread.so.0
>>>>>> #2  0x00007fdaf2b279f7 in __pthread_mutex_unlock_usercnt () from
>>>>>> /lib64/libpthread.so.0
>>>>>> #3  0x00007fdaf37edac3 in ncs_os_lock () from
>>>>>> /usr/local/lib/libopensaf_core.so.0
>>>>>> #4  0x00007fdaf37e084d in ncs_ipc_send () from
>>>>>> /usr/local/lib/libopensaf_core.so.0
>>>>>> #5  0x000000000041eea1 in avnd_evt_send(avnd_cb_tag*,
>>>>>> avnd_evt_tag*)
>>>>>> ()
>>>>>> #6  0x000000000040a2cb in
>>>>>> comp_clc_resp_callback(NCS_OS_PROC_EXECUTE_TIMED_CB_INFO*)
>> ()
>>>>>> #7  0x00007fdaf37ecdfb in give_exec_mod_cb () from
>>>>>> /usr/local/lib/libopensaf_core.so.0
>>>>>> #8  0x00007fdaf37ecfde in ncs_exec_mod_hdlr () from
>>>>>> /usr/local/lib/libopensaf_core.so.0
>>>>>> #9  0x00007fdaf2b247b6 in start_thread () from
>>>>>> /lib64/libpthread.so.0
>>>>>> #10 0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
>>>>>> #11 0x0000000000000000 in ?? ()
>>>>>>
>>>>>> Thread 3 (Thread 0x7fdaf3c25b00 (LWP 15510)):
>>>>>> #0  0x00007fdaf20d14f6 in poll () from /lib64/libc.so.6
>>>>>> #1  0x00007fdaf3817623 in mdtm_process_recv_events () from
>>>>>> /usr/local/lib/libopensaf_core.so.0
>>>>>> #2  0x00007fdaf2b247b6 in start_thread () from
>>>>>> /lib64/libpthread.so.0
>>>>>> #3  0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
>>>>>> #4  0x0000000000000000 in ?? ()
>>>>>>
>>>>>> Thread 2 (Thread 0x7fdaf3c58b00 (LWP 15509)):
>>>>>> #0  0x00007fdaf20d14f6 in poll () from /lib64/libc.so.6
>>>>>> #1  0x00007fdaf37db22f in osaf_ppoll () from
>>>>>> /usr/local/lib/libopensaf_core.so.0
>>>>>> #2  0x00007fdaf37e2acf in ncs_tmr_wait () from
>>>>>> /usr/local/lib/libopensaf_core.so.0
>>>>>> #3  0x00007fdaf2b247b6 in start_thread () from
>>>>>> /lib64/libpthread.so.0
>>>>>> #4  0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
>>>>>> #5  0x0000000000000000 in ?? ()
>>>>>>
>>>>>> Thread 1 (Thread 0x7fdaf3c28720 (LWP 15508)):
>>>>>> #0  0x0000000000412b50 in
>>>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*)
>> ()
>>>>>> #1  0x000000000040a093 in
>>>>>> avnd_comp_clc_terming_cleansucc_hdler(avnd_cb_tag*,
>>>>> avnd_comp_tag*)
>>>>>> ()
>>>>>> #2  0x000000000040c7d4 in avnd_comp_clc_fsm_run(avnd_cb_tag*,
>>>>>> avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) ()
>>>>>> #3  0x000000000040ce49 in avnd_evt_clc_resp_evh(avnd_cb_tag*,
>>>>>> avnd_evt_tag*) ()
>>>>>> #4  0x000000000042133f in avnd_main_process() () at main.cc:667
>>>>>> #5  0x0000000000405517 in main () at main.cc:186
>>>>>>
>>>>>> TC #18.          Same configuration as #12: Run SG lock and keep gdb
>>>>> in
>>>>>> amf_csi_remove_callback  and stop controller and then start the
>>>>>> controller and make is up. Now release Amfnd from gdb so that it
>>>>>> can respond to csi remove(Please note that controller has reboot
>>>>>> and is available now). Now, issue SG unlock. Amfnd crashes on PL-3
>>>>>> and PL-4 at the same location[Please note that this test case works
>>>>>> with controller
>>>>> up]:
>>>>>> Syslog and Bt:
>>>>>> Feb 10 16:35:51 PM_PL-3 amf_demo[26623]: CSI Remove for all CSIs
>>>>>> Feb
>>>>>> 10 16:35:51 PM_PL-3 osafamfnd[26545]: NO Removed
>>>>>> 'safSi=AmfDemo,safApp=AmfDemo1' from
>>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: NO Assigning
>>>>>> 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to
>>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]: CSI Set - add
>>>>>> 'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState Active
>>>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]:        name: abcdef, value:
>> val1
>>>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]:        name: abcdef, value:
>> val2
>>>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: NO Assigned
>>>>>> 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to
>>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: di.cc:850:
>>>>>> avnd_di_susi_resp_send: Assertion 'si' failed.
>>>>>> Feb 10 16:36:02 PM_PL-3 osafclmna[26536]: AL AMF Node Director is
>>>>>> down, terminate this process
>>>>>>
>>>>>> Core was generated by `/usr/local/lib/opensaf/osafamfnd --
>>>>>> tracemask=0xffffffff'.
>>>>>> Program terminated with signal 6, Aborted.
>>>>>> #0  0x00007f022ebd9b55 in raise () from /lib64/libc.so.6
>>>>>> (gdb) bt
>>>>>> #0  0x00007f022ebd9b55 in raise () from /lib64/libc.so.6
>>>>>> #1  0x00007f022ebdb131 in abort () from /lib64/libc.so.6
>>>>>> #2  0x00007f023038331b in __osafassert_fail () from
>>>>>> /usr/local/lib/libopensaf_core.so.0
>>>>>> #3  0x000000000041b399 in avnd_di_susi_resp_send(avnd_cb_tag*,
>>>>>> avnd_su_tag*, avnd_su_si_rec*) ()
>>>>>> #4  0x000000000042e9fa in avnd_su_si_oper_done(avnd_cb_tag*,
>>>>>> avnd_su_tag*, avnd_su_si_rec*) ()
>>>>>> #5  0x0000000000411622 in
>> avnd_comp_csi_assign_done(avnd_cb_tag*,
>>>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
>>>>>> #6  0x0000000000407397 in avnd_evt_ava_resp_evh(avnd_cb_tag*,
>>>>>> avnd_evt_tag*) ()
>>>>>> #7  0x000000000042133f in avnd_main_process() () at main.cc:667
>>>>>> #8  0x0000000000405517 in main () at main.cc:186
>>>>>>
>>>>>> TC #19.          Same configuration as #12: Run Node lock and keep
>>>>> sleep of
>>>>>> 5 sec in amf_csi_set_callback and stop controller. Reject quisced
>>>>>> assignment in amf_csi_set_callback, Amfnd crashes. Syslog and gdb
>>>>>> is the same as in TC #17.
>>>>>>
>>>>>> TC #20.          Same configuration as #12: Issue Node shutdown:
>>>>> and keep
>>>>>> sleep of 5 sec in amf_csi_set_callback before sending
>>>>>> saAmfResponse() and stop controller. Amfnd crashes:
>>>>>> Syslog:
>>>>>>
>>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO component with
>>>>>> QUIESCED/QUIESCING a                                              
>>>>>> ssignment failed
>>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO recovery action
>> 'comp
>>>>>> restart' esca                                              lated to 
>>>>>> 'comp failover'
>>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO SU failover probation
>>>>> timer
>>>>>> started                                               (timeout: 
>>>>>> 1200000000000 ns)
>>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO Performing failover
>> of
>>>>>> 'safSu=SU1,s
>> afSg=AmfDemo,safApp=AmfDemo1'
>>>>>> (SU failover count: 1)
>>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
>>>>>> 'safComp=AmfDemo,safSu=SU1,safSg=Am
>>>>>> fDemo,safApp=AmfDemo1' recovery action escalated from
>>>>>> 'componentRestart' to 'com                                              
>>>>>> ponentFailover'
>>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
>>>>>> 'safComp=AmfDemo,safSu=SU1,safSg=Am
>>>>>> fDemo,safApp=AmfDemo1' faulted due to 'csiSetcallbackFailed' :
>>>>>> Recovery
>>>>> is
>>>>>> 'comp                                              onentFailover'
>>>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
>>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=Amf
>> Demo1'
>>>>>> Presence State INSTANTIATED => TERMINATING Feb 10 17:21:10
>> PM_PL-3
>>>>>> osafamfnd[29330]: NO Removed
>>>>>> 'safSi=AmfDemo,safApp=AmfDe                                              
>>>>>> mo1' from
>>>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: saAmfHAStateGet FAILED
>> - 7
>>>>>> Feb 10 17:21:10 PM_PL-3 osafimmnd[29561]: AL AMF Node Director is
>>>>>> down, terminat                                              e this 
>>>>>> process
>>>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: AL AMF Node Director is
>>>>> down,
>>>>>> terminate                                               this process
>>>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: exiting (caught term
>>>>>> signal) Feb 10 17:21:10 PM_PL-3 osafclmna[29321]: AL AMF Node
>>>>>> Director is
>>>>> down,
>>>>>> terminat                                              e this process
>>>>>>
>>>>>> Bt:
>>>>>> Core was generated by `/usr/local/lib/opensaf/osafamfnd --
>>>>>> tracemask=0xffffffff'.
>>>>>> Program terminated with signal 11, Segmentation fault.
>>>>>> #0  0x00000000004117c9 in
>> avnd_comp_csi_assign_done(avnd_cb_tag*,
>>>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
>>>>>> (gdb) bt
>>>>>> #0  0x00000000004117c9 in
>> avnd_comp_csi_assign_done(avnd_cb_tag*,
>>>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
>>>>>> #1  0x0000000000406a3b in
>>>>>> avnd_evt_ava_csi_quiescing_compl_evh(avnd_cb_tag*,
>> avnd_evt_tag*)
>>>>>> ()
>>>>>> #2  0x000000000042133f in avnd_main_process() () at main.cc:667
>>>>>> #3  0x0000000000405517 in main () at main.cc:186
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> -Nagu
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Nagendra Kumar
>>>>>>> Sent: 09 February 2016 21:39
>>>>>>> To: minh chau; hans.nordeb...@ericsson.com;
>>>>> gary....@dektech.com.au;
>>>>>>> Praveen Malviya
>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf:
>>>>>>> Add
>>>>>> support
>>>>>>> for cloud resilience [#1620] V2
>>>>>>>
>>>>>>> 15. Same configuration as Test Case #12, SI lock. Keep gdb in both
>>>>>>> the SUs for csi remove and keep timeout as 100 sec. Slock SI and
>>>>>>> stop
>>>>> controller.
>>>>>>> Start controller and allow csi remove to timeout.
>>>>>>> Two things:
>>>>>>>         SU2 has Standby assignment(which is wrong), SU1 has not
>>>>> assignment.
>>>>>>>         Error at PL-4 : SU-SI record addition failed
>>>>>>>
>>>>>>> PM_SC-1:/home/nagu/views/staging # amf-state  siass
>>>>>>> safSISU=safSu=PL-
>>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
>>>>>>>        saAmfSISUHAState=ACTIVE(1)
>>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>>>> safSISU=safSu=PL-
>>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
>>>>>>>        saAmfSISUHAState=ACTIVE(1)
>>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>>>>
>> safSISU=safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,s
>>>>>>> afApp=AmfDemo1
>>>>>>>        saAmfSISUHAState=STANDBY(2)
>>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>>>> safSISU=safSu=SC-
>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
>>>>>>>        saAmfSISUHAState=ACTIVE(1)
>>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-
>>>>>>> 2N,safApp=OpenSAF
>>>>>>>        saAmfSISUHAState=ACTIVE(1)
>>>>>>>        saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>>>>
>>>>>>> Syslog of PL-4:
>>>>>>>
>>>>>>> Feb  9 21:24:50 PM_PL-4 osafamfnd[7998]: NO
>>>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' component restart
>>>>>>> probation timer started (timeout: 60000000000 ns) Feb  9 21:24:50
>>>>>>> PM_PL-
>>>>>> 4
>>>>>>> osafamfnd[7998]: NO Restarting a component of
>>>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' (comp restart count:
>> 1)
>>>>>> Feb
>>>>>>> 9 21:24:50 PM_PL-4 osafamfnd[7998]: NO
>>>>>>> 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>>> faulted due to 'csiRemovecallbackTimeout' : Recovery is
>>>>>> 'componentRestart'
>>>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo_script: killproc
>>>>>>> /opt/amf_demo/amf_demo failed Feb  9 21:24:55 PM_PL-4
>>>>>>> amf_demo[8200]:
>>>>>>> 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>>> started Feb  9 21:24:55 PM_PL-4 osafamfnd[7998]: NO Removed
>>>>>>> 'safSi=AmfDemo1,safApp=AmfDemo1' from
>>>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]: HC started with AMF Feb
>> 9
>>>>>>> 21:24:55 PM_PL-4 amf_demo[8200]: Registered with AMF Feb  9
>>>>> 21:24:55
>>>>>>> PM_PL-4 amf_demo[8200]: CSI Set - add
>>>>>>> 'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState
>> Standby
>>>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]:         name: abcdef, value:
>>>>> val1
>>>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]:         name: abcdef, value:
>>>>> val2
>>>>>>> Feb  9 21:24:55 PM_PL-4 osafamfnd[7998]: CR SU-SI record addition
>>>>>>> failed, SU= safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 :
>>>>>>> SI=safSi=AmfDemo,safApp=AmfDemo1 Feb  9 21:24:55 PM_PL-4
>>>>>>> amf_demo[8200]: Health check 1 Feb  9 21:25:50 PM_PL-4
>>>>>>> osafamfnd[7998]: NO
>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
>>>>>>> Component or SU restart probation timer expired
>>>>>>>
>>>>>>> Thanks
>>>>>>> -Nagu
>>>>>>>
>>>>>>> ------------------------------------------------------------------
>>>>>>> --
>>>>>>> ----------
>>>>>>> Site24x7 APM Insight: Get Deep Visibility into Application
>>>>>>> Performance APM
>>>>>> +
>>>>>>> Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>>>>>> Monitor end-to-end web transactions and take corrective actions
>>>>>>> now
>>>>>> Troubleshoot
>>>>>>> faster and improve end-user experience. Signup Now!
>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>>>>>> _______________________________________________
>>>>>>> Opensaf-devel mailing list
>>>>>>> Opensaf-devel@lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>>>>> -------------------------------------------------------------------
>>>>>> ---
>>>>>> --------
>>>>>> Site24x7 APM Insight: Get Deep Visibility into Application
>>>>>> Performance APM + Mobile APM + RUM: Monitor 3 App instances at
>> just
>>>>>> $35/Month Monitor end-to-end web transactions and take corrective
>>>>>> actions now Troubleshoot faster and improve end-user experience.
>> Signup Now!
>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>>>>> _______________________________________________
>>>>>> Opensaf-devel mailing list
>>>>>> Opensaf-devel@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>>>> --------------------------------------------------------------------
>>>>> ----------
>>>>> Site24x7 APM Insight: Get Deep Visibility into Application
>>>>> Performance APM + Mobile APM + RUM: Monitor 3 App instances at just
>>>>> $35/Month Monitor end-to-end web transactions and take corrective
>>>>> actions now Troubleshoot faster and improve end-user experience.
>> Signup Now!
>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>>>> _______________________________________________
>>>>> Opensaf-devel mailing list
>>>>> Opensaf-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to