Hi Nagu

I think we need to make sure we’re all looking at the same source code.

I have trouble recreating some of the problems you’ve seen, but I see other 
problems.

Perhaps we can set up a fork of opensaf-staging on source forge, and check in 
the patches?

Thanks
Gary


> On 15 Feb 2016, at 4:00 PM, Gary Lee <gary....@dektech.com.au> wrote:
> 
> Hi Nagu
> 
> Just wanted to confirm that when you attach gdb to a process, the process is 
> amf_demo, and not amfnd?
> 
> Thanks
> Gary
> 
>> On 13 Feb 2016, at 1:35 AM, Nagendra Kumar <nagendr...@oracle.com> wrote:
>> 
>> TC #27: Same configuration as TC #24:
>> Add a new Csi in running demo appl: Keep gdb in SU1 comp in 
>> amf_csi_set_callback and add new csi to existing si. Stop controller, and 
>> then respond from gdb. Start controller. Only Act assignment is given to SU1 
>> component. Standby csi assignment is not given to SU2 component:
>> safCSIComp=safComp=AmfDemo\,safSu=SU1\,safSg=AmfDemo\,safApp=AmfDemo1,safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1
>> safCSIComp=safComp=AmfDemo\,safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1
>> safCSIComp=safComp=AmfDemo\,safSu=SU1\,safSg=AmfDemo\,safApp=AmfDemo1,safCsi=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
>> 
>> Logs " TC 27" are attached in ticket.
>> 
>> TC #28: Same configuration as TC #24:
>> Delete a Csi in running demo appl: Add a new csi and keep gdb in SU1 comp in 
>> amf_csi_remove_callback and then delete csi. Stop controller, and then 
>> respond from gdb. Start controller. Amfd crashes:
>> 
>> Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: Started
>> Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: NO Sending node up due to NCSMDS_UP
>> Feb 12 15:52:32 PM_SC-1 osafamfd[16086]: NO Received node_up from 2010f: 
>> msg_id 1
>> Feb 12 15:52:32 PM_SC-1 osafamfd[16086]: csi.cc:1470: avd_compcsi_recreate: 
>> Assertion 'csi' failed.
>> Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: WA AMF director unexpectedly 
>> crashed
>> Feb 12 15:52:32 PM_SC-1 osafamfnd[16096]: WA AMF director unexpectedly 
>> crashed
>> 
>> Logs " TC 28" are attached in ticket.
>> 
>> TC #29: Configuration: One controller, and two PLs. 2N SG, SU1 and SU2 with 
>> 3 comp with one csi and SI associated. Si under si deps.  
>> safSi=B,safApp=Test depends on safSi=A,safApp=Test and safSi=C,safApp=Test 
>> depends on safSi=B,safApp=Test. SU1(Act) on PL-3 and SU2(Standby) on PL-4.
>> 
>> Lock SI A and keep gdb in quisced callback and stop controller. Respond from 
>> gdb and start controller. After controller comes up SU2 has two Standby 
>> assignment and no active assignments, which looks serious problem.
>> 
>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=B,safApp=Test
>>       saAmfSISUHAState=STANDBY(2)
>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
>>       saAmfSISUHAState=STANDBY(2)
>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>> 
>> At the same time, there are two error message in controller:
>> Feb 12 19:32:01 PM_SC-1 osafamfd[1572]: EM sg_2n_fsm.cc:1439: 
>> safSu=ABC2,safSg=2N,safApp=Test (31)
>> Feb 12 19:32:01 PM_SC-1 osafamfd[1572]: EM sg_2n_fsm.cc:1439: 
>> safSu=ABC2,safSg=2N,safApp=Test (31)
>> 
>> Logs " TC 29" are attached in ticket.
>> 
>> TC #30: Configuration same as TC #29: This case, Lock SI B, which is 
>> dependent on SI A and is sponsor of SI C.
>> Only assignment of SI B is removed and SI C assignment remains:
>> safSISU=safSu=ABC1\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
>>       saAmfSISUHAState=ACTIVE(1)
>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=A,safApp=Test
>>       saAmfSISUHAState=STANDBY(2)
>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>> safSISU=safSu=ABC1\,safSg=2N\,safApp=Test,safSi=A,safApp=Test
>>       saAmfSISUHAState=ACTIVE(1)
>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>> safSISU=safSu=ABC2\,safSg=2N\,safApp=Test,safSi=C,safApp=Test
>>       saAmfSISUHAState=STANDBY(2)
>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>> 
>> Logs " TC 30" are attached in ticket.
>> 
>> TC #31: Same configuration as TC #30. Lock SI B, which is dependent on SI A 
>> and is sponsor of SI C. Assignment of SI B is removed and tolerance timer 
>> will start running.
>> Reboot the controller. The assignment of SI C should be removed because its 
>> sponsor is in locked state.
>> Logs " TC 30" are attached in ticket.
>> 
>> Thanks
>> -Nagu
>> 
>>> -----Original Message-----
>>> From: Nagendra Kumar
>>> Sent: 11 February 2016 20:06
>>> To: minh chau; hans.nordeb...@ericsson.com; gary....@dektech.com.au;
>>> Praveen Malviya
>>> Cc: opensaf-devel@lists.sourceforge.net
>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add support
>>> for cloud resilience [#1620] V2
>>> 
>>> TC #21: Node shutdown and same as TC #17. Logs " TC 21" are attached in
>>> ticket.
>>> 
>>> TC #22: Node lock and unlock(2nd time), result same as TC #18. Logs " TC 22"
>>> are attached in ticket.
>>> 
>>> TC #23: Node Group shutdown operation: Stop controller during shutdown
>>> operation and it will remain in shutdown state even if all the assignments 
>>> are
>>> removed.
>>> 
>>> TC #24: Configuration: Start SC-1 and PL-3 and PL-4, create 2N Red model
>>> SU1(Act) on PL-3 and SU2(Standby) on PL-4 . Create a node group having PL-
>>> 3 and PL-4.
>>> Now, lock and lock-in the node group. Reboot the controller. Check the
>>> admin state of node group, it will be 3. Now, do unlock-in of node group,
>>> SU1 and SU2 are not instantiated(which is wrong). Node group admin state is
>>> 2 now(locked). Logs " TC 24" are attached in ticket.
>>> 
>>> TC #25: Same configuration as TC #24:
>>> Lock the node group and keep gdb in amf_csi_set_callback in SU1's
>>> comp(Act) and amf_csi_remove_callback in SU2's comp(Standby). Reboot
>>> the controller. When SC-1 comes up, respond from gdb from SU1 and SU2
>>> components. The following error comes, that means extra susi remove is
>>> coming:
>>> Feb 11 19:46:27 PM_PL-3 osafamfnd[16881]: ER susi_assign_evh:
>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' has no assignments
>>> 
>>> Logs " TC 25" are attached in ticket.
>>> 
>>> TC #26: Same configuration as TC #24:
>>> Lock the node group and keep gdb in amf_csi_remove_callback in SU1's
>>> comp(Act) and amf_csi_remove_callback in SU2's comp(Standby). Reboot
>>> the controller. When SC-1 comes up, respond from gdb from SU1 and SU2
>>> components. Amfnd on PL-3 and PL-4 crashes:
>>> PL-3:
>>> Feb 11 19:56:41 PM_PL-3 osafamfnd[17561]: di.cc:850:
>>> avnd_di_susi_resp_send: Assertion 'si' failed.
>>> Feb 11 19:56:42 PM_PL-3 osafclmna[17552]: AL AMF Node Director is down,
>>> terminate this process
>>> PL-4:
>>> Feb 11 19:56:40 PM_PL-4 osafamfnd[10912]: di.cc:850:
>>> avnd_di_susi_resp_send: Assertion 'si' failed.
>>> Feb 11 19:56:40 PM_PL-4 osafimmnd[10893]: NO Global discard node
>>> received for nodeId:2030f pid:17542 Feb 11 19:56:40 PM_PL-4
>>> osafimmnd[10893]: NO Implementer disconnected 13 <0, 2030f(down)>
>>> (MsgQueueService131855) Feb 11 19:56:40 PM_PL-4 amf_demo[11007]: AL
>>> AMF Node Director is down, terminate this process
>>> 
>>> Logs " TC 26" are attached in ticket.
>>> 
>>> Thanks
>>> -Nagu
>>>> -----Original Message-----
>>>> From: Nagendra Kumar
>>>> Sent: 10 February 2016 17:30
>>>> To: minh chau; hans.nordeb...@ericsson.com; gary....@dektech.com.au;
>>>> Praveen Malviya
>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add
>>>> support for cloud resilience [#1620] V2
>>>> 
>>>> TC #16.    Same configuration as #12: Run SI shutdown and keep sleep
>>>> of 5 sec before saAmfCSIQuiescingComplete  and stop controller and
>>>> then after sleep, reject saAmfCSIQuiescingComplete with
>>>> SA_AIS_ERR_FAILED_OPERATION. All the assignment from SU1 on PL-3 and
>>>> SU2 on PL-4 are removed and SI admin state is 2(locked):
>>>> saAmfSIAdminState                                  SA_UINT32_T  2 (0x2)
>>>> 
>>>> "Si going into locked state" is different behaviour when controller is
>>>> up and running and run this test case. In case, controller is
>>>> available, SI will be in unlocked state and all the assignments will
>>>> be on SU2 as Act and SU3 as Standby (on PL-4). This need either correction
>>> or documentation.
>>>> 
>>>> TC #17.            Same configuration as #12: Run SG shutdown and
>>> keep sleep
>>>> of 5 sec before saAmfCSIQuiescingComplete  and stop controller and
>>>> then after sleep, reject saAmfCSIQuiescingComplete with
>>>> SA_AIS_ERR_FAILED_OPERATION. Amfnd crashes[Please note that this test
>>>> case works with controller up]:
>>>> Syslog and bt:
>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO component with
>>>> QUIESCED/QUIESCING assignment failed Feb 10 11:44:29 PM_PL-3
>>>> osafamfnd[15508]: NO recovery action 'comp restart' escalated to 'comp
>>>> failover'
>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO SU failover probation
>>>> timer started (timeout: 1200000000000 ns) Feb 10 11:44:29 PM_PL-3
>>>> osafamfnd[15508]: NO Performing failover of
>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' (SU failover count: 1) Feb
>>>> 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>>> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>> recovery action escalated from 'componentRestart' to
>>> 'componentFailover'
>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>>> 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>> faulted due to 'csiSetcallbackFailed' : Recovery is 'componentFailover'
>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State
>>> INSTANTIATED
>>>> => TERMINATING Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO
>>> Removed
>>>> 'safSi=AmfDemo,safApp=AmfDemo1' from
>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>> Feb 10 11:44:29 PM_PL-3 amf_demo[15721]: saAmfHAStateGet FAILED - 7
>>>> Feb 10 11:44:29 PM_PL-3 amf_demo[15721]: exiting (caught term signal)
>>>> Feb 10 11:44:29 PM_PL-3 osafamfnd[15508]: NO avnd_di_oper_send()
>>>> deferred as AMF director is offline Feb 10 11:44:29 PM_PL-3
>>>> osafimmnd[15760]: AL AMF Node Director is down, terminate this process
>>>> 
>>>> Program terminated with signal 11, Segmentation fault.
>>>> #0  0x0000000000412b50 in
>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*) ()
>>>> (gdb) bt
>>>> #0  0x0000000000412b50 in
>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*) ()
>>>> #1  0x000000000040a093 in
>>>> avnd_comp_clc_terming_cleansucc_hdler(avnd_cb_tag*,
>>> avnd_comp_tag*)
>>>> ()
>>>> #2  0x000000000040c7d4 in avnd_comp_clc_fsm_run(avnd_cb_tag*,
>>>> avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) ()
>>>> #3  0x000000000040ce49 in avnd_evt_clc_resp_evh(avnd_cb_tag*,
>>>> avnd_evt_tag*) ()
>>>> #4  0x000000000042133f in avnd_main_process() () at main.cc:667
>>>> #5  0x0000000000405517 in main () at main.cc:186
>>>> (gdb) thread apply all bt
>>>> 
>>>> Thread 4 (Thread 0x7fdaf3c05b00 (LWP 15512)):
>>>> #0  0x00007fdaf2b2b415 in __lll_unlock_wake () from
>>>> /lib64/libpthread.so.0
>>>> #1  0x00007fdaf2b27ac4 in _L_unlock_553 () from /lib64/libpthread.so.0
>>>> #2  0x00007fdaf2b279f7 in __pthread_mutex_unlock_usercnt () from
>>>> /lib64/libpthread.so.0
>>>> #3  0x00007fdaf37edac3 in ncs_os_lock () from
>>>> /usr/local/lib/libopensaf_core.so.0
>>>> #4  0x00007fdaf37e084d in ncs_ipc_send () from
>>>> /usr/local/lib/libopensaf_core.so.0
>>>> #5  0x000000000041eea1 in avnd_evt_send(avnd_cb_tag*, avnd_evt_tag*)
>>>> ()
>>>> #6  0x000000000040a2cb in
>>>> comp_clc_resp_callback(NCS_OS_PROC_EXECUTE_TIMED_CB_INFO*) ()
>>>> #7  0x00007fdaf37ecdfb in give_exec_mod_cb () from
>>>> /usr/local/lib/libopensaf_core.so.0
>>>> #8  0x00007fdaf37ecfde in ncs_exec_mod_hdlr () from
>>>> /usr/local/lib/libopensaf_core.so.0
>>>> #9  0x00007fdaf2b247b6 in start_thread () from /lib64/libpthread.so.0
>>>> #10 0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
>>>> #11 0x0000000000000000 in ?? ()
>>>> 
>>>> Thread 3 (Thread 0x7fdaf3c25b00 (LWP 15510)):
>>>> #0  0x00007fdaf20d14f6 in poll () from /lib64/libc.so.6
>>>> #1  0x00007fdaf3817623 in mdtm_process_recv_events () from
>>>> /usr/local/lib/libopensaf_core.so.0
>>>> #2  0x00007fdaf2b247b6 in start_thread () from /lib64/libpthread.so.0
>>>> #3  0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
>>>> #4  0x0000000000000000 in ?? ()
>>>> 
>>>> Thread 2 (Thread 0x7fdaf3c58b00 (LWP 15509)):
>>>> #0  0x00007fdaf20d14f6 in poll () from /lib64/libc.so.6
>>>> #1  0x00007fdaf37db22f in osaf_ppoll () from
>>>> /usr/local/lib/libopensaf_core.so.0
>>>> #2  0x00007fdaf37e2acf in ncs_tmr_wait () from
>>>> /usr/local/lib/libopensaf_core.so.0
>>>> #3  0x00007fdaf2b247b6 in start_thread () from /lib64/libpthread.so.0
>>>> #4  0x00007fdaf20da9cd in clone () from /lib64/libc.so.6
>>>> #5  0x0000000000000000 in ?? ()
>>>> 
>>>> Thread 1 (Thread 0x7fdaf3c28720 (LWP 15508)):
>>>> #0  0x0000000000412b50 in
>>>> avnd_comp_cmplete_all_assignment(avnd_cb_tag*, avnd_comp_tag*) ()
>>>> #1  0x000000000040a093 in
>>>> avnd_comp_clc_terming_cleansucc_hdler(avnd_cb_tag*,
>>> avnd_comp_tag*)
>>>> ()
>>>> #2  0x000000000040c7d4 in avnd_comp_clc_fsm_run(avnd_cb_tag*,
>>>> avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) ()
>>>> #3  0x000000000040ce49 in avnd_evt_clc_resp_evh(avnd_cb_tag*,
>>>> avnd_evt_tag*) ()
>>>> #4  0x000000000042133f in avnd_main_process() () at main.cc:667
>>>> #5  0x0000000000405517 in main () at main.cc:186
>>>> 
>>>> TC #18.            Same configuration as #12: Run SG lock and keep gdb
>>> in
>>>> amf_csi_remove_callback  and stop controller and then start the
>>>> controller and make is up. Now release Amfnd from gdb so that it can
>>>> respond to csi remove(Please note that controller has reboot and is
>>>> available now). Now, issue SG unlock. Amfnd crashes on PL-3 and PL-4
>>>> at the same location[Please note that this test case works with controller
>>> up]:
>>>> Syslog and Bt:
>>>> Feb 10 16:35:51 PM_PL-3 amf_demo[26623]: CSI Remove for all CSIs Feb
>>>> 10 16:35:51 PM_PL-3 osafamfnd[26545]: NO Removed
>>>> 'safSi=AmfDemo,safApp=AmfDemo1' from
>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: NO Assigning
>>>> 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to
>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]: CSI Set - add
>>>> 'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState Active
>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]:        name: abcdef, value: val1
>>>> Feb 10 16:36:02 PM_PL-3 amf_demo[26623]:        name: abcdef, value: val2
>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: NO Assigned
>>>> 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to
>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>> Feb 10 16:36:02 PM_PL-3 osafamfnd[26545]: di.cc:850:
>>>> avnd_di_susi_resp_send: Assertion 'si' failed.
>>>> Feb 10 16:36:02 PM_PL-3 osafclmna[26536]: AL AMF Node Director is
>>>> down, terminate this process
>>>> 
>>>> Core was generated by `/usr/local/lib/opensaf/osafamfnd --
>>>> tracemask=0xffffffff'.
>>>> Program terminated with signal 6, Aborted.
>>>> #0  0x00007f022ebd9b55 in raise () from /lib64/libc.so.6
>>>> (gdb) bt
>>>> #0  0x00007f022ebd9b55 in raise () from /lib64/libc.so.6
>>>> #1  0x00007f022ebdb131 in abort () from /lib64/libc.so.6
>>>> #2  0x00007f023038331b in __osafassert_fail () from
>>>> /usr/local/lib/libopensaf_core.so.0
>>>> #3  0x000000000041b399 in avnd_di_susi_resp_send(avnd_cb_tag*,
>>>> avnd_su_tag*, avnd_su_si_rec*) ()
>>>> #4  0x000000000042e9fa in avnd_su_si_oper_done(avnd_cb_tag*,
>>>> avnd_su_tag*, avnd_su_si_rec*) ()
>>>> #5  0x0000000000411622 in avnd_comp_csi_assign_done(avnd_cb_tag*,
>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
>>>> #6  0x0000000000407397 in avnd_evt_ava_resp_evh(avnd_cb_tag*,
>>>> avnd_evt_tag*) ()
>>>> #7  0x000000000042133f in avnd_main_process() () at main.cc:667
>>>> #8  0x0000000000405517 in main () at main.cc:186
>>>> 
>>>> TC #19.            Same configuration as #12: Run Node lock and keep
>>> sleep of
>>>> 5 sec in amf_csi_set_callback and stop controller. Reject quisced
>>>> assignment in amf_csi_set_callback, Amfnd crashes. Syslog and gdb is
>>>> the same as in TC #17.
>>>> 
>>>> TC #20.            Same configuration as #12: Issue Node shutdown:
>>> and keep
>>>> sleep of 5 sec in amf_csi_set_callback before sending saAmfResponse()
>>>> and stop controller. Amfnd crashes:
>>>> Syslog:
>>>> 
>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO component with
>>>> QUIESCED/QUIESCING a                                              
>>>> ssignment failed
>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO recovery action 'comp
>>>> restart' esca                                              lated to 'comp 
>>>> failover'
>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO SU failover probation
>>> timer
>>>> started                                               (timeout: 
>>>> 1200000000000 ns)
>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO Performing failover of
>>>> 'safSu=SU1,s                                              
>>>> afSg=AmfDemo,safApp=AmfDemo1'
>>>> (SU failover count: 1)
>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
>>>> 'safComp=AmfDemo,safSu=SU1,safSg=Am
>>>> fDemo,safApp=AmfDemo1' recovery action escalated from
>>>> 'componentRestart' to 'com                                              
>>>> ponentFailover'
>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
>>>> 'safComp=AmfDemo,safSu=SU1,safSg=Am
>>>> fDemo,safApp=AmfDemo1' faulted due to 'csiSetcallbackFailed' : Recovery
>>> is
>>>> 'comp                                              onentFailover'
>>>> Feb 10 17:21:10 PM_PL-3 osafamfnd[29330]: NO
>>>> 'safSu=SU1,safSg=AmfDemo,safApp=Amf                                        
>>>>       Demo1'
>>>> Presence State INSTANTIATED => TERMINATING Feb 10 17:21:10 PM_PL-3
>>>> osafamfnd[29330]: NO Removed
>>>> 'safSi=AmfDemo,safApp=AmfDe                                              
>>>> mo1' from
>>>> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: saAmfHAStateGet FAILED - 7
>>>> Feb 10 17:21:10 PM_PL-3 osafimmnd[29561]: AL AMF Node Director is
>>>> down, terminat                                              e this process
>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: AL AMF Node Director is
>>> down,
>>>> terminate                                               this process
>>>> Feb 10 17:21:10 PM_PL-3 amf_demo[29519]: exiting (caught term signal)
>>>> Feb 10 17:21:10 PM_PL-3 osafclmna[29321]: AL AMF Node Director is
>>> down,
>>>> terminat                                              e this process
>>>> 
>>>> Bt:
>>>> Core was generated by `/usr/local/lib/opensaf/osafamfnd --
>>>> tracemask=0xffffffff'.
>>>> Program terminated with signal 11, Segmentation fault.
>>>> #0  0x00000000004117c9 in avnd_comp_csi_assign_done(avnd_cb_tag*,
>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
>>>> (gdb) bt
>>>> #0  0x00000000004117c9 in avnd_comp_csi_assign_done(avnd_cb_tag*,
>>>> avnd_comp_tag*, avnd_comp_csi_rec*) ()
>>>> #1  0x0000000000406a3b in
>>>> avnd_evt_ava_csi_quiescing_compl_evh(avnd_cb_tag*, avnd_evt_tag*) ()
>>>> #2  0x000000000042133f in avnd_main_process() () at main.cc:667
>>>> #3  0x0000000000405517 in main () at main.cc:186
>>>> 
>>>> 
>>>> Thanks
>>>> -Nagu
>>>> 
>>>>> -----Original Message-----
>>>>> From: Nagendra Kumar
>>>>> Sent: 09 February 2016 21:39
>>>>> To: minh chau; hans.nordeb...@ericsson.com;
>>> gary....@dektech.com.au;
>>>>> Praveen Malviya
>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add
>>>> support
>>>>> for cloud resilience [#1620] V2
>>>>> 
>>>>> 15. Same configuration as Test Case #12, SI lock. Keep gdb in both
>>>>> the SUs for csi remove and keep timeout as 100 sec. Slock SI and stop
>>> controller.
>>>>> Start controller and allow csi remove to timeout.
>>>>> Two things:
>>>>>   SU2 has Standby assignment(which is wrong), SU1 has not
>>> assignment.
>>>>>   Error at PL-4 : SU-SI record addition failed
>>>>> 
>>>>> PM_SC-1:/home/nagu/views/staging # amf-state  siass
>>>>> safSISU=safSu=PL-
>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
>>>>>       saAmfSISUHAState=ACTIVE(1)
>>>>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>> safSISU=safSu=PL-
>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
>>>>>       saAmfSISUHAState=ACTIVE(1)
>>>>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>> 
>>>> 
>>> safSISU=safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,s
>>>>> afApp=AmfDemo1
>>>>>       saAmfSISUHAState=STANDBY(2)
>>>>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>> safSISU=safSu=SC-
>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
>>>>>       saAmfSISUHAState=ACTIVE(1)
>>>>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-
>>>>> 2N,safApp=OpenSAF
>>>>>       saAmfSISUHAState=ACTIVE(1)
>>>>>       saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
>>>>> 
>>>>> Syslog of PL-4:
>>>>> 
>>>>> Feb  9 21:24:50 PM_PL-4 osafamfnd[7998]: NO
>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' component restart
>>>>> probation timer started (timeout: 60000000000 ns) Feb  9 21:24:50
>>>>> PM_PL-
>>>> 4
>>>>> osafamfnd[7998]: NO Restarting a component of
>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' (comp restart count: 1)
>>>> Feb
>>>>> 9 21:24:50 PM_PL-4 osafamfnd[7998]: NO
>>>>> 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
>>>>> faulted due to 'csiRemovecallbackTimeout' : Recovery is
>>>> 'componentRestart'
>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo_script: killproc
>>>>> /opt/amf_demo/amf_demo failed Feb  9 21:24:55 PM_PL-4
>>>>> amf_demo[8200]:
>>>>> 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
>>>>> started Feb  9 21:24:55 PM_PL-4 osafamfnd[7998]: NO Removed
>>>>> 'safSi=AmfDemo1,safApp=AmfDemo1' from
>>>>> 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]: HC started with AMF Feb  9
>>>>> 21:24:55 PM_PL-4 amf_demo[8200]: Registered with AMF Feb  9
>>> 21:24:55
>>>>> PM_PL-4 amf_demo[8200]: CSI Set - add
>>>>> 'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState Standby
>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]:         name: abcdef, value:
>>> val1
>>>>> Feb  9 21:24:55 PM_PL-4 amf_demo[8200]:         name: abcdef, value:
>>> val2
>>>>> Feb  9 21:24:55 PM_PL-4 osafamfnd[7998]: CR SU-SI record addition
>>>>> failed, SU= safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 :
>>>>> SI=safSi=AmfDemo,safApp=AmfDemo1 Feb  9 21:24:55 PM_PL-4
>>>>> amf_demo[8200]: Health check 1 Feb  9 21:25:50 PM_PL-4
>>>>> osafamfnd[7998]: NO 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
>>>>> Component or SU restart probation timer expired
>>>>> 
>>>>> Thanks
>>>>> -Nagu
>>>>> 
>>>>> --------------------------------------------------------------------
>>>>> ----------
>>>>> Site24x7 APM Insight: Get Deep Visibility into Application
>>>>> Performance APM
>>>> +
>>>>> Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor
>>>>> end-to-end web transactions and take corrective actions now
>>>> Troubleshoot
>>>>> faster and improve end-user experience. Signup Now!
>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>>>> _______________________________________________
>>>>> Opensaf-devel mailing list
>>>>> Opensaf-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>>> 
>>>> ----------------------------------------------------------------------
>>>> --------
>>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>>> Monitor end-to-end web transactions and take corrective actions now
>>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>>> _______________________________________________
>>>> Opensaf-devel mailing list
>>>> Opensaf-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>> 
>>> ------------------------------------------------------------------------------
>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance APM +
>>> Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor
>>> end-to-end web transactions and take corrective actions now Troubleshoot
>>> faster and improve end-user experience. Signup Now!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>> _______________________________________________
>>> Opensaf-devel mailing list
>>> Opensaf-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
> 


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to