- **status**: review --> accepted
- **Comment**:
Observing same crash in one more flow while testing #1863 when SI has more than
one CSI. In the already reported case, crash is observed when su-failover
escalation occurs when comp replies with ERROR_CODE for the csi_remove
callback. Published patch takes care of this.
In the new case, component faults after successfully responding for quiesced
cbk with su-failover recovery. AMFND launches cleanup of componnents. Since
component had faulted after successful reply which in turns leads to AMFND
response to AMFD for quiesced callback, AMFD performs fail-over of SU and make
SU2 active. After making SU2 successfully active, AMFD sends assignment removal
to AMFND of SU1. In this case AMFND gets removal of assignment when su-failover
escalation is going on. Since none of the comps are registered, no callbacks
are issued and AMFND tries to respond to AMFD for completion of assignment and
it crashes.
amfnd trace:
Jun 8 16:33:07.030112 osafamfnd [17092:sidb.cc:0876] << avnd_su_si_csi_rec_del
Jun 8 16:33:07.030127 osafamfnd [17092:sidb.cc:0816] << avnd_su_si_csi_del: 1
Jun 8 16:33:07.030142 osafamfnd [17092:sidb.cc:0737] T1 SU-SI record deleted,
SU= safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 : SI=safSi=AmfDemo,safApp=AmfDemo1
Jun 8 16:33:07.030159 osafamfnd [17092:sidb.cc:0785] << avnd_su_si_del: 1
Jun 8 16:33:07.030199 osafamfnd [17092:susm.cc:0255] >> avnd_su_siq_prc: SU
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Jun 8 16:33:07.030253 osafamfnd [17092:susm.cc:0260] << avnd_su_siq_prc
Jun 8 16:33:07.030270 osafamfnd [17092:susm.cc:1177] << avnd_su_si_oper_done: 1
Jun 8 16:33:07.030286 osafamfnd [17092:comp.cc:1822] <<
avnd_comp_csi_remove_done: 1
Jun 8 16:33:07.030301 osafamfnd [17092:comp.cc:1321] << avnd_comp_csi_remove: 1
Jun 8 16:33:07.030316 osafamfnd [17092:comp.cc:1678] >>
all_csis_in_removed_state: 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Jun 8 16:33:07.030331 osafamfnd [17092:comp.cc:1691] <<
all_csis_in_removed_state: 1
Jun 8 16:33:07.030346 osafamfnd [17092:susm.cc:1021] >> avnd_su_si_oper_done:
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' '(null)'
Jun 8 16:33:07.030360 osafamfnd [17092:susm.cc:0845] >>
susi_operation_in_progress: 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' '(null)'
Jun 8 16:33:07.030376 osafamfnd [17092:susm.cc:0890] <<
susi_operation_in_progress: 1
Jun 8 16:33:07.030391 osafamfnd [17092:err.cc:1595] >>
is_no_assignment_due_to_escalations
Jun 8 16:33:07.030406 osafamfnd [17092:err.cc:1600] <<
is_no_assignment_due_to_escalations: true
Will publish a new version after considering this case also.
---
** [tickets:#1770] AMF : amfnd segfaulted during su failover escalation**
**Status:** accepted
**Milestone:** 4.7.2
**Created:** Tue Apr 19, 2016 06:53 AM UTC by Srikanth R
**Last Updated:** Wed May 04, 2016 06:55 PM UTC
**Owner:** Praveen
Setup :
5 node cluster with 3 payloads
changeset : 7438 ( opensaf 5.0.FC)
Application : 2N with 5 SUs ( si-si deps enabled & su failover flag enabled)
Issue :
AMFND hosting the faulty SU segfaulted during su Failover escalation as part
of SU lock operation
Steps performed :
-> Initially bring up the application and ensure that application is fully
assigned.
-> Perform one fault operation on the SU hosting the active assignment, such a
way that the next fault is escalated to su failover.
-> Perform lock operation of SU hosting the active assignment.
-> Do not respond to the CSI removal callback, for which this fault shall be
escalated to su failover.
-> AMFND seg faulted with the following bt file
signal: 11 pid: 320 uid: 0
/usr/lib64/libopensaf_core.so.0(+0x1fd9d)[0x7f1d79294d9d]
/lib64/libpthread.so.0(+0xf7c0)[0x7f1d782b67c0]
/usr/lib64/opensaf/osafamfnd[0x43b1ff]
/usr/lib64/opensaf/osafamfnd[0x417f89]
/usr/lib64/opensaf/osafamfnd[0x408469]
/usr/lib64/opensaf/osafamfnd[0x42c65a]
/usr/lib64/opensaf/osafamfnd[0x42c4a0]
/usr/lib64/opensaf/osafamfnd[0x42b979]
/lib64/libc.so.6(_ _libc_start_main+0xe6)[0x7f1d77ac1c36]
/usr/lib64/opensaf/osafamfnd[0x405f29]
-> Below is the entry in osafamfnd trace :
Apr 19 11:23:44.684918 osafamfnd [29522:clc.cc:0870] T1
'safComp=COMP2SU5TWONAPP,safSu=SU5,safSg=SGONE,safApp=TWONAPP':FSM Enter
presence state: 'SA_AMF_PRESENCE_TERMINATING(4)':FSM Exit presence
state:SA_AMF_PRESENCE_TERMINATING(4)
Apr 19 11:23:44.684924 osafamfnd [29522:clc.cc:0889] << avnd_comp_clc_fsm_run: 1
Apr 19 11:23:44.684930 osafamfnd [29522:err.cc:1120] << avnd_err_su_repair:
retval=1
Apr 19 11:23:44.684936 osafamfnd [29522:susm.cc:0255] >> avnd_su_siq_prc: SU
'safSu=SU5,safSg=SGONE,safApp=TWONAPP'
Apr 19 11:23:44.684942 osafamfnd [29522:susm.cc:0260] << avnd_su_siq_prc
Apr 19 11:23:44.684947 osafamfnd [29522:susm.cc:1176] << avnd_su_si_oper_done: 1
Apr 19 11:23:44.684953 osafamfnd [29522:comp.cc:1822] <<
avnd_comp_csi_remove_done: 1
Apr 19 11:23:44.684959 osafamfnd [29522:comp.cc:1321] << avnd_comp_csi_remove: 1
Apr 19 11:23:44.685055 osafamfnd [29522:comp.cc:1678] >>
all_csis_in_removed_state: 'safSu=SU5,safSg=SGONE,safApp=TWONAPP'
Apr 19 11:23:44.685064 osafamfnd [29522:comp.cc:1691] <<
all_csis_in_removed_state: 1
Apr 19 11:23:44.685070 osafamfnd [29522:susm.cc:1021] >> avnd_su_si_oper_done:
'safSu=SU5,safSg=SGONE,safApp=TWONAPP' '(null)'
Apr 19 11:23:44.685076 osafamfnd [29522:susm.cc:0845] >>
susi_operation_in_progress: 'safSu=SU5,safSg=SGONE,safApp=TWONAPP' '(null)'
Apr 19 11:23:44.685082 osafamfnd [29522:susm.cc:0890] <<
susi_operation_in_progress: 1
Apr 19 11:23:44.685096 osafamfnd [29522:err.cc:1586] >>
is_no_assignment_due_to_escalations
Apr 19 11:23:44.685102 osafamfnd [29522:err.cc:1591] <<
is_no_assignment_due_to_escalations: true
Apr 19 11:24:51.153931 osafamfnd [2500:ncs_main_pub.c:0223] TR
NCS:PROCESS_ID=2500
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets