- **Type**: defect --> discussion
---
** [tickets:#300] amfnd crashed on payload node**
**Status:** unassigned
**Created:** Wed May 22, 2013 11:39 AM UTC by Nagendra Kumar
**Last Updated:** Wed May 22, 2013 11:41 AM UTC
**Owner:** nobody
Migrated from http://devel.opensaf.org/ticket/3127
Changeset : 4200
Transport : TCP/ipv6 ( link local )
patches : 2794
PBE enabled.
Model : 2n
configuration : 1SG,2SUs,4comps in each su, 4Sis with 1csi each.
SU1 is hosted on PL-3 and SU2 on PL-4
scenario:
perform shutdown of sg. Reject the quiescing state on active SU. Then unlock
the sg. Amfnd crashes on pl-3.
gdb output:
Program terminated with signal 6, Aborted.
#0 0x0000003c0be328a5 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
opensaf-amf-nodedirector-4.3.RC1-1.el6.x86_64
(gdb) bt
#0 0x0000003c0be328a5 in raise () from /lib64/libc.so.6
#1 0x0000003c0be34085 in abort () from /lib64/libc.so.6
#2 0x0000003223e18fbb in osafassert_fail () from /usr/lib64/libopensaf_core.so.0
#3 0x00000000004317cf in ?? ()
#4 0x0000000000443c24 in ?? ()
/var/log/messages on pl-3:
Apr 22 14:15:32 OEL-64BIT-SLOT2 osafamfnd[2371]: avnd_di.c:572:
avnd_di_susi_resp_send: Assertion 'si' failed.
Apr 22 14:15:32 OEL-64BIT-SLOT2 python2.5: AL AMF Node Director is down,
terminate this process
Apr 22 14:15:32 OEL-64BIT-SLOT2 python2.5: AL AMF Node Director is down,
terminate this process
Apr 22 14:15:32 OEL-64BIT-SLOT2 python2.5: AL AMF Node Director is down,
terminate this process
Apr 22 14:15:32 OEL-64BIT-SLOT2 python2.5: AL AMF Node Director is down,
terminate this process
Apr 22 14:15:32 OEL-64BIT-SLOT2 osafamfwd[2467]: Rebooting OpenSAF NodeId? = 0
EE Name = No EE Mapped, Reason: AMF unexpectedly crashed
Changed 4 weeks ago by nagendra ¶
Could you upload the bt and bt full. The bt shared contains garbage.
in reply to: ↑ 2 Changed 4 weeks ago by surenderk ¶
Replying to nagendra:
Could you upload the bt and bt full. The bt shared contains garbage.
Find the backtrace below.
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6.x86_64
(gdb) bt
#0 0x0000003c0be328a5 in raise () from /lib64/libc.so.6
#1 0x0000003c0be34085 in abort () from /lib64/libc.so.6
#2 0x0000003223e18fbb in osafassert_fail (file=0x456e49 "avnd_di.c", line=572,
func=0x457470 "avnd_di_susi_resp_send", assertion=0x457125 "si") at
sysf_def.c:301
#3 0x00000000004317cf in avnd_di_susi_resp_send (cb=0x66b3e0, su=0xf92dc0,
si=0x0) at avnd_di.c:572
#4 0x0000000000443c24 in avnd_su_si_oper_done (cb=0x66b3e0, su=0xf92dc0,
si=0x0) at avnd_susm.c:944
#5 0x00000000004265d9 in avnd_comp_csi_assign_done (cb=0x66b3e0, comp=0xf95650,
csi=0xfa3e60)
at avnd_comp.c:1598
#6 0x000000000040ab18 in avnd_evt_ava_resp_evh (cb=0x66b3e0,
evt=0x7f86580025a0) at avnd_cbq.c:429
#7 0x000000000043e7e4 in avnd_evt_process (evt=0x7f86580025a0) at
avnd_proc.c:279
#8 0x000000000043e621 in avnd_main_process () at avnd_proc.c:220
#9 0x0000000000408b34 in main (argc=2, argv=0x7fff2329e238) at amfnd_main.c:53
(gdb) bt full
#0 0x0000003c0be328a5 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x0000003c0be34085 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x0000003223e18fbb in osafassert_fail (file=0x456e49 "avnd_di.c", line=572,
func=0x457470 "avnd_di_susi_resp_send", assertion=0x457125 "si") at
sysf_def.c:301
No locals.
#3 0x00000000004317cf in avnd_di_susi_resp_send (cb=0x66b3e0, su=0xf92dc0,
si=0x0) at avnd_di.c:572
curr_si = 0xf98400
msg = {type = AVND_MSG_AVD, info = {avd = 0xf93830, avnd = 0xf93830, ava =
0xf93830}}
rc = 1
FUNCTION = "avnd_di_susi_resp_send"
#4 0x0000000000443c24 in avnd_su_si_oper_done (cb=0x66b3e0, su=0xf92dc0,
si=0x0) at avnd_susm.c:944
curr_si = 0x0
curr_csi = 0x0
t_csi = 0x0
are_si_assigned = false
rc = 1
opr_done = true
FUNCTION = "avnd_su_si_oper_done"
#5 0x00000000004265d9 in avnd_comp_csi_assign_done (cb=0x66b3e0, comp=0xf95650,
csi=0xfa3e60)
at avnd_comp.c:1598
rank = 2
curr_csi = 0xfa3e60
rc = 1
csiname = 0xfa3e92 "safCsi=CSI4,safSi=SI4,safApp=test2nApp"
FUNCTION = "avnd_comp_csi_assign_done"
#6 0x000000000040ab18 in avnd_evt_ava_resp_evh (cb=0x66b3e0,
evt=0x7f86580025a0) at avnd_cbq.c:429
—Type <return> to continue, or q <return> to quit—
api_info = 0x7f8658002348
resp = 0x7f8658002358
comp = 0xf95650
cbk_rec = 0xfa2620
hc_rec = 0x0
csi = 0xfa3e60
err_info = {src = 589947440, rec_rcvr = {raw = 32767, avsv_ext = 32767, saf_amf
= 32767}}
rc = 1
msg_from_avnd = false
int_ext_comp = false
amf_rc = SA_AIS_OK
FUNCTION = "avnd_evt_ava_resp_evh"
#7 0x000000000043e7e4 in avnd_evt_process (evt=0x7f86580025a0) at
avnd_proc.c:279
cb = 0x66b3e0
rc = 1
FUNCTION = "avnd_evt_process"
#8 0x000000000043e621 in avnd_main_process () at avnd_proc.c:220
ret = 1
mbx_fd = {raise_obj = 10, rmv_obj = 11}
fds = {{fd = 11, events = 1, revents = 1}, {fd = 15, events = 1, revents = 0},
{fd = 13, events = 1,
revents = 0}, {fd = 601960803, events = 50, revents = 0}}
nfds = 3
evt = 0x7f86580025a0
FUNCTION = "avnd_main_process"
#9 0x0000000000408b34 in main (argc=2, argv=0x7fff2329e238) at amfnd_main.c:53
error = 0
Changed 4 weeks ago by hafe ¶
Hm yes I added that assert. It sort of means that if the SUSI action Assign
is only valid with an SI. Because you can't do new assign with no SIs...
Changed 4 weeks ago by nagendra ¶
■owner changed from ravisekhar to nagendra
■status changed from new to accepted
Changed 4 weeks ago by nagendra ¶
■patch_waiting changed from no to yes
Changed 4 weeks ago by nagendra ¶
Analysis :
===================
During Sg shutdown, Amfd sends Susi Quiscing message to Amfnd. At Amfnd, a
component faults and it sends SU Out of Service message to Amfd.
When Amfd receives the Su OOS(because of component fault) from amfnd, then it
sends Quisced assignment to amfnd. Now, Amfnd has two susi to respond to Amfd,
first one was the Quiscing message and other one is Quisced message.
When Amfnd responds to both the susi, Amfd sends susi remove twice to Amfnd.
The second susi remove message gets dropped at Amfnd but the
AVND_SU_FLAG_ALL_SI gets set because of the below lines at avnd_su_si_msg_prc:
/* If the request targets all SIs, set flag once early for all cases */
if (avsv_sa_name_is_null(&info->si_name))
m_AVND_SU_ALL_SI_SET(su);
When SG unlock is done, then amfnd crashes because of the flag is set.
follow-up: ↓ 9 Changed 4 weeks ago by hafe ¶
See http://devel.opensaf.org/ticket/3026
I never got a responde to that analysis. Same problem again. We need to decide
the action amfnd should take when it receives a redundant SUSI delete message.
Right now it is silently discarded.
Logging added in changeset 4177 (#3083) would clearly show this is happening.
That is why I moved the logs.
in reply to: ↑ 8 Changed 4 weeks ago by nagendra ¶
Replying to hafe:
See http://devel.opensaf.org/ticket/3026
I never got a responde to that analysis. Same problem again. We need to decide
the action amfnd should take when it receives a redundant SUSI delete message.
Right now it is silently discarded.
Logging added in changeset 4177 (#3083) would clearly show this is happening.
That is why I moved the logs.
As of now, we are discarding the message.
Going forward, we need to collect the data for the same and analyse and create
protocol between Amfd and Amfnd.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets