- **status**: unassigned --> assigned
- **assigned_to**: Praveen
- **Milestone**: future --> 4.3.3
---
** [tickets:#212] amf: amfnd crashed on active controller**
**Status:** assigned
**Milestone:** 4.3.3
**Created:** Wed May 15, 2013 07:12 AM UTC by Praveen
**Last Updated:** Wed May 15, 2013 07:12 AM UTC
**Owner:** Praveen
Migrated from http://devel.opensaf.org/ticket/3010.
changeset : 3969 with pataches :2986,2884,2865,2977
Model : 2N
configuration : 1SG,5SUs,5SIs,each SU has 3comps.3CSIs in each SI
csi-csi deps configured in SI1,SI5 as: CSI1<-CSI2<-CSI3 ( chain )
si-si deps configured as SI1<-SI2<-SI3<-SI4
SaAmfCSIAttribute is set for all the CSIs.
GDB output:
(gdb) bt
#0 0x00007fd460520b55 in raise () from /lib64/libc.so.6
#1 0x00007fd460522131 in abort () from /lib64/libc.so.6
#2 0x00007fd461525e44 in osafassert_fail (file=0x45bc7d "avnd_susm.c",
line=915,
func=0x45be60 "avnd_su_si_oper_done", assertion=0x45bd7a "0") at sysf_def.c:301
#3 0x0000000000444b1b in avnd_su_si_oper_done (cb=0x66bfe0, su=0x694890,
si=0x699730) at avnd_susm.c:915
#4 0x000000000042767f in avnd_comp_csi_remove_done (cb=0x66bfe0,
comp=0x6a2ea0, csi=0x6afe50)
at avnd_comp.c:1713
#5 0x00000000004263ce in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a2ea0,
csi=0x6afe50) at avnd_comp.c:1245
#6 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0,
comp=0x6a4f00, csi=0x6a7ba0)
at avnd_comp.c:1710
#7 0x000000000042659e in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a4f00,
csi=0x6a7ba0) at avnd_comp.c:1292
#8 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0,
comp=0x6a3a70, csi=0x6a7510)
at avnd_comp.c:1710
#9 0x000000000042659e in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a3a70,
csi=0x6a7510) at avnd_comp.c:1292
#10 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0,
comp=0x6a2ea0, csi=0x6a6e80)
at avnd_comp.c:1710
#11 0x0000000000420db3 in avnd_comp_clc_terming_cleansucc_hdler (cb=0x66bfe0,
comp=0x6a2ea0) at avnd_clc.c:2047
#12 0x000000000041d78d in avnd_comp_clc_fsm_run (cb=0x66bfe0, comp=0x6a2ea0,
ev=AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_SUCC) at avnd_clc.c:876
#13 0x000000000041c99b in avnd_evt_clc_resp_evh (cb=0x66bfe0, evt=0x694b20) at
avnd_clc.c:446
#14 0x000000000043f85a in avnd_evt_process (evt=0x694b20) at avnd_proc.c:279
#15 0x000000000043f698 in avnd_main_process () at avnd_proc.c:220
#16 0x0000000000408fb5 in main (argc=2, argv=0x7fffd92745d8) at amfnd_main.c:45
(gdb) bt full
#0 0x00007fd460520b55 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007fd460522131 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007fd461525e44 in osafassert_fail (file=0x45bc7d "avnd_susm.c",
line=915,
func=0x45be60 "avnd_su_si_oper_done", assertion=0x45bd7a "0") at sysf_def.c:301
No locals.
#3 0x0000000000444b1b in avnd_su_si_oper_done (cb=0x66bfe0, su=0x694890,
si=0x699730) at avnd_susm.c:915
curr_si = 0x699730
curr_csi = 0x0
t_csi = 0x0
are_si_assigned = false
rc = 1
opr_done = true
FUNCTION = "avnd_su_si_oper_done"
#4 0x000000000042767f in avnd_comp_csi_remove_done (cb=0x66bfe0,
comp=0x6a2ea0, csi=0x6afe50)
at avnd_comp.c:1713
curr_csi = 0x0
rc = 1
csiname = 0x6afe82 "safCsi=CSI4SI2,safSi=TWONSI2,safApp=TWONAPP"
FUNCTION = "avnd_comp_csi_remove_done"
#5 0x00000000004263ce in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a2ea0,
csi=0x6afe50) at and_comp.c:1245
curr_csi = 0x0
is_assigned = false
rc = 1
csiname = 0x6afe82 "safCsi=CSI4SI2,safSi=TWONSI2,safApp=TWONAPP"
FUNCTION = "avnd_comp_csi_remove"
#6 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0, comp=0x6a4f00,
csi=0x6a7ba0)
—Type <return> to continue, or q <return> to quit—
at avnd_comp.c:1710
curr_csi = 0x6afe50
rc = 1
csiname = 0x6a7bd2 "safCsi=CSI3SI2,safSi=TWONSI2,safApp=TWONAPP"
FUNCTION = "avnd_comp_csi_remove_done"
#7 0x000000000042659e in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a4f00,
csi=0x6a7ba0) at avnd_comp.c:1292
curr_csi = 0x0
is_assigned = false
rc = 1
csiname = 0x6a7bd2 "safCsi=CSI3SI2,safSi=TWONSI2,safApp=TWONAPP"
FUNCTION = "avnd_comp_csi_remove"
#8 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0,
comp=0x6a3a70, csi=0x6a7510)
at avnd_comp.c:1710
curr_csi = 0x6a7ba0
rc = 1
csiname = 0x6a7542 "safCsi=CSI2SI2,safSi=TWONSI2,safApp=TWONAPP"
FUNCTION = "avnd_comp_csi_remove_done"
#9 0x000000000042659e in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a3a70,
csi=0x6a7510) at avnd_comp.c:1292
curr_csi = 0x0
is_assigned = false
rc = 1
csiname = 0x6a7542 "safCsi=CSI2SI2,safSi=TWONSI2,safApp=TWONAPP"
FUNCTION = "avnd_comp_csi_remove"
#10 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0,
comp=0x6a2ea0, csi=0x6a6e80)
at avnd_comp.c:1710
curr_csi = 0x6a7510
rc = 1
csiname = 0x6a6eb2 "safCsi=CSI1SI2,safSi=TWONSI2,safApp=TWONAPP"
—Type <return> to continue, or q <return> to quit—
FUNCTION = "avnd_comp_csi_remove_done"
#11 0x0000000000420db3 in avnd_comp_clc_terming_cleansucc_hdler (cb=0x66bfe0,
comp=0x6a2ea0) at avnd_clc.c:2047
csi = 0x6a6e80
csis_removed = 2
rc = 1
FUNCTION = "avnd_comp_clc_terming_cleansucc_hdler"
#12 0x000000000041d78d in avnd_comp_clc_fsm_run (cb=0x66bfe0, comp=0x6a2ea0,
ev=AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_SUCC) at avnd_clc.c:876
prv_st = SA_AMF_PRESENCE_TERMINATING
final_st = 6958752
rc = 1
FUNCTION = "avnd_comp_clc_fsm_run"
#13 0x000000000041c99b in avnd_evt_clc_resp_evh (cb=0x66bfe0, evt=0x694b20) at
avnd_clc.c:446
ev = AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_SUCC
clc_evt = 0x694b40
comp = 0x6a2ea0
rc = 1
amcmd = 0
FUNCTION = "avnd_evt_clc_resp_evh"
#14 0x000000000043f85a in avnd_evt_process (evt=0x694b20) at avnd_proc.c:279
cb = 0x66bfe0
rc = 1
FUNCTION = "avnd_evt_process"
#15 0x000000000043f698 in avnd_main_process () at avnd_proc.c:220
ret = 1
mbx_fd = {raise_obj = 11, rmv_obj = 12}
fds = {{fd = 12, events = 1, revents = 1}, {fd = 16, events = 1, revents = 0},
{fd = 14, events = 1,
revents = 0}, {fd = 1637552552, events = 1, revents = 0}}
—Type <return> to continue, or q <return> to quit—
nfds = 3
evt = 0x694b20
FUNCTION = "avnd_main_process"
#16 0x0000000000408fb5 in main (argc=2, argv=0x7fffd92745d8) at amfnd_main.c:45
error = 0
Logs are not available. Will provide once it is reproducible.
Changed 3 months ago by praveenmalviya
■priority changed from major to minor
Changing it to minor as amfnd crashed in shutting down sequence:
Feb 22 0:55:17.869499 osafamfnd [2997:avnd_term.c:0126] >>
avnd_evt_last_step_term_evh
Feb 22 0:55:17.869517 osafamfnd [2997:avnd_term.c:0130] NO Shutdown initiated
Feb 22 0:55:17.869530 osafamfnd [2997:avnd_term.c:0164] NO Removing
assignments from AMF components
Feb 22 0:55:17.869540 osafamfnd [2997:avnd_susm.c:0678] >> avnd_su_si_remove:
'safSu=SU3,safSg=SGONE,safApp=TWONAPP' 'safSi=TWONSI5,safApp=TWONAPP'
Feb 22 0:55:17.869561 osafamfnd [2997:avnd_susm.c:0679] NO Removing
'safSi=TWONSI5,safApp=TWONAPP' from 'safSu=SU3,safSg=SGONE,safApp=TWONAPP'
Feb 22 0:55:17.869568 osafamfnd [2997:avnd_comp.c:1212] >>
avnd_comp_csi_remove: comp: 'safComp=COMP1SU3TWONAPP,s
follow-up: ↓ 4 Changed 2 months ago by praveenmalviya
Analysis (Not very informative):
This assert happens when SI will be in either ASSIGNED or REMOVED state. SU3
has two SIs assigned to it. When shutdown was started, avnd_su_si_remove() was
called only for safSi=TWONSI5 and not for safSi=TWONSI2. This means
safSi=TWONSI2 was already in REMOVED state and this fact is supported by assert
also. But later on CSI remove callbacks were issued to the components for
safSi=TWONSI2 which contradicts the fact that CSIs were already removed for
this SI and it was in REMOVED state. Before this amfnd traces contains
assignment related information only.
If possbile, can you please provide:
1)Steps to reproduce.
2)Configuration details.
3)From fr 3, please provide value of:
si->curr_assign_state
in reply to: ↑ 3 Changed 2 months ago by surenderk
Replying to praveenmalviya:
Analysis (Not very informative):
This assert happens when SI will be in either ASSIGNED or REMOVED state. SU3
has two SIs assigned to it. When shutdown was started, avnd_su_si_remove() was
called only for safSi=TWONSI5 and not for safSi=TWONSI2. This means
safSi=TWONSI2 was already in REMOVED state and this fact is supported by assert
also. But later on CSI remove callbacks were issued to the components for
safSi=TWONSI2 which contradicts the fact that CSIs were already removed for
this SI and it was in REMOVED state. Before this amfnd traces contains
assignment related information only.
If possbile, can you please provide:
1)Steps to reproduce.
2)Configuration details.
3)From fr 3, please provide value of:
si->curr_assign_state
Not exactly sure what steps i performed. It was shutdown of SU or SI( probably
you can see logs), and component rejecting the quiescing state.
(gdb) fr 3
#3 0x0000000000444b1b in avnd_su_si_oper_done (cb=0x66bfe0, su=0x694890,
si=0x699730) at avnd_susm.c:915
915 avnd_susm.c: No such file or directory.
in avnd_susm.c
(gdb) p si->curr_assign_state
$1 = AVND_SU_SI_ASSIGN_STATE_ASSIGNED
(gdb)
Changed 2 months ago by praveenmalviya
For safSi=TWONSI5 removal of assignment was started first. When comp responded
for CSI remove callback, CSIs of safSi=TWONSI2 were picked and there removal
started without moving SI into the REMOVING state.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets