- **status**: unassigned --> assigned
- **assigned_to**: Praveen
- **Milestone**: future --> 4.3.3



---

** [tickets:#212] amf: amfnd crashed on active controller**

**Status:** assigned
**Milestone:** 4.3.3
**Created:** Wed May 15, 2013 07:12 AM UTC by Praveen
**Last Updated:** Wed May 15, 2013 07:12 AM UTC
**Owner:** Praveen

Migrated from http://devel.opensaf.org/ticket/3010.

changeset : 3969 with pataches :2986,2884,2865,2977
 Model : 2N
 configuration : 1SG,5SUs,5SIs,each SU has 3comps.3CSIs in each SI
 csi-csi deps configured in SI1,SI5 as: CSI1<-CSI2<-CSI3 ( chain )
 si-si deps configured as SI1<-SI2<-SI3<-SI4
 SaAmfCSIAttribute is set for all the CSIs. 


GDB output:
(gdb) bt
 #0 0x00007fd460520b55 in raise () from /lib64/libc.so.6
 #1 0x00007fd460522131 in abort () from /lib64/libc.so.6
 #2 0x00007fd461525e44 in osafassert_fail (file=0x45bc7d "avnd_susm.c", 
line=915, 
func=0x45be60 "avnd_su_si_oper_done", assertion=0x45bd7a "0") at sysf_def.c:301
#3 0x0000000000444b1b in avnd_su_si_oper_done (cb=0x66bfe0, su=0x694890, 
si=0x699730) at avnd_susm.c:915
 #4 0x000000000042767f in avnd_comp_csi_remove_done (cb=0x66bfe0, 
comp=0x6a2ea0, csi=0x6afe50)
at avnd_comp.c:1713
#5 0x00000000004263ce in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a2ea0, 
csi=0x6afe50) at avnd_comp.c:1245
 #6 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0, 
comp=0x6a4f00, csi=0x6a7ba0)
at avnd_comp.c:1710
#7 0x000000000042659e in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a4f00, 
csi=0x6a7ba0) at avnd_comp.c:1292
 #8 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0, 
comp=0x6a3a70, csi=0x6a7510)
 at avnd_comp.c:1710
 #9 0x000000000042659e in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a3a70, 
csi=0x6a7510) at avnd_comp.c:1292
 #10 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0, 
comp=0x6a2ea0, csi=0x6a6e80)
 at avnd_comp.c:1710
 #11 0x0000000000420db3 in avnd_comp_clc_terming_cleansucc_hdler (cb=0x66bfe0, 
comp=0x6a2ea0) at avnd_clc.c:2047
 #12 0x000000000041d78d in avnd_comp_clc_fsm_run (cb=0x66bfe0, comp=0x6a2ea0, 
ev=AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_SUCC) at avnd_clc.c:876
#13 0x000000000041c99b in avnd_evt_clc_resp_evh (cb=0x66bfe0, evt=0x694b20) at 
avnd_clc.c:446
 #14 0x000000000043f85a in avnd_evt_process (evt=0x694b20) at avnd_proc.c:279
 #15 0x000000000043f698 in avnd_main_process () at avnd_proc.c:220
 #16 0x0000000000408fb5 in main (argc=2, argv=0x7fffd92745d8) at amfnd_main.c:45
 (gdb) bt full
 #0 0x00007fd460520b55 in raise () from /lib64/libc.so.6
 No symbol table info available.
 #1 0x00007fd460522131 in abort () from /lib64/libc.so.6
 No symbol table info available.
 #2 0x00007fd461525e44 in osafassert_fail (file=0x45bc7d "avnd_susm.c", 
line=915, 
func=0x45be60 "avnd_su_si_oper_done", assertion=0x45bd7a "0") at sysf_def.c:301
 No locals.
 #3 0x0000000000444b1b in avnd_su_si_oper_done (cb=0x66bfe0, su=0x694890, 
si=0x699730) at avnd_susm.c:915
 curr_si = 0x699730
 curr_csi = 0x0
 t_csi = 0x0
 are_si_assigned = false
 rc = 1
 opr_done = true
 FUNCTION = "avnd_su_si_oper_done"
 #4 0x000000000042767f in avnd_comp_csi_remove_done (cb=0x66bfe0, 
comp=0x6a2ea0, csi=0x6afe50)
 at avnd_comp.c:1713
 curr_csi = 0x0
 rc = 1
 csiname = 0x6afe82 "safCsi=CSI4SI2,safSi=TWONSI2,safApp=TWONAPP"
 FUNCTION = "avnd_comp_csi_remove_done"
 #5 0x00000000004263ce in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a2ea0, 
csi=0x6afe50) at and_comp.c:1245
curr_csi = 0x0
 is_assigned = false
 rc = 1
 csiname = 0x6afe82 "safCsi=CSI4SI2,safSi=TWONSI2,safApp=TWONAPP"
 FUNCTION = "avnd_comp_csi_remove"
#6 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0, comp=0x6a4f00, 
csi=0x6a7ba0)
 —Type <return> to continue, or q <return> to quit—
at avnd_comp.c:1710
curr_csi = 0x6afe50
 rc = 1
 csiname = 0x6a7bd2 "safCsi=CSI3SI2,safSi=TWONSI2,safApp=TWONAPP"
 FUNCTION = "avnd_comp_csi_remove_done"
#7 0x000000000042659e in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a4f00, 
csi=0x6a7ba0) at avnd_comp.c:1292
 curr_csi = 0x0
 is_assigned = false
 rc = 1
 csiname = 0x6a7bd2 "safCsi=CSI3SI2,safSi=TWONSI2,safApp=TWONAPP"
 FUNCTION = "avnd_comp_csi_remove"
 #8 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0, 
comp=0x6a3a70, csi=0x6a7510)
 at avnd_comp.c:1710
 curr_csi = 0x6a7ba0
 rc = 1
 csiname = 0x6a7542 "safCsi=CSI2SI2,safSi=TWONSI2,safApp=TWONAPP"
 FUNCTION = "avnd_comp_csi_remove_done"
 #9 0x000000000042659e in avnd_comp_csi_remove (cb=0x66bfe0, comp=0x6a3a70, 
csi=0x6a7510) at avnd_comp.c:1292
 curr_csi = 0x0
 is_assigned = false
 rc = 1
 csiname = 0x6a7542 "safCsi=CSI2SI2,safSi=TWONSI2,safApp=TWONAPP"
 FUNCTION = "avnd_comp_csi_remove"
 #10 0x000000000042765b in avnd_comp_csi_remove_done (cb=0x66bfe0, 
comp=0x6a2ea0, csi=0x6a6e80)
 at avnd_comp.c:1710
 curr_csi = 0x6a7510
 rc = 1
 csiname = 0x6a6eb2 "safCsi=CSI1SI2,safSi=TWONSI2,safApp=TWONAPP"
 —Type <return> to continue, or q <return> to quit—
 FUNCTION = "avnd_comp_csi_remove_done"
 #11 0x0000000000420db3 in avnd_comp_clc_terming_cleansucc_hdler (cb=0x66bfe0, 
comp=0x6a2ea0) at avnd_clc.c:2047
 csi = 0x6a6e80
 csis_removed = 2
 rc = 1
 FUNCTION = "avnd_comp_clc_terming_cleansucc_hdler"
 #12 0x000000000041d78d in avnd_comp_clc_fsm_run (cb=0x66bfe0, comp=0x6a2ea0, 
ev=AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_SUCC) at avnd_clc.c:876
prv_st = SA_AMF_PRESENCE_TERMINATING
 final_st = 6958752
 rc = 1
 FUNCTION = "avnd_comp_clc_fsm_run"
#13 0x000000000041c99b in avnd_evt_clc_resp_evh (cb=0x66bfe0, evt=0x694b20) at 
avnd_clc.c:446
ev = AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_SUCC
 clc_evt = 0x694b40
 comp = 0x6a2ea0
 rc = 1
 amcmd = 0
 FUNCTION = "avnd_evt_clc_resp_evh"
#14 0x000000000043f85a in avnd_evt_process (evt=0x694b20) at avnd_proc.c:279
cb = 0x66bfe0
 rc = 1
 FUNCTION = "avnd_evt_process"
#15 0x000000000043f698 in avnd_main_process () at avnd_proc.c:220
ret = 1
 mbx_fd = {raise_obj = 11, rmv_obj = 12}
 fds = {{fd = 12, events = 1, revents = 1}, {fd = 16, events = 1, revents = 0}, 
{fd = 14, events = 1, 
revents = 0}, {fd = 1637552552, events = 1, revents = 0}}
—Type <return> to continue, or q <return> to quit—
nfds = 3
 evt = 0x694b20
 FUNCTION = "avnd_main_process"

#16 0x0000000000408fb5 in main (argc=2, argv=0x7fffd92745d8) at amfnd_main.c:45
error = 0
Logs are not available. Will provide once it is reproducible. 



Changed 3 months ago by praveenmalviya 
■priority changed from major to minor

 Changing it to minor as amfnd crashed in shutting down sequence:
Feb 22 0:55:17.869499 osafamfnd [2997:avnd_term.c:0126] >> 
avnd_evt_last_step_term_evh
 Feb 22 0:55:17.869517 osafamfnd [2997:avnd_term.c:0130] NO Shutdown initiated
 Feb 22 0:55:17.869530 osafamfnd [2997:avnd_term.c:0164] NO Removing 
assignments from AMF components
 Feb 22 0:55:17.869540 osafamfnd [2997:avnd_susm.c:0678] >> avnd_su_si_remove: 
'safSu=SU3,safSg=SGONE,safApp=TWONAPP' 'safSi=TWONSI5,safApp=TWONAPP'
 Feb 22 0:55:17.869561 osafamfnd [2997:avnd_susm.c:0679] NO Removing 
'safSi=TWONSI5,safApp=TWONAPP' from 'safSu=SU3,safSg=SGONE,safApp=TWONAPP'
 Feb 22 0:55:17.869568 osafamfnd [2997:avnd_comp.c:1212] >> 
avnd_comp_csi_remove: comp: 'safComp=COMP1SU3TWONAPP,s
 
follow-up: ↓ 4   Changed 2 months ago by praveenmalviya 
Analysis (Not very informative):
 This assert happens when SI will be in either ASSIGNED or REMOVED state. SU3 
has two SIs assigned to it. When shutdown was started, avnd_su_si_remove() was 
called only for safSi=TWONSI5 and not for safSi=TWONSI2. This means 
safSi=TWONSI2 was already in REMOVED state and this fact is supported by assert 
also. But later on CSI remove callbacks were issued to the components for 
safSi=TWONSI2 which contradicts the fact that CSIs were already removed for 
this SI and it was in REMOVED state. Before this amfnd traces contains 
assignment related information only.
 
If possbile, can you please provide:
 1)Steps to reproduce.
 2)Configuration details.
 3)From fr 3, please provide value of:
 si->curr_assign_state
 
in reply to: ↑ 3   Changed 2 months ago by surenderk 
Replying to praveenmalviya:
Analysis (Not very informative):
 This assert happens when SI will be in either ASSIGNED or REMOVED state. SU3 
has two SIs assigned to it. When shutdown was started, avnd_su_si_remove() was 
called only for safSi=TWONSI5 and not for safSi=TWONSI2. This means 
safSi=TWONSI2 was already in REMOVED state and this fact is supported by assert 
also. But later on CSI remove callbacks were issued to the components for 
safSi=TWONSI2 which contradicts the fact that CSIs were already removed for 
this SI and it was in REMOVED state. Before this amfnd traces contains 
assignment related information only.

 If possbile, can you please provide:
 1)Steps to reproduce.
 2)Configuration details.
 3)From fr 3, please provide value of:
 si->curr_assign_state

Not exactly sure what steps i performed. It was shutdown of SU or SI( probably 
you can see logs), and component rejecting the quiescing state. 

(gdb) fr 3
 #3 0x0000000000444b1b in avnd_su_si_oper_done (cb=0x66bfe0, su=0x694890, 
si=0x699730) at avnd_susm.c:915
 915 avnd_susm.c: No such file or directory.
in avnd_susm.c
(gdb) p si->curr_assign_state
 $1 = AVND_SU_SI_ASSIGN_STATE_ASSIGNED
 (gdb)
 
 Changed 2 months ago by praveenmalviya 
For safSi=TWONSI5 removal of assignment was started first. When comp responded 
for CSI remove callback, CSIs of safSi=TWONSI2 were picked and there removal 
started without moving SI into the REMOVING state.




---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to