- **status**: review --> fixed
- **assigned_to**: Hans Feldt -->  nobody 
- **Milestone**: future --> 4.3.3
- **Comment**:

I think we can close this one now. Will open a new enhancement ticket (or maybe 
it exist) to handle the case when IMM is read in context of receiving REG_SU 
and other events.



---

** [tickets:#574] amfnd abort causing node reboot**

**Status:** fixed
**Milestone:** 4.3.3
**Created:** Tue Sep 24, 2013 08:33 AM UTC by Hans Feldt
**Last Updated:** Tue Apr 08, 2014 03:06 PM UTC
**Owner:** nobody

amfnd fails to read from IMM (comp capability) due to some unknown reason which 
causes an abort in immutils and a core dump. Which in turn causes the amf 
watchdog to reboot the node.

This particular IMM read is in the criticial switch-over logic when the 
application is already added and up providing service. The read of comp 
capability can easily be avoided with just some more information included in an 
amfd-amfnd message.

==================================================================================

2013-09-09 11:49:52  osafamfnd SC-2-1 notice osafamfnd[5336]: NO Assigning 'all 
(37) SIs' STANDBY to 'safSu=1,safSg=2N,safApp=SomeApp'
2013-09-09 11:49:52  osafamfnd SC-2-1 notice osafamfnd[5336]: NO Assigning 
'safSi=CS,safApp=SomeApp' STANDBY to 'safSu=1,safSg=2N,safApp=SomeApp'
2013-09-09 11:50:02  osafamfnd SC-2-1 err osafamfnd[5336]: saImmOmInitialize 
FAILED, rc = 5
2013-09-09 11:50:04  osafrded SC-2-1 alert osafrded[5113]: AL AMF Node Director 
is down, terminate this process
2013-09-09 11:50:04  osaffmd SC-2-1 alert osaffmd[5122]: AL AMF Node Director 
is down, terminate this process
2013-09-09 11:50:04  osafimmnd SC-2-1 alert osafimmnd[5142]: AL AMF Node 
Director is down, terminate this process
2013-09-09 11:50:06  osafpmnd SC-2-1 alert osafpmnd[5405]: AL AMF Node Director 
is down, terminate this process
2013-09-09 11:50:04  osafpmd SC-2-1 alert osafpmd[5421]: AL AMF Node Director 
is down, terminate this process
2013-09-09 11:50:04  osafamfwd SC-2-1 crit osafamfwd[5463]: Rebooting OpenSAF 
NodeId = 0 EE Name = No EE Mapped, Reason: AMF unexpectedly crashed, OwnNodeId 
= 131343, SupervisionTime = 60
2013-09-09 11:50:04  osafckptd SC-2-1 alert osafckptd[5520]: AL AMF Node 
Director is down, terminate this process 

(gdb) bt full
#0 0x00007fab46742b35 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007fab46744111 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00000000004051f8 in defaultImmutilError (fmt=0x43fef0 "rc = %d")
at ../../../../../osaf/tools/safimm/src/immutil.c:72
ap = {{gp_offset = 16, fp_offset = 48, overflow_arg_area = 0x7fffa48dc300, 
reg_save_area = 0x7fffa48dc230}}
ap2 = {{gp_offset = 16, fp_offset = 48, overflow_arg_area = 0x7fffa48dc300,
reg_save_area = 0x7fffa48dc230}}
#3 0x00000000004065f4 in immutil_saImmOmInitialize (immHandle=0x7fffa48dc490, 
immCallbacks=0x0,
version=0x7fffa48dc4b0) at ../../../../../osaf/tools/safimm/src/immutil.c:1127
localVer = {releaseCode = 65 'A', majorVersion = 2 '\002', minorVersion = 12 
'\f'}
rc = SA_AIS_ERR_TIMEOUT
nTries = 6886170
#4 0x000000000041df97 in avnd_comp_cap_x_act_or_1_act_check 
(comp_type=0x69131a, csi_type=0x6f0142)
at avnd_comp.c:911
rc = <optimized out>
error = <optimized out>
dn = {length = 97,
value = 
"safSupportedCsType=safVersion=1.0.0\\,safCSType=X,safVersion=R2B,safCompType=X",
 '\000' <repeats 158 times>}
accessorHandle = 0
attributes = <optimized out>
---Type <return> to continue, or q <return> to quit---
comp_cap = <optimized out>
attributeNames = {0x443552 "ULL", 0x0}
immOmHandle = 0
immVersion = {releaseCode = 65 'A', majorVersion = 2 '\002', minorVersion = 1 
'\001'}
__FUNCTION__ = "avnd_comp_cap_x_act_or_1_act_check"
#5 0x000000000041e43b in avnd_comp_csi_assign (cb=0x6578c0, comp=0x6911e0, 
csi=0x0) at avnd_comp.c:1017
npi_prv_inst = <optimized out>
npi_curr_inst = <optimized out>
curr_csi = 0x6f0010
comp_ev = <optimized out>
rc = <optimized out>
csiname = 0x4434f1 "%u"
__FUNCTION__ = "avnd_comp_csi_assign"
#6 0x0000000000436d9c in assign_si_to_su (si=0x69ccc0, su=0x66f770, 
single_csi=0) at avnd_susm.c:561
npi_prv_inst = <optimized out>
npi_curr_inst = 6
su_ev = 4294967295
rc = 6933746
curr_csi = 0x6f0010
__FUNCTION__ = "assign_si_to_su"
#7 0x0000000000437219 in avnd_su_si_assign (cb=<optimized out>, su=0x66f770, 
si=0x69ccc0) at avnd_susm.c:606
rc = <optimized out>
rank = <optimized out>
---Type <return> to continue, or q <return> to quit---
curr_si = <optimized out>
curr_csi = <optimized out>
__FUNCTION__ = "avnd_su_si_assign"
#8 0x0000000000434b9d in avnd_su_si_msg_prc (cb=0x6578c0, su=0x66f770, 
info=<optimized out>) at avnd_susm.c:349
csi_param = 0x6f8df8
si = <optimized out>
rc = 1
csi = <optimized out>
__FUNCTION__ = "avnd_su_si_msg_prc"
#9 0x000000000043216e in avnd_evt_avd_info_su_si_assign_evh (cb=0x6578c0, 
evt=<optimized out>) at avnd_su.c:258
info = <optimized out>
siq = <optimized out>
su = 0x66f770
rc = <optimized out>
__FUNCTION__ = "avnd_evt_avd_info_su_si_assign_evh"
#10 0x0000000000430190 in avnd_main_process () at avnd_proc.c:218
ret = 0
mbx_fd = <optimized out>
fds = {{fd = 11, events = 1, revents = 1}, {fd = 15, events = 1, revents = 0}, 
{fd = 13, events = 1,
revents = 0}, {fd = 0, events = 0, revents = 0}}
evt = 0x6c5190
__FUNCTION__ = "avnd_main_process"
#11 0x0000000000408815 in main (argc=1, argv=0x7fffa48dc7a8) at amfnd_main.c:61
---Type <return> to continue, or q <return> to quit---
error = 32767
ret = <optimized out>




---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to