- **Milestone**: 4.7.2 --> 5.0.2


** [tickets:#2040] clmd seg faulted on active controller during switchover**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Fri Sep 16, 2016 11:30 AM UTC by Ritu Raj
**Last Updated:** Fri Sep 16, 2016 11:30 AM UTC
**Owner:** nobody

 (3.6 MB; application/x-bzip)
- [clmd_bt](https://sourceforge.net/p/opensaf/tickets/2040/attachment/clmd_bt) 
(3.2 kB; application/octet-stream)

# Environment details
OS : Suse 64bit
Changeset : 7997 ( 5.1.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
 1PBE with 30K objects)

clmd seg faulted on active controller during controller switchover 

#Steps followed & Observed behaviour
1. Incoked controller switchover (SC-1 is the Active)
2. During role change, on SC-1 clmd got crashed and node went for reboot as 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown'
3. After, Active controller went for reboot,
>>NTFD crashed on Standby controller  and cluster reset happend -- Regarding 
>>NTFD crashed a ticket is already raised -- 

*Syslog :

Sep 16 15:33:30 sofo-s1 osafamfnd[2162]: NO 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Sep 16 15:33:30 sofo-s1 osafamfnd[2162]: ER 
safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
Sep 16 15:33:30 sofo-s1 osafamfnd[2162]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60

*Below is the bt:

0  0x00007f3db18e1b55 in raise () from /lib64/libc.so.6
1  0x00007f3db18e3131 in abort () from /lib64/libc.so.6
2  0x00007f3db191ec2f in __libc_message () from /lib64/libc.so.6
3  0x00007f3db1924358 in malloc_printerr () from /lib64/libc.so.6
4  0x00007f3db19292fc in free () from /lib64/libc.so.6
5  0x00007f3db223db52 in timer_delete@@GLIBC_2.3.3 () from /lib64/librt.so.1
6  0x00000000004055b3 in amf_quiesced_state_handler (cb=0x633820 <_clms_cb>, 
invocation=4288675847) at clms_amf.c:123
7  0x0000000000405795 in clms_amf_csi_set_callback (invocation=4288675847, 
compName=0x6bac88, new_haState=SA_AMF_HA_QUIESCED, csiDescriptor=...) at 
8  0x00007f3db332e1f1 in ava_hdl_cbk_rec_prc (info=0x6bac70, 
reg_cbk=0x7fff3e5fafe0) at ava_hdl.cc:645
9  0x00007f3db332d896 in ava_hdl_cbk_dispatch_all (cb=0x7fff3e5fb0b0, 
hdl_rec=0x7fff3e5fb0b8) at ava_hdl.cc:446
10 0x00007f3db332d376 in ava_hdl_cbk_dispatch (cb=0x7fff3e5fb0b0, 
hdl_rec=0x7fff3e5fb0b8, flags=SA_DISPATCH_ALL) at ava_hdl.cc:320
11 0x00007f3db3325a49 in AmfAgent::Dispatch (hdl=4285530114, 
flags=SA_DISPATCH_ALL) at amf_agent.cc:283
12 0x00007f3db332588e in saAmfDispatch (hdl=4285530114, flags=SA_DISPATCH_ALL) 
at amf_agent.cc:244
13 0x0000000000413966 in main (argc=2, argv=0x7fff3e5fb208) at clms_main.c:515

1. Issue is random
2. Syslog, clmd trace and bt file attached 


Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
Opensaf-tickets mailing list

Reply via email to