The issue is always reproducible.
Test:
A campaign is modeled to include PL-5 and an SU on this node. For this the
script '/usr/share/opensaf/immxml/immxml-modify-config' is being used. While
doing rollback clm crash is observed. It is seen that the campaign is doing a
lock/lock-in op on PL-5 and simultaneously the script immxml-modify-config is
also trying to perform admin lock i.e the lines below if commented in
immxml-modify-config, then the rollback goes fine. if enabled then clm crashes.
PLMNODE=`cat $CURRENT_NODECFG | grep ".. $node " | awk '{ print $ 3 }'`
trace "PLMNODE: $PLMNODE"
cmd="amf-adm lock safNode=$PLMNODE,safCluster=myClmCluster"
The scripts, configuration are attached.
Attachment: scripts.tgz (4.9 kB; application/x-compressed-tar)
---
** [tickets:#227] clmd asserts on active controller during node lock timeout**
**Status:** unassigned
**Created:** Wed May 15, 2013 10:23 AM UTC by Mathi Naickan
**Last Updated:** Fri Jun 28, 2013 10:45 AM UTC
**Owner:** Mathi Naickan
I have asked for traces from the submitter.
changeset : 4007 with patch 2865
scenario:
========
Trying to do lock/lock-in of PL-5.
amf-adm lock safNode=PL-5,safCluster=myClmCluster
error - saImmOmAdminOperationInvoke_2 FAILED: SA_AIS_ERR_TIMEOUT (5)
error: failed to eval/store amf-adm lock safNode=PL-5,safCluster=myClmCluster
failed. Aborting script! exitCode: 1
#0 0x00007fb446240b55 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007fb446240b55 in raise () from /lib64/libc.so.6
#1 0x00007fb446242131 in abort () from /lib64/libc.so.6
#2 0x00007fb447881e44 in osafassert_fail (file=0x420380 "clms_evt.c", line=390,
func=0x420680 "proc_node_lock_tmr_exp_msg", assertion=0x42069b "op_node !=
NULL") at sysf_def.c:301
#3 0x000000000040954a in proc_node_lock_tmr_exp_msg (evt=0x655290) at
clms_evt.c:390
#4 0x000000000040bc42 in clms_process_mbx (mbx=0x6298a0) at clms_evt.c:1272
#5 0x0000000000412b3b in main (argc=1, argv=0x7fff3162cb28) at clms_main.c:455
(gdb) bt full
#0 0x00007fb446240b55 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007fb446242131 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007fb447881e44 in osafassert_fail (file=0x420380 "clms_evt.c", line=390,
func=0x420680 "proc_node_lock_tmr_exp_msg", assertion=0x42069b "op_node !=
NULL") at sysf_def.c:301
No locals.
#3 0x000000000040954a in proc_node_lock_tmr_exp_msg (evt=0x655290) at
clms_evt.c:390
rc = 1
node_id = 132367
op_node = 0x0
FUNCTION = "proc_node_lock_tmr_exp_msg"
#4 0x000000000040bc42 in clms_process_mbx (mbx=0x6298a0) at clms_evt.c:1272
msg = 0x655290
FUNCTION = "clms_process_mbx"
#5 0x0000000000412b3b in main (argc=1, argv=0x7fff3162cb28) at clms_main.c:455
ret = 1
mbx_fd = {raise_obj = 11, rmv_obj = 12}
error = SA_AIS_OK
rc = 1
FUNCTION = "main"
syslog on sc-1:
==============
Mar 13 12:27:23 SLES1 osafclmd[6575]: clms_evt.c:390:
proc_node_lock_tmr_exp_msg: Assertion 'op_node != NULL' failed.
Mar 13 12:27:23 SLES1 osafamfnd[6604]: NO
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
Mar 13 12:27:23 SLES1 osafamfnd[6604]: ER
safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Mar 13 12:27:23 SLES1 osafamfnd[6604]: Rebooting OpenSAF NodeId? = 131343 EE
Name = , Reason: Component faulted: recovery is node failfast
Mar 13 12:27:23 SLES1 opensaf_reboot: Rebooting local node
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets