Hi Praveen,
Thank you so much for the info.
I have another same setup where Opensaf 4.2.2 is running. As per your input I
have modified imm.xml as below:
<object class="SaAmfComp">
<dn>safComp=HMSComp_n11s4,safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4</dn>
<attr>
<name>saAmfCompType</name>
<value>safVersion=4.0.0,safCompType=hmsCompType_n11s4</value>
</attr>
<attr>
<name>saAmfCompInstantiationLevel</name>
<value>1</value>
</attr>
<attr>
<name>saAmfCompCleanupTimeout </name>
<value>10000000000</value>
</attr>
<attr>
<name>saAmfCompCmdEnv</name>
<value>AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2</value>
<value>AMF_DEMO_VAR3=COMP1_VALUE3</value>
<value>AMF_DEMO_VAR4=COMP1_VALUE4</value>
</attr>
</object>
But this didn’t work. Issue still exist.
Do I need to test this on 4.4.2 version only.
However I observed that by changing the value “OPENSAF_TERMTIMEOUT=1000” in
nid.conf giving me expected result but only once or twice.
Below are the captured logs while working:
1st attempt:
= = = = =
Aug 30 17:22:14 localhost kernel: grsec: From 172.16.11.2: signal 11 sent to
/hegw/gsw/bin/hms[hms:31056] uid/euid:0/0 gid/egid:0/0, parent
/sbin/init[init:1] uid/euid:0/0 gid/egid:0/0 by /bin/bash[bash:29706]
uid/euid:0/0 gid/egid:0/0, parent /bin/login[login:29705] uid/euid:0/0
gid/egid:0/0
Aug 30 17:22:28 localhost osafamfnd[30931]:
'safComp=HMSComp_n11s4,safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
faulted due to 'healthCheckcallbackTimeout' : Recovery is 'componentRestart'
Aug 30 17:22:29 localhost AMF_DEMO: CMD=cleanup
Aug 30 17:22:29 localhost AMF_DEMO_VAR: AMF_DEMO_VAR4=COMP1_VALUE4
Aug 30 17:22:29 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1
Aug 30 17:22:29 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2
Aug 30 17:22:29 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:22:29 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2
Aug 30 17:22:29 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:22:32 localhost AMF_DEMO: CMD=instantiate
Aug 30 17:22:32 localhost AMF_DEMO_VAR: AMF_DEMO_VAR4=COMP1_VALUE4
Aug 30 17:22:32 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1
Aug 30 17:22:32 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2
Aug 30 17:22:32 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:22:32 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:22:39 localhost osafamfnd[30931]:
'safComp=HMSComp_n11s4,safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
faulted due to 'healthCheckcallbackTimeout' : Recovery is 'componentRestart'
Aug 30 17:22:39 localhost AMF_DEMO: CMD=cleanup
Aug 30 17:22:39 localhost AMF_DEMO_VAR: AMF_DEMO_VAR4=COMP1_VALUE4
Aug 30 17:22:39 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1
Aug 30 17:22:39 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2
Aug 30 17:22:39 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:22:39 localhost AMF_DEMO: CMD=instantiate
Aug 30 17:22:39 localhost AMF_DEMO_VAR: AMF_DEMO_VAR4=COMP1_VALUE4
Aug 30 17:22:39 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1
Aug 30 17:22:39 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2
Aug 30 17:22:39 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:22:46 localhost osafamfnd[30931]:
'safComp=HMSComp_n11s4,safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
faulted due to 'healthCheckcallbackTimeout' : Recovery is 'componentRestart'
Aug 30 17:22:46 localhost AMF_DEMO: CMD=cleanup
Aug 30 17:22:46 localhost AMF_DEMO_VAR: AMF_DEMO_VAR4=COMP1_VALUE4
Aug 30 17:22:46 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1
Aug 30 17:22:46 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1
Aug 30 17:22:46 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2
Aug 30 17:22:46 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:22:49 localhost AMF_DEMO: CMD=instantiate
Aug 30 17:22:49 localhost AMF_DEMO_VAR: AMF_DEMO_VAR4=COMP1_VALUE4
Aug 30 17:22:49 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1
Aug 30 17:22:49 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2
Aug 30 17:22:49 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
2nd attempt :
= = = = = =
Aug 30 17:32:16 localhost osafamfnd[1726]:
'safComp=HMSComp_n11s4,safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
faulted due to 'healthCheckcallbackTimeout' : Recovery is 'componentRestart'
Aug 30 17:32:16 localhost AMF_DEMO: CMD=cleanup
Aug 30 17:32:16 localhost AMF_DEMO_VAR: AMF_DEMO_VAR4=COMP1_VALUE4
Aug 30 17:32:16 localhost AMF_DEMO_VAR: AMF_DEMO_VAR1=CT_VALUE1
Aug 30 17:32:16 localhost AMF_DEMO_VAR: AMF_DEMO_VAR2=COMP1_OVERLOAD_VALUE2
Aug 30 17:32:16 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:32:16 localhost AMF_DEMO_VAR: AMF_DEMO_VAR3=COMP1_VALUE3
Aug 30 17:32:26 localhost osafamfnd[1726]: Cleanup of
'safComp=HMSComp_n11s4,safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
failed
Aug 30 17:32:26 localhost osafamfnd[1726]: Reason:'Script did not exit within
time'
Aug 30 17:32:26 localhost osafamfnd[1726]: SU Failover trigerred for
'safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4': Failed component:
'safComp=HMSComp_n11s4,safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
Aug 30 17:32:26 localhost osafamfnd[1726]:
'safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4' Presence State
INSTANTIATED => TERMINATION_FAILED
Aug 30 17:32:26 localhost osafamfnd[1726]: Assigning
'safSi=HenbGw,safApp=HenbGwApp_PL_n11s4' QUIESCED to
'safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
Aug 30 17:32:26 localhost osafamfnd[1726]: Assigned
'safSi=HenbGw,safApp=HenbGwApp_PL_n11s4' QUIESCED to
'safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
Aug 30 17:32:26 localhost osafamfnd[1726]: Removing
'safSi=HenbGw,safApp=HenbGwApp_PL_n11s4' from
'safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
Aug 30 17:32:26 localhost osafamfnd[1726]: Removed
'safSi=HenbGw,safApp=HenbGwApp_PL_n11s4' from
'safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
Aug 30 17:32:26 localhost osafamfnd[1726]: Removed
'safSi=HenbGw,safApp=HenbGwApp_PL_n11s4' from
'safSu=SU-n11s4,safSg=HenbGw-SG,safApp=HenbGwApp_PL_n11s4'
Aug 30 17:36:37 localhost xinetd[1321]: START: shell pid=2307 from=172.16.11.2
Aug 30 17:36:37 localhost rshd[2308]: root@n11s2 as root:
cmd='/hegw/hgsm/sbin/computeOtherModuleUsage 4'
Aug 30 17:36:42 localhost xinetd[1321]: EXIT: shell status=0 pid=2307
duration=5(sec)
Aug 30 17:36:49 localhost xinetd[1321]: START: shell pid=2341 from=172.16.11.2
Aug 30 17:36:49 localhost rshd[2342]: root@n11s2 as root:
cmd='/hegw/hgsm/sbin/computeOtherProcsUsage 4'
Aug 30 17:36:49 localhost xinetd[1321]: EXIT: shell status=0 pid=2341
duration=0(sec)
My question is
Why changing the value of OPENSAF_TERMTIMEOUT is not working every time.
My expectation is Opensaf will try for 'componentRestart' several times as in
1st attempt log.
Need your help.
Thanks You.
Regards
Dheeraj
>>
[Praveen] I guess intention is to increase timeout for clean up script.
It can be done by changing saAmfCompCleanupTimeout in component (object of
class "SaAmfComp") or by changing saAmfCtDefClcCliTimeout in comptype (class
"SaAmfCompType") of component. If changed in comptype, it will be applicable to
each component of this comptype provided comp is not overriding it by
configuring saAmfCompCleanupTimeout.
Thanks
Praveen
>>
============================================================================================================================
Disclaimer: This message and the information contained herein is proprietary
and confidential and subject to the Tech Mahindra policy statement, you may
review the policy at http://www.techmahindra.com/Disclaimer.html
<http://www.techmahindra.com/Disclaimer.html> externally
http://tim.techmahindra.com/tim/disclaimer.html
<http://tim.techmahindra.com/tim/disclaimer.html> internally within
TechMahindra.
============================================================================================================================
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users