Thanks for the information. I think root cause of this issue is that amfnd let
2 sequence of component life cycle happening parellel
. The first error is component registration timeout
Oct 23 12:47:26.169639 osafamfnd [421:clc.cc:0500] NO Reason: component
registration timer expired
Oct 23 12:47:26.169657 osafamfnd [421:clc.cc:1573] >>
avnd_comp_clc_xxxing_instfail_hdler:
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon': Instantiate
fail event in Instantiating/Restarting State
Oct 23 12:47:26.169691 osafamfnd [421:clc.cc:2942] T1 CLC CLI command
arguments[1] ='cleanup'
. The second error is component crash detected by ava_down
Oct 23 12:47:29.177223 osafamfnd [421:err.cc:0407] NO
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' faulted due to
'avaDown' : Recovery is 'componentRestart'
Oct 23 12:47:29.177508 osafamfnd [421:clc.cc:1573] >>
avnd_comp_clc_xxxing_instfail_hdler:
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon': Instantiate
fail event in Instantiating/Restarting State
Oct 23 12:47:29.177548 osafamfnd [421:clc.cc:2942] T1 CLC CLI command
arguments[1] ='cleanup'
The first cleanup succeeds, amfnd starts clc instantiate
Oct 23 12:47:29.181102 osafamfnd [421:clc.cc:1702] >>
avnd_comp_clc_xxxing_cleansucc_hdler:
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon': Cleanup
success event in the instantiating/restarting state
Oct 23 12:47:29.181127 osafamfnd [421:clc.cc:2942] T1 CLC CLI command
arguments[1] ='instantiate'
The problem comes up when amfnd is waiting the return of clc instantiate, the
return of second clc cleanup comes. At the begining, the second clc cleanup is
just redundant because the first clc cleanup has been in progress
As of this though, I created the patch for this ticket as in attachment. The
patch avoids the same clc commands running in a row if amfnd has not got
clc_resp (which indicates timeout/error) from the others. Still testing it.
@Praveen: the solution of this patch sounds ok to you?
Attachments:
-
[1557.patch](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/680d40d6/8dc7/attachment/1557.patch)
(1.5 kB; application/octet-stream)
---
** [tickets:#1557] Comp fails in INSTANTIATION_FAILED because comp crashes
after compRegistration timeout**
**Status:** accepted
**Milestone:** 4.5.2
**Labels:** INSTANTIATION_FAILED component registration
**Created:** Fri Oct 23, 2015 02:17 AM UTC by Minh Hon Chau
**Last Updated:** Tue Oct 27, 2015 07:00 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**
-
[app3_twon2su1si.xml](https://sourceforge.net/p/opensaf/tickets/1557/attachment/app3_twon2su1si.xml)
(10.5 kB; text/xml)
-
[amf_demo_script](https://sourceforge.net/p/opensaf/tickets/1557/attachment/amf_demo_script)
(1.9 kB; application/octet-stream)
- [log.tgz](https://sourceforge.net/p/opensaf/tickets/1557/attachment/log.tgz)
(698.3 kB; application/x-compressed-tar)
-
[amf_demo.diff](https://sourceforge.net/p/opensaf/tickets/1557/attachment/amf_demo.diff)
(2.0 kB; text/x-patch)
Steps reproduce:
. Apply amf_demo.diff and build amf_demo, using attached amf_demo_script as clc
script
. Run commands:
. immcfg -f app3_twon2su1si.xml
. echo 1 > /root/hu23992
. amf-adm unlock-in safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Logs:
Oct 23 12:47:19 PL-4 osafamfnd[421]: NO
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State UNINSTANTIATED
=> INSTANTIATING
Oct 23 12:47:19 PL-4 amf_demo_script: CLC-START:
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:22 PL-4 amf_demo[585]:
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' started
Oct 23 12:47:26 PL-4 osafamfnd[421]: NO Instantiation of
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' failed
Oct 23 12:47:26 PL-4 osafamfnd[421]: NO Reason: component registration timer
expired
Oct 23 12:47:26 PL-4 amf_demo_script: CLC-STOP:
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:27 PL-4 amf_demo[585]: Registered with AMF and HC started
Oct 23 12:47:27 PL-4 amf_demo[585]: Health check 1
Oct 23 12:47:29 PL-4 amf_demo[585]: exiting (caught term signal)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' component restart probation
timer started (timeout: 10000000000 ns)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO Restarting a component of
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' (comp restart count: 1)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' faulted due to
'avaDown' : Recovery is 'componentRestart'
Oct 23 12:47:29 PL-4 amf_demo_script: CLC-STOP:
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:29 PL-4 amf_demo_script: CLC-START:
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:32 PL-4 amf_demo[628]:
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' started
Oct 23 12:47:32 PL-4 amf_demo[628]: exiting (caught term signal)
Oct 23 12:47:32 PL-4 osafamfnd[421]: WA
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State
INSTANTIATING => INSTANTIATION_FAILED
Oct 23 12:47:32 PL-4 osafamfnd[421]: NO
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State INSTANTIATING
=> INSTANTIATION_FAILED
Trace is also attached.
Initial analysis:
. After comp timeout in component_registration phase, amfnd enters
instantiating_fail, thus cleanup clc is called
. Then comp crashed, amfnd receives ava_mds_down, amfnd also enters
instantiating_fail for component, another cleanup clc is called.
. Eventually, at the returns of two cleanup clc, amfnd will enters
cleanup_success twice under instantiating state of component
. At the second cleanup_success, the retry_counter has reach retry_max, so
component fails into INSTANTIATION_FAILED
As first thought, amfnd should not enter instantiating_fail when comp is
crashed, since it has been already in handling of instantiating_fail.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets