Thanks for the information. I think root cause of this issue is that amfnd let 
2 sequence of component life cycle happening parellel
. The first error is component registration timeout 
Oct 23 12:47:26.169639 osafamfnd [421:clc.cc:0500] NO Reason: component 
registration timer expired
Oct 23 12:47:26.169657 osafamfnd [421:clc.cc:1573] >> 
avnd_comp_clc_xxxing_instfail_hdler: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon': Instantiate 
fail event in Instantiating/Restarting State
Oct 23 12:47:26.169691 osafamfnd [421:clc.cc:2942] T1 CLC CLI command 
arguments[1] ='cleanup'
. The second error is component crash detected by ava_down
Oct 23 12:47:29.177223 osafamfnd [421:err.cc:0407] NO 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' faulted due to 
'avaDown' : Recovery is 'componentRestart'
Oct 23 12:47:29.177508 osafamfnd [421:clc.cc:1573] >> 
avnd_comp_clc_xxxing_instfail_hdler: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon': Instantiate 
fail event in Instantiating/Restarting State
Oct 23 12:47:29.177548 osafamfnd [421:clc.cc:2942] T1 CLC CLI command 
arguments[1] ='cleanup'

The first cleanup succeeds, amfnd starts clc instantiate 
Oct 23 12:47:29.181102 osafamfnd [421:clc.cc:1702] >> 
avnd_comp_clc_xxxing_cleansucc_hdler: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon': Cleanup 
success event in the instantiating/restarting state
Oct 23 12:47:29.181127 osafamfnd [421:clc.cc:2942] T1 CLC CLI command 
arguments[1] ='instantiate'

The problem comes up when amfnd is waiting the return of clc instantiate, the 
return of second clc cleanup comes. At the begining, the second clc cleanup is 
just redundant because the first clc cleanup has been in progress

As of this though, I created the patch for this ticket as in attachment. The 
patch avoids the same clc commands running in a row if amfnd has not got 
clc_resp (which indicates timeout/error) from the others. Still testing it.

@Praveen: the solution of this patch sounds ok to you?




Attachments:

- 
[1557.patch](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/680d40d6/8dc7/attachment/1557.patch)
 (1.5 kB; application/octet-stream)


---

** [tickets:#1557] Comp fails in INSTANTIATION_FAILED because comp crashes 
after compRegistration timeout**

**Status:** accepted
**Milestone:** 4.5.2
**Labels:** INSTANTIATION_FAILED component registration 
**Created:** Fri Oct 23, 2015 02:17 AM UTC by Minh Hon Chau
**Last Updated:** Tue Oct 27, 2015 07:00 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**

- 
[app3_twon2su1si.xml](https://sourceforge.net/p/opensaf/tickets/1557/attachment/app3_twon2su1si.xml)
 (10.5 kB; text/xml)
- 
[amf_demo_script](https://sourceforge.net/p/opensaf/tickets/1557/attachment/amf_demo_script)
 (1.9 kB; application/octet-stream)
- [log.tgz](https://sourceforge.net/p/opensaf/tickets/1557/attachment/log.tgz) 
(698.3 kB; application/x-compressed-tar)
- 
[amf_demo.diff](https://sourceforge.net/p/opensaf/tickets/1557/attachment/amf_demo.diff)
 (2.0 kB; text/x-patch)


Steps reproduce:
. Apply amf_demo.diff and build amf_demo, using attached amf_demo_script as clc 
script
. Run commands:
   . immcfg -f app3_twon2su1si.xml
   . echo 1 > /root/hu23992
   . amf-adm unlock-in safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon

Logs:
Oct 23 12:47:19 PL-4 osafamfnd[421]: NO 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State UNINSTANTIATED 
=> INSTANTIATING
Oct 23 12:47:19 PL-4 amf_demo_script: CLC-START: 
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:22 PL-4 amf_demo[585]: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' started
Oct 23 12:47:26 PL-4 osafamfnd[421]: NO Instantiation of 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' failed
Oct 23 12:47:26 PL-4 osafamfnd[421]: NO Reason: component registration timer 
expired
Oct 23 12:47:26 PL-4 amf_demo_script: CLC-STOP: 
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:27 PL-4 amf_demo[585]: Registered with AMF and HC started
Oct 23 12:47:27 PL-4 amf_demo[585]: Health check 1
Oct 23 12:47:29 PL-4 amf_demo[585]: exiting (caught term signal)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' component restart probation 
timer started (timeout: 10000000000 ns)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO Restarting a component of 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' (comp restart count: 1)
Oct 23 12:47:29 PL-4 osafamfnd[421]: NO 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' faulted due to 
'avaDown' : Recovery is 'componentRestart'
Oct 23 12:47:29 PL-4 amf_demo_script: CLC-STOP: 
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:29 PL-4 amf_demo_script: CLC-START: 
safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
Oct 23 12:47:32 PL-4 amf_demo[628]: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' started
Oct 23 12:47:32 PL-4 amf_demo[628]: exiting (caught term signal)
Oct 23 12:47:32 PL-4 osafamfnd[421]: WA 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State 
INSTANTIATING => INSTANTIATION_FAILED
Oct 23 12:47:32 PL-4 osafamfnd[421]: NO 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State INSTANTIATING 
=> INSTANTIATION_FAILED

Trace is also attached.

Initial analysis:
. After comp timeout in component_registration phase, amfnd enters 
instantiating_fail, thus cleanup clc is called
. Then comp crashed, amfnd receives ava_mds_down, amfnd also enters 
instantiating_fail for component, another cleanup clc is called.
. Eventually, at the returns of two cleanup clc, amfnd will enters 
cleanup_success twice under instantiating state of component
. At the second cleanup_success, the retry_counter has reach retry_max, so 
component fails into INSTANTIATION_FAILED

As first thought, amfnd should not enter instantiating_fail when comp is 
crashed, since it has been already in handling of instantiating_fail.



---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to