On 18-Dec-15 11:12 AM, Minh Hon Chau wrote:
> Agree with you amfnd should perform comp-failover. The timer expires
> since no other fault happens but the previous recovery (comp-failover)
> had been determined and haven't completed.
>
Also we need to note here, by default in imm.xml value for 
saAmfNodeSuFailOverProb is kept high around 20 minutes so that timer 
will be runnig for long time. And I think any cleanup script will not 
take this much time. At the same time high value of 
saAmfNodeSuFailOverProb is justified as node should not reboot with 
single fault of su level so easily, for which there is option of 
directly configuring node-failover as recovery policy.

Thanks,
Praveen
> ------------------------------------------------------------------------
>
> *[tickets:#1590] <http://sourceforge.net/p/opensaf/tickets/1590/>
> Shutdown node hang if component calls saAmfFinalize during component
> failover*
>
> *Status:* accepted
> *Milestone:* 4.6.2
> *Labels:* hanging shutdown node shutdown nodegroup
> *Created:* Tue Nov 10, 2015 05:03 AM UTC by Minh Hon Chau
> *Last Updated:* Tue Nov 10, 2015 07:41 AM UTC
> *Owner:* Minh Hon Chau
> *Attachments:*
>
>   * osafamfnd
>     <https://sourceforge.net/p/opensaf/tickets/1590/attachment/osafamfnd> 
> (336.3
>     kB; application/octet-stream)
>   * syslog
>     <https://sourceforge.net/p/opensaf/tickets/1590/attachment/syslog>
>     (314.7 kB; application/octet-stream)
>
> The admin command shutdown node (or nodegroup) will hang if component
> calls saAmfFinalize during component failover. Trace is attached.
>
> Scenario:
> . Issue admin shutdown node
> . component rejects quiescing assignment
> saAmfCSIQuiescingComplete(SA_AIS_ERR_FAILED_OPERATION)
> . component calls saAmfFinalize, finalizing handle
> . Due to failure of quiescing assignment, component failover recovery is
> started. As result of it, clc cleanup is called.
> . The event finalize handle comes before clc cleanup returns ok.
> . avnd_comp_clc_terming_cleansucc_hdler() is handling cleanup success
> case. The quiescing sequence can't be continued because
> avnd_comp_cmplete_all_assignment() currently seems to handle normal
> case, which is callback list exist. But the fact component is
> unregistered, all handles are deleted by saAmfFinalize. No
> su_si_oper_done is sent to amfd at the end, thus the command hang until
> timeout
>
> Another similiar test is done on amf_demo, which calls saAmfFinalize
> when component receives sigterm. The assignment is quiesced then removed
> successfully, since amfnd is "aware of " unregistered component during
> quiesced assignment sequence.
>
> The quiescing assignment sequence should be aware of unregistered
> component this case, in order to avoid hanging shutdown node. Or
> saAmfFinalize should return TRY_AGAIN, to be analyzing ...
>
> ------------------------------------------------------------------------
>
> Sent from sourceforge.net because you indicated interest in
> https://sourceforge.net/p/opensaf/tickets/1590/
>
> To unsubscribe from further messages, please visit
> https://sourceforge.net/auth/subscriptions/
>


---

** [tickets:#1590] Shutdown node hang if component calls saAmfFinalize during 
component failover**

**Status:** accepted
**Milestone:** 4.6.2
**Labels:** hanging shutdown node shutdown nodegroup 
**Created:** Tue Nov 10, 2015 05:03 AM UTC by Minh Hon Chau
**Last Updated:** Tue Nov 10, 2015 07:41 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**

- 
[osafamfnd](http://sourceforge.net/p/opensaf/tickets/1590/attachment/osafamfnd) 
(336.3 kB; application/octet-stream)
- [syslog](http://sourceforge.net/p/opensaf/tickets/1590/attachment/syslog) 
(314.7 kB; application/octet-stream)


The admin command shutdown node (or nodegroup) will hang if component calls 
saAmfFinalize during component failover. Trace is attached.

Scenario:
. Issue admin shutdown node
. component rejects quiescing assignment 
saAmfCSIQuiescingComplete(SA_AIS_ERR_FAILED_OPERATION)
. component calls saAmfFinalize, finalizing handle
. Due to failure of quiescing assignment, component failover recovery is 
started. As result of it, clc cleanup is called.
. The event finalize handle comes before clc cleanup returns ok.
. avnd_comp_clc_terming_cleansucc_hdler() is handling cleanup success case. The 
quiescing sequence can't be continued because 
avnd_comp_cmplete_all_assignment() currently seems to handle normal case, which 
is callback list exist. But the fact component is unregistered, all handles are 
deleted by saAmfFinalize. No su_si_oper_done is sent to amfd at the end, thus 
the command hang until timeout

Another similiar test is done on amf_demo, which calls saAmfFinalize when 
component receives sigterm. The assignment is quiesced then removed 
successfully, since amfnd is "aware of " unregistered component during quiesced 
assignment sequence.

The quiescing assignment sequence should be aware of unregistered component 
this case, in order to avoid hanging shutdown node. Or saAmfFinalize should 
return TRY_AGAIN, to be analyzing ... 







---

Sent from sourceforge.net because [email protected] is 
subscribed to http://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
http://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to