Hi Praveen,
Regarding saAmfNodeAutoRepair being used in amfnd so amfnd can decide whether
to have an immediate reboot, I think amfnd can obtain this attribute in such a
way of saAmfNodeSuFailoverMax and saAmfNodeSuFailOverProb, in which it adds
saAmfNodeAutoRepair in AVSV_D2N_NODE_UP_MSG, AVSV_D2N_DATA_VERIFY_MSG, and
AVSV_D2N_OPERATION_REQUEST_MSG
But a node reboot due to nodeFailFast which is determined by
saAmfNodeFailfastOnTerminationFailure/saAmfNodeFailfastOnInstantiationFailure
has to wait until SC comeback from headless, because a node failfast relies on
saAmfSGAutoRepair and amfnd has no information about SG so amfnd can reboot
failfast.
So to minize (don't have to change and bump msg version) and uniform support of
a node reboot escalation (node failover/switch over/failfast) during headless,
I think it'd be better to wait until SC comes back, where amfd has enough
information to decide a node reboot.
Below is summary of amf's escalation behavior during headless for this ticket:
- componentRestart/suRestart (PI SU), perform as normally as non-headless
- componentFailover during headless, amfnd just cleanup faulty component, and
keeps csi assignment of other healthy component. These csi assignment will be
switched over once SC comes back
- suFailover, all components are cleanup.
- nodeFailover, all SU are cleanup
- nodeSwitchover, only faulty SU are cleanup, other healthy SUs will be
switched over once SC comes back
- nodeFailfast, a node failfast due to TERM-FAILED/INST-FAILED will be
performed once SC comes back (already implemented)
Also, amfnd needs to cache AVSV_N2D_OPERATION_STATE_MSG for NPI suRestart,
NodeFailover, SwitchOver so that amfnd will resend this msg once SC comeback.
That will trigger avd_su_oper_state_evh() and the failover/switchover can be
continued
Please let me know if something wrong with the above behavior
Thanks,
Minh
---
** [tickets:#1902] AMF: Remove node reboot if su/comp failover during headless**
**Status:** assigned
**Milestone:** 5.1.FC
**Created:** Wed Jun 29, 2016 12:02 PM UTC by Minh Hon Chau
**Last Updated:** Mon Jul 04, 2016 05:26 AM UTC
**Owner:** Minh Hon Chau
If a comp/su failover occurs during headless, amfnd will escalate to reboot.
This will unexpectedly impact on other comp/su which are up and running if
there's no node failover escalation configured on this faulty comp/su
2016-06-29 21:30:07 PL-4 osafamfnd[429]: NO
'safComp=AmfDemo2,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' faulted due
to 'avaDown' : Recovery is 'suFailover'
2016-06-29 21:30:07 PL-4 osafamfnd[429]: NO Terminating components of
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon'(abruptly & unordered)
2016-06-29 21:30:07 PL-4 osafamfnd[429]: NO
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State INSTANTIATED =>
TERMINATING
2016-06-29 21:30:07 PL-4 osafamfnd[429]: NO
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State TERMINATING =>
TERMINATING
2016-06-29 21:30:07 PL-4 osafamfnd[429]: NO
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' Presence State TERMINATING =>
TERMINATING
2016-06-29 21:30:07 PL-4 osafamfnd[429]: Rebooting OpenSAF NodeId = 132111 EE
Name = , Reason: Can't perform recovery while controllers are down. Recovery is
node failfast., OwnNodeId = 132111, SupervisionTime = 60
2016-06-29 21:30:07 PL-4 opensaf_reboot: Rebooting local node; timeout=60
This ticket will remove unexpected reboot due to failover during headless which
is mentioned as limitation in AMF opensaf documentation.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets