Hi MInh,
The case "1)When comp completes assignments in headless state". can still occur
in some cases when AMFNDs have responded for assignmets to AMFD and cluster
becomes headless before AMFD processes these mail box messages. In this case
after headless state, AMFD will get SUSI FSM states as MODIFY from IMM and
since AMFND had responded for the assignment it will not further send any SUSI
response. Now AMFD will be in misunderstanding that assignment event will come
and thus SG FSM will get triggered.
So inside the function avd_cluster_tmr_init_evh(), if condition:
if (i_sg->any_assignment_in_progress() == false) {
+ i_sg->set_fsm_state(AVD_SG_FSM_STABLE);
+ }
+
will find assignments in progress (AMFD has got atleast one SUSI in MODIFY
state from IMM) and thus SG will remain unstable. So now there will not be any
event from AMFND to trigger SG FSM. In such cases, AMFD will have to trigger
FSM by itself.
I think, while recreating SUSI from AMFNDs, AMFD can track for any SU in the
SG atelast one SUSI is in assigning state. This will ensure that atleast one
event will come to trigger SG FSM. But if all SIs are in assigned/removed state
in all SUs at all AMFNDs then there will not be any outside event that will
trigger SG FSM.
Thanks,
Praveen
---
** [tickets:#1725] AMF: Recover transient SUSIs left over from headless**
**Status:** review
**Milestone:** 5.1.FC
**Created:** Wed Apr 06, 2016 07:16 AM UTC by Minh Hon Chau
**Last Updated:** Thu Aug 11, 2016 12:42 PM UTC
**Owner:** Minh Hon Chau
This ticket is more likely an enhancement that targets on how AMFD detect and
recover the transients SUSI left over from headless. There are three major
situations:
(1) - Cluster goes headless, su/node failover on any payloads can happen, or
any payloads can be hard rebooted/powered off by operator, then cluster recover
(2) - issue admin op on any AMF entities, cluster goes headless. During
headless, the middle HA assignments of whole admin op sequence between AMFND
and components could be:
(2.1) The assignment completes, component returns OK with csi callback,
then cluster recover
(2.2) The assignment is under going, then cluster recover. The assignment
afterward could complete, or csi callback returns FAILED_OPERATION or error can
also happen
At the time cluster recover, amfd has collected all assignments from all
amfnd(s). These assignments can be in assigned or assigning states whilst its
HA states do not conform its SG redundancy. Any of (1) (2.1) (2.2) can happen
in a combination, which means while issuing admin op (2), cluster go headless
and any kinds of failover (1) can happen during headless.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. http://sdm.link/zohodev2dev
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets