Hi Praveen,

Thanks for review, I have commented inline.

  * Escalation and Recovery during SC absence period:
-Restarts will work as normal, but failover or switchover will result in Node
-Failfast. The repair action will be initiated when a SC returns if
-saAmfSGAutoRepair is enabled.
+Component and su restarts will work as normal. Any fail-over or switch-over at
+component, su, and node level will only cleanup faulty components. Recovery 
will
+be delayed until a SC returns: the fail-over or switch-over of SI assignments
+will be initiated if saAmfSGAutoRepair is enabled, the node will be reboot if
+saAmfNodeAutoRepair, aAmfNodeFailfastOnTerminationFailure, or
+saAmfNodeFailfastOnInstantiationFailure is enabled.
[Praveen] I think there is no dependecy of failover and switchover of 
assignents on saAmfSgAutoRepair.
Should the sentence be like this?
  " Recovery (failover or switchvoer of assignments) will be delayed until a SC 
returns.
When first SC comes up after SC absebce state AMF will perform pending repairs:

[Minh]: This part is about escalation and recovery which is initiated by 
su_oper message, it does depend on saAmfSgAutoRepair which is checked in 
su_try_repair(), so I am not going to change the text

+* Possible loss of RTA updates and SI assignment messages
+If both SCs go down abruptly (SCs are immediately powered-off for instance),
+AMFD could fail to update RTA to IMM, the SI assignment messages sent from
+AMFND could not reach to AMFD, recovery could be impossible.
+
[Praveen] Should be mention the case of loss of assignment reseponse from AMFND 
to AMFD?
Also I think we should mention impact of this loss, something like:
"In case of loss of RTA and SI assignments, AMF will not be able to fully 
recover assignments. Thus application
may go in inconsistent state."

[Minh]: I rewrites the text as: "If both SCs go down abruptly (SCs are 
immediately powered-off for instance), AMFD could fail to update RTA to IMM, 
the SI assignment request message sent from
AMFD could not reach to AMFND, or the SI assignment response message sent from 
AMFND also could not reach to AMFD. In such cases, recovery could be 
impossible, application may have inappropriate assignment states"

One query: It's known in ticket #2210 that loss of mbcsv checkpoint in sc 
failover in normal cluster can also happen as similar as loss of RTA when both 
SCs go headless. For the loss of SI assignment messages, although AMFD is using 
MDS in redundant view but the SI assignment is not synchronization, I wonder if 
someone abruptly power off active controller when active amfd is about 
receiving the assignment message, or when amfnd just sends out the assignment 
response message but does not reach to amfds?



On 15/03/17 16:26, praveen malviya wrote:
> +saAmfNodeFailfastOnInstantiationFailure is enabled.
> [Praveen] I think there is no dependecy of failover and switchover of 
> assignents on saAmfSgAutoRepair.
> Should the sentence be like this?
>   " Recovery (failover or switchvoer of assignments) will be delayed until a 
> SC returns.
> When first SC comes up after SC absebce state AMF will perform pending 
> repairs:


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to