Hi David, Just adding on top of Mohan's response. You can configure probation time(given example is of 10 seconds) as per your requirements.
Thanks & Best Regards -Nagendra, +91-9866424860 www.GetHighAvailability.com Get High Availability Today! NJ, USA: +1 508-507-6507 | Hyderabad, India: +91 798-992-5293 -----Original Message----- From: Mohan Kanakam [mailto:mo...@hasolutions.in] Sent: 04 May 2020 16:13 To: 'Hoyt, David'; Opensaf-users@lists.sourceforge.net Cc: Nagendra Kumar Subject: RE: [users] escalate to node reboot Hi David, You can set the following parameters in the Service Group: Component restart max hits threshold: saAmfSGCompRestartMax: You can set it as 2 if you want to escalate to SU restart after 2 component restart. saAmfSGCompRestartProb: You need to set this in nano seconds. For example: 10000000000 i.e. 10 seconds, if you want to escalate SU restart if components restart to 2 times within 10 seconds. saAmfSGCompRestartMax: You can set it as 2 if you want to escalate to SU Failover after 2 SU restart. saAmfSGSuRestartProb: You need to set this in nano seconds. For example: 10000000000 i.e. 10 seconds, if you want to escalate SU failover if SU restart to 2 times within 10 seconds. saAmfNodeSuFailoverMax: You can set it as 2 if you want to escalate to SU Failover after 2 SU Failover. saAmfNodeSuFailOverProb: You need to set this in nano seconds. For example: 10000000000 i.e. 10 seconds, if you want to escalate Node failover if SU failover to 2 times within 10 seconds. >> Basically, after a couple of retries, I want the node to reboot if the application cannot run on it. If you want to directly jump on Node failover after 2 times of component restart, then set the following values: saAmfSGCompRestartMax: 0 saAmfSGCompRestartProb: 1000000000 i.e. 1 second saAmfSGCompRestartMax: 0 saAmfSGSuRestartProb: 1000000000 i.e. 1 second saAmfNodeSuFailoverMax: 2 saAmfNodeSuFailOverProb: 10000000000 i.e. 10 seconds After 2 kills, if you kill the component again, then it will reboot the node as below: 2020-05-04T15:58:34.068017+05:30 osafamfnd[7209]: NO SU failovers have reached configured limit of 2 2020-05-04T15:58:34.071693+05:30 VirtualBox osafamfnd[7209]: NO SU failover probation timer stopped 2020-05-04T15:58:34.073870+05:30 VirtualBox osafamfnd[7209]: NO 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' recovery action escalated from 'componentRestart' to 'nodeFailover' 2020-05-04T15:58:34.077648+05:30 VirtualBox osafamfnd[7209]: NO 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' faulted due to 'avaDown' : Recovery is 'nodeFailover' 2020-05-04T15:58:34.081016+05:30 VirtualBox osafamfnd[7209]: NO Terminating all application components (abruptly & unordered) 2020-05-04T15:58:34.105582+05:30 VirtualBox osafamfnd[7209]: IN 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State INSTANTIATED => TERMINATING 2020-05-04T15:58:34.110504+05:30 VirtualBox osafamfnd[7209]: NO 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State INSTANTIATED => TERMINATING 2020-05-04T15:58:34.111269+05:30 VirtualBox osafamfnd[7209]: NO 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State TERMINATING => TERMINATING 2020-05-04T15:58:34.279655+05:30 VirtualBox amf_demo_script: 1. Stopping component....: 0 2020-05-04T15:58:34.320264+05:30 VirtualBox amf_demo_script: 2. Stopping component....: 0 2020-05-04T15:58:34.339804+05:30 VirtualBox osafamfnd[7209]: IN 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State TERMINATING => UNINSTANTIATED 2020-05-04T15:58:34.340415+05:30 VirtualBox osafamfnd[7209]: NO Terminated all application components 2020-05-04T15:58:34.341337+05:30 VirtualBox osafamfnd[7209]: NO Informing director of node fail-over Hope it helps! Thanks & Regards Mohan Kanakam | +91-8333082448 Software Engineer High Availability Solutions www.GetHighAvailability.com Get High Availability Today ! NJ, USA: +1 508-422-7725 | Hyderabad, India: +91 798-992-5293 -----Original Message----- From: Hoyt, David [mailto:dh...@rbbn.com] Sent: Tuesday, April 28, 2020 7:06 PM To: Opensaf-users@lists.sourceforge.net Subject: [users] escalate to node reboot Hi all, With all the SG, SU and component variables, I'm trying to determine what I need to set in the imm.xml file for the following: * Component restart max hits threshold, escalate to SU failure * SU failure max escalates to SU failover followed by node reboot Basically, after a couple of retries, I want the node to reboot if the application cannot run on it. Setup: 2 nodes: SC-1,SC-2 Running opensaf-5.19.10 Virtualization: kvm Operating System: Red Hat Enterprise Linux Server 7.8 (Maipo) Kernel: Linux 3.10.0-1127.el7.x86_64 Architecture: x86-64 Regards, David ---------------------------------------------------------------------------- ------------------------------------------- Notice: This e-mail together with any attachments may contain information of Ribbon Communications Inc. that is confidential and/or proprietary for the sole use of the intended recipient. Any review, disclosure, reliance or distribution by others or forwarding without express permission is strictly prohibited. If you are not the intended recipient, please notify the sender immediately and then delete all copies, including any attachments. ---------------------------------------------------------------------------- ------------------------------------------- _______________________________________________ Opensaf-users mailing list Opensaf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-users _______________________________________________ Opensaf-users mailing list Opensaf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-users