Hi Tocino García, José Tomás, I am able to simulate the issue, you are facing. There are two problems: 1. The binary is not getting started because the absolute path is not there. 2. The passive monitoring binary needs to know the PID file path.
Please correct the following two things and everything should work fine. 1. Change your script like below(Here you are storing pid in a file /var/run/simplepm.pid): #!/bin/sh nohup /etc/init.d/simplepm > /dev/null 2>&1 & echo $! > /var/run/simplepm.pid exit 0 2. Add the pid path(--file=/var/run/simplepm.pid, stored above in the script) for monitoring in net-snmp.xml: <attr> <name>saAmfCtRelPathAmStartCmd</name> <value>../../usr/local/sbin/amfpm --start --file=/var/run/simplepm.pid</value> </attr> <attr> <name>saAmfCtRelPathAmStopCmd</name> <value>../../usr/local/sbin/amfpm --stop --file=/var/run/simplepm.pid</value> Now you can kill the component, Amf will detect. But since, saAmfSutDefSUFailover is marked as true and saAmfSGAutoRepair is marked as false, so Amf will not restart the component. To see getting the component restarted, please also make the following change in net-snmp.xml: <attr> <name>saAmfSutDefSUFailover</name> <value>0</value> </attr> <attr> <name>saAmfSGAutoRepair</name> <value>1</value> </attr> Now, if you kill simplepm, then Amf will restart it. Try and let me know. Thanks & Best Regards -Nagendra | +91-9866424860 www.GetHighAvailability.com Get High Availability Today! NJ, USA: +1 508-422-7725 | Hyderabad, India: +91 798-992-5293 -----Original Message----- From: Tocino García, José Tomás [ELIMCO] [mailto:elimco.jttoci...@navantia.es] Sent: 20 February 2020 18:51 To: opensaf-users@lists.sourceforge.net Subject: [users] Example of basic passive monitoring fails Hello. I'm trying to create a simple test to launch a non-sa-aware non-proxied binary (called simplepm, it prints a message every second on syslog indefinitely) with passive monitoring. Given this binary is not a daemon, if I set the saAmfCtRelPathInstantiateCmd to "simplepm" it will fail after a while with a timeout because the instantiation script does not exit. Feb 20 12:04:01 proc0105 osafamfnd[3334]: NO Instantiation of 'safComp=simplepm,safSu=1,safSg=2N,safApp=simplepm' failed Feb 20 12:04:01 proc0105 osafamfnd[3334]: NO Reason:'Script did not exit within time' That's why I've tried creating a simple "instantiate.sh" script to send the binary to the background: #!/bin/bash nohup ./simplepm & exit 0 But it does not seem to work at all. When I unlock and unlock-in the service unit, this appears: Feb 20 12:34:13 proc0105 osafamfnd[5612]: NO Assigning 'safSi=1,safApp=simplepm' ACTIVE to 'safSu=1,safSg=2N,safApp=simplepm' Feb 20 12:34:13 proc0105 osafamfnd[5612]: NO 'safSu=1,safSg=2N,safApp=simplepm' Presence State UNINSTANTIATED => INSTANTIATING Feb 20 12:34:13 proc0105 osafamfnd[5612]: NO 'safSu=1,safSg=2N,safApp=simplepm' Presence State INSTANTIATING => INSTANTIATED Feb 20 12:34:13 proc0105 osafamfnd[5612]: NO Assigned 'safSi=1,safApp=simplepm' ACTIVE to 'safSu=1,safSg=2N,safApp=simplepm' Seemingly the process should be running but it's not (ps shows nothing), and after 20 seconds the following appears Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO saAmfSUFailover is true for 'safSu=1,safSg=2N,safApp=simplepm' Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO SU failover probation timer started (timeout: 1200000000000 ns, failovers: 0, max failovers: 2) after SU failover. Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO Performing failover of 'safSu=1,safSg=2N,safApp=simplepm' (SU failover count: 1) Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO 'safComp=simplepm,safSu=1,safSg=2N,safApp=simplepm' recovery action escalated from 'componentFailover' to 'suFailover' Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO 'safComp=simplepm,safSu=1,safSg=2N,safApp=simplepm' faulted due to 'activeMonitorFailed' : Recovery is 'suFailover' Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO Terminating components of 'safSu=1,safSg=2N,safApp=simplepm'(abruptly & unordered) Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO 'safSu=1,safSg=2N,safApp=simplepm' Presence State INSTANTIATED => TERMINATING Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO 'safSu=1,safSg=2N,safApp=simplepm' Presence State TERMINATING => TERMINATING Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO Terminated all components in 'safSu=1,safSg=2N,safApp=simplepm' Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO Informing director of sufailover Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO 'safSu=1,safSg=2N,safApp=simplepm' Presence State TERMINATING => UNINSTANTIATED I don't know if it's related, but I'm using amfpm for the monitoring, with the following configuration: <attr> <name>saAmfCtRelPathInstantiateCmd</name> <value>instantiate.sh</value> </attr> <attr> <name>saAmfCtRelPathCleanupCmd</name> <value>cleanup.sh</value> </attr> <attr> <name>saAmfCtRelPathTerminateCmd</name> <value>terminate.sh</value> </attr> <attr> <name>saAmfCtRelPathAmStartCmd</name> <value>../../usr/sbin/amfpm --start</value> </attr> <attr> <name>saAmfCtRelPathAmStopCmd</name> <value>../../usr/sbin/amfpm --stop</value> </attr> What am I missing? Thanks in advance. Regards. -- José Tomás Tocino García Ingeniero Informático - System Infrastructure Team Ubicación: Edif. Integración LBTS F110 / F105 / SCOMBA, Navantia Sistemas, SF Email: elimco.jttoci...@navantia.es<mailto:elimco.jttoci...@navantia.es> Tfno: 856 30 9163 [logoSoologicSmall] [Navantia] ________________________________ NAVANTIA S.A. S.M.E. Este mensaje y cualquier fichero anexo al mismo contiene información de carácter confidencial dirigida exclusivamente a su(s) destinatario(s) y, en su caso, sometida a secreto profesional. Queda prohibida su difusión, copia o distribución a terceros sin la previa autorización escrita. Si Vd. ha recibido este mensaje por error, se ruega lo comunique inmediatamente por esta misma vía y proceda a su completa eliminación. Puede revisar nuestra política de privacidad en http://www.navantia.es/es/legal/. The information in this e-mail and in any attachments is confidential and, if any, protected by a professional privilege and intended solely for the attention and use of the named address(es). You are hereby notified that any dissemination, copy or distribution of this information is prohibited without the prior written consent. If you have received this communication in error, please notify the sender by reply e-mail and delete it. You can review our privacy policy at http://www.navantia.es/en/legal/. ________________________________ [Navantia] Piense en el medio ambiente. ¿Necesita realmente imprimir este correo? Please care for the environment. Do you really need to print this e-mail? _______________________________________________ Opensaf-users mailing list Opensaf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-users _______________________________________________ Opensaf-users mailing list Opensaf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-users