Hi Tocino García, José Tomás,

I am able to simulate the issue, you are facing.
There are two problems:
1. The binary is not getting started because the absolute path is not there.
2. The passive monitoring binary needs to know the PID file path.

Please correct the following two things and everything should work fine.
1. Change your script like below(Here you are storing pid in a file
/var/run/simplepm.pid):

#!/bin/sh
nohup /etc/init.d/simplepm > /dev/null 2>&1 & echo $! >
/var/run/simplepm.pid
exit 0

2. Add the pid path(--file=/var/run/simplepm.pid, stored above in the
script) for monitoring in net-snmp.xml:
                <attr>
                        <name>saAmfCtRelPathAmStartCmd</name>
                        <value>../../usr/local/sbin/amfpm --start
--file=/var/run/simplepm.pid</value>
                </attr>
                <attr>
                        <name>saAmfCtRelPathAmStopCmd</name>
                        <value>../../usr/local/sbin/amfpm --stop
--file=/var/run/simplepm.pid</value>


Now you can kill the component, Amf will detect. But since,
saAmfSutDefSUFailover is marked as true and saAmfSGAutoRepair is marked as
false, so Amf will not restart the component. To see getting the component
restarted, please also make the following change in net-snmp.xml:

                <attr>
                        <name>saAmfSutDefSUFailover</name>
                        <value>0</value>
                </attr>

                <attr>
                        <name>saAmfSGAutoRepair</name>
                        <value>1</value>
                </attr>

Now, if you kill simplepm, then Amf will restart it.

Try and let me know.

Thanks & Best Regards
-Nagendra | +91-9866424860
www.GetHighAvailability.com 
Get High Availability Today!
NJ, USA: +1 508-422-7725    |    Hyderabad, India: +91 798-992-5293 


-----Original Message-----
From: Tocino García, José Tomás [ELIMCO]
[mailto:elimco.jttoci...@navantia.es] 
Sent: 20 February 2020 18:51
To: opensaf-users@lists.sourceforge.net
Subject: [users] Example of basic passive monitoring fails

Hello.

I'm trying to create a simple test to launch a non-sa-aware non-proxied
binary (called simplepm, it prints a message every second on syslog
indefinitely) with passive monitoring. Given this binary is not a daemon, if
I set the saAmfCtRelPathInstantiateCmd to "simplepm" it will fail after a
while with a timeout because the instantiation script does not exit.

Feb 20 12:04:01 proc0105 osafamfnd[3334]: NO Instantiation of
'safComp=simplepm,safSu=1,safSg=2N,safApp=simplepm' failed
Feb 20 12:04:01 proc0105 osafamfnd[3334]: NO Reason:'Script did not exit
within time'

That's why I've tried creating a simple "instantiate.sh" script to send the
binary to the background:

#!/bin/bash

nohup ./simplepm &
exit 0

But it does not seem to work at all. When I unlock and unlock-in the service
unit, this appears:

Feb 20 12:34:13 proc0105 osafamfnd[5612]: NO Assigning
'safSi=1,safApp=simplepm' ACTIVE to 'safSu=1,safSg=2N,safApp=simplepm'
Feb 20 12:34:13 proc0105 osafamfnd[5612]: NO
'safSu=1,safSg=2N,safApp=simplepm' Presence State UNINSTANTIATED =>
INSTANTIATING
Feb 20 12:34:13 proc0105 osafamfnd[5612]: NO
'safSu=1,safSg=2N,safApp=simplepm' Presence State INSTANTIATING =>
INSTANTIATED
Feb 20 12:34:13 proc0105 osafamfnd[5612]: NO Assigned
'safSi=1,safApp=simplepm' ACTIVE to 'safSu=1,safSg=2N,safApp=simplepm'

Seemingly the process should be running but it's not (ps shows nothing), and
after 20 seconds the following appears

Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO saAmfSUFailover is true for
'safSu=1,safSg=2N,safApp=simplepm'
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO SU failover probation timer
started (timeout: 1200000000000 ns, failovers: 0, max failovers: 2) after SU
failover.
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO Performing failover of
'safSu=1,safSg=2N,safApp=simplepm' (SU failover count: 1)
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO
'safComp=simplepm,safSu=1,safSg=2N,safApp=simplepm' recovery action
escalated from 'componentFailover' to 'suFailover'
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO
'safComp=simplepm,safSu=1,safSg=2N,safApp=simplepm' faulted due to
'activeMonitorFailed' : Recovery is 'suFailover'
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO Terminating components of
'safSu=1,safSg=2N,safApp=simplepm'(abruptly & unordered)
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO
'safSu=1,safSg=2N,safApp=simplepm' Presence State INSTANTIATED =>
TERMINATING
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO
'safSu=1,safSg=2N,safApp=simplepm' Presence State TERMINATING => TERMINATING
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO Terminated all components in
'safSu=1,safSg=2N,safApp=simplepm'
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO Informing director of
sufailover
Feb 20 12:34:33 proc0105 osafamfnd[5612]: NO
'safSu=1,safSg=2N,safApp=simplepm' Presence State TERMINATING =>
UNINSTANTIATED

I don't know if it's related, but I'm using amfpm for the monitoring, with
the following configuration:

        <attr>
            <name>saAmfCtRelPathInstantiateCmd</name>
            <value>instantiate.sh</value>
        </attr>
        <attr>
            <name>saAmfCtRelPathCleanupCmd</name>
            <value>cleanup.sh</value>
        </attr>
        <attr>
            <name>saAmfCtRelPathTerminateCmd</name>
            <value>terminate.sh</value>
        </attr>
        <attr>
            <name>saAmfCtRelPathAmStartCmd</name>
            <value>../../usr/sbin/amfpm --start</value>
        </attr>
        <attr>
            <name>saAmfCtRelPathAmStopCmd</name>
            <value>../../usr/sbin/amfpm --stop</value>
        </attr>

What am I missing?

Thanks in advance.
Regards.

--
José Tomás Tocino García
Ingeniero Informático - System Infrastructure Team

Ubicación: Edif. Integración LBTS F110 / F105 / SCOMBA, Navantia Sistemas,
SF
Email: elimco.jttoci...@navantia.es<mailto:elimco.jttoci...@navantia.es>
Tfno: 856 30 9163
[logoSoologicSmall]





[Navantia]
________________________________

NAVANTIA S.A. S.M.E. Este mensaje y cualquier fichero anexo al mismo
contiene información de carácter confidencial dirigida exclusivamente a
su(s) destinatario(s) y, en su caso, sometida a secreto profesional. Queda
prohibida su difusión, copia o distribución a terceros sin la previa
autorización escrita. Si Vd. ha recibido este mensaje por error, se ruega lo
comunique inmediatamente por esta misma vía y proceda a su completa
eliminación. Puede revisar nuestra política de privacidad en
http://www.navantia.es/es/legal/.

The information in this e-mail and in any attachments is confidential and,
if any, protected by a professional privilege and intended solely for the
attention and use of the named address(es). You are hereby notified that any
dissemination, copy or distribution of this information is prohibited
without the prior written consent. If you have received this communication
in error, please notify the sender by reply e-mail and delete it. You can
review our privacy policy at http://www.navantia.es/en/legal/.

________________________________

[Navantia] Piense en el medio ambiente. ¿Necesita realmente imprimir este
correo? Please care for the environment. Do you really need to print this
e-mail?

_______________________________________________
Opensaf-users mailing list
Opensaf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-users



_______________________________________________
Opensaf-users mailing list
Opensaf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to