Tested the following scenarios on SLES SP2  (x86_64):
1. Did kill and admin restart on ckptnd, it came up successfully.
2. Kept sleep in ckptnd terminate callback and did admin restart. Ckptnd came 
up successfully.

Ack with the following component:
1.      The similar changes need to be done for Amfwd.
2.      When I was testing while keeping sleep in terminate callback, I 
observed that start clc-cli script waits at killproc till Ckptnd keeps sleeping
        and comes out when Cpknd exits. Hope that is expected. I mean it 
doesn't terminate the process until it is alive.
3.      In all 'stop' (cleanup case), ' rm -f $termfile' could be added outside 
check of RETVAL. 

Thanks
-Nagu

> -----Original Message-----
> From: Mathivanan Naickan Palanivelu
> Sent: 06 May 2015 20:17
> To: [email protected]; Ramesh Babu Betham;
> [email protected]; Nagendra Kumar; Praveen Malviya
> Cc: [email protected]
> Subject: [PATCH 1 of 1] osaf: During adminrestart of node directors, before 
> re-
> instantiating kill them [#1326]
> 
>  osaf/services/saf/cpsv/cpnd/cpnd_amf.c              |  14 +++++++++++++-
>  osaf/services/saf/cpsv/cpnd/scripts/osaf-ckptnd.in  |  13 +++++++++++++
>  osaf/services/saf/glsv/glnd/glnd_amf.c              |  14 ++++++++++++++
>  osaf/services/saf/glsv/glnd/scripts/osaf-lcknd.in   |  13 +++++++++++++
>  osaf/services/saf/immsv/immnd/immnd_amf.c           |  11 +++++++++++
>  osaf/services/saf/immsv/immnd/scripts/osaf-immnd.in |  18
> ++++++++++++++++++
>  osaf/services/saf/mqsv/mqnd/mqnd_amf.c              |  11 +++++++++++
>  osaf/services/saf/mqsv/mqnd/scripts/osaf-msgnd.in   |  13 +++++++++++++
>  osaf/services/saf/smfsv/smfnd/scripts/osaf-smfnd.in |  13 +++++++++++++
>  osaf/services/saf/smfsv/smfnd/smfnd_amf.c           |  11 +++++++++++
>  10 files changed, 130 insertions(+), 1 deletions(-)
> 
> 
> The command $ amf-adm restart <DN name> is one way of administratively
> restarting an AMF component.
> As apart of this admin operation, AMF sends the component terminate
> callback
> to the PI components. It is up to the component to release all its resources
> and respond
> to AMF the status of its self-termination before exiting (typically) the 
> process
> itself.
> After receiving the response from the component, AMF invokes the
> instantiation script of
> the component. During this time, it is possible that the previously running
> instance
> of the process (of this component) has not yet exited. This situation when
> there is already a running daemon/process and now a new instantiation is
> being attempted
> can cause the instantiation script to return failure.
> This patch creates temporary term_state_file from inside the component
> terminate callback
> of the node directors.
> In the instantiation scripts, a check is done to distinguish a
> a fresh instantiation versus an instantiation after a termination.
> If the term_state_file exists then it means, its an instantiation after
> termination.
> If so, just attempt to kill (using killproc) the process again before calling
> start_daemon.
> 
> Note: There has been mention of using start_daemon -f option which will
> create another
> copy of the daemon if the previous daemon is still running. Using this option
> may not
> be ideal for us as it can create any inconsistency between the two daemons
> when
> using any resources and also, there is no proof or documentation of
> start_daemon -f
> working successfully. This is even more significant given that some distros 
> are
> really slow in becoming LSB compliant, particularly the start_daemon and the
> likes of it.
> 
> diff --git a/osaf/services/saf/cpsv/cpnd/cpnd_amf.c
> b/osaf/services/saf/cpsv/cpnd/cpnd_amf.c
> --- a/osaf/services/saf/cpsv/cpnd/cpnd_amf.c
> +++ b/osaf/services/saf/cpsv/cpnd/cpnd_amf.c
> @@ -35,7 +35,9 @@
> 
> *****************************************************************
> *************/
> 
>  #include "cpnd.h"
> +#include "configmake.h"
> 
> +static const char *term_state_file = PKGPIDDIR "/osafckptnd_termstate";
> 
> /****************************************************************
> ************
>   * Name          : cpnd_saf_health_chk_callback
>   *
> @@ -232,13 +234,23 @@ void cpnd_amf_comp_terminate_callback(Sa
>  {
>       CPND_CB *cb = NULL;
>       SaAisErrorT saErr = SA_AIS_OK;
> +     int fd;
> +     TRACE_ENTER();
> 
> -     TRACE_ENTER();
>       cb = ncshm_take_hdl(NCS_SERVICE_ID_CPND, gl_cpnd_cb_hdl);
>       if (cb == NULL) {
>               LOG_ER("cpnd cb take handle failed in amf term callback");
>               return;
>       }
> +
> +     fd = open(term_state_file, O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
> +
> +     if (fd >=0)
> +             (void)close(fd);
> +     else
> +             LOG_NO("cannot create termstate file %s: %s",
> +                                     term_state_file, strerror(errno));
> +
>       saAmfResponse(cb->amf_hdl, invocation, saErr);
>       ncshm_give_hdl(gl_cpnd_cb_hdl);
>       LOG_NO("Received AMF component terminate callback, exiting");
> diff --git a/osaf/services/saf/cpsv/cpnd/scripts/osaf-ckptnd.in
> b/osaf/services/saf/cpsv/cpnd/scripts/osaf-ckptnd.in
> --- a/osaf/services/saf/cpsv/cpnd/scripts/osaf-ckptnd.in
> +++ b/osaf/services/saf/cpsv/cpnd/scripts/osaf-ckptnd.in
> @@ -30,10 +30,21 @@ fi
>  binary=$pkglibdir/$prog
>  pidfile=$pkgpiddir/$prog.pid
>  lockfile=$lockdir/$initscript
> +termfile=$pkgpiddir/$prog"_termstate"
> 
>  RETVAL=0
> 
>  start() {
> +     #If the term file exists, it means instantiation is
> +     #attempted after a termination For eg:- during administrative
> +     #restart of a component. In this case, first try to kill
> +     #the component since it might be seen as still running while exiting
> +     #via the termination callback or termination scripts(in case of NPI).
> +     #Note: start_daemon -f may also be used to create another copy of
> the daemon,
> +     #but the behaviour of -f option has not been tested yet!
> +
> +     [ -e $termfile ] && killproc -p $pidfile $binary
> +
>       export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
>       [ -x $binary ] || exit 5
>       echo -n "Starting $prog: "
> @@ -41,6 +52,7 @@ start() {
>       RETVAL=$?
>       if [ $RETVAL -eq 0 ]; then
>               touch $lockfile
> +             rm -f $termfile
>               log_success_msg
>       else
>               log_failure_msg
> @@ -55,6 +67,7 @@ stop() {
>       if [ $RETVAL -eq 0 ] || [ $RETVAL -eq 7 ]; then
>               rm -f $lockfile
>               log_success_msg
> +             rm -f $termfile
>               RETVAL=0
>       else
>               log_failure_msg
> diff --git a/osaf/services/saf/glsv/glnd/glnd_amf.c
> b/osaf/services/saf/glsv/glnd/glnd_amf.c
> --- a/osaf/services/saf/glsv/glnd/glnd_amf.c
> +++ b/osaf/services/saf/glsv/glnd/glnd_amf.c
> @@ -35,6 +35,8 @@
> 
> *****************************************************************
> *************/
> 
>  #include "glnd.h"
> +#include "configmake.h"
> +
>  void glnd_amf_comp_terminate_callback(SaInvocationT invocation, const
> SaNameT *compName);
>  void glnd_saf_health_chk_callback(SaInvocationT invocation,
>                                 const SaNameT *compName, const
> SaAmfHealthcheckKeyT *checkType);
> @@ -45,6 +47,7 @@ void glnd_amf_CSI_set_callback(SaInvocat
>  void glnd_amf_csi_rmv_callback(SaInvocationT invocation,
>                              const SaNameT *compName, const SaNameT
> *csiName, SaAmfCSIFlagsT csiFlags);
> 
> +static const char *term_state_file = PKGPIDDIR "/osaflcknd_termstate";
> 
> /****************************************************************
> ************
>   * Name          : glnd_saf_health_chk_callback
>   *
> @@ -114,17 +117,28 @@ void glnd_amf_comp_terminate_callback(Sa
>       GLND_CB *glnd_cb;
>       SaAisErrorT error = SA_AIS_OK;
>       TRACE_ENTER2("Component Name: %s", compName->value);
> +     int fd;
> 
>       /* take the handle */
>       glnd_cb = (GLND_CB *)m_GLND_TAKE_GLND_CB;
>       if (!glnd_cb) {
>               LOG_ER("GLND cb take handle failed");
>       } else {
> +
> +             fd = open(term_state_file, O_CREAT | O_RDWR, S_IRUSR |
> S_IWUSR);
> +
> +             if (fd >=0)
> +                     (void)close(fd);
> +             else
> +                     LOG_NO("cannot create termstate file %s: %s",
> +                                     term_state_file, strerror(errno));
> +
>               if (saAmfResponse(glnd_cb->amf_hdl, invocation, error) !=
> SA_AIS_OK)
>                       LOG_ER("GLND amf response failed");
>               /* giveup the handle */
>               m_GLND_GIVEUP_GLND_CB;
>       }
> +
>       LOG_NO("Received AMF component terminate callback, exiting");
>       TRACE_LEAVE();
> 
> diff --git a/osaf/services/saf/glsv/glnd/scripts/osaf-lcknd.in
> b/osaf/services/saf/glsv/glnd/scripts/osaf-lcknd.in
> --- a/osaf/services/saf/glsv/glnd/scripts/osaf-lcknd.in
> +++ b/osaf/services/saf/glsv/glnd/scripts/osaf-lcknd.in
> @@ -30,10 +30,21 @@ fi
>  binary=$pkglibdir/$prog
>  pidfile=$pkgpiddir/$prog.pid
>  lockfile=$lockdir/$initscript
> +termfile=$pkgpiddir/$prog"_termstate"
> 
>  RETVAL=0
> 
>  start() {
> +     #If the term file exists, it means instantiation is
> +     #attempted after a termination For eg:- during administrative
> +     #restart of a component. In this case, first try to kill
> +     #the component since it might be seen as still running while exiting
> +     #via the termination callback or termination scripts(in case of NPI).
> +     #Note: start_daemon -f may also be used to create another copy of
> the daemon,
> +     #but the behaviour of -f option has not been tested yet!
> +
> +     [ -e $termfile ] && killproc -p $pidfile $binary
> +
>       export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
>       [ -x $binary ] || exit 5
>       echo -n "Starting $prog: "
> @@ -41,6 +52,7 @@ start() {
>       RETVAL=$?
>       if [ $RETVAL -eq 0 ]; then
>               touch $lockfile
> +             rm -f $termfile
>               log_success_msg
>       else
>               log_failure_msg
> @@ -55,6 +67,7 @@ stop() {
>       if [ $RETVAL -eq 0 ] || [ $RETVAL -eq 7 ]; then
>               rm -f $lockfile
>               log_success_msg
> +             rm -f $termfile
>               RETVAL=0
>       else
>               log_failure_msg
> diff --git a/osaf/services/saf/immsv/immnd/immnd_amf.c
> b/osaf/services/saf/immsv/immnd/immnd_amf.c
> --- a/osaf/services/saf/immsv/immnd/immnd_amf.c
> +++ b/osaf/services/saf/immsv/immnd/immnd_amf.c
> @@ -18,7 +18,9 @@
>  #include "immnd.h"
>  #include <nid_start_util.h>
>  #include "osaf_extended_name.h"
> +#include "configmake.h"
> 
> +static const char *term_state_file = PKGPIDDIR "/osafimmnd_termstate";
> 
> /****************************************************************
> ************
>   * Name          : immnd_saf_health_chk_callback
>   *
> @@ -73,12 +75,21 @@ static void immnd_saf_health_chk_callbac
>  static void immnd_amf_comp_terminate_callback(SaInvocationT invocation,
> const SaNameT *compName)
>  {
>       TRACE_ENTER();
> +     int fd;
> 
>       if (immnd_cb->pbePid > 0)
>               kill(immnd_cb->pbePid, SIGTERM);
>       if (immnd_cb->syncPid > 0)
>               kill(immnd_cb->syncPid, SIGTERM);
> 
> +     fd = open(term_state_file, O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
> +
> +     if (fd >=0)
> +             (void)close(fd);
> +     else
> +             LOG_NO("cannot create termstate file %s: %s",
> +                                     term_state_file, strerror(errno));
> +
>       LOG_NO("Received AMF component terminate callback, exiting");
>       saAmfResponse(immnd_cb->amf_hdl, invocation, SA_AIS_OK);
> 
> diff --git a/osaf/services/saf/immsv/immnd/scripts/osaf-immnd.in
> b/osaf/services/saf/immsv/immnd/scripts/osaf-immnd.in
> --- a/osaf/services/saf/immsv/immnd/scripts/osaf-immnd.in
> +++ b/osaf/services/saf/immsv/immnd/scripts/osaf-immnd.in
> @@ -31,12 +31,17 @@ fi
>  binary=$pkglibdir/$prog
>  pidfile=$pkgpiddir/$prog.pid
>  lockfile=$lockdir/$initscript
> +termfile=$pkgpiddir/$prog"_termstate"
> 
>  RETVAL=0
>  NIDSERV="IMMND"
>  COMPNAMEFILE=$pkglocalstatedir/immnd_comp_name
> 
>  start() {
> +     # remove any termination file created previously via
> +     # AMF component termination callback
> +     rm -f $termfile
> +
>       export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
>       [ -p $NIDFIFO ] || exit 1
>          if [ ! -x $binary ]; then
> @@ -58,6 +63,16 @@ start() {
>  }
> 
>  instantiate() {
> +     #If the term file exists, it means instantiation is
> +     #attempted after a termination For eg:- during administrative
> +     #restart of a component. In this case, first try to kill
> +     #the component since it might be seen as still running while exiting
> +     #via the termination callback or termination scripts(in case of NPI).
> +     #Note: start_daemon -f may also be used to create another copy of
> the daemon,
> +     #but the behaviour of -f option has not been tested yet!
> +
> +     [ -e $termfile ] && killproc -p $pidfile $binary
> +
>       echo -n "AMF Instantiating $prog: "
>       echo $SA_AMF_COMPONENT_NAME > $COMPNAMEFILE
>       pidofproc -p $pidfile $binary
> @@ -71,9 +86,11 @@ instantiate() {
>       fi
>       if [ $RETVAL -eq 0 ]; then
>               log_success_msg
> +             rm -f $termfile
>       else
>               log_failure_msg
>       fi
> +
>       return $RETVAL
>  }
> 
> @@ -86,6 +103,7 @@ stop() {
>       if [ $RETVAL -eq 0 ] || [ $RETVAL -eq 7 ]; then
>               rm -f $lockfile
>               rm -f $COMPNAMEFILE
> +             rm -f $termfile
>               log_success_msg
>               RETVAL=0
>       else
> diff --git a/osaf/services/saf/mqsv/mqnd/mqnd_amf.c
> b/osaf/services/saf/mqsv/mqnd/mqnd_amf.c
> --- a/osaf/services/saf/mqsv/mqnd/mqnd_amf.c
> +++ b/osaf/services/saf/mqsv/mqnd/mqnd_amf.c
> @@ -35,6 +35,7 @@
> 
> *****************************************************************
> *************/
> 
>  #include "mqnd.h"
> +#include "configmake.h"
> 
>  static void mqnd_saf_health_chk_callback(SaInvocationT invocation, const
> SaNameT *compName,
>                                        SaAmfHealthcheckKeyT *checkType);
> @@ -47,6 +48,7 @@ static void mqnd_amf_CSI_set_callback(Sa
>                                     const SaNameT *compName,
>                                     SaAmfHAStateT haState,
> SaAmfCSIDescriptorT csiDescriptor);
> 
> +static const char *term_state_file = PKGPIDDIR "/osafmsgnd_termstate";
> 
> /****************************************************************
> ************
>   * Name          : mqnd_saf_health_chk_callback
>   *
> @@ -227,6 +229,7 @@ static void mqnd_amf_comp_terminate_call
>       TRACE_ENTER();
> 
>       uint32_t cb_hdl = m_MQND_GET_HDL();
> +     int fd;
> 
>       /* Get the CB from the handle */
>       mqnd_cb = ncshm_take_hdl(NCS_SERVICE_ID_MQND, cb_hdl);
> @@ -236,6 +239,14 @@ static void mqnd_amf_comp_terminate_call
>               return;
>       }
> 
> +     fd = open(term_state_file, O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
> +
> +     if (fd >=0)
> +             (void)close(fd);
> +     else
> +             LOG_NO("cannot create termstate file %s: %s",
> +                                     term_state_file, strerror(errno));
> +
>       saAmfResponse(mqnd_cb->amf_hdl, invocation, saErr);
>       LOG_ER("Amf Terminate Callback called");
> 
> diff --git a/osaf/services/saf/mqsv/mqnd/scripts/osaf-msgnd.in
> b/osaf/services/saf/mqsv/mqnd/scripts/osaf-msgnd.in
> --- a/osaf/services/saf/mqsv/mqnd/scripts/osaf-msgnd.in
> +++ b/osaf/services/saf/mqsv/mqnd/scripts/osaf-msgnd.in
> @@ -30,10 +30,21 @@ fi
>  binary=$pkglibdir/$prog
>  pidfile=$pkgpiddir/$prog.pid
>  lockfile=$lockdir/$initscript
> +termfile=$pkgpiddir/$prog"_termstate"
> 
>  RETVAL=0
> 
>  start() {
> +     #If the term file exists, it means instantiation is
> +     #attempted after a termination For eg:- during administrative
> +     #restart of a component. In this case, first try to kill
> +     #the component since it might be seen as still running while exiting
> +     #via the termination callback or termination scripts(in case of NPI).
> +     #Note: start_daemon -f may also be used to create another copy of
> the daemon,
> +     #but the behaviour of -f option has not been tested yet!
> +
> +     [ -e $termfile ] && killproc -p $pidfile $binary
> +
>       export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
>       [ -x $binary ] || exit 5
>       echo -n "Starting $prog: "
> @@ -42,6 +53,7 @@ start() {
>       if [ $RETVAL -eq 0 ]; then
>               touch $lockfile
>               log_success_msg
> +             rm -f $termfile
>       else
>               log_failure_msg
>       fi
> @@ -55,6 +67,7 @@ stop() {
>       if [ $RETVAL -eq 0 ] || [ $RETVAL -eq 7 ]; then
>               rm -f $lockfile
>               log_success_msg
> +             rm -f $termfile
>               RETVAL=0
>       else
>               log_failure_msg
> diff --git a/osaf/services/saf/smfsv/smfnd/scripts/osaf-smfnd.in
> b/osaf/services/saf/smfsv/smfnd/scripts/osaf-smfnd.in
> --- a/osaf/services/saf/smfsv/smfnd/scripts/osaf-smfnd.in
> +++ b/osaf/services/saf/smfsv/smfnd/scripts/osaf-smfnd.in
> @@ -30,10 +30,21 @@ fi
>  binary=$pkglibdir/$prog
>  pidfile=$pkgpiddir/$prog.pid
>  lockfile=$lockdir/$initscript
> +termfile=$pkgpiddir/$prog"_termstate"
> 
>  RETVAL=0
> 
>  start() {
> +     #If the term file exists, it means instantiation is
> +     #attempted after a termination For eg:- during administrative
> +     #restart of a component. In this case, first try to kill
> +     #the component since it might be seen as still running while exiting
> +     #via the termination callback or termination scripts(in case of NPI).
> +     #Note: start_daemon -f may also be used to create another copy of
> the daemon,
> +     #but the behaviour of -f option has not been tested yet!
> +
> +     [ -e $termfile ] && killproc -p $pidfile $binary
> +
>       export LD_LIBRARY_PATH=$pkglibdir:$LD_LIBRARY_PATH
>       [ -x $binary ] || exit 5
>       echo -n "Starting $prog: "
> @@ -42,6 +53,7 @@ start() {
>       if [ $RETVAL -eq 0 ]; then
>               touch $lockfile
>               log_success_msg
> +             rm -f $termfile
>       else
>               log_failure_msg
>       fi
> @@ -55,6 +67,7 @@ stop() {
>       if [ $RETVAL -eq 0 ] || [ $RETVAL -eq 7 ]; then
>               rm -f $lockfile
>               log_success_msg
> +             rm -f $termfile
>               RETVAL=0
>       else
>               log_failure_msg
> diff --git a/osaf/services/saf/smfsv/smfnd/smfnd_amf.c
> b/osaf/services/saf/smfsv/smfnd/smfnd_amf.c
> --- a/osaf/services/saf/smfsv/smfnd/smfnd_amf.c
> +++ b/osaf/services/saf/smfsv/smfnd/smfnd_amf.c
> @@ -20,7 +20,9 @@
>   */
> 
>  #include "smfnd.h"
> +#include "configmake.h"
> 
> +static const char *term_state_file = PKGPIDDIR "/osafsmfnd_termstate";
> 
> /****************************************************************
> ************
>   * Name          : amf_health_chk_callback
>   *
> @@ -107,6 +109,15 @@ static void amf_csi_set_callback(SaInvoc
>  static void amf_comp_terminate_callback(SaInvocationT invocation, const
> SaNameT * compName)
>  {
>       TRACE_ENTER();
> +     int fd;
> +
> +     fd = open(term_state_file, O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
> +
> +     if (fd >=0)
> +             (void)close(fd);
> +     else
> +             LOG_NO("cannot create termstate file %s: %s",
> +                                     term_state_file, strerror(errno));
> 
>       saAmfResponse(smfnd_cb->amf_hdl, invocation, SA_AIS_OK);
> 

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to