Hi Hans N,
                Thanks for review. 
As explained in the ticket and as I explained in the previous emails and in the 
review request, that this patch solves chicken-egg problem of Immnd and Amfnd.

I am not able to simulate the problem mentioned in the ticket by Bertil because 
that was rare case and was never explained how to reproduce.

Since there are many places Amfnd is using saImmOmSearchInitialize_2(), so 
keeping 180 seconds sleep will definitely cause the problem, which you have 
seen, but will not be seen if the same thing happens during Immnd goes down and 
Amfnd wants to read information in SU Instantiation. So, the patch solves the 
problem even if you keep sleep of 180 sec during SU instantiation.

So, the problem of reading from Immnd hangs issue will be there in many flows, 
which needs further action of modifying the code,
But the patch sent for review is the first thing to start.
So, after this patch, we need to work on the following work items in this 
ticket:

1.      Pushing Imm reading information in other flows of the code in the 
separate thread created in the floated patch: This is not going to be easy as 
Amfnd execution and Reading Imm goes together and segregating them into two 
flows are not easy.
2.      Handling of BAD_HANDLE in all the places as done in Amfd.
3.      Handling of TIMEOUT in all the places as done in Amfd.

So, these items will come and we need to work on these items further. 

The easy steps to get chicken-egg issue:
1. Start controllers, upload Amf demo.
2. Now, add a new component in the SU. This is done because a new component 
will be read from avnd_evt_avd_su_pres_evh-> avnd_comp_config_get_su-> 
avnd_comp_create(). The old components(uploaded via immcfg -f) are already read 
in avnd_evt_avd_reg_su_evh().
1.     Keep sleep of 3 sec in Amfd in unlock_instantiation().
2.    Issue unlock-in on Demo SU.
3.    Keep 5 sec sleep in stop function of Immnd.
4.   Before sleep expiry, kill Immnd.
5.    After sleep, Amfd sends a request to Amfnd to instantiate SU, but it goes 
and call Imm api to read configuration.
        Since Immnd is not there, so it hangs in Initialize().
6. When Immnd comes up  after sleep and responds CLC script, it waits for Amfnd 
to call instantiate CLC, which is not possible as Amfnd is hung in Imm api.

Since this patch has a separate thread now to read Imm (implemented SU 
Instantiation), so would be first step to go.

Thanks
-Nagu
> -----Original Message-----
> From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com]
> Sent: 17 March 2016 21:29
> To: Nagendra Kumar; Praveen Malviya; minh.c...@dektech.com.au;
> gary....@dektech.com.au
> Cc: opensaf-devel@lists.sourceforge.net; Hans Nordebäck
> Subject: Re: [PATCH 1 of 1] amfnd: process su instantiation in a separate
> thread [#517]
> 
> Hi Nagu,
> 
> I did a quick test with a patch (attached in this mail) that provokes the 
> fault
> reported in #517. I did a quick test with the patch for review and the
> "provoking" patch, the fault seems to remain.
> 
> I tested according to below:
> - Apply both patches.
> - Start the system in UML.
> - Check traces on SC-2, tail -f root/var/SC-2/log/messages
> - When the system is up load the AmfDemo configuration, immcfg -f immcfg
> -f /hostfs/AppConfig-2N.xml
> 
> 
> Mar 17 16:55:12 SC-2 local0.err osafamfnd[467]: ER XXXX: Provoke ticket
> 517 fault, safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1
> Mar 17 16:55:33 SC-2 user.notice kernel: random: nonblocking pool is
> initialized
> Mar 17 16:56:09 SC-2 local0.err osafamfwd[631]: TIMEOUT receiving AMF
> health check request, generating core for amfnd
> Mar 17 16:56:09 SC-2 local0.err osafamfwd[631]: Last received
> healthcheck cnt=2
> 
> /Thanks HansN
> 
> 
> On 03/15/2016 09:26 AM, nagendr...@oracle.com wrote:
> >   osaf/services/saf/amf/amfnd/Makefile.am    |    3 +-
> >   osaf/services/saf/amf/amfnd/evt.cc         |   14 ++-
> >   osaf/services/saf/amf/amfnd/imm.cc         |  137
> +++++++++++++++++++++++++++++
> >   osaf/services/saf/amf/amfnd/include/avnd.h |    6 +
> >   osaf/services/saf/amf/amfnd/main.cc        |    3 +-
> >   5 files changed, 158 insertions(+), 5 deletions(-)
> >
> >
> > If immnd is down and Amf decides to instantiate some SU, then there
> > is a deadlock between Amfnd and Imm. Amf calls Imm api to read
> > imm attributes during SU instantiation and stuck because Imm
> > is down. Imm waits for Amfnd to instantiate/restart it.
> > So, both waits for each other. This is a deadlock.
> > So, the fix bypasses main thread for SU instantiation and does
> > process of SU instantiation in a separate thread.
> > So, when Imm is down, its instantiation/restart happens in the
> > main thread. If any SU instantiation request comes, then Amfnd
> > routes that request in a separate thread, that threads stuck in Imm
> > api call untill Imm is not restarted and responds to Imm api call.
> > This obviates the deadlock between Amfnd and Imm.
> > So, this fix has advantage for not having any mutex between threads.
> > The reason is that SU is yet to instantiate, so it is still
> > immune to faults of any components and hence no need of having any
> > mutex among the threads.
> >
> > diff --git a/osaf/services/saf/amf/amfnd/Makefile.am
> b/osaf/services/saf/amf/amfnd/Makefile.am
> > --- a/osaf/services/saf/amf/amfnd/Makefile.am
> > +++ b/osaf/services/saf/amf/amfnd/Makefile.am
> > @@ -60,7 +60,8 @@ osafamfnd_SOURCES = \
> >     term.cc \
> >     tmr.cc \
> >     util.cc \
> > -   verify.cc
> > +   verify.cc \
> > +   imm.cc
> >
> >   osafamfnd_LDADD = \
> >     $(top_builddir)/osaf/tools/safimm/src/libimmutil.la \
> > diff --git a/osaf/services/saf/amf/amfnd/evt.cc
> b/osaf/services/saf/amf/amfnd/evt.cc
> > --- a/osaf/services/saf/amf/amfnd/evt.cc
> > +++ b/osaf/services/saf/amf/amfnd/evt.cc
> > @@ -319,9 +319,17 @@ void avnd_evt_destroy(AVND_EVT *evt)
> >
> **************************************************************
> ****************/
> >   uint32_t avnd_evt_send(AVND_CB *cb, AVND_EVT *evt)
> >   {
> > -   uint32_t rc = m_NCS_IPC_SEND(&cb->mbx, evt, evt->priority);
> > -   if (NCSCC_RC_SUCCESS != rc)
> > -           LOG_CR("AvND send event to mailbox failed, type = %u",
> evt->type);
> > +   uint32_t rc;
> > +   if (evt->type == AVND_EVT_AVD_SU_PRES_MSG) {
> > +           TRACE("AVND_EVT_AVD_REG_SU_MSG");
> > +           rc = m_NCS_IPC_SEND(&ir_cb.mbx, evt, evt->priority);
> > +           if (NCSCC_RC_SUCCESS != rc)
> > +                   LOG_CR("AvND send event to mailbox failed, type =
> %u", evt->type);
> > +   } else {
> > +           rc = m_NCS_IPC_SEND(&cb->mbx, evt, evt->priority);
> > +           if (NCSCC_RC_SUCCESS != rc)
> > +                   LOG_CR("AvND send event to mailbox failed, type =
> %u", evt->type);
> > +   }
> >
> >     return rc;
> >   }
> > diff --git a/osaf/services/saf/amf/amfnd/imm.cc
> b/osaf/services/saf/amf/amfnd/imm.cc
> > new file mode 100644
> > --- /dev/null
> > +++ b/osaf/services/saf/amf/amfnd/imm.cc
> > @@ -0,0 +1,137 @@
> > +/*      -*- OpenSAF  -*-
> > + *
> > + * (C) Copyright 2016 The OpenSAF Foundation
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY
> > + * or FITNESS FOR A PARTICULAR PURPOSE. This file and program are
> licensed
> > + * under the GNU Lesser General Public License Version 2.1, February
> 1999.
> > + * The complete license can be accessed from the following location:
> > + * http://opensource.org/licenses/lgpl-license.php
> > + * See the Copying file included with the OpenSAF distribution for full
> > + * licensing terms.
> > + *
> > + * Author(s): Oracle
> > + *
> > + */
> > +
> > +#include "avnd.h"
> > +#include <poll.h>
> > +
> > +enum {
> > +   FD_MBX = 0
> > +};
> > +
> > +ir_cb_t ir_cb;
> > +
> > +extern const AVND_EVT_HDLR g_avnd_func_list[AVND_EVT_MAX];
> > +
> > +static struct pollfd fds[FD_MBX + 1];
> > +static nfds_t irfds = FD_MBX + 1;
> > +static void ir_process_event(AVND_EVT *evt);
> > +
> > +/**
> > + * This thread will read classes and object information thereby allowing
> main
> > + * thread to proceed with other important work.
> > + * As of now, SU instantiation needs imm data to be read, so
> > + * only this event is being handled here.
> > + * Going forward, we need to use this function for similar purposes.
> > + * @param _cb
> > + *
> > + * @return void*
> > + */
> > +static void *imm_reader_thread(void *_cb)
> > +{
> > +   uint32_t rc = SA_AIS_OK;
> > +   AVND_EVT *evt;
> > +   TRACE_ENTER();
> > +   NCS_SEL_OBJ mbx_fd;
> > +
> > +   /* create the mail box */
> > +   rc = m_NCS_IPC_CREATE(&ir_cb.mbx);
> > +   if (NCSCC_RC_SUCCESS != rc) {
> > +           LOG_CR("AvND Mailbox creation failed");
> > +           exit(EXIT_FAILURE);
> > +   }
> > +   TRACE("Ir mailbox creation success");
> > +
> > +   /* attach the mail box */
> > +   rc = m_NCS_IPC_ATTACH(&ir_cb.mbx);
> > +   if (NCSCC_RC_SUCCESS != rc) {
> > +           LOG_CR("Ir mailbox attach failed");
> > +           exit(EXIT_FAILURE);
> > +   }
> > +
> > +   mbx_fd = ncs_ipc_get_sel_obj(&ir_cb.mbx);
> > +   fds[FD_MBX].fd = mbx_fd.rmv_obj;
> > +   fds[FD_MBX].events = POLLIN;
> > +
> > +   while (1) {
> > +           if (poll(fds, irfds, -1) == (-1)) {
> > +                   if (errno == EINTR) {
> > +                           continue;
> > +                   }
> > +
> > +                   LOG_ER("poll Fail - %s", strerror(errno));
> > +                   exit(EXIT_FAILURE);
> > +           }
> > +
> > +           if (fds[FD_MBX].revents & POLLIN) {
> > +                   while (nullptr != (evt = (AVND_EVT
> *)ncs_ipc_non_blk_recv(&ir_cb.mbx)))
> > +                           ir_process_event(evt);
> > +           }
> > +   }
> > +   TRACE_LEAVE();
> > +}
> > +
> > +/**
> > + * Start a imm reader thread.
> > + * @param Nothing.
> > + * @return Nothing.
> > + */
> > +void imm_reader_thread_create(void)
> > +{
> > +   pthread_t thread;
> > +   pthread_attr_t attr;
> > +
> > +   TRACE_ENTER();
> > +
> > +   pthread_attr_init(&attr);
> > +   pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
> > +
> > +   if (pthread_create(&thread, &attr, imm_reader_thread, &ir_cb) != 0)
> {
> > +           LOG_ER("pthread_create FAILED: %s", strerror(errno));
> > +           exit(EXIT_FAILURE);
> > +   }
> > +
> > +   pthread_attr_destroy(&attr);
> > +
> > +   TRACE_LEAVE();
> > +}
> > +
> > +/**
> > + * This function process events, which requires reading data from imm.
> > + * As of now, SU instantiation needs to read data, so diverting SU
> > + * instantiation in this thread. Going forward, we need to use this
> > + * function for similar purposes.
> > + * @param Event pointer.
> > + * @return Nothing.
> > + */
> > +void ir_process_event(AVND_EVT *evt)
> > +{
> > +   uint32_t res = NCSCC_RC_SUCCESS;
> > +   AVND_EVT_TYPE evt_type =  evt->type;
> > +   TRACE_ENTER2("Evt type:%u", evt_type);
> > +
> > +   osafassert(AVND_EVT_AVD_SU_PRES_MSG == evt_type);
> > +
> > +   res = g_avnd_func_list[evt_type] (avnd_cb, evt);
> > +
> > +   /* log the result of event processing */
> > +   TRACE("Evt Type: '%u', '%s'", evt_type, ((res == NCSCC_RC_SUCCESS)
> ? "success" : "failure"));
> > +
> > +   if (evt)
> > +           avnd_evt_destroy(evt);
> > +
> > +   TRACE_LEAVE2("res '%u'", res);
> > +}
> > diff --git a/osaf/services/saf/amf/amfnd/include/avnd.h
> b/osaf/services/saf/amf/amfnd/include/avnd.h
> > --- a/osaf/services/saf/amf/amfnd/include/avnd.h
> > +++ b/osaf/services/saf/amf/amfnd/include/avnd.h
> > @@ -69,5 +69,11 @@
> >
> >   extern uint32_t avnd_create(void);
> >   extern void avnd_sigterm_handler(void);
> > +typedef struct {
> > +        SYSF_MBX mbx;
> > +}ir_cb_t;
> > +
> > +extern ir_cb_t ir_cb;
> > +extern void imm_reader_thread_create(void);
> >
> >   #endif
> > diff --git a/osaf/services/saf/amf/amfnd/main.cc
> b/osaf/services/saf/amf/amfnd/main.cc
> > --- a/osaf/services/saf/amf/amfnd/main.cc
> > +++ b/osaf/services/saf/amf/amfnd/main.cc
> > @@ -552,7 +552,7 @@ void avnd_main_process(void)
> >             LOG_ER("signal TERM failed: %s", strerror(errno));
> >             goto done;
> >     }
> > -
> > +   imm_reader_thread_create();
> >     mbx_fd = ncs_ipc_get_sel_obj(&avnd_cb->mbx);
> >     fds[FD_MBX].fd = mbx_fd.rmv_obj;
> >     fds[FD_MBX].events = POLLIN;
> > @@ -638,6 +638,7 @@ void avnd_evt_process(AVND_EVT *evt)
> >   done:
> >     if (evt)
> >             avnd_evt_destroy(evt);
> > +   TRACE_LEAVE();
> >   }
> >
> >
> /*************************************************************
> ****************
> 

------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to