Hi Hans N, I concur with your understanding and we need to make Amfnd implementation also the same way as is there in Amfd going forward as I mentioned in the email.
Thanks for your review and testing time. Thanks -Nagu > -----Original Message----- > From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com] > Sent: 18 March 2016 12:16 > To: Nagendra Kumar; Praveen Malviya; minh.c...@dektech.com.au; > gary....@dektech.com.au > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 1 of 1] amfnd: process su instantiation in a separate > thread [#517] > > Hi Nagu, > > the purpose with the "provoke" was to point to two problems mentioned in > the ticket, "immutil should not do abort and the other problem is that > immutil do sleeps in try again loops. Avnd is event based and immutil is > configurable but errors and try-again logic has to be managed by avnd." > if you change the sleep(180) to return an error code instead, e.g. > return SA_AIS_ERR_BAD_HANDLE; immutil takes the decision to abort, > shouldn't we configure immutil for amfnd the same as for amfd? > > I will try to finish the review today. > > /Thanks HansN > > > On 03/18/2016 06:15 AM, Nagendra Kumar wrote: > > Hi Hans N, > > Thanks for review. > > As explained in the ticket and as I explained in the previous emails and in > the review request, that this patch solves chicken-egg problem of Immnd and > Amfnd. > > > > I am not able to simulate the problem mentioned in the ticket by Bertil > because that was rare case and was never explained how to reproduce. > > > > Since there are many places Amfnd is using saImmOmSearchInitialize_2(), > so keeping 180 seconds sleep will definitely cause the problem, which you > have seen, but will not be seen if the same thing happens during Immnd goes > down and Amfnd wants to read information in SU Instantiation. So, the patch > solves the problem even if you keep sleep of 180 sec during SU instantiation. > > > > So, the problem of reading from Immnd hangs issue will be there in > > many flows, which needs further action of modifying the code, But the > patch sent for review is the first thing to start. > > So, after this patch, we need to work on the following work items in this > ticket: > > > > 1. Pushing Imm reading information in other flows of the code in the > separate thread created in the floated patch: This is not going to be easy as > Amfnd execution and Reading Imm goes together and segregating them into > two flows are not easy. > > 2. Handling of BAD_HANDLE in all the places as done in Amfd. > > 3. Handling of TIMEOUT in all the places as done in Amfd. > > > > So, these items will come and we need to work on these items further. > > > > The easy steps to get chicken-egg issue: > > 1. Start controllers, upload Amf demo. > > 2. Now, add a new component in the SU. This is done because a new > component will be read from avnd_evt_avd_su_pres_evh-> > avnd_comp_config_get_su-> avnd_comp_create(). The old > components(uploaded via immcfg -f) are already read in > avnd_evt_avd_reg_su_evh(). > > 1. Keep sleep of 3 sec in Amfd in unlock_instantiation(). > > 2. Issue unlock-in on Demo SU. > > 3. Keep 5 sec sleep in stop function of Immnd. > > 4. Before sleep expiry, kill Immnd. > > 5. After sleep, Amfd sends a request to Amfnd to instantiate SU, but it > goes and call Imm api to read configuration. > > Since Immnd is not there, so it hangs in Initialize(). > > 6. When Immnd comes up after sleep and responds CLC script, it waits for > Amfnd to call instantiate CLC, which is not possible as Amfnd is hung in Imm > api. > > > > Since this patch has a separate thread now to read Imm (implemented SU > Instantiation), so would be first step to go. > > > > Thanks > > -Nagu > >> -----Original Message----- > >> From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com] > >> Sent: 17 March 2016 21:29 > >> To: Nagendra Kumar; Praveen Malviya; minh.c...@dektech.com.au; > >> gary....@dektech.com.au > >> Cc: opensaf-devel@lists.sourceforge.net; Hans Nordebäck > >> Subject: Re: [PATCH 1 of 1] amfnd: process su instantiation in a > >> separate thread [#517] > >> > >> Hi Nagu, > >> > >> I did a quick test with a patch (attached in this mail) that provokes > >> the fault reported in #517. I did a quick test with the patch for > >> review and the "provoking" patch, the fault seems to remain. > >> > >> I tested according to below: > >> - Apply both patches. > >> - Start the system in UML. > >> - Check traces on SC-2, tail -f root/var/SC-2/log/messages > >> - When the system is up load the AmfDemo configuration, immcfg -f > >> immcfg -f /hostfs/AppConfig-2N.xml > >> > >> > >> Mar 17 16:55:12 SC-2 local0.err osafamfnd[467]: ER XXXX: Provoke > >> ticket > >> 517 fault, safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 > >> Mar 17 16:55:33 SC-2 user.notice kernel: random: nonblocking pool is > >> initialized Mar 17 16:56:09 SC-2 local0.err osafamfwd[631]: TIMEOUT > >> receiving AMF health check request, generating core for amfnd Mar 17 > >> 16:56:09 SC-2 local0.err osafamfwd[631]: Last received healthcheck > >> cnt=2 > >> > >> /Thanks HansN > >> > >> > >> On 03/15/2016 09:26 AM, nagendr...@oracle.com wrote: > >>> osaf/services/saf/amf/amfnd/Makefile.am | 3 +- > >>> osaf/services/saf/amf/amfnd/evt.cc | 14 ++- > >>> osaf/services/saf/amf/amfnd/imm.cc | 137 > >> +++++++++++++++++++++++++++++ > >>> osaf/services/saf/amf/amfnd/include/avnd.h | 6 + > >>> osaf/services/saf/amf/amfnd/main.cc | 3 +- > >>> 5 files changed, 158 insertions(+), 5 deletions(-) > >>> > >>> > >>> If immnd is down and Amf decides to instantiate some SU, then there > >>> is a deadlock between Amfnd and Imm. Amf calls Imm api to read imm > >>> attributes during SU instantiation and stuck because Imm is down. > >>> Imm waits for Amfnd to instantiate/restart it. > >>> So, both waits for each other. This is a deadlock. > >>> So, the fix bypasses main thread for SU instantiation and does > >>> process of SU instantiation in a separate thread. > >>> So, when Imm is down, its instantiation/restart happens in the main > >>> thread. If any SU instantiation request comes, then Amfnd routes > >>> that request in a separate thread, that threads stuck in Imm api > >>> call untill Imm is not restarted and responds to Imm api call. > >>> This obviates the deadlock between Amfnd and Imm. > >>> So, this fix has advantage for not having any mutex between threads. > >>> The reason is that SU is yet to instantiate, so it is still immune > >>> to faults of any components and hence no need of having any mutex > >>> among the threads. > >>> > >>> diff --git a/osaf/services/saf/amf/amfnd/Makefile.am > >> b/osaf/services/saf/amf/amfnd/Makefile.am > >>> --- a/osaf/services/saf/amf/amfnd/Makefile.am > >>> +++ b/osaf/services/saf/amf/amfnd/Makefile.am > >>> @@ -60,7 +60,8 @@ osafamfnd_SOURCES = \ > >>> term.cc \ > >>> tmr.cc \ > >>> util.cc \ > >>> - verify.cc > >>> + verify.cc \ > >>> + imm.cc > >>> > >>> osafamfnd_LDADD = \ > >>> $(top_builddir)/osaf/tools/safimm/src/libimmutil.la \ diff --git > >>> a/osaf/services/saf/amf/amfnd/evt.cc > >> b/osaf/services/saf/amf/amfnd/evt.cc > >>> --- a/osaf/services/saf/amf/amfnd/evt.cc > >>> +++ b/osaf/services/saf/amf/amfnd/evt.cc > >>> @@ -319,9 +319,17 @@ void avnd_evt_destroy(AVND_EVT *evt) > >>> > >> > ************************************************************** > >> ****************/ > >>> uint32_t avnd_evt_send(AVND_CB *cb, AVND_EVT *evt) > >>> { > >>> - uint32_t rc = m_NCS_IPC_SEND(&cb->mbx, evt, evt->priority); > >>> - if (NCSCC_RC_SUCCESS != rc) > >>> - LOG_CR("AvND send event to mailbox failed, type = %u", > >> evt->type); > >>> + uint32_t rc; > >>> + if (evt->type == AVND_EVT_AVD_SU_PRES_MSG) { > >>> + TRACE("AVND_EVT_AVD_REG_SU_MSG"); > >>> + rc = m_NCS_IPC_SEND(&ir_cb.mbx, evt, evt->priority); > >>> + if (NCSCC_RC_SUCCESS != rc) > >>> + LOG_CR("AvND send event to mailbox failed, type = > >> %u", evt->type); > >>> + } else { > >>> + rc = m_NCS_IPC_SEND(&cb->mbx, evt, evt->priority); > >>> + if (NCSCC_RC_SUCCESS != rc) > >>> + LOG_CR("AvND send event to mailbox failed, type = > >> %u", evt->type); > >>> + } > >>> > >>> return rc; > >>> } > >>> diff --git a/osaf/services/saf/amf/amfnd/imm.cc > >> b/osaf/services/saf/amf/amfnd/imm.cc > >>> new file mode 100644 > >>> --- /dev/null > >>> +++ b/osaf/services/saf/amf/amfnd/imm.cc > >>> @@ -0,0 +1,137 @@ > >>> +/* -*- OpenSAF -*- > >>> + * > >>> + * (C) Copyright 2016 The OpenSAF Foundation > >>> + * > >>> + * This program is distributed in the hope that it will be useful, > >>> +but > >>> + * WITHOUT ANY WARRANTY; without even the implied warranty of > >> MERCHANTABILITY > >>> + * or FITNESS FOR A PARTICULAR PURPOSE. This file and program are > >> licensed > >>> + * under the GNU Lesser General Public License Version 2.1, > >>> + February > >> 1999. > >>> + * The complete license can be accessed from the following location: > >>> + * http://opensource.org/licenses/lgpl-license.php > >>> + * See the Copying file included with the OpenSAF distribution for > >>> + full > >>> + * licensing terms. > >>> + * > >>> + * Author(s): Oracle > >>> + * > >>> + */ > >>> + > >>> +#include "avnd.h" > >>> +#include <poll.h> > >>> + > >>> +enum { > >>> + FD_MBX = 0 > >>> +}; > >>> + > >>> +ir_cb_t ir_cb; > >>> + > >>> +extern const AVND_EVT_HDLR g_avnd_func_list[AVND_EVT_MAX]; > >>> + > >>> +static struct pollfd fds[FD_MBX + 1]; static nfds_t irfds = FD_MBX > >>> ++ 1; static void ir_process_event(AVND_EVT *evt); > >>> + > >>> +/** > >>> + * This thread will read classes and object information thereby > >>> +allowing > >> main > >>> + * thread to proceed with other important work. > >>> + * As of now, SU instantiation needs imm data to be read, so > >>> + * only this event is being handled here. > >>> + * Going forward, we need to use this function for similar purposes. > >>> + * @param _cb > >>> + * > >>> + * @return void* > >>> + */ > >>> +static void *imm_reader_thread(void *_cb) { > >>> + uint32_t rc = SA_AIS_OK; > >>> + AVND_EVT *evt; > >>> + TRACE_ENTER(); > >>> + NCS_SEL_OBJ mbx_fd; > >>> + > >>> + /* create the mail box */ > >>> + rc = m_NCS_IPC_CREATE(&ir_cb.mbx); > >>> + if (NCSCC_RC_SUCCESS != rc) { > >>> + LOG_CR("AvND Mailbox creation failed"); > >>> + exit(EXIT_FAILURE); > >>> + } > >>> + TRACE("Ir mailbox creation success"); > >>> + > >>> + /* attach the mail box */ > >>> + rc = m_NCS_IPC_ATTACH(&ir_cb.mbx); > >>> + if (NCSCC_RC_SUCCESS != rc) { > >>> + LOG_CR("Ir mailbox attach failed"); > >>> + exit(EXIT_FAILURE); > >>> + } > >>> + > >>> + mbx_fd = ncs_ipc_get_sel_obj(&ir_cb.mbx); > >>> + fds[FD_MBX].fd = mbx_fd.rmv_obj; > >>> + fds[FD_MBX].events = POLLIN; > >>> + > >>> + while (1) { > >>> + if (poll(fds, irfds, -1) == (-1)) { > >>> + if (errno == EINTR) { > >>> + continue; > >>> + } > >>> + > >>> + LOG_ER("poll Fail - %s", strerror(errno)); > >>> + exit(EXIT_FAILURE); > >>> + } > >>> + > >>> + if (fds[FD_MBX].revents & POLLIN) { > >>> + while (nullptr != (evt = (AVND_EVT > >> *)ncs_ipc_non_blk_recv(&ir_cb.mbx))) > >>> + ir_process_event(evt); > >>> + } > >>> + } > >>> + TRACE_LEAVE(); > >>> +} > >>> + > >>> +/** > >>> + * Start a imm reader thread. > >>> + * @param Nothing. > >>> + * @return Nothing. > >>> + */ > >>> +void imm_reader_thread_create(void) { > >>> + pthread_t thread; > >>> + pthread_attr_t attr; > >>> + > >>> + TRACE_ENTER(); > >>> + > >>> + pthread_attr_init(&attr); > >>> + pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); > >>> + > >>> + if (pthread_create(&thread, &attr, imm_reader_thread, &ir_cb) != > >>> +0) > >> { > >>> + LOG_ER("pthread_create FAILED: %s", strerror(errno)); > >>> + exit(EXIT_FAILURE); > >>> + } > >>> + > >>> + pthread_attr_destroy(&attr); > >>> + > >>> + TRACE_LEAVE(); > >>> +} > >>> + > >>> +/** > >>> + * This function process events, which requires reading data from imm. > >>> + * As of now, SU instantiation needs to read data, so diverting SU > >>> + * instantiation in this thread. Going forward, we need to use this > >>> + * function for similar purposes. > >>> + * @param Event pointer. > >>> + * @return Nothing. > >>> + */ > >>> +void ir_process_event(AVND_EVT *evt) { > >>> + uint32_t res = NCSCC_RC_SUCCESS; > >>> + AVND_EVT_TYPE evt_type = evt->type; > >>> + TRACE_ENTER2("Evt type:%u", evt_type); > >>> + > >>> + osafassert(AVND_EVT_AVD_SU_PRES_MSG == evt_type); > >>> + > >>> + res = g_avnd_func_list[evt_type] (avnd_cb, evt); > >>> + > >>> + /* log the result of event processing */ > >>> + TRACE("Evt Type: '%u', '%s'", evt_type, ((res == NCSCC_RC_SUCCESS) > >> ? "success" : "failure")); > >>> + > >>> + if (evt) > >>> + avnd_evt_destroy(evt); > >>> + > >>> + TRACE_LEAVE2("res '%u'", res); > >>> +} > >>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd.h > >> b/osaf/services/saf/amf/amfnd/include/avnd.h > >>> --- a/osaf/services/saf/amf/amfnd/include/avnd.h > >>> +++ b/osaf/services/saf/amf/amfnd/include/avnd.h > >>> @@ -69,5 +69,11 @@ > >>> > >>> extern uint32_t avnd_create(void); > >>> extern void avnd_sigterm_handler(void); > >>> +typedef struct { > >>> + SYSF_MBX mbx; > >>> +}ir_cb_t; > >>> + > >>> +extern ir_cb_t ir_cb; > >>> +extern void imm_reader_thread_create(void); > >>> > >>> #endif > >>> diff --git a/osaf/services/saf/amf/amfnd/main.cc > >> b/osaf/services/saf/amf/amfnd/main.cc > >>> --- a/osaf/services/saf/amf/amfnd/main.cc > >>> +++ b/osaf/services/saf/amf/amfnd/main.cc > >>> @@ -552,7 +552,7 @@ void avnd_main_process(void) > >>> LOG_ER("signal TERM failed: %s", strerror(errno)); > >>> goto done; > >>> } > >>> - > >>> + imm_reader_thread_create(); > >>> mbx_fd = ncs_ipc_get_sel_obj(&avnd_cb->mbx); > >>> fds[FD_MBX].fd = mbx_fd.rmv_obj; > >>> fds[FD_MBX].events = POLLIN; > >>> @@ -638,6 +638,7 @@ void avnd_evt_process(AVND_EVT *evt) > >>> done: > >>> if (evt) > >>> avnd_evt_destroy(evt); > >>> + TRACE_LEAVE(); > >>> } > >>> > >>> > >> > /************************************************************* > >> **************** > >> > > ------------------------------------------------------------------------------ Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel