Hi Nagu That’s true. But to fix this in 5.0, I think we would have to introduce a new event (to notify the main thread from the mds thread), which may not be appropriate.
Can we make your suggestment an enhancement ticket for 5.2? Gary > On 20 Oct. 2016, at 9:58 pm, Nagendra Kumar <[email protected]> wrote: > > Hi Gary, > It would be better to check whether AmfD has got NCSMDS_UP for > the particular AmfND in avd_node_up_evh() and drop it if it has not got, else > proceed. > > Thanks > -Nagu > >> -----Original Message----- >> From: Gary Lee [mailto:[email protected]] >> Sent: 20 October 2016 04:21 >> To: [email protected]; [email protected]; >> [email protected]; Nagendra Kumar; Praveen Malviya; >> [email protected] >> Cc: [email protected] >> Subject: [PATCH 1 of 1] amfd: handle late arrival of amfnd svc up event >> [#2124] >> >> osaf/services/saf/amf/amfd/ndmsg.cc | 23 +++++++++++++++++++++++ >> 1 files changed, 23 insertions(+), 0 deletions(-) >> >> >> if the svc up event for amfnd arrives after N2D_NODE_UP, then amfd may >> fail to send D2N_NODE_UP to amfnd. This will eventually cause the >> respective amfnd >> to reboot the node, due to message ID mismatches. By resetting msg IDs to >> 0 and >> adest to 0, we allow amfnd to send another N2D_NODE_UP and resume >> node join sequence. >> >> diff --git a/osaf/services/saf/amf/amfd/ndmsg.cc >> b/osaf/services/saf/amf/amfd/ndmsg.cc >> --- a/osaf/services/saf/amf/amfd/ndmsg.cc >> +++ b/osaf/services/saf/amf/amfd/ndmsg.cc >> @@ -206,6 +206,21 @@ static void avd_d2n_msg_enqueue(AVD_CL_C >> cb->nd_msg_queue_list.push(nd_msg); >> } >> >> +void handle_nodeup_failure(AVD_CL_CB *cb, const AVSV_DND_MSG *msg) >> +{ >> + AVD_AVND* node = avd_node_find_nodeid(msg- >>> msg_info.d2n_node_up.node_id); >> + osafassert(node != nullptr); >> + >> + LOG_WA("failed to send node up to '%s'", node- >>> node_name.c_str()); >> + >> + node->snd_msg_id = 0; >> + node->rcv_msg_id = 0; >> + node->adest = 0; >> + avd_node_state_set(node, AVD_AVND_STATE_ABSENT); >> + >> + m_AVSV_SEND_CKPT_UPDT_ASYNC_UPDT(cb, node, >> AVSV_CKPT_AVD_NODE_CONFIG); >> +} >> + >> >> /************************************************************* >> *************** >> Name : avd_d2n_msg_dequeue >> >> @@ -234,6 +249,14 @@ uint32_t avd_d2n_msg_dequeue(AVD_CL_CB * >> */ >> if ((rc = ncsmds_api(&queue_elem->snd_msg)) != >> NCSCC_RC_SUCCESS) { >> LOG_ER("%s: ncsmds_api failed %u", >> __FUNCTION__, rc); >> + >> + // if the svc up event for amfnd arrives after >> N2D_NODE_UP, then >> + // ncsmds_api(D2N_NODE_UP_MSG) may fail. Reset >> msg IDs to 0, and >> + // wait for the next N2D_NODE UP from amfnd >> + const AVSV_DND_MSG *d2n_msg = >> static_cast<AVD_DND_MSG *>(queue_elem- >>> snd_msg.info.svc_send.i_msg); >> + if (d2n_msg->msg_type == >> AVSV_D2N_NODE_UP_MSG) { >> + handle_nodeup_failure(cb, d2n_msg); >> + } >> } >> >> d2n_msg_free((AVD_DND_MSG *)queue_elem- >>> snd_msg.info.svc_send.i_msg); ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
