What I see is avnd_diq_del() is called as soon as system becomes headless. This will delete all pending messages. But when component will respond during SCs absence a new message will be generated and buffered. For node_up AMFD will ack the message, but amfnd calls avnd_diq_rec_del() (not avnd_diq_del()) in avnd_di_msg_ack_process(). We need to call avnd_diq_del() in ack message so that msg_id gets updated. Further looking into it..
Thanks. Praveen On 17-May-17 1:50 PM, praveen malviya wrote: > Hi Minh, > > While testing this, I am observing that amfd is dropping the assignment > message because of message id mismatch: > May 17 12:37:39.522117 osafamfd [7686:7686:src/amf/amfd/sgproc.cc:1171] > >> avd_su_si_assign_evh: id:1, node:2030f, act:5, > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', '', ha:3, err:1, single:0 > .... > .... > May 17 12:37:39.522404 osafamfd [7686:7686:src/amf/amfd/ndproc.cc:0075] > WA avd_msg_sanity_chk: invalid msg id 1, msg type 5, from 2030f should be 3 > May 17 12:37:39.522418 osafamfd [7686:7686:src/amf/amfd/sgproc.cc:1777] > << avd_su_si_assign_evh > > I am also looking into this. For your reference I had attached amfd and > amfnd traces from SC-1 and PL-3 respectively in the ticket. > I am working with one controller and one payload. > > > Thanks > Praveen > > On 15-May-17 1:06 PM, Minh Chau wrote: >> When amfnd-payload responds susi assignment response just before both SC >> go down, and that response message does not come to director. Therefore, >> the status of that assignment could be seen as "modifying" in IMM. When >> SC comes back, active amfd will be waiting for that response forever. >> >> Patch checks if a susi assignment response is sent but not-ack just before >> both SC come down, amfnd-payload will buffer it in a way as a susi get >> assigned during SC absence >> --- >> src/amf/amfnd/di.cc | 53 >> +++++++++++++++++++++++++++++++++++++++++++++-------- >> 1 file changed, 45 insertions(+), 8 deletions(-) >> >> diff --git a/src/amf/amfnd/di.cc b/src/amf/amfnd/di.cc >> index e06b9260d..3776a09dc 100644 >> --- a/src/amf/amfnd/di.cc >> +++ b/src/amf/amfnd/di.cc >> @@ -1282,16 +1282,53 @@ void avnd_di_msg_ack_process(AVND_CB *cb, uint32_t >> mid) { >> Notes : None. >> >> ******************************************************************************/ >> void avnd_diq_del(AVND_CB *cb) { >> - AVND_DND_MSG_LIST *rec = 0; >> >> - do { >> - /* pop the record */ >> - m_AVND_DIQ_REC_POP(cb, rec); >> - if (!rec) break; >> + if ((cb->dnd_list.head != nullptr)) { >> + AVND_DND_MSG_LIST *rec = 0; >> + bool found = true; >> + while (found) { >> + found = false; >> + for (rec = cb->dnd_list.head; rec != nullptr; >> + rec = rec->next) { >> + osafassert(rec->msg.type == AVND_MSG_AVD); >> + // delete all pending messages that haven't been sent out >> + if (rec->no_retries == 0) { >> + m_AVND_DIQ_REC_POP(cb, rec); >> + avnd_diq_rec_del(cb, rec); >> + break; >> + } else { >> + // Assignment response had been sent, but not ack because last >> + // controller go down, reset msg_id and will be resent later >> + if (rec->msg.info.avd->msg_type == >> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG) { >> + if (rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id != 0) { >> + rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0; >> + found = true; >> + LOG_NO( >> + "Found not-ack su_si_assign msg for SU:'%s', " >> + "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', " >> + "error:'%u', msg_id:'%u'", >> + osaf_extended_name_borrow(&rec->msg.info.avd->msg_info >> + .n2d_su_si_assign.su_name), >> + osaf_extended_name_borrow(&rec->msg.info.avd->msg_info >> + .n2d_su_si_assign.si_name), >> + rec->msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >> + rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >> + rec->msg.info.avd->msg_info.n2d_su_si_assign >> + .single_csi, >> + rec->msg.info.avd->msg_info.n2d_su_si_assign.error, >> + rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id); >> + } >> + } else { >> + // delete other messages for now >> + m_AVND_DIQ_REC_POP(cb, rec); >> + avnd_diq_rec_del(cb, rec); >> + break; >> + } >> + } >> >> - /* delete the record */ >> - avnd_diq_rec_del(cb, rec); >> - } while (1); >> + } >> + } >> + } >> >> return; >> } >> > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! > https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=KwsqvdArvOJV5IkAidvFxTT0JBVpgHVYUwJOsjK9dt4&s=Luyb_FCgTEXSpVle_diQMuhKxVmmm6cmv5VA03k0Zu8&e= > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_opensaf-2Ddevel&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=KwsqvdArvOJV5IkAidvFxTT0JBVpgHVYUwJOsjK9dt4&s=e4sg0J1cdg4VnTqeWPDrNZlPv2BuIuFj4Dk7JACxgx8&e= > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel