What I see is avnd_diq_del() is called as soon as system becomes
headless. This will delete all pending messages. But when component will
respond during SCs absence a new message will be generated and buffered.
For node_up AMFD will ack the message, but amfnd calls
avnd_diq_rec_del() (not avnd_diq_del()) in avnd_di_msg_ack_process().
We need to call avnd_diq_del() in ack message so that msg_id gets updated.
Further looking into it..
Thanks.
Praveen
On 17-May-17 1:50 PM, praveen malviya wrote:
> Hi Minh,
>
> While testing this, I am observing that amfd is dropping the assignment
> message because of message id mismatch:
> May 17 12:37:39.522117 osafamfd [7686:7686:src/amf/amfd/sgproc.cc:1171]
> >> avd_su_si_assign_evh: id:1, node:2030f, act:5,
> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', '', ha:3, err:1, single:0
> ....
> ....
> May 17 12:37:39.522404 osafamfd [7686:7686:src/amf/amfd/ndproc.cc:0075]
> WA avd_msg_sanity_chk: invalid msg id 1, msg type 5, from 2030f should be 3
> May 17 12:37:39.522418 osafamfd [7686:7686:src/amf/amfd/sgproc.cc:1777]
> << avd_su_si_assign_evh
>
> I am also looking into this. For your reference I had attached amfd and
> amfnd traces from SC-1 and PL-3 respectively in the ticket.
> I am working with one controller and one payload.
>
>
> Thanks
> Praveen
>
> On 15-May-17 1:06 PM, Minh Chau wrote:
>> When amfnd-payload responds susi assignment response just before both SC
>> go down, and that response message does not come to director. Therefore,
>> the status of that assignment could be seen as "modifying" in IMM. When
>> SC comes back, active amfd will be waiting for that response forever.
>>
>> Patch checks if a susi assignment response is sent but not-ack just before
>> both SC come down, amfnd-payload will buffer it in a way as a susi get
>> assigned during SC absence
>> ---
>> src/amf/amfnd/di.cc | 53
>> +++++++++++++++++++++++++++++++++++++++++++++--------
>> 1 file changed, 45 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/amf/amfnd/di.cc b/src/amf/amfnd/di.cc
>> index e06b9260d..3776a09dc 100644
>> --- a/src/amf/amfnd/di.cc
>> +++ b/src/amf/amfnd/di.cc
>> @@ -1282,16 +1282,53 @@ void avnd_di_msg_ack_process(AVND_CB *cb, uint32_t
>> mid) {
>> Notes : None.
>>
>> ******************************************************************************/
>> void avnd_diq_del(AVND_CB *cb) {
>> - AVND_DND_MSG_LIST *rec = 0;
>>
>> - do {
>> - /* pop the record */
>> - m_AVND_DIQ_REC_POP(cb, rec);
>> - if (!rec) break;
>> + if ((cb->dnd_list.head != nullptr)) {
>> + AVND_DND_MSG_LIST *rec = 0;
>> + bool found = true;
>> + while (found) {
>> + found = false;
>> + for (rec = cb->dnd_list.head; rec != nullptr;
>> + rec = rec->next) {
>> + osafassert(rec->msg.type == AVND_MSG_AVD);
>> + // delete all pending messages that haven't been sent out
>> + if (rec->no_retries == 0) {
>> + m_AVND_DIQ_REC_POP(cb, rec);
>> + avnd_diq_rec_del(cb, rec);
>> + break;
>> + } else {
>> + // Assignment response had been sent, but not ack because last
>> + // controller go down, reset msg_id and will be resent later
>> + if (rec->msg.info.avd->msg_type ==
>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG) {
>> + if (rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id != 0) {
>> + rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0;
>> + found = true;
>> + LOG_NO(
>> + "Found not-ack su_si_assign msg for SU:'%s', "
>> + "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', "
>> + "error:'%u', msg_id:'%u'",
>> + osaf_extended_name_borrow(&rec->msg.info.avd->msg_info
>> + .n2d_su_si_assign.su_name),
>> + osaf_extended_name_borrow(&rec->msg.info.avd->msg_info
>> + .n2d_su_si_assign.si_name),
>> + rec->msg.info.avd->msg_info.n2d_su_si_assign.ha_state,
>> + rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_act,
>> + rec->msg.info.avd->msg_info.n2d_su_si_assign
>> + .single_csi,
>> + rec->msg.info.avd->msg_info.n2d_su_si_assign.error,
>> + rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id);
>> + }
>> + } else {
>> + // delete other messages for now
>> + m_AVND_DIQ_REC_POP(cb, rec);
>> + avnd_diq_rec_del(cb, rec);
>> + break;
>> + }
>> + }
>>
>> - /* delete the record */
>> - avnd_diq_rec_del(cb, rec);
>> - } while (1);
>> + }
>> + }
>> + }
>>
>> return;
>> }
>>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org!
> https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=KwsqvdArvOJV5IkAidvFxTT0JBVpgHVYUwJOsjK9dt4&s=Luyb_FCgTEXSpVle_diQMuhKxVmmm6cmv5VA03k0Zu8&e=
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_opensaf-2Ddevel&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=KwsqvdArvOJV5IkAidvFxTT0JBVpgHVYUwJOsjK9dt4&s=e4sg0J1cdg4VnTqeWPDrNZlPv2BuIuFj4Dk7JACxgx8&e=
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel