What I see is avnd_diq_del() is called as soon as system becomes 
headless. This will delete all pending messages. But when component will 
respond during SCs absence a new message will be generated and buffered.
For node_up AMFD will ack the message, but amfnd calls 
avnd_diq_rec_del() (not avnd_diq_del()) in avnd_di_msg_ack_process().
We need to call avnd_diq_del() in ack message so that msg_id gets updated.
Further looking into it..


Thanks.
Praveen



On 17-May-17 1:50 PM, praveen malviya wrote:
> Hi Minh,
> 
> While testing this, I am observing that amfd is dropping the assignment
> message because of message id mismatch:
> May 17 12:37:39.522117 osafamfd [7686:7686:src/amf/amfd/sgproc.cc:1171]
>   >> avd_su_si_assign_evh: id:1, node:2030f, act:5,
> 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', '', ha:3, err:1, single:0
> ....
> ....
> May 17 12:37:39.522404 osafamfd [7686:7686:src/amf/amfd/ndproc.cc:0075]
> WA avd_msg_sanity_chk: invalid msg id 1, msg type 5, from 2030f should be 3
> May 17 12:37:39.522418 osafamfd [7686:7686:src/amf/amfd/sgproc.cc:1777]
> << avd_su_si_assign_evh
> 
> I am also looking into this. For your reference I had attached amfd and
> amfnd traces from SC-1 and PL-3 respectively in the ticket.
> I am working with one controller and one payload.
> 
> 
> Thanks
> Praveen
> 
> On 15-May-17 1:06 PM, Minh Chau wrote:
>> When amfnd-payload responds susi assignment response just before both SC
>> go down, and that response message does not come to director. Therefore,
>> the status of that assignment could be seen as "modifying" in IMM. When
>> SC comes back, active amfd will be waiting for that response forever.
>>
>> Patch checks if a susi assignment response is sent but not-ack just before
>> both SC come down, amfnd-payload will buffer it in a way as a susi get
>> assigned during SC absence
>> ---
>>    src/amf/amfnd/di.cc | 53 
>> +++++++++++++++++++++++++++++++++++++++++++++--------
>>    1 file changed, 45 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/amf/amfnd/di.cc b/src/amf/amfnd/di.cc
>> index e06b9260d..3776a09dc 100644
>> --- a/src/amf/amfnd/di.cc
>> +++ b/src/amf/amfnd/di.cc
>> @@ -1282,16 +1282,53 @@ void avnd_di_msg_ack_process(AVND_CB *cb, uint32_t 
>> mid) {
>>      Notes         : None.
>>    
>> ******************************************************************************/
>>    void avnd_diq_del(AVND_CB *cb) {
>> -  AVND_DND_MSG_LIST *rec = 0;
>>    
>> -  do {
>> -    /* pop the record */
>> -    m_AVND_DIQ_REC_POP(cb, rec);
>> -    if (!rec) break;
>> +  if ((cb->dnd_list.head != nullptr)) {
>> +    AVND_DND_MSG_LIST *rec = 0;
>> +    bool found = true;
>> +    while (found) {
>> +      found = false;
>> +      for (rec = cb->dnd_list.head; rec != nullptr;
>> +           rec = rec->next) {
>> +        osafassert(rec->msg.type == AVND_MSG_AVD);
>> +        // delete all pending messages that haven't been sent out
>> +        if (rec->no_retries == 0) {
>> +          m_AVND_DIQ_REC_POP(cb, rec);
>> +          avnd_diq_rec_del(cb, rec);
>> +          break;
>> +        } else {
>> +          // Assignment response had been sent, but not ack because last
>> +          // controller go down, reset msg_id and will be resent later
>> +          if (rec->msg.info.avd->msg_type == 
>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG) {
>> +            if (rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id != 0) {
>> +              rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0;
>> +              found = true;
>> +              LOG_NO(
>> +                  "Found not-ack su_si_assign msg for SU:'%s', "
>> +                  "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', "
>> +                  "error:'%u', msg_id:'%u'",
>> +                  osaf_extended_name_borrow(&rec->msg.info.avd->msg_info
>> +                                                 .n2d_su_si_assign.su_name),
>> +                  osaf_extended_name_borrow(&rec->msg.info.avd->msg_info
>> +                                                 .n2d_su_si_assign.si_name),
>> +                  rec->msg.info.avd->msg_info.n2d_su_si_assign.ha_state,
>> +                  rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_act,
>> +                  rec->msg.info.avd->msg_info.n2d_su_si_assign
>> +                      .single_csi,
>> +                  rec->msg.info.avd->msg_info.n2d_su_si_assign.error,
>> +                  rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id);
>> +            }
>> +          } else {
>> +            // delete other messages for now
>> +            m_AVND_DIQ_REC_POP(cb, rec);
>> +            avnd_diq_rec_del(cb, rec);
>> +            break;
>> +          }
>> +        }
>>    
>> -    /* delete the record */
>> -    avnd_diq_rec_del(cb, rec);
>> -  } while (1);
>> +      }
>> +    }
>> +  }
>>    
>>      return;
>>    }
>>
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=KwsqvdArvOJV5IkAidvFxTT0JBVpgHVYUwJOsjK9dt4&s=Luyb_FCgTEXSpVle_diQMuhKxVmmm6cmv5VA03k0Zu8&e=
> _______________________________________________
> Opensaf-devel mailing list
> Opensaf-devel@lists.sourceforge.net
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_opensaf-2Ddevel&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=KwsqvdArvOJV5IkAidvFxTT0JBVpgHVYUwJOsjK9dt4&s=e4sg0J1cdg4VnTqeWPDrNZlPv2BuIuFj4Dk7JACxgx8&e=
> 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to