Hi Nagu

I think there might still be race conditions that need to be fixed.

Seems to be at least two places where rcv_msg_id is reset to 0 in the main 
thread, that might interfere with the MDS thread’s use of rcv_msg_id.

Thanks
Gary



On 12/04/2016, 5:12 PM, "Nagendra Kumar" <nagendr...@oracle.com> wrote:

>Hi Hans N,
>               Yes, sure. That would be useful. Thanks for the information.    
>
>-Nagu
>
>> -----Original Message-----
>> From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com]
>> Sent: 12 April 2016 12:35
>> To: Nagendra Kumar; Praveen Malviya; minh.c...@dektech.com.au;
>> gary....@dektech.com.au
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: Re: [PATCH 1 of 1] amfnd: remove receive message id counter from
>> threads [#1728]
>> 
>> Hi Nagu,
>> 
>> fine, you can also use the valgrind tool helgrind to check for threading
>> issues, e.g in amfnd.conf add:
>> 
>> export TOOL="valgrind --tool=helgrind --log-file=/tmp/amfnd.valgrind"
>> 
>> and in /usr/local/lib/opensaf/clc-cli/osaf-amfnd change:
>> 
>>      #start_daemon -p $pidfile $binary $args
>>      $TOOL $binary $args
>> 
>> /Thanks HansN
>> 
>> 
>> 
>> On 04/12/2016 08:58 AM, Nagendra Kumar wrote:
>> > Hi Hans N,
>> >            I checked avnd_comp_clc_cmd_execute() and found that
>> comp->clc_info.exec_cmd is being used and overwritten in this function.
>> > Since SU being instantiated in imm reader thread is being instantiated 
>> > first
>> time and the components are just being instantiated, so they will not be
>> violating any data access with main thread, where other SUs may be getting
>> instantiating, termination, etc.
>> > After running instantiate command on one of the component, any way, the
>> response will be processed from main thread only. So, in one meaning, Imm
>> reader thread is just reading all imm specific information and invoking
>> initialization for one component of SU and ends its job.
>> >
>> > If you find any other specific thing, then please let me know.
>> >
>> > Thanks
>> > -Nagu
>> >> -----Original Message-----
>> >> From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com]
>> >> Sent: 11 April 2016 19:27
>> >> To: Nagendra Kumar; Praveen Malviya; minh.c...@dektech.com.au;
>> >> gary....@dektech.com.au
>> >> Cc: opensaf-devel@lists.sourceforge.net
>> >> Subject: Re: [PATCH 1 of 1] amfnd: remove receive message id counter
>> from
>> >> threads [#1728]
>> >>
>> >> Hi Nagu,
>> >>
>> >> a question, it looks like e.g. avnd_comp_clc_cmd_execute() may run in
>> >> parallel wrt the main thread
>> >> and the new ImmReader, not sure but have you checked for data races in
>> >> this function?
>> >>
>> >> /Thanks HansN
>> >>
>> >> On 04/11/2016 01:49 PM, nagendr...@oracle.com wrote:
>> >>>    osaf/services/saf/amf/amfnd/amfnd.cc |   3 -
>> >>>    osaf/services/saf/amf/amfnd/comp.cc  |   3 -
>> >>>    osaf/services/saf/amf/amfnd/di.cc    |  24 ---------
>> >>>    osaf/services/saf/amf/amfnd/mds.cc   |  89
>> >> ++++++++++++++++++++++++++++++++++++
>> >>>    osaf/services/saf/amf/amfnd/su.cc    |  12 ----
>> >>>    osaf/services/saf/amf/amfnd/susm.cc  |   9 ---
>> >>>    osaf/services/saf/amf/amfnd/term.cc  |   4 -
>> >>>    7 files changed, 89 insertions(+), 55 deletions(-)
>> >>>
>> >>>
>> >>> Since, the message id counter is updated by main thread and
>> >>> imm reader thread simultaneouspy, so there is chance that
>> >>> the counter may be updated in any order and make Amfd and
>> >>> Amfnd counters Out of Order.
>> >>> So, such msg id counter update should be removed
>> >>> from both and should be kep at a common location.
>> >>> This patch fixes the same.
>> >>>
>> >>> diff --git a/osaf/services/saf/amf/amfnd/amfnd.cc
>> >> b/osaf/services/saf/amf/amfnd/amfnd.cc
>> >>> --- a/osaf/services/saf/amf/amfnd/amfnd.cc
>> >>> +++ b/osaf/services/saf/amf/amfnd/amfnd.cc
>> >>> @@ -386,9 +386,6 @@ uint32_t avnd_evt_avd_reboot_evh(AVND_CB
>> >>>
>> >>>          osafassert(AVSV_D2N_REBOOT_MSG == evt->info.avd->msg_type);
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>> -
>> >>>          /* Clear error report related alarms before reboot.
>> >>>             TODO: This for loop can be removed if AVD remembers and
>> >> checkpoints
>> >>>             alarms sent due to error report.
>> >>> diff --git a/osaf/services/saf/amf/amfnd/comp.cc
>> >> b/osaf/services/saf/amf/amfnd/comp.cc
>> >>> --- a/osaf/services/saf/amf/amfnd/comp.cc
>> >>> +++ b/osaf/services/saf/amf/amfnd/comp.cc
>> >>> @@ -2691,9 +2691,6 @@ uint32_t
>> avnd_evt_comp_admin_op_req(AVND
>> >>>
>> >>>          TRACE_ENTER2("'%s' op=%u", info->dn.value, info->oper_id);
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>> -
>> >>>          comp = m_AVND_COMPDB_REC_GET(cb->compdb, info->dn);
>> >>>          osafassert( comp != nullptr);
>> >>>
>> >>> diff --git a/osaf/services/saf/amf/amfnd/di.cc
>> >> b/osaf/services/saf/amf/amfnd/di.cc
>> >>> --- a/osaf/services/saf/amf/amfnd/di.cc
>> >>> +++ b/osaf/services/saf/amf/amfnd/di.cc
>> >>> @@ -108,9 +108,6 @@ static uint32_t avnd_evt_node_admin_op_r
>> >>>
>> >>>          TRACE_ENTER2("%s op=%u", info->dn.value, info->oper_id);
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>> -
>> >>>          switch(info->oper_id) {
>> >>>          default:
>> >>>                  LOG_NO("%s: unsupported adm op %u", __FUNCTION__,
>> >> info->oper_id);
>> >>> @@ -200,17 +197,6 @@ uint32_t avnd_evt_avd_operation_request_
>> >>>
>> >>>          TRACE_ENTER2("Class=%u, action=%u", param->class_id, param-
>> >>> act);
>> >>>
>> >>> -        /* dont process unless AvD is up */
>> >>> -        if (!m_AVND_CB_IS_AVD_UP(cb))
>> >>> -                goto done;
>> >>> -
>> >>> -        // TODO() hide the below code in a "set_msg_id()" function
>> >>> -        // If message was not broadcasted, (msg_id == 0)
>> >>> -        if (info->msg_id != 0) {
>> >>> -                avnd_msgid_assert(info->msg_id);
>> >>> -                cb->rcv_msg_id = info->msg_id;
>> >>> -        }
>> >>> -
>> >>>          switch (param->class_id) {
>> >>>          case AVSV_SA_AMF_NODE:
>> >>>                  rc = avnd_node_oper_req(cb, param);
>> >>> @@ -260,7 +246,6 @@ uint32_t avnd_evt_avd_operation_request_
>> >>>                  avnd_msg_content_free(cb, &msg);
>> >>>          }
>> >>>
>> >>> -done:
>> >>>          TRACE_LEAVE();
>> >>>          return rc;
>> >>>    }
>> >>> @@ -1410,20 +1395,11 @@ uint32_t
>> avnd_evt_avd_role_change_evh(AV
>> >>>
>> >>>          TRACE_ENTER();
>> >>>
>> >>> -        /* dont process unless AvD is up */
>> >>> -        if (!m_AVND_CB_IS_AVD_UP(cb)){
>> >>> -                LOG_IN("AVD is not up yet");
>> >>> -                return NCSCC_RC_FAILURE;
>> >>> -        }
>> >>> -
>> >>>          info = &evt->info.avd->msg_info.d2n_role_change_info;
>> >>>
>> >>>          TRACE("MsgId: %u,NodeId:%u, role rcvd:%u role present:%u",\
>> >>>                                info->msg_id, info->node_id, info->role, 
>> >>> cb-
>> >>> avail_state_avnd);
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>> -
>> >>>          prev_ha_state = cb->avail_state_avnd;
>> >>>
>> >>>          /* Ignore the duplicate roles. */
>> >>> diff --git a/osaf/services/saf/amf/amfnd/mds.cc
>> >> b/osaf/services/saf/amf/amfnd/mds.cc
>> >>> --- a/osaf/services/saf/amf/amfnd/mds.cc
>> >>> +++ b/osaf/services/saf/amf/amfnd/mds.cc
>> >>> @@ -300,6 +300,93 @@ done:
>> >>>          return rc;
>> >>>    }
>> >>>
>> >>> +/**
>> >>> + * @brief  This function update the receive msg id with the msg id sent
>> >>> + *         from Amfd.
>> >>> + * @param  ptr to avnd_cb.
>> >>> + * @param  ptr to evt.
>> >>> + * @return  Nothing.
>> >>> + */
>> >>> +static void update_rcv_msg_id(AVND_CB *cb, AVND_EVT *evt)
>> >>> +{
>> >>> +        uint32_t msg_id;
>> >>> +        switch (evt->type) {
>> >>> +                case AVND_EVT_AVD_OPERATION_REQUEST_MSG:
>> >>> +                        {
>> >>> +
>> >>   AVSV_D2N_OPERATION_REQUEST_MSG_INFO *op_info = &evt-
>> >>> info.avd->msg_info.d2n_op_req;
>> >>> +                                msg_id = op_info->msg_id;
>> >>> +                                /* dont update the counter unless AND 
>> >>> is up.
>> >> */
>> >>> +                                if (!m_AVND_CB_IS_AVD_UP(cb))
>> >>> +                                        goto done;
>> >>> +
>> >>> +                                // TODO() hide the below code in a
>> >> "set_msg_id()" function
>> >>> +                                // If message was not broadcasted, 
>> >>> (msg_id
>> >> == 0)
>> >>> +                                if (msg_id == 0)
>> >>> +                                        goto done;
>> >>> +                        }
>> >>> +                        break;
>> >>> +                case AVND_EVT_AVD_ROLE_CHANGE_MSG:
>> >>> +                        {
>> >>> +                                AVSV_D2N_ROLE_CHANGE_INFO *rc_info =
>> >> &evt->info.avd->msg_info.d2n_role_change_info;
>> >>> +                                msg_id = rc_info->msg_id;
>> >>> +                                /* dont update the counter unless AND 
>> >>> is up.
>> >> */
>> >>> +                                if (!m_AVND_CB_IS_AVD_UP(cb))
>> >>> +                                        goto done;
>> >>> +                        }
>> >>> +                        break;
>> >>> +                case AVND_EVT_AVD_SU_PRES_MSG:
>> >>> +                        {
>> >>> +                                AVSV_D2N_PRESENCE_SU_MSG_INFO
>> >> *pres_info = &evt->info.avd->msg_info.d2n_prsc_su;
>> >>> +                                msg_id = pres_info->msg_id;
>> >>> +                                /* dont update the counter unless AND 
>> >>> is up.
>> >> */
>> >>> +                                if (!m_AVND_CB_IS_AVD_UP(cb))
>> >>> +                                        return;
>> >>> +                        }
>> >>> +                        break;
>> >>> +                case AVND_EVT_AVD_REG_SU_MSG:
>> >>> +                        {
>> >>> +                                AVSV_D2N_REG_SU_MSG_INFO *reg_info =
>> >> &evt->info.avd->msg_info.d2n_reg_su;
>> >>> +                                msg_id = reg_info->msg_id;
>> >>> +                                /* dont update the counter unless AND 
>> >>> is up.
>> >> */
>> >>> +                                if (!m_AVND_CB_IS_AVD_UP(cb))
>> >>> +                                        return;
>> >>> +                        }
>> >>> +                        break;
>> >>> +                case AVND_EVT_AVD_INFO_SU_SI_ASSIGN_MSG:
>> >>> +                        {
>> >>> +                                AVSV_D2N_INFO_SU_SI_ASSIGN_MSG_INFO
>> >> *susi_info = &evt->info.avd->msg_info.d2n_su_si_assign;
>> >>> +                                msg_id = susi_info->msg_id;
>> >>> +                        }
>> >>> +                        break;
>> >>> +                case AVND_EVT_AVD_SET_LEDS_MSG:
>> >>> +                        {
>> >>> +                                AVSV_D2N_SET_LEDS_MSG_INFO *led_info
>> >> = &evt->info.avd->msg_info.d2n_set_leds;
>> >>> +                                msg_id = led_info->msg_id;
>> >>> +                        }
>> >>> +                        break;
>> >>> +                case AVND_EVT_AVD_ADMIN_OP_REQ_MSG:
>> >>> +                        {
>> >>> +                                AVSV_D2N_ADMIN_OP_REQ_MSG_INFO
>> >> *adm_info = &evt->info.avd->msg_info.d2n_admin_op_req_info;
>> >>> +                                msg_id = adm_info->msg_id;
>> >>> +                        }
>> >>> +                        break;
>> >>> +                case AVND_EVT_AVD_REBOOT_MSG:
>> >>> +                        {
>> >>> +                                AVSV_D2N_REBOOT_MSG_INFO
>> >> *reboot_info = &evt->info.avd->msg_info.d2n_reboot_info;
>> >>> +                                msg_id = reboot_info->msg_id;
>> >>> +                                break;
>> >>> +                        }
>> >>> +                default:
>> >>> +                        goto done;
>> >>> +        }
>> >>> +
>> >>> +
>> >>> +        avnd_msgid_assert(msg_id);
>> >>> +        cb->rcv_msg_id = msg_id;
>> >>> +done:
>> >>> +        return;
>> >>> +}
>> >>> +
>> >>>
>> >>
>> /*************************************************************
>> >> ***************
>> >>>      Name          : avnd_mds_rcv
>> >>>
>> >>> @@ -432,6 +519,8 @@ uint32_t avnd_mds_rcv(AVND_CB *cb, MDS_C
>> >>>          /* nullify the msg as it is used in the event */
>> >>>          memset(&msg, 0, sizeof(AVND_MSG));
>> >>>
>> >>> +        /* Update the receive msg id counter. */
>> >>> +        update_rcv_msg_id(cb, evt);
>> >>>          /* send the event */
>> >>>          rc = avnd_evt_send(cb, evt);
>> >>>
>> >>> diff --git a/osaf/services/saf/amf/amfnd/su.cc
>> >> b/osaf/services/saf/amf/amfnd/su.cc
>> >>> --- a/osaf/services/saf/amf/amfnd/su.cc
>> >>> +++ b/osaf/services/saf/amf/amfnd/su.cc
>> >>> @@ -125,15 +125,8 @@ uint32_t avnd_evt_avd_reg_su_evh(AVND_CB
>> >>>
>> >>>          TRACE_ENTER();
>> >>>
>> >>> -        /* dont process unless AvD is up */
>> >>> -        if (!m_AVND_CB_IS_AVD_UP(cb))
>> >>> -                goto done;
>> >>> -
>> >>>          info = &evt->info.avd->msg_info.d2n_reg_su;
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>> -
>> >>>          /*
>> >>>           * Check whether SU updates are received after fail-over then
>> >>>           * call a separate processing function.
>> >>> @@ -396,8 +389,6 @@ uint32_t avnd_evt_avd_info_su_si_assign_
>> >>>                  }
>> >>>          }
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>>
>> >>>          if (info->msg_act == AVSV_SUSI_ACT_ASGN) {
>> >>>                  /* SI rank and CSI capability (originally from 
>> >>> SaAmfCtCsType)
>> >>> @@ -622,9 +613,6 @@ uint32_t avnd_evt_su_admin_op_req(AVND_C
>> >>>
>> >>>          TRACE_ENTER2("%s op=%u", info->dn.value, info->oper_id);
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>> -
>> >>>          su = m_AVND_SUDB_REC_GET(cb->sudb, info->dn);
>> >>>          osafassert(su != nullptr);
>> >>>
>> >>> diff --git a/osaf/services/saf/amf/amfnd/susm.cc
>> >> b/osaf/services/saf/amf/amfnd/susm.cc
>> >>> --- a/osaf/services/saf/amf/amfnd/susm.cc
>> >>> +++ b/osaf/services/saf/amf/amfnd/susm.cc
>> >>> @@ -1277,12 +1277,6 @@ uint32_t
>> avnd_evt_avd_su_pres_evh(AVND_C
>> >>>
>> >>>          TRACE_ENTER();
>> >>>
>> >>> -        /* dont process unless AvD is up */
>> >>> -        if (!m_AVND_CB_IS_AVD_UP(cb)) {
>> >>> -                TRACE("AVD is not yet up");
>> >>> -                goto done;
>> >>> -        }
>> >>> -
>> >>>          /* since AMFND is going down no need to process SU
>> >> instantiate/terminate msg
>> >>>           * because SU instantiate will cause component information to be
>> >> read from IMMND
>> >>>           * and IMMND might have been already terminated and in that case
>> >> AMFND will osafassert */
>> >>> @@ -1293,9 +1287,6 @@ uint32_t avnd_evt_avd_su_pres_evh(AVND_C
>> >>>
>> >>>          info = &evt->info.avd->msg_info.d2n_prsc_su;
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>> -
>> >>>          /* get the su */
>> >>>          su = m_AVND_SUDB_REC_GET(cb->sudb, info->su_name);
>> >>>          if (!su) {
>> >>> diff --git a/osaf/services/saf/amf/amfnd/term.cc
>> >> b/osaf/services/saf/amf/amfnd/term.cc
>> >>> --- a/osaf/services/saf/amf/amfnd/term.cc
>> >>> +++ b/osaf/services/saf/amf/amfnd/term.cc
>> >>> @@ -196,14 +196,10 @@ done:
>> >>>
>> >>
>> **************************************************************
>> >> ****************/
>> >>>    uint32_t avnd_evt_avd_set_leds_evh(AVND_CB *cb, AVND_EVT *evt)
>> >>>    {
>> >>> -        AVSV_D2N_SET_LEDS_MSG_INFO *info = &evt->info.avd-
>> >>> msg_info.d2n_set_leds;
>> >>>          uint32_t rc = NCSCC_RC_SUCCESS;
>> >>>
>> >>>          TRACE_ENTER();
>> >>>
>> >>> -        avnd_msgid_assert(info->msg_id);
>> >>> -        cb->rcv_msg_id = info->msg_id;
>> >>> -
>> >>>          if (cb->led_state == AVND_LED_STATE_GREEN) {
>> >>>                  /* Nothing to be done we have already got this msg */
>> >>>                  goto done;
>> 
>> 


------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to