Hi Nagu,

Patch looks good.
Can we think an alternative that condition of calling 
avd_process_state_info_queue() by checking node_state as 
AVD_AVND_STATE_ABSENT, so that we don't have to introduce new static var?

Thanks,
Minh

On 23/06/16 20:45, nagendr...@oracle.com wrote:
>   osaf/services/saf/amf/amfd/ndfsm.cc |  12 +++++++++++-
>   1 files changed, 11 insertions(+), 1 deletions(-)
>
>
> When Amfd receives duplicate node up messages from Act
> amfnd, then it tries to reset alarm_sent for SI.
> This happens when cluster is recovering from headless state.
> And if that happens then when those SIs gets assigned,
> then alarms are not reset.
> This patch fixes this issue. It avoids resetting alarm_sent
> when duplicate node ups are received.
>
> diff --git a/osaf/services/saf/amf/amfd/ndfsm.cc 
> b/osaf/services/saf/amf/amfd/ndfsm.cc
> --- a/osaf/services/saf/amf/amfd/ndfsm.cc
> +++ b/osaf/services/saf/amf/amfd/ndfsm.cc
> @@ -51,11 +51,13 @@ void avd_process_state_info_queue(AVD_CL
>       uint32_t i;
>       const auto queue_size = cb->evt_queue.size();
>       AVD_EVT_QUEUE *queue_evt = nullptr;
> +     /* Counter for Act Amfnd node up message.*/
> +     static int act_amfnd_node_up_count = 0;
>   
>       TRACE_ENTER();
>   
>       TRACE("queue_size before processing: %lu", (unsigned long) queue_size);
> -
> +     act_amfnd_node_up_count ++;
>       // recover assignments from state info
>       for(i=0 ; i<queue_size ; i++) {
>               queue_evt = cb->evt_queue.front();
> @@ -91,6 +93,13 @@ void avd_process_state_info_queue(AVD_CL
>               }
>       }
>   
> +     /* Alarms shouldn't be reset in next subsequent node up message.
> +        Because in the previous node up messages queue_size might have
> +        been zero. In the subsequent node up messages, this might cause
> +        alarm_sent to get reset and this may cause unassigned alarm to
> +        exist even those SIs are assigned after some time.*/
> +     if (act_amfnd_node_up_count > 1) goto done;
> +
>       // Once active amfd looks up the state info from queue, that means node 
> sync
>       // finishes. Therefore, if the queue is empty, this active amfd is 
> coming
>       // from a cluster restart, the alarm state should be reset.
> @@ -115,6 +124,7 @@ void avd_process_state_info_queue(AVD_CL
>                       }
>               }
>       }
> +done:
>       TRACE("queue_size after processing: %lu", (unsigned long) 
> cb->evt_queue.size());
>       TRACE_LEAVE();
>   }
>


------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to