Hi Minh,

Thanks for reviewing.
I will change Error to Warning before pushing.

Thanks,
Praveen

On 09-Aug-16 4:48 AM, minh chau wrote:
> Hi Praveen,
>
> This patch has also fixed the coredump in the other tests are failing in
> test report of #1725 part 1, which  are 14, 64, 68, 84, 124, 128
> In the above test cases, still get "ER avd_sg_su_oper_list_del: su not
> found".
> Can we change ER to WA?
> Ack from me with this minor comment.
>
> Thanks,
> Minh
>
> On 11/05/16 02:26, praveen.malv...@oracle.com wrote:
>>   osaf/services/saf/amf/amfd/sg_2n_fsm.cc |  24 +++++++++++++++++++-----
>>   1 files changed, 19 insertions(+), 5 deletions(-)
>>
>>
>> In the reported problem, AMFND asserted when SU was unlocked.
>>
>> For complete analysis, please refer ticket. In short, when AMFND was
>> removing
>> the assignments, it gets a duplicate removal of assignment for the
>> same SU because
>> of reboot of node hosting the active su. This duplicate message gets
>> buffered and is picked
>> up when ongoing removal completes. After completion of ongoing removal
>> of assignment, AMFND picks
>> buffered assignment and sets assignment related flags. Since SUSIs
>> were deleted during previos
>> removal, no callbacks processing and response to AMFD is done for it.
>> During response to AMFD,
>> AMFND resets all assignment related flags and it remained undone for
>> buffered assignments.
>> Later on when SU was unlocked and fresh assignments were given to it.
>> After completion of callback
>> when AMFND tries to respond to AMFND expects valid SI pointer for
>> fresh assignment and checks it through
>> a assert statement. Here AMFND asserts because of side effects of
>> assignment related flags being set.
>>
>> Patch fixes the problem by avoiding sending duplicate removal of
>> assignments to AMFND.
>>
>> diff --git a/osaf/services/saf/amf/amfd/sg_2n_fsm.cc
>> b/osaf/services/saf/amf/amfd/sg_2n_fsm.cc
>> --- a/osaf/services/saf/amf/amfd/sg_2n_fsm.cc
>> +++ b/osaf/services/saf/amf/amfd/sg_2n_fsm.cc
>> @@ -3339,9 +3339,7 @@ void SG_2N::node_fail(AVD_CL_CB *cb, AVD
>>             if ((avd_su_state_determine(su) != SA_AMF_HA_STANDBY) &&
>>               !((avd_su_state_determine(su) == SA_AMF_HA_QUIESCED) &&
>> -              (avd_su_fsm_state_determine(su) == AVD_SU_SI_STATE_UNASGN)
>> -            )
>> -            ) {
>> +              (avd_su_fsm_state_determine(su) ==
>> AVD_SU_SI_STATE_UNASGN))) {
>>               /* SU is not standby */
>>               a_susi = avd_sg_2n_act_susi(cb, su->sg_of_su, &s_susi);
>>   @@ -3388,11 +3386,27 @@ void SG_2N::node_fail(AVD_CL_CB *cb, AVD
>>                   } else {
>>                       /* the other SU has quiesced or standby assigned
>> and is in the
>>                        * operation list and is out of service.
>> -                     * Send a D2N-INFO_SU_SI_ASSIGN with remove all
>> to that SU.
>> +                     * Send a D2N-INFO_SU_SI_ASSIGN with remove all
>> to that SU
>> +                     * if not sent already.
>>                        * Remove this SU from operation list. Free the
>>                        * SU SI relationships of this SU.
>>                        */
>> -                    avd_sg_su_si_del_snd(cb, o_su);
>> +
>> +
>> +                    /*
>> +                       As mentioned above other su (o_su) is OOS for
>> quiesced or
>> +                       standby state, it means some admin operation
>> is going on it or
>> +                       it has faulted (su level) which led to OOS.
>> +                       In this function, we are processing node_fail
>> of active/quiesced
>> +                       su. These active/quiesced assignments will be
>> deleted because of
>> +                       node fault and also other su cannot be made
>> active as it is OOS.
>> +                       So AMF will have to remove assignments of
>> other su (o_su) also.
>> +                       Since o_su is OOS, there is a possibility that
>> AMF would have
>> +                       sent deletion of assignment to it because of
>> admin op or fault.
>> +                       If not sent then send it now.
>> +                     */
>> +                    if (all_unassigned(o_su) == false)
>> +                        avd_sg_su_si_del_snd(cb, o_su);
>>                       su->delete_all_susis();
>>                       avd_sg_su_oper_list_del(cb, su, false);
>>                       m_AVD_CHK_OPLIST(o_su, flag);
>>
>

------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to