On 01/31/2017 05:54 AM, Neelakanta Reddy wrote:
> Hi Rafael,
>
> Comments inline.
>
> On 2017/01/30 07:33 PM, Rafael Odzakow wrote:
>> ACK, with comments after [rafael] tag
>>
> I will add the comments, while pushing.
>> As an improvement after the retries the operation can fail the 
>> campaign. Today we only move the problem over to the next campaign?
>>
> No, the first campaign will not fail and the campaign state will be 
> moved to campaign completed.
> But saAmfSUMaintenanceCampaign  is not cleared, when the next campaign 
> is tried it will fail with the error.
That is my point. Why would we want to move a problem over to the next 
campaign? Is it not better to fail at the right time?
>
> The patch, retries the modification of saAmfSUMaintenanceCampaign, if 
> the ccb operation is aborted because of resource abort.
>
> Thanks,
> Neel.
>>
>> On 01/30/2017 02:25 PM, reddy.neelaka...@oracle.com wrote:
>>> osaf/services/saf/smfsv/smfd/SmfImmOperation.cc |  21 
>>> ++++++++++++++++++---
>>>   osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc |  16 
>>> ++++++++++++++--
>>>   2 files changed, 32 insertions(+), 5 deletions(-)
>>>
>>>
>>> When the campaign is commited, while modifying 
>>> saAmfSUMaintenanceCampaign  flags, if any of the IMMND restarts or
>>> new node joins the cluster, then the non-commited ccbs will be aborted.
>>>
>>> This is a rare case, when saAmfSUMaintenanceCampaign will fail due 
>>> to IMMND sync, and and the next campaign will fails with
>>> ER Failed to set maintenance state in step=safSmfStep=0001
>>>
>>> diff --git a/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc 
>>> b/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc
>>> --- a/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc
>>> +++ b/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc
>>> @@ -1033,11 +1033,26 @@ SmfImmModifyOperation::execute(SmfRollba
>>>   //   objectName.length = (SaUint16T) objectLen;
>>>   //   memcpy(objectName.value, object, objectLen);
>>>   +    const SaStringT * errStrings = NULL;
>>>       result = immutil_saImmOmCcbObjectModify_2(m_ccbHandle, 
>>> &objectName, (const SaImmAttrModificationT_2 **)
>>> -                          m_immAttrMods);
>>> -    if (result != SA_AIS_OK) {
>>> +            m_immAttrMods);
>>> +    if (result != SA_AIS_OK && result == 
>>> SA_AIS_ERR_FAILED_OPERATION) {
>>> +        result = saImmOmCcbGetErrorStrings(m_ccbHandle, &errStrings);
>>> +        if (errStrings){
>>> +            TRACE("Received error string is %s", errStrings[0]);
>>> +            char * type = NULL;
>>> +            type = strstr(errStrings[0], "IMM: Resource abort: ");
>>> +            if(type != NULL) {
>>> +                TRACE("SA_AIS_ERR_FAILED_OPERATION is modified to 
>>> SA_AIS_ERR_TRY_AGAIN because of ccb resourse abort" );
>>> +                result = SA_AIS_ERR_TRY_AGAIN;
>>> +                TRACE_LEAVE();
>>> +                return result;
>>> +            }
>>> +        }
>>> +    }
>>> +    if (result != SA_AIS_OK){
>>>           LOG_NO("SmfImmModifyOperation::execute, 
>>> saImmOmCcbObjectModify failed %s", saf_error(result));
>>> -                TRACE_LEAVE();
>>> +        TRACE_LEAVE();
>>>           return result;
>>>       }
>>>   diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc 
>>> b/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc
>>> --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc
>>> +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc
>>> @@ -1063,6 +1063,7 @@ SmfUpgradeCampaign::resetMaintenanceStat
>>>           std::list < std::string > objectList;
>>>       SmfImmUtils immUtil;
>>>           (void)immUtil.getChildren("", objectList, SA_IMM_SUBTREE, 
>>> "SaAmfSU");
>>> +    SaAisErrorT rc = SA_AIS_OK;
>>>             //Reset saAmfSUMaintenanceCampaign for all found SUs
>>>           const std::string campDn = 
>>> SmfCampaignThread::instance()->campaign()->getDn();
>>> @@ -1093,9 +1094,20 @@ SmfUpgradeCampaign::resetMaintenanceStat
>>>           }
>>>           }
>>>   -        if (immUtil.doImmOperations(operations) != SA_AIS_OK) {
>>> - LOG_NO("SmfUpgradeStep::setMaintenanceState(), fails to reset all 
>>> saAmfSUMaintenanceCampaign attr");
>>> +    const uint32_t MAX_NO_RETRIES = 2;
>>> +    uint32_t retry_cnt = 0;
>>> +    while (++retry_cnt <= MAX_NO_RETRIES) {
>>> +        rc = immUtil.doImmOperations(operations);
>> [rafael] whitespace above
>>> +        if (rc != SA_AIS_OK && rc == SA_AIS_ERR_TRY_AGAIN) {
>>> +            /* TRY_AGAIN is returned only when ccb is aborted
>> [rafael] whitespace above
>>> +               with Resource abort in error string. Resurce abort in
>>> +               error string is checked presently for modify 
>>> operation */
>> [rafael] no strict rule but I never use this comment style over 
>> multiple lines. Maybe try:
>> /*
>> *
>> */
>> or
>> //
>> //
>> //
>>> +            continue;
>>> +        }
>>>           }
>>> +    if (rc != SA_AIS_OK){
>>> +        LOG_NO("SmfUpgradeStep::setMaintenanceState(), fails to 
>>> reset all saAmfSUMaintenanceCampaign attr");
>>> +    }
>>>             //Delete the created SmfImmModifyOperation instances
>>>           std::list < SmfImmOperation * > ::iterator operIter;
>>
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to