Hi Rafael,

Comments inline.

On 2017/01/30 07:33 PM, Rafael Odzakow wrote:
> ACK, with comments after [rafael] tag
>
I will add the comments, while pushing.
> As an improvement after the retries the operation can fail the 
> campaign. Today we only move the problem over to the next campaign?
>
No, the first campaign will not fail and the campaign state will be 
moved to campaign completed.
But saAmfSUMaintenanceCampaign  is not cleared, when the next campaign 
is tried it will fail with the error.

The patch, retries the modification of saAmfSUMaintenanceCampaign, if 
the ccb operation is aborted because of resource abort.

Thanks,
Neel.
>
> On 01/30/2017 02:25 PM, reddy.neelaka...@oracle.com wrote:
>> osaf/services/saf/smfsv/smfd/SmfImmOperation.cc    |  21 
>> ++++++++++++++++++---
>>   osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc |  16 
>> ++++++++++++++--
>>   2 files changed, 32 insertions(+), 5 deletions(-)
>>
>>
>> When the campaign is commited, while modifying 
>> saAmfSUMaintenanceCampaign  flags, if any of the IMMND restarts or
>> new node joins the cluster, then the non-commited ccbs will be aborted.
>>
>> This is a rare case, when saAmfSUMaintenanceCampaign will fail due to 
>> IMMND sync, and and the next campaign will fails with
>> ER Failed to set maintenance state in step=safSmfStep=0001
>>
>> diff --git a/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc 
>> b/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc
>> --- a/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc
>> +++ b/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc
>> @@ -1033,11 +1033,26 @@ SmfImmModifyOperation::execute(SmfRollba
>>   //   objectName.length = (SaUint16T) objectLen;
>>   //   memcpy(objectName.value, object, objectLen);
>>   +    const SaStringT * errStrings = NULL;
>>       result = immutil_saImmOmCcbObjectModify_2(m_ccbHandle, 
>> &objectName, (const SaImmAttrModificationT_2 **)
>> -                          m_immAttrMods);
>> -    if (result != SA_AIS_OK) {
>> +            m_immAttrMods);
>> +    if (result != SA_AIS_OK && result == SA_AIS_ERR_FAILED_OPERATION) {
>> +        result = saImmOmCcbGetErrorStrings(m_ccbHandle, &errStrings);
>> +        if (errStrings){
>> +            TRACE("Received error string is %s", errStrings[0]);
>> +            char * type = NULL;
>> +            type = strstr(errStrings[0], "IMM: Resource abort: ");
>> +            if(type != NULL) {
>> +                TRACE("SA_AIS_ERR_FAILED_OPERATION is modified to 
>> SA_AIS_ERR_TRY_AGAIN because of ccb resourse abort" );
>> +                result = SA_AIS_ERR_TRY_AGAIN;
>> +                TRACE_LEAVE();
>> +                return result;
>> +            }
>> +        }
>> +    }
>> +    if (result != SA_AIS_OK){
>>           LOG_NO("SmfImmModifyOperation::execute, 
>> saImmOmCcbObjectModify failed %s", saf_error(result));
>> -                TRACE_LEAVE();
>> +        TRACE_LEAVE();
>>           return result;
>>       }
>>   diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc 
>> b/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc
>> --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc
>> +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc
>> @@ -1063,6 +1063,7 @@ SmfUpgradeCampaign::resetMaintenanceStat
>>           std::list < std::string > objectList;
>>       SmfImmUtils immUtil;
>>           (void)immUtil.getChildren("", objectList, SA_IMM_SUBTREE, 
>> "SaAmfSU");
>> +    SaAisErrorT rc = SA_AIS_OK;
>>             //Reset saAmfSUMaintenanceCampaign for all found SUs
>>           const std::string campDn = 
>> SmfCampaignThread::instance()->campaign()->getDn();
>> @@ -1093,9 +1094,20 @@ SmfUpgradeCampaign::resetMaintenanceStat
>>           }
>>           }
>>   -        if (immUtil.doImmOperations(operations) != SA_AIS_OK) {
>> -                LOG_NO("SmfUpgradeStep::setMaintenanceState(), fails 
>> to reset all saAmfSUMaintenanceCampaign attr");
>> +    const uint32_t MAX_NO_RETRIES = 2;
>> +    uint32_t retry_cnt = 0;
>> +    while (++retry_cnt <= MAX_NO_RETRIES) {
>> +        rc = immUtil.doImmOperations(operations);
> [rafael] whitespace above
>> +        if (rc != SA_AIS_OK && rc == SA_AIS_ERR_TRY_AGAIN) {
>> +            /* TRY_AGAIN is returned only when ccb is aborted
> [rafael] whitespace above
>> +               with Resource abort in error string. Resurce abort in
>> +               error string is checked presently for modify 
>> operation */
> [rafael] no strict rule but I never use this comment style over 
> multiple lines. Maybe try:
> /*
> *
> */
> or
> //
> //
> //
>> +            continue;
>> +        }
>>           }
>> +    if (rc != SA_AIS_OK){
>> +        LOG_NO("SmfUpgradeStep::setMaintenanceState(), fails to 
>> reset all saAmfSUMaintenanceCampaign attr");
>> +    }
>>             //Delete the created SmfImmModifyOperation instances
>>           std::list < SmfImmOperation * > ::iterator operIter;
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to