Hi Rafael, comments inline.
On 2017/01/31 03:54 PM, Rafael Odzakow wrote: > > > On 01/31/2017 05:54 AM, Neelakanta Reddy wrote: >> Hi Rafael, >> >> Comments inline. >> >> On 2017/01/30 07:33 PM, Rafael Odzakow wrote: >>> ACK, with comments after [rafael] tag >>> >> I will add the comments, while pushing. >>> As an improvement after the retries the operation can fail the >>> campaign. Today we only move the problem over to the next campaign? >>> >> No, the first campaign will not fail and the campaign state will be >> moved to campaign completed. >> But saAmfSUMaintenanceCampaign is not cleared, when the next >> campaign is tried it will fail with the error. > That is my point. Why would we want to move a problem over to the next > campaign? Is it not better to fail at the right time? As per the campaign state model, there is no failed state once you commit the campaign. The only state once the campaign committed is "campaign completed". Thanks, Neel. >> >> The patch, retries the modification of saAmfSUMaintenanceCampaign, if >> the ccb operation is aborted because of resource abort. >> >> Thanks, >> Neel. >>> >>> On 01/30/2017 02:25 PM, reddy.neelaka...@oracle.com wrote: >>>> osaf/services/saf/smfsv/smfd/SmfImmOperation.cc | 21 >>>> ++++++++++++++++++--- >>>> osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc | 16 >>>> ++++++++++++++-- >>>> 2 files changed, 32 insertions(+), 5 deletions(-) >>>> >>>> >>>> When the campaign is commited, while modifying >>>> saAmfSUMaintenanceCampaign flags, if any of the IMMND restarts or >>>> new node joins the cluster, then the non-commited ccbs will be >>>> aborted. >>>> >>>> This is a rare case, when saAmfSUMaintenanceCampaign will fail due >>>> to IMMND sync, and and the next campaign will fails with >>>> ER Failed to set maintenance state in step=safSmfStep=0001 >>>> >>>> diff --git a/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc >>>> b/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc >>>> --- a/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc >>>> +++ b/osaf/services/saf/smfsv/smfd/SmfImmOperation.cc >>>> @@ -1033,11 +1033,26 @@ SmfImmModifyOperation::execute(SmfRollba >>>> // objectName.length = (SaUint16T) objectLen; >>>> // memcpy(objectName.value, object, objectLen); >>>> + const SaStringT * errStrings = NULL; >>>> result = immutil_saImmOmCcbObjectModify_2(m_ccbHandle, >>>> &objectName, (const SaImmAttrModificationT_2 **) >>>> - m_immAttrMods); >>>> - if (result != SA_AIS_OK) { >>>> + m_immAttrMods); >>>> + if (result != SA_AIS_OK && result == >>>> SA_AIS_ERR_FAILED_OPERATION) { >>>> + result = saImmOmCcbGetErrorStrings(m_ccbHandle, &errStrings); >>>> + if (errStrings){ >>>> + TRACE("Received error string is %s", errStrings[0]); >>>> + char * type = NULL; >>>> + type = strstr(errStrings[0], "IMM: Resource abort: "); >>>> + if(type != NULL) { >>>> + TRACE("SA_AIS_ERR_FAILED_OPERATION is modified to >>>> SA_AIS_ERR_TRY_AGAIN because of ccb resourse abort" ); >>>> + result = SA_AIS_ERR_TRY_AGAIN; >>>> + TRACE_LEAVE(); >>>> + return result; >>>> + } >>>> + } >>>> + } >>>> + if (result != SA_AIS_OK){ >>>> LOG_NO("SmfImmModifyOperation::execute, >>>> saImmOmCcbObjectModify failed %s", saf_error(result)); >>>> - TRACE_LEAVE(); >>>> + TRACE_LEAVE(); >>>> return result; >>>> } >>>> diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc >>>> b/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc >>>> --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc >>>> +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc >>>> @@ -1063,6 +1063,7 @@ SmfUpgradeCampaign::resetMaintenanceStat >>>> std::list < std::string > objectList; >>>> SmfImmUtils immUtil; >>>> (void)immUtil.getChildren("", objectList, SA_IMM_SUBTREE, >>>> "SaAmfSU"); >>>> + SaAisErrorT rc = SA_AIS_OK; >>>> //Reset saAmfSUMaintenanceCampaign for all found SUs >>>> const std::string campDn = >>>> SmfCampaignThread::instance()->campaign()->getDn(); >>>> @@ -1093,9 +1094,20 @@ SmfUpgradeCampaign::resetMaintenanceStat >>>> } >>>> } >>>> - if (immUtil.doImmOperations(operations) != SA_AIS_OK) { >>>> - LOG_NO("SmfUpgradeStep::setMaintenanceState(), fails to reset all >>>> saAmfSUMaintenanceCampaign attr"); >>>> + const uint32_t MAX_NO_RETRIES = 2; >>>> + uint32_t retry_cnt = 0; >>>> + while (++retry_cnt <= MAX_NO_RETRIES) { >>>> + rc = immUtil.doImmOperations(operations); >>> [rafael] whitespace above >>>> + if (rc != SA_AIS_OK && rc == SA_AIS_ERR_TRY_AGAIN) { >>>> + /* TRY_AGAIN is returned only when ccb is aborted >>> [rafael] whitespace above >>>> + with Resource abort in error string. Resurce abort in >>>> + error string is checked presently for modify >>>> operation */ >>> [rafael] no strict rule but I never use this comment style over >>> multiple lines. Maybe try: >>> /* >>> * >>> */ >>> or >>> // >>> // >>> // >>>> + continue; >>>> + } >>>> } >>>> + if (rc != SA_AIS_OK){ >>>> + LOG_NO("SmfUpgradeStep::setMaintenanceState(), fails to >>>> reset all saAmfSUMaintenanceCampaign attr"); >>>> + } >>>> //Delete the created SmfImmModifyOperation instances >>>> std::list < SmfImmOperation * > ::iterator operIter; >>> >> > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel