Hi Lennart,

On 2016/09/27 02:57 PM, Lennart Lund wrote:
> Hi Neel
>
> I can see a problem here. If a timeout there are two possibilities that the 
> SI-swap has been done or the DI-swap is not done.
>  From AIS spec:
> "SA_AIS_ERR_TIMEOUT - An implementation-dependent timeout occurred, or the
> timeout, specified by the timeout parameter, occurred before the call could 
> complete.
> It is unspecified whether the call succeeded or whether it did not."
>
> If the SI-swap did succeed also SMF must have answered the CSI set callback 
> for handling this. This means that this process is no longer ACTIVE and 
> cannot be allowed to handle the campaign any longer e.g. fail the campaign. 
> Is this handled correctly as is?
In the case of TIMEOUT, The following is checked.

  while (SmfCampaignThread::instance() != NULL) {

when SMF receives Quiesced state, the campaign_oi_deactivate is called 
which will terminates SmfCampaignThread.
Once the SmfCampaignThread is terminated,but still the re-try is performed.

  rc = admOp.execute(0); must be guarded with SmfCampaignThread::instance().

I will resend the patch.

Thanks,
Neel.

> It should be possible to check the smfd_cb->ha_state enum to determine if we 
> are still ACTIVE. If not this process must end all campaign handling and 
> become standby. A problem here is that I don't think this is handled. Also if 
> this check is done it would be possible to try again in case of timeout if we 
> are still ACTIVE.
>
> So even if your solution is implemented it means that if there is a timeout 
> error it must still be checked whether we are active or not and if we are 
> still active the campaign should fail but if we are standby nothing should be 
> done.
>
> Thanks
> Lennart
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]]
>> Sent: den 27 september 2016 09:21
>> To: Lennart Lund <[email protected]>; Rafael Odzakow
>> <[email protected]>
>> Cc: [email protected]
>> Subject: [PATCH 1 of 1] smf:retry for TIMEOUT in si-swap to be avoided
>> [#2069]
>>
>>   osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc |  4 +---
>>   1 files changed, 1 insertions(+), 3 deletions(-)
>>
>>
>> diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
>> b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
>> --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
>> +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
>> @@ -4247,7 +4247,6 @@ SmfSwapThread::main(void)
>>      int rc = admOp.execute(0);
>>      while ((rc == SA_AIS_ERR_TRY_AGAIN) ||
>>                 (rc == SA_AIS_ERR_BUSY) ||
>> -              (rc == SA_AIS_ERR_TIMEOUT) ||
>>                 (rc == SA_AIS_ERR_FAILED_OPERATION)) {
>>
>>                   if (retryCnt > max_swap_retry) {
>> @@ -4255,8 +4254,7 @@ SmfSwapThread::main(void)
>>                           goto exit_error;
>>                   }
>>
>> -                if ((rc == SA_AIS_ERR_TIMEOUT) ||
>> -                    (rc == SA_AIS_ERR_FAILED_OPERATION)) {
>> +                if (rc == SA_AIS_ERR_FAILED_OPERATION) {
>>                           //A timeout or failed operation occur. It is 
>> undefined if the
>> operation was successful or not.
>>                           //We wait for maximum two minutes to see if the 
>> campaign
>> thread is terminated (which it is in a successful swap)
>>                           //If not terminated, retry the SWAP operation.


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to