Hi Anders Widell, Please find my response .
On 1/5/2016 3:26 PM, Anders Widell wrote: > See comments inline. > > regards, > Anders Widell > > On 01/04/2016 11:00 AM, A V Mahesh wrote: >> Hi Anders Widell, >> >> Please find my answers. >> >> On 1/4/2016 2:20 PM, Anders Widell wrote: >>> Ok good, that will solve the backwards compatibility issue! Though I >>> still don't understand what the problem would be if we let the >>> function time out at the time specified by the user, even when the >>> user wants us to time out after less than fourteen seconds. If we go >>> for a new API version, wouldn't it be best if we let the new version >>> of the API treat any non-negative timeout parameter value as valid? >> [AVM] Currently all OpenSaf (IMMA/CPA/LGA/NTFA/ect .... ) API are >> tested using default ~10 second as its MAX Transport MDS timeout >> value (Implicitly by CPA or explicitly passed by user/application) , >> and we haven't tested Opensaf with less value than >> this on a normal 4 node cluster , if we have a requirement of even >> when the user wants us to time out after less than default timeout >> vault based >> on his cluster size or performance of system, we can >> make this as exportable/configurable variable at Transport level. > [AndersW] The synchronous SAF API functions that don't take an > explicit timeout parameter is a different story - for them I think it > is fine to just test with the default timeout of 10 seconds although > the user has the possibility to change the timeout using an > environment variable (at least for the IMM API). > > For the API functions that take an explicit timeout parameter, we > can't expect the user to always set this parameter to ten seconds. I > understand that you cannot test with all possible values of the > timeout parameter, since the timeout parameter is a 64-bit integer. > You ought to have one test case where the SAF API call does time out, > and anther test case where it doesn't time out. That, plus a few tests > with extreme timeout values (0, 1, SA_TIME_MAX, and -1) ought to be > enough. > > If we go for bumping the CKPT API version then we don't have to worry > about backwards compatibility, and then I think the new version of the > API should support the full range of the timeout parameter, i.e. [0, > SA_TIME_MAX]. That is better than imposing some arbitrary limits (why > 14 seconds?). It shouldn't be difficult to implement support for the > full range of timeout values, since the underlying POSIX functions > that we use do support them. And for the testing: we anyhow do need to > test both the case when the timeout happens and when it doesn't. [AVM] If we should support the full range of the timeout parameter, i.e. [0, SA_TIME_MAX] , even we don't have an issue of application backwards compatibility ( bumping the CKPT API version is NOT required ). OK then I will do following change in new ticket for the SAF API functions in all services that take an explicit timeout parameter: 1) API should support the full range of the timeout parameter, i.e. [0, SA_TIME_MAX]. 2) API will unconditionally return SA_AIS_ERR_TIMEOUT if application specify > 0x98968000000000 as timeout and it will be documented as well ( because the current implementation of MDS header adopting the size of the timeout parameter from 32 bits to 64 bits is complex , so I will explore the possibility of this in an enhancement ) . >> >>> >>> Another thing: it looks like the timeout parameter is used both by >>> the agent (to time out waiting for the reply from the server) as >>> well as by the server. Or, at least it is passed as a parameter in >>> the Open request message to the CKPT node director. However, when I >>> look at the CKPT node director code, as far as I can tell it looks >>> like the timeout parameter isn't really used for anything. If I am >>> correct, we end up in the cpnd_evt_proc_ckpt_open() function. It has >>> a local variable called "timeout", and it reads the timeout >>> parameter from the message it has received from the agent. But it is >>> not used! Could you please take a look at this? >> [AVM] This timeout is used only for Transport (MDS) sync_send >> message timeout between CPA<--->CPND ( >> mds_info.info.svc_send.info.sndrsp.i_time_to_wait ) >> and CPND only has interaction with /dev/shmR/W >> operation to create/read/write to Physical ckpt replicas ( the real >> checkpoint data) , which varies for system to system configuration. >> >> CPND--->CPD--->CPND only control data, so the the >> default timeout value is used between CPND & CPD . > [AndersW] Ok, so then the part for handling the timeout variable in > cpnd_evt_proc_ckpt_open() is just dead code that ought to be removed? [AVM] yes this can be removed. -AVM >> >> -AVM >> >>> >>> thanks, >>> Anders Widell >>> >>> On 01/04/2016 07:19 AM, A V Mahesh wrote: >>>> Hi Anders Widell >>>> >>>> Ok I will bump the minor version of the CKPT API, or to use ELF >>>> symbol version as you suggested. >>>> >>>> -AVM >>>> >>>> On 12/29/2015 3:49 PM, Anders Widell wrote: >>>>> Hi! >>>>> >>>>> See my comments marked [AndersW] inline. >>>>> >>>>> regards, >>>>> Anders Widell >>>>> >>>>> On 12/28/2015 05:35 AM, A V Mahesh wrote: >>>>>> Hi Anders Widell, >>>>>> >>>>>> Please find my answers. >>>>>> >>>>>> On 12/22/2015 10:41 PM, Anders Widell wrote: >>>>>>> See comments inline. >>>>>>> >>>>>>> regards, >>>>>>> Anders Widell >>>>>>> >>>>>>> On 12/21/2015 11:04 AM, A V Mahesh wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> - The timeout here is the duration for which the API should >>>>>>>> wait before returning to the user. >>>>>>>> - The patch does not changes the minimum value of 14 seconds. >>>>>>>> IE. Before this patch also the default value was 14 seconds. >>>>>>>> This value was set based on implementation specific fine >>>>>>>> tuning, particularly in large size cluster. >>>>>>> I think it does change the minimum timeout. I am here referring >>>>>>> to the functions that take an SaTimeT timeout parameter. You can >>>>>>> try this example program, that used to work before ticket >>>>>>> [#1583] was pushed: >>>>>>> >>>>>>> const SaNameT checkpointName = {strlen("safCkpt=hello"), >>>>>>> "safCkpt=hello"}; >>>>>>> do { >>>>>>> err = saCkptCheckpointOpen( >>>>>>> ckptHandle, >>>>>>> &checkpointName, >>>>>>> &checkpointCreationAttributes, >>>>>>> SA_CKPT_CHECKPOINT_READ | >>>>>>> SA_CKPT_CHECKPOINT_WRITE | >>>>>>> SA_CKPT_CHECKPOINT_CREATE, >>>>>>> 10 * SA_TIME_ONE_SECOND, >>>>>>> &checkpointHandle); >>>>>>> >>>>>> [AVM] >>>>>> >>>>>> Even though previously ( before this patch) >>>>>> Application was passing 10 * SA_TIME_ONE_SECOND ( ) >>>>>> because of #1583 reported real issue of `SaTimeT >>>>>> timeout parameter is not being converted in to TEN_MILLI_SEC() >>>>>> while passing to mds_msg_sync_send()` >>>>>> internally CPA was assuming Application is >>>>>> passed 10 SECOND * 10000000 ( 10 * 1000000000 * 10000000) >>>>>> ,which is being treated as larger than the default >>>>>> timeout time of 14 SECOND, so previously the >>>>>> Application only has an impression that it will get time out in >>>>>> 10 SECOND , but in piratical it was considered larger >>>>>> than 14 SECOND default time. >>>>> [AndersW] Yes this is a bug and I am not arguing against fixing >>>>> it. I don't think a fix for this would be backwards incompatible, >>>>> since an application shouldn't depend on a bug in OpenSAF (we >>>>> don't need to be bug-backwards-compatible, except perhaps in some >>>>> unusual situations). But this is not the problem that I am talking >>>>> about. The problem I am talking about is that the range of allowed >>>>> timeout values has become smaller as part of the fix. Such a >>>>> change is typically not backwards compatible, since an application >>>>> could be using a timeout value which lies within the previously >>>>> allowed range of timeout values, but which is outside the new >>>>> (smaller) range of allowed values. The example program I provided >>>>> demonstrates this problem. It works before [#1583] was pushed, but >>>>> not afterwards. The program uses the checkpoint API in a perfectly >>>>> valid and reasonable way. Thus, it provides a clear evidence that >>>>> the change is not backwards compatible. If you don't believe me, >>>>> please try the program for yourself. >>>>>> >>>>>> ================================================================================================================= >>>>>> >>>>>> >>>>>> Before #1583 Fix : >>>>>> saCkptCheckpointOpen( ckptHandle, >>>>>> &checkpointName, ....., 10 * SA_TIME_ONE_SECOND, >>>>>> &checkpointHandle); >>>>>> timeout = 10 * SA_TIME_ONE_SECOND; >>>>>> proc_rc = >>>>>> cpa_mds_msg_sync_send(cb->cpa_mds_hdl, &(cb->cpnd_mds_dest), >>>>>> &evt, &out_evt, timeout); >>>>>> >>>>>> After #1583 Fix : >>>>>> timeout = 10 * SA_TIME_ONE_SECOND; >>>>>> time_out = >>>>>> m_CPSV_CONVERT_SATIME_TEN_MILLI_SEC(timeout); >>>>>> proc_rc = >>>>>> cpa_mds_msg_sync_send(cb->cpa_mds_hdl, &(cb->cpnd_mds_dest), >>>>>> &evt, &out_evt, time_out); >>>>>> >>>>>> ================================================================================================================= >>>>>> >>>>>> >>>>>> >>>>>> In other words , before this patch in reality the >>>>>> saCkptCheckpointOpen() was not being returned TIMEOUT even after >>>>>> 10 SECOND ( but Application expecting it). >>>>>> Application was only in assumption it was passing >>>>>> 10 SECOND , so no change in behavioral difference in Application >>>>>> before and after this patch. >>>>> [AndersW] As I said above, this is not the part of the change that >>>>> I have problems with. It is the following passage: >>>>> >>>>> Before (note: NCS_SAF_MIN_ACCEPT_TIME has the value 10): >>>>> >>>>> if (time_out < NCS_SAF_MIN_ACCEPT_TIME) { >>>>> rc = SA_AIS_ERR_TIMEOUT; >>>>> TRACE_4("Cpa CkptOpen failed Api failed with return >>>>> value:%d,ckptHandle:%llx,timeout:%llu", rc, ckptHandle, timeout); >>>>> goto mds_send_fail; >>>>> } >>>>> >>>>> >>>>> After (note: CPSV_WAIT_TIME has the value 1400): >>>>> >>>>> if (time_out < CPSV_WAIT_TIME) { >>>>> rc = SA_AIS_ERR_TIMEOUT; >>>>> TRACE_4("Cpa CkptOpen failed Api failed with return >>>>> value:%d,ckptHandle:%llx,timeout:%llu", >>>>> rc, ckptHandle, timeout); >>>>> goto mds_send_fail; >>>>> } >>>>> >>>>> So yes, there is a change in behaviour from the application's >>>>> point of view. If the timeout (after conversion to a unit of 10 >>>>> milliseconds) falls within the range 10 to 1400, the application >>>>> will now always get the SA_AIS_ERR_TIMEOUT. Previously, the >>>>> function may have succeeded. >>>>>> >>>>>>> >>>>>>>> - Even though it is not a practical case of passing 49 days >>>>>>>> (greater than or equal to 0x98968000000000) as a timeout value, >>>>>>>> yes what you say can be done across all services. >>>>>>> Shouldn't we then document 0x98968000000000 as the upper limit >>>>>>> for the timeout parameter, and validate that the parameter given >>>>>>> by the user is less than this upper limit. As it is coded right >>>>>>> now, the function will unconditionally return SA_AIS_ERR_TIMEOUT >>>>>>> if you specify 0x98968000000000 as timeout (you can try this by >>>>>>> replacing 10 * SA_TIME_ONE_SECOND with 0x98968000000000 in the >>>>>>> example program above). >>>>>> >>>>>> [AVM] Ok ,as if this issue exist with all service , I will >>>>>> raise a new ticket and will be fixed/document for all services >>>>>> point of view . >>>>> [AndersW] Instead of documenting this as a limitation, please fix >>>>> MDS by changing the size of the timeout parameter from 32 bits to >>>>> 64 bits. Note that there are perfectly reasonable timeout values >>>>> that are very large, e.g. someone might be using SA_TIME_MAX to >>>>> get a (for all practical purposes) infinite timeout. >>>>>> >>>>>> -AVM >>>>>>>> >>>>>>>> Thanks, >>>>>>>> AV. >>>>>>>> >>>>>>>> >>>>>>>> On 12/18/2015 2:43 PM, Anders Widell wrote: >>>>>>>>> Hi! >>>>>>>>> >>>>>>>>> I have some questions regarding this change: >>>>>>>>> >>>>>>>>> 1) This change is not backwards compatible. Have you >>>>>>>>> considered to either bump the minor version of the CKPT API, >>>>>>>>> or to use ELF symbol versioning, to avoid breaking old >>>>>>>>> applications? >>>>>>>>> >>>>>>>>> 2) Why was the minimum possible time-out increased from 0.1 >>>>>>>>> seconds to 14 seconds? 14 seconds is like an eternity in a >>>>>>>>> real-time system. >>>>>>>>> >>>>>>>>> 3) If the user-specified timeout is less than 14 seconds, the >>>>>>>>> functions will always fail without even trying. Wouldn't it be >>>>>>>>> better to at least try to perform the operation within the >>>>>>>>> time limit specified by the user? The user may have a loop >>>>>>>>> where it keeps retrying the API function call when the >>>>>>>>> SA_AIS_ERR_TIMEOUT error code is returned. I would expect such >>>>>>>>> a loop to eventually terminate, especially if the timeout >>>>>>>>> parameter has a reasonable value (and there are many >>>>>>>>> reasonable timeout values which are less than 14 seconds). >>>>>>>>> >>>>>>>>> 3) Can't the conversion from nanoseconds represented as a >>>>>>>>> signed 64-bit integer to a unit of 10 milliseconds stored in >>>>>>>>> an unsigned 32-bit integer ('uint32_t time_out' in the code >>>>>>>>> below) overflow? I.e. if the timeout parameter has a value >>>>>>>>> greater than or equal to 0x98968000000000? >>>>>>>>> >>>>>>>>> / Anders Widell >>>>>>>>> >>>>>>>>> On 11/06/2015 06:33 AM, mahesh.va...@oracle.com wrote: >>>>>>>>>> osaf/libs/agents/saf/cpa/cpa_api.c | 111 >>>>>>>>>> ++++++++++++++++++++++++++++++------ >>>>>>>>>> 1 files changed, 91 insertions(+), 20 deletions(-) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> If we test upper and lower limits of SaTimeT timeout >>>>>>>>>> arguments to CPSV API , >>>>>>>>>> some of CPA APIs are doesn't returns or hungs. >>>>>>>>>> >>>>>>>>>> Current cpa code doesn't validating or passing SaTimeT >>>>>>>>>> timeout arguments to CPSV API , >>>>>>>>>> which are below default API timeout limit of CPSV_WAIT_TIME >>>>>>>>>> (1400 X 10ms) and >>>>>>>>>> above SaTimeT, type long long int Maximum value >>>>>>>>>> 9223372036854775807 (2^63-1) >>>>>>>>>> >>>>>>>>>> For example : >>>>>>>>>> >>>>>>>>>> i) const SaTimeT timeout = 1215752192; >>>>>>>>>> ii) const SaTimeT timeout = 2 * SA_TIME_ONE_MINUTE; >>>>>>>>>> iii) const SaTimeT timeout = 9223372036854775809; >>>>>>>>>> >>>>>>>>>> saCkptCheckpointOpen(...., timeout, ...); >>>>>>>>>> >>>>>>>>>> Because of that in some of variables are over flowing and , >>>>>>>>>> API calls are not working properly >>>>>>>>>> >>>>>>>>>> Fix : >>>>>>>>>> >>>>>>>>>> 1 - Validated if (time_out < CPSV_WAIT_TIME) return >>>>>>>>>> SA_AIS_ERR_INVALID_PARAM >>>>>>>>>> 2 - Validates if (expirationTime < 0 ) return >>>>>>>>>> SA_AIS_ERR_INVALID_PARAM >>>>>>>>>> 3 - Pass proper timeout values to internal functions of Opensaf >>>>>>>>>> >>>>>>>>>> diff --git a/osaf/libs/agents/saf/cpa/cpa_api.c >>>>>>>>>> b/osaf/libs/agents/saf/cpa/cpa_api.c >>>>>>>>>> --- a/osaf/libs/agents/saf/cpa/cpa_api.c >>>>>>>>>> +++ b/osaf/libs/agents/saf/cpa/cpa_api.c >>>>>>>>>> @@ -27,7 +27,6 @@ >>>>>>>>>> ******************************************************************************/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> #include "cpa.h" >>>>>>>>>> -#define NCS_SAF_MIN_ACCEPT_TIME 10 >>>>>>>>>> /**************************************************************************** >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Name : SaCkptInitialize >>>>>>>>>> @@ -889,6 +888,14 @@ SaAisErrorT saCkptCheckpointOpen(SaCkptH >>>>>>>>>> return SA_AIS_ERR_INVALID_PARAM; >>>>>>>>>> } >>>>>>>>>> + if (timeout < 0 ) { >>>>>>>>>> + TRACE_4("Cpa CkptOpen: failed invalid timeout return >>>>>>>>>> value:%d,ckptHandle:%llx", >>>>>>>>>> + SA_AIS_ERR_INVALID_PARAM, ckptHandle); >>>>>>>>>> + TRACE_LEAVE2("API return code = %u", rc); >>>>>>>>>> + return SA_AIS_ERR_INVALID_PARAM; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + >>>>>>>>>> /* Draft Validations */ >>>>>>>>>> rc = >>>>>>>>>> cpa_open_attr_validate(checkpointCreationAttributes, >>>>>>>>>> checkpointOpenFlags, checkpointName); >>>>>>>>>> if (rc != SA_AIS_OK) { >>>>>>>>>> @@ -975,13 +982,13 @@ SaAisErrorT saCkptCheckpointOpen(SaCkptH >>>>>>>>>> } >>>>>>>>>> evt.info.cpnd.info.openReq.ckpt_flags = >>>>>>>>>> checkpointOpenFlags; >>>>>>>>>> - >>>>>>>>>> /* convert the timeout to 10 ms value and add it to the >>>>>>>>>> sync send timeout */ >>>>>>>>>> time_out = m_CPSV_CONVERT_SATIME_TEN_MILLI_SEC(timeout); >>>>>>>>>> - if (time_out < NCS_SAF_MIN_ACCEPT_TIME) { >>>>>>>>>> + if (time_out < CPSV_WAIT_TIME) { >>>>>>>>>> rc = SA_AIS_ERR_TIMEOUT; >>>>>>>>>> - TRACE_4("Cpa CkptOpen failed Api failed with return >>>>>>>>>> value:%d,ckptHandle:%llx,timeout:%llu", rc, ckptHandle, >>>>>>>>>> timeout); >>>>>>>>>> + TRACE_4("Cpa CkptOpen failed Api failed with return >>>>>>>>>> value:%d,ckptHandle:%llx,timeout:%llu", >>>>>>>>>> + rc, ckptHandle, timeout); >>>>>>>>>> goto mds_send_fail; >>>>>>>>>> } >>>>>>>>>> @@ -998,9 +1005,8 @@ SaAisErrorT saCkptCheckpointOpen(SaCkptH >>>>>>>>>> TRACE_4("cpa CkptOpen:CPND_DOWN Api failed with >>>>>>>>>> return value:%d,ckptHandle:%llx", rc, ckptHandle); >>>>>>>>>> goto mds_send_fail; >>>>>>>>>> } >>>>>>>>>> - >>>>>>>>>> /* Send the evt to CPND */ >>>>>>>>>> - proc_rc = cpa_mds_msg_sync_send(cb->cpa_mds_hdl, >>>>>>>>>> &(cb->cpnd_mds_dest), &evt, &out_evt, timeout); >>>>>>>>>> + proc_rc = cpa_mds_msg_sync_send(cb->cpa_mds_hdl, >>>>>>>>>> &(cb->cpnd_mds_dest), &evt, &out_evt, time_out); >>>>>>>>>> /* Generate rc from proc_rc */ >>>>>>>>>> switch (proc_rc) { >>>>>>>>>> @@ -1709,6 +1715,13 @@ SaAisErrorT saCkptCheckpointRetentionDur >>>>>>>>>> goto done; >>>>>>>>>> } >>>>>>>>>> + if (retentionDuration < 0 ) { >>>>>>>>>> + TRACE_4("Cpa RetentionDurationSet: failed >>>>>>>>>> invalid retentionDuration return value:%d,ckptHandle:%llx", >>>>>>>>>> + SA_AIS_ERR_INVALID_PARAM, checkpointHandle); >>>>>>>>>> + rc = SA_AIS_ERR_INVALID_PARAM; >>>>>>>>>> + goto done; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> /* get the CB Lock */ >>>>>>>>>> if (m_NCS_LOCK(&cb->cb_lock, NCS_LOCK_WRITE) != >>>>>>>>>> NCSCC_RC_SUCCESS) { >>>>>>>>>> rc = SA_AIS_ERR_LIBRARY; >>>>>>>>>> @@ -2183,6 +2196,7 @@ SaAisErrorT saCkptSectionCreate(SaCkptCh >>>>>>>>>> bool gen_sec_flag = false; >>>>>>>>>> SaCkptSectionIdT app_ptr; >>>>>>>>>> CPA_CLIENT_NODE *cl_node = NULL; >>>>>>>>>> + uint32_t time_out; >>>>>>>>>> TRACE_ENTER2("SaCkptCheckpointHandleT passed is >>>>>>>>>> %llx",checkpointHandle); >>>>>>>>>> /* Validate the Input Parameters */ >>>>>>>>>> @@ -2312,8 +2326,21 @@ SaAisErrorT saCkptSectionCreate(SaCkptCh >>>>>>>>>> evt.info.cpnd.info.sec_creatReq.init_size = >>>>>>>>>> initialDataSize; >>>>>>>>>> m_NCS_UNLOCK(&cb->cb_lock, NCS_LOCK_WRITE); >>>>>>>>>> + >>>>>>>>>> + time_out = CPA_WAIT_TIME(initialDataSize); >>>>>>>>>> + >>>>>>>>>> + if (time_out < 0 ) { >>>>>>>>>> + rc = SA_AIS_ERR_TIMEOUT; >>>>>>>>>> + TRACE_4("cpa SectCreate Api failed because of larg >>>>>>>>>> initialDataSize return :%d,ckptHandle:%llx", >>>>>>>>>> + rc, checkpointHandle); >>>>>>>>>> + goto fail1; >>>>>>>>>> + } >>>>>>>>>> + else if (time_out < CPSV_WAIT_TIME) { >>>>>>>>>> + time_out = CPSV_WAIT_TIME; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> proc_rc = cpa_mds_msg_sync_send(cb->cpa_mds_hdl, >>>>>>>>>> &(gc_node->active_mds_dest), >>>>>>>>>> - &evt, &out_evt, >>>>>>>>>> CPA_WAIT_TIME(initialDataSize)); >>>>>>>>>> + &evt, &out_evt, time_out); >>>>>>>>>> /* Generate rc from proc_rc */ >>>>>>>>>> switch (proc_rc) { >>>>>>>>>> @@ -2684,6 +2711,14 @@ SaAisErrorT saCkptSectionExpirationTimeS >>>>>>>>>> return rc; >>>>>>>>>> } >>>>>>>>>> + if (expirationTime < 0 ) { >>>>>>>>>> + TRACE_4("Cpa ExpirationTimeSet: failed >>>>>>>>>> invalid expirationTime return value:%d,ckptHandle:%llx", >>>>>>>>>> + SA_AIS_ERR_INVALID_PARAM, checkpointHandle); >>>>>>>>>> + rc = SA_AIS_ERR_INVALID_PARAM; >>>>>>>>>> + TRACE_LEAVE2("API return code = %u", rc); >>>>>>>>>> + return rc; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> /* retrieve CPA CB */ >>>>>>>>>> m_CPA_RETRIEVE_CB(cb); >>>>>>>>>> if (!cb) { >>>>>>>>>> @@ -2847,6 +2882,14 @@ SaAisErrorT saCkptSectionIterationInitia >>>>>>>>>> return rc; >>>>>>>>>> } >>>>>>>>>> + if (expirationTime < 0 ) { >>>>>>>>>> + TRACE_4("Cpa SectionIterationInitialize: failed >>>>>>>>>> invalid expirationTime return value:%d,ckptHandle:%llx", >>>>>>>>>> + SA_AIS_ERR_INVALID_PARAM, checkpointHandle); >>>>>>>>>> + rc = SA_AIS_ERR_INVALID_PARAM; >>>>>>>>>> + TRACE_LEAVE2("API return code = %u", rc); >>>>>>>>>> + return rc; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> /* retrieve CPA CB */ >>>>>>>>>> m_CPA_RETRIEVE_CB(cb); >>>>>>>>>> if (!cb) { >>>>>>>>>> @@ -3360,6 +3403,7 @@ SaAisErrorT saCkptCheckpointWrite(SaCkpt >>>>>>>>>> CPA_CLIENT_NODE *cl_node = NULL; >>>>>>>>>> SaSizeT all_ioVector_size = 0; >>>>>>>>>> uint32_t err_flag = 0; >>>>>>>>>> + uint32_t time_out; >>>>>>>>>> >>>>>>>>>> TRACE_ENTER2("SaCkptCheckpointHandleT passed is >>>>>>>>>> %llx",checkpointHandle); >>>>>>>>>> @@ -3479,18 +3523,27 @@ SaAisErrorT >>>>>>>>>> saCkptCheckpointWrite(SaCkpt >>>>>>>>>> /* Unlock cpa_lock before calling mds api */ >>>>>>>>>> m_NCS_UNLOCK(&cb->cb_lock, NCS_LOCK_WRITE); >>>>>>>>>> -#if 1 >>>>>>>>>> - for (iter = 0; iter < numberOfElements; iter++) >>>>>>>>>> + >>>>>>>>>> + for (iter = 0; iter < numberOfElements; iter++) { >>>>>>>>>> all_ioVector_size += ioVector[iter].dataSize; >>>>>>>>>> - >>>>>>>>>> - proc_rc = cpa_mds_msg_sync_send(cb->cpa_mds_hdl, >>>>>>>>>> &(gc_node->active_mds_dest), >>>>>>>>>> - &evt, &out_evt, >>>>>>>>>> CPA_WAIT_TIME(all_ioVector_size)); >>>>>>>>>> -#else >>>>>>>>>> - proc_rc = cpa_mds_msg_sync_send(cb->cpa_mds_hdl, >>>>>>>>>> &(gc_node->active_mds_dest), &evt, &out_evt, CPSV_WAIT_TIME); >>>>>>>>>> -#endif >>>>>>>>>> - /* Generate rc from proc_rc */ >>>>>>>>>> - switch (proc_rc) >>>>>>>>>> - { >>>>>>>>>> + time_out = CPA_WAIT_TIME(all_ioVector_size); >>>>>>>>>> + >>>>>>>>>> + if (time_out < 0 ) { >>>>>>>>>> + rc = SA_AIS_ERR_TIMEOUT; >>>>>>>>>> + TRACE_4("cpa ckpointWrite Api failed because of >>>>>>>>>> larg ioVector size return :%d,ckptHandle:%llx", >>>>>>>>>> + rc, checkpointHandle); >>>>>>>>>> + goto done; >>>>>>>>>> + } >>>>>>>>>> + else if (time_out < CPSV_WAIT_TIME) { >>>>>>>>>> + time_out = CPSV_WAIT_TIME; >>>>>>>>>> + } >>>>>>>>>> + proc_rc = cpa_mds_msg_sync_send(cb->cpa_mds_hdl, >>>>>>>>>> &(gc_node->active_mds_dest), >>>>>>>>>> + &evt, &out_evt, time_out); >>>>>>>>>> + >>>>>>>>>> + } >>>>>>>>>> + /* Generate rc from proc_rc */ >>>>>>>>>> + switch (proc_rc) >>>>>>>>>> + { >>>>>>>>>> case NCSCC_RC_SUCCESS: >>>>>>>>>> break; >>>>>>>>>> case NCSCC_RC_REQ_TIMOUT: >>>>>>>>>> @@ -3577,6 +3630,7 @@ SaAisErrorT saCkptSectionOverwrite(SaCkp >>>>>>>>>> CPA_GLOBAL_CKPT_NODE *gc_node = NULL; >>>>>>>>>> bool add_flag = false; >>>>>>>>>> CPA_CLIENT_NODE *cl_node = NULL; >>>>>>>>>> + uint32_t time_out; >>>>>>>>>> TRACE_ENTER2("SaCkptCheckpointHandleT passed is >>>>>>>>>> %llx",checkpointHandle); >>>>>>>>>> memset(&ckpt_data, '\0', sizeof(CPSV_CKPT_DATA)); >>>>>>>>>> @@ -3684,8 +3738,19 @@ SaAisErrorT saCkptSectionOverwrite(SaCkp >>>>>>>>>> /* Unlock cpa_lock before calling mds api */ >>>>>>>>>> m_NCS_UNLOCK(&cb->cb_lock, NCS_LOCK_WRITE); >>>>>>>>>> + time_out = CPA_WAIT_TIME(dataSize); >>>>>>>>>> + >>>>>>>>>> + if (time_out < 0 ) { >>>>>>>>>> + rc = SA_AIS_ERR_TIMEOUT; >>>>>>>>>> + TRACE_4("cpa SectionOverwrite Api failed because of >>>>>>>>>> larg dataSize return :%d,ckptHandle:%llx", rc, >>>>>>>>>> checkpointHandle); >>>>>>>>>> + goto done; >>>>>>>>>> + } >>>>>>>>>> + else if (time_out < CPSV_WAIT_TIME) { >>>>>>>>>> + time_out = CPSV_WAIT_TIME; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> proc_rc = cpa_mds_msg_sync_send(cb->cpa_mds_hdl, >>>>>>>>>> &(gc_node->active_mds_dest), >>>>>>>>>> - &evt, &out_evt, CPA_WAIT_TIME(dataSize)); >>>>>>>>>> + &evt, &out_evt, time_out); >>>>>>>>>> /* Generate rc from proc_rc */ >>>>>>>>>> switch (proc_rc) { >>>>>>>>>> @@ -4053,6 +4118,12 @@ SaAisErrorT saCkptCheckpointSynchronize( >>>>>>>>>> /* For Ckpt Sync CPA will not perform any operation >>>>>>>>>> other than passing the request to CPND */ >>>>>>>>>> + if (timeout < 0 ) { >>>>>>>>>> + TRACE_4("Cpa CheckpointSynchronize: failed with >>>>>>>>>> invalied timeout return value:%d,ckptHandle:%llx", >>>>>>>>>> SA_AIS_ERR_INVALID_PARAM, checkpointHandle); >>>>>>>>>> + rc =SA_AIS_ERR_INVALID_PARAM; >>>>>>>>>> + goto done; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> /* retrieve CPA CB */ >>>>>>>>>> m_CPA_RETRIEVE_CB(cb); >>>>>>>>>> if (!cb) { >>>>>>>>>> @@ -4139,7 +4210,7 @@ SaAisErrorT saCkptCheckpointSynchronize( >>>>>>>>>> /* Convert the time from saTimeT to millisecs */ >>>>>>>>>> timeout = m_CPSV_CONVERT_SATIME_TEN_MILLI_SEC(timeout); >>>>>>>>>> - if (timeout < NCS_SAF_MIN_ACCEPT_TIME) { >>>>>>>>>> + if (timeout < CPSV_WAIT_TIME) { >>>>>>>>>> rc = SA_AIS_ERR_TIMEOUT; >>>>>>>>>> TRACE_4("cpa CkptSynchronize Api failed with return >>>>>>>>>> value:%d,ckptHandle:%llx ", rc, checkpointHandle); >>>>>>>>>> goto fail1; >>>>>>>>>> >>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Opensaf-devel mailing list >>>>>>>>>> Opensaf-devel@lists.sourceforge.net >>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel