Dear Mahesh, Thank you very much for your review and comments. I understood that I missed the point at n_secs calculation.
Because checking cpnd_sec_hdr_update() result by calling it again is quite weird. Is that acceptable if I move n_secs-- part to cpnd_ckpt_sec_del_db() function? something like this: ============================================================================ ======== .......... if (localSecMap) { if (sectionInfo) { LocalSectionIdMap::iterator it(localSecMap->find(sectionInfo->lcl_sec_id)); if (it != localSecMap->end()) localSecMap->erase(it); } } else { LOG_ER("can't find local sec map in cpnd_ckpt_sec_del"); osafassert(false); } // Move following: if (sectionInfo) { cp_node->replica_info.n_secs--; cp_node->replica_info.mem_used = cp_node->replica_info.mem_used - (sectionInfo->sec_size); } ........... ============================================================================ ======= The ideal is that cpnd_ckpt_sec_del() keep original the cpnd_ckpt_sec_del_db() is revert of action before cpnd_sec_hdr_update() cpnd_ckpt_sec_del_db() can be called after cpnd_ckpt_sec_del() many time without affecting system. Please consider and tell me your judgement. Best regards, Hoang -----Original Message----- From: Vo Minh Hoang [mailto:hoang.m...@dektech.com.au] Sent: Friday, November 25, 2016 5:14 PM To: 'A V Mahesh' <mahesh.va...@oracle.com> Cc: opensaf-devel@lists.sourceforge.net Subject: Re: [devel] [PATCH 1 of 1] cpnd: fix error handling while section_hdr_update_fail [#2207] Dear Mahesh, Thank you very much for your review and comments. I understood that I missed the point at n_secs calculation. Because checking cpnd_sec_hdr_update() -----Original Message----- From: A V Mahesh [mailto:mahesh.va...@oracle.com] Sent: Friday, November 25, 2016 3:28 PM To: Vo Minh Hoang <hoang.m...@dektech.com.au> Cc: anders.wid...@ericsson.com; opensaf-devel@lists.sourceforge.net Subject: Re: [PATCH 1 of 1] cpnd: fix error handling while section_hdr_update_fail [#2207] Hi Hoang, >>I am sorry that I cannot find the ER that you point out. Ok I understand that `map & localSecMap` will not become become null even all elements are removed in fist attempt of cpnd_ckpt_sec_del_db() invocation, so we will not core dump. But section DB still corrupted, because of cpnd_ckpt_sec_de() function split ( this patch ) Irrelevant of cpnd_sec_hdr_update() success or failure, the `cp_node->replica_info.n_secs++;` was occurring before cpnd_sec_hdr_update(), so we need to do `cp_node->replica_info.n_secs--;` If cpnd_sec_hdr_update() successful and cpnd_ckpt_hdr_update() fails, we also need to do roll-backing SECTION HEADER with roll backed sectionInfo , so we need to UPDATE THE SECTION HEADER. If you see original cpnd_ckpt_sec_del() function this is required irrelevant of cpnd_sec_hdr_update() success or failure ============================================================================ ======================= if (sectionInfo) { cp_node->replica_info.n_secs--; cp_node->replica_info.mem_used = cp_node->replica_info.mem_used - (sectionInfo->sec_size); // UPDATE THE SECTION HEADER uint32_t rc(cpnd_sec_hdr_update(sectionInfo, cp_node)); if (rc == NCSCC_RC_FAILURE) { TRACE_4("cpnd sect hdr update failed"); } ============================================================================ ======================= So splitting the function is not required you need to only ensure cpnd_ckpt_hdr_update() done or not yet done put some filter and do UPDATE THE CHECKPOINT HEADER based on cpnd_ckpt_hdr_update() ============================================================================ ======================= if ( cpnd_ckpt_hdr_update() == was successful ) { // UPDATE THE CHECKPOINT HEADER rc = cpnd_ckpt_hdr_update(cp_node); if (rc == NCSCC_RC_FAILURE) { TRACE_4("cpnd ckpt hdr update failed"); } } ============================================================================ ======================= -AVM On 11/25/2016 12:08 PM, Vo Minh Hoang wrote: > Dear Mahesh, > > The first call erase a element with id from map. > The second call with same id, no element found so iterator will point > to > map->end(). > Because of that map->erase() does not execute. > > I am sorry that I cannot find the ER that you point out. > > Sincerely, > Hoang > > -----Original Message----- > From: A V Mahesh [mailto:mahesh.va...@oracle.com] > Sent: Friday, November 25, 2016 1:20 PM > To: Vo Minh Hoang <hoang.m...@dektech.com.au> > Cc: anders.wid...@ericsson.com; opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 1 of 1] cpnd: fix error handling while > section_hdr_update_fail [#2207] > > Hi Hoang, > > The second invocation of cpnd_ckpt_sec_del_db() with generate code dump. > > ============================================================= > > SectionMap *map(static_cast<SectionMap > *>(cp_node->replica_info.section_db)); > > if (map) { > SectionMap::iterator it(map->find(id)); > > if (it != map->end()) { > sectionInfo = it->second; > map->erase(it); > } > } > else { > LOG_ER("can't find map in cpnd_ckpt_sec_del"); > *osafassert(false);* > } > ============================================================= > > > -AVM > > On 11/25/2016 11:16 AM, Vo Minh Hoang wrote: >> Dear Mahesh, >> >> Thank you very much for your comment. >> >> Would you please verify again my understanding about this. >> >> Old cpnd_ckpt_sec_del() function and new cpnd_ckpt_sec_del_db() >> search for the sectionInfo in 2 maps by its id and remove if found. >> In case ckpt_hdr_update_fails, the cpnd_ckpt_sec_del_db() is called >> twice, the first one remove sectionInfo from maps, the second one >> does not found sectionInfo and cannot remove anything, will not >> generate error. >> >> Sincerely, >> Hoang >> >> -----Original Message----- >> From: A V Mahesh [mailto:mahesh.va...@oracle.com] >> Sent: Friday, November 25, 2016 12:12 PM >> To: Hoang Vo <hoang.m...@dektech.com.au>; anders.wid...@ericsson.com >> Cc: opensaf-devel@lists.sourceforge.net >> Subject: Re: [PATCH 1 of 1] cpnd: fix error handling while >> section_hdr_update_fail [#2207] >> >> Hi Hoang, >> >> NACK >> >> This patch has some fundamental problem, this will corrupt the cpsv >> database . >> This patch instead of resolving the issue , increase the percentage >> of problem occurrence . >> >> Basically this patch splits cpnd_ckpt_sec_del() in to two parts >> cpnd_ckpt_sec_del_db() & cpnd_ckpt_sec_del() and the new >> cpnd_ckpt_sec_del() invokes cpnd_ckpt_sec_del_db() to fullfil the >> old function behavior , so that to support other old invocations and >> changed the error cleanup in >> cpnd_ckpt_sec_add() as follows : >> >> ckpt_hdr_update_fails: >> cpnd_ckpt_sec_del(cp_node, id); >> >> section_hdr_update_fails: >> cpnd_ckpt_sec_del_db(cp_node, id); >> >> This means cpnd_ckpt_sec_del_db() is always called twice in case of >> ckpt_hdr_update_fails , which will generate core >> >> -AVM >> >> >> On 11/24/2016 3:26 PM, Hoang Vo wrote: >>> osaf/libs/common/cpsv/include/cpnd_sec.h | 3 +++ >>> osaf/services/saf/cpsv/cpnd/cpnd_db.c | 4 +++- >>> osaf/services/saf/cpsv/cpnd/cpnd_sec.cc | 31 >> +++++++++++++++++++++++++++---- >>> 3 files changed, 33 insertions(+), 5 deletions(-) >>> >>> >>> problem: >>> the steps to add a section is add_db_tree -> update_sec_hdr -> >>> update_ckpt_hdr so if an error occur cpsv should handle error in >>> reverse >> order. >>> currently, section_hdr_update_fails, cpsv revert ckpt_hdr also that >>> case error >>> >>> solution: >>> only revert db_tree in case section_hdr_update_fails >>> >>> diff --git a/osaf/libs/common/cpsv/include/cpnd_sec.h >>> b/osaf/libs/common/cpsv/include/cpnd_sec.h >>> --- a/osaf/libs/common/cpsv/include/cpnd_sec.h >>> +++ b/osaf/libs/common/cpsv/include/cpnd_sec.h >>> @@ -39,6 +39,9 @@ CPND_CKPT_SECTION_INFO * >>> cpnd_ckpt_sec_get_create(const CPND_CKPT_NODE *, const >>> SaCkptSectionIdT *); >>> >>> CPND_CKPT_SECTION_INFO * >>> +cpnd_ckpt_sec_del_db(CPND_CKPT_NODE *, SaCkptSectionIdT *); >>> + >>> +CPND_CKPT_SECTION_INFO * >>> cpnd_ckpt_sec_del(CPND_CKPT_NODE *, SaCkptSectionIdT *); >>> >>> CPND_CKPT_SECTION_INFO * >>> diff --git a/osaf/services/saf/cpsv/cpnd/cpnd_db.c >>> b/osaf/services/saf/cpsv/cpnd/cpnd_db.c >>> --- a/osaf/services/saf/cpsv/cpnd/cpnd_db.c >>> +++ b/osaf/services/saf/cpsv/cpnd/cpnd_db.c >>> @@ -468,10 +468,12 @@ CPND_CKPT_SECTION_INFO *cpnd_ckpt_sec_ad >>> TRACE_LEAVE(); >>> return pSecPtr; >>> >>> - section_hdr_update_fails: >>> ckpt_hdr_update_fails: >>> cpnd_ckpt_sec_del(cp_node, id); >>> >>> + section_hdr_update_fails: >>> + cpnd_ckpt_sec_del_db(cp_node, id); >>> + >>> section_add_fails: >>> if (pSecPtr->sec_id.id != NULL) >>> m_MMGR_FREE_CPND_DEFAULT(pSecPtr->sec_id.id); >>> diff --git a/osaf/services/saf/cpsv/cpnd/cpnd_sec.cc >>> b/osaf/services/saf/cpsv/cpnd/cpnd_sec.cc >>> --- a/osaf/services/saf/cpsv/cpnd/cpnd_sec.cc >>> +++ b/osaf/services/saf/cpsv/cpnd/cpnd_sec.cc >>> @@ -157,19 +157,19 @@ cpnd_ckpt_sec_find(const CPND_CKPT_NODE >>> } >>> >>> >> /******************************************************************** >> * >> ****** >> * >>> - * Name : cpnd_ckpt_sec_del >>> + * Name : cpnd_ckpt_sec_del_db >>> * >>> - * Description : Function to remove the section from a checkpoint. >>> + * Description : Function to remove the section from a checkpoint map >> db. >>> * >>> * Arguments : CPND_CKPT_NODE *cp_node - Check point node. >>> * : SaCkptSectionIdT id - Section Identifier >>> - * >>> + * >>> * Return Values : ptr to CPND_CKPT_SECTION_INFO/NULL; >>> * >>> * Notes : None. >>> >> ********************************************************************* >> * >> ****** >> */ >>> CPND_CKPT_SECTION_INFO * >>> -cpnd_ckpt_sec_del(CPND_CKPT_NODE *cp_node, SaCkptSectionIdT *id) >>> +cpnd_ckpt_sec_del_db(CPND_CKPT_NODE *cp_node, SaCkptSectionIdT *id) >>> { >>> CPND_CKPT_SECTION_INFO *sectionInfo(0); >>> >>> @@ -206,6 +206,29 @@ cpnd_ckpt_sec_del(CPND_CKPT_NODE *cp_nod >>> osafassert(false); >>> } >>> >>> + TRACE_LEAVE(); >>> + >>> + return sectionInfo; >>> +} >>> + >>> >> +/******************************************************************* >> +* >> +****** >> ** >>> + * Name : cpnd_ckpt_sec_del >>> + * >>> + * Description : Function to remove the section from a checkpoint. >>> + * >>> + * Arguments : CPND_CKPT_NODE *cp_node - Check point node. >>> + * : SaCkptSectionIdT id - Section Identifier >>> + * >>> + * Return Values : ptr to CPND_CKPT_SECTION_INFO/NULL; >>> + * >>> + * Notes : None. >>> + >>> +******************************************************************* >>> +* >>> +* >>> +********/ >>> +CPND_CKPT_SECTION_INFO * >>> +cpnd_ckpt_sec_del(CPND_CKPT_NODE *cp_node, SaCkptSectionIdT *id) { >>> + TRACE_ENTER(); >>> + CPND_CKPT_SECTION_INFO *sectionInfo = >>> +cpnd_ckpt_sec_del_db(cp_node, id); >>> + >>> if (sectionInfo) { >>> cp_node->replica_info.n_secs--; >>> cp_node->replica_info.mem_used = >>> cp_node->replica_info.mem_used >>> - (sectionInfo->sec_size); > ---------------------------------------------------------------------------- -- _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel