I updated solution and sent out V2. -----Original Message----- From: Minh Hon Chau <[email protected]> Sent: Tuesday, April 21, 2020 2:23 PM To: Thuan Tran <[email protected]>; Thang Duc Nguyen <[email protected]> Cc: [email protected] Subject: Re: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get operation invoke name [#3178]
Agree. On 21/4/20 12:24 pm, Thuan Tran wrote: > Hi, > > If there is no way to get admin owner or object implementer in middle of one > CCB many operations. > Then a "unknown" invoker is better than keep restarting by each operation of > that CCB. > > Best Regards, > ThuanTr > > -----Original Message----- > From: Thang Duc Nguyen <[email protected]> > Sent: Tuesday, April 21, 2020 8:39 AM > To: Thang Duc Nguyen <[email protected]>; Minh Hon Chau > <[email protected]>; Thuan Tran <[email protected]> > Cc: [email protected] > Subject: RE: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get > operation invoke name [#3178] > > Update. > > If we accept to avoid coredump, there is @operation_invoke_name that needs to > be freed before exit? > [Thang]: as above can fill invoke_name as unknown in this case to avoid the > coredump. > And free in applyccbcb. > > -----Original Message----- > From: Thang Duc Nguyen <[email protected]> > Sent: Tuesday, April 21, 2020 8:29 AM > To: Minh Hon Chau <[email protected]>; Thuan Tran > <[email protected]> > Cc: [email protected] > Subject: Re: [devel] [PATCH 1/1] ntf: restart ntfimcnd if it fails to > get operation invoke name [#3178] > > Hi Minh, > See my command inline. > > -----Original Message----- > From: Minh Hon Chau <[email protected]> > Sent: Monday, April 20, 2020 5:24 PM > To: Thang Duc Nguyen <[email protected]>; Thuan Tran > <[email protected]> > Cc: [email protected] > Subject: Re: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get > operation invoke name [#3178] > > Hi Thang, > > I understand the invoke_name is only present in the first callback, thus > ntfimcn must memorize it in the userdata. My question is, is it ok that this > userdata being lost because ntfimcn restart? I think it is, since the ccb has > not committed. > [Thang]: can accept it and fill invoke_name as unknown instead of do nothing. > > If we accept the userdata being lost, then we can look at to avoid the > coredump, otherwise Thuan can give an idea if it is imm issue that causes the > lost userdata. > > If we accept to avoid coredump, there is @operation_invoke_name that needs to > be freed before exit? > [Thang]: as above can fill invoke_name as unknown in this case to avoid the > coredump. > > > thanks > > Minh > > On 20/4/20 6:30 pm, Thang Duc Nguyen wrote: >> Hi Minh, >> >> See my comment inline. >> >> -----Original Message----- >> From: Minh Hon Chau <[email protected]> >> Sent: Monday, April 20, 2020 11:51 AM >> To: Thuan Tran <[email protected]>; Thang Duc Nguyen >> <[email protected]> >> Cc: [email protected] >> Subject: Re: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get >> operation invoke name [#3178] >> >> Hi, >> >> One similarity to #2859 is that the invoke_name is only present in the first >> callback, so ntfimcn must memorize it in ccb userdata. >> >> But after ntfimcn calls ccbutil_ccbAddModifyOperation, this userdata is not >> written to immnd and sync across the other immnd(s)? >> Meanings the userdata is only stored in imm agent? So after switchover, the >> next ccb callback does not have the invoke_name, and ntfimcn has lost its >> user data since restart. >> >> [Thang]: with a ccb with multi ops. The invoke_name, in this case only the >> first op contain the adminOwnername. And after ntfimcnd restarts, it >> received the seond or larger op modify. And this modify callback does not >> contain any more about this invoke_name. >> Maybe we can retrieve the invoke_name from imm db but we can not got all >> info about all ops in that ccb. >> >> Thanks >> >> Minh >> >> On 16/4/20 3:32 pm, Thuan Tran wrote: >>> Hi, >>> >>> I think this is just enhancement, not an urgent fix. >>> Then we should make it better if possible. >>> >>> About #2859, I am not reviewer at that time. >>> But I would not agree that solution as we can see service keep >>> restart if service still start in middle of one CCB many operations. >>> >>> Best Regards, >>> ThuanTr >>> >>> -----Original Message----- >>> From: Thang Duc Nguyen <[email protected]> >>> Sent: Thursday, April 16, 2020 10:51 AM >>> To: Thuan Tran <[email protected]>; Minh Hon Chau >>> <[email protected]> >>> Cc: [email protected] >>> Subject: RE: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get >>> operation invoke name [#3178] >>> >>> Hi Thuan, >>> >>> Thanks for your comment. >>> First this issue happen only in specific situation. And I think restart it >>> is no cause big issue. >>> And the ccb is internal data based mange by ntf/ntfimcnd. After >>> ntfimcnd restart, it reinitialize CcbUtilCcbData and operation invoke name >>> is empty. >>> >>> Moreover, in current code in ntfimcn_imm.c, there are many place use >>> imcn_exit(EXIT_FAILURE) when detect the error. Example for this is #2859. >>> We consider to open a new ticket to consider your suggestion by >>> refactor/change current behavior of ntfimcnd. >>> >>> B.R/Thang >>> >>> -----Original Message----- >>> From: Thuan Tran <[email protected]> >>> Sent: Thursday, April 16, 2020 10:16 AM >>> To: Thang Duc Nguyen <[email protected]>; Minh Hon Chau >>> <[email protected]> >>> Cc: [email protected] >>> Subject: RE: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get >>> operation invoke name [#3178] >>> >>> Hi Thang, >>> >>> From reproduce method, with solution after exit (instead of crash), user >>> continue input another operation then service exit again. >>> The point is why we cannot get admin owner or object implementer via 2nd >>> imm modify callback in this scenario? >>> Is it an IMM limit that don't include admin owner or object implementer >>> from 2nd modify callback? >>> >>> If limit, can we use another way to get admin owner or object implementer >>> base on object name? >>> By this, we can avoid continuous exit if user keep going on operations by >>> same CCB. >>> >>> Best Regards, >>> ThuanTr >>> >>> -----Original Message----- >>> From: Thang Duc Nguyen <[email protected]> >>> Sent: Wednesday, April 15, 2020 3:43 PM >>> To: Minh Hon Chau <[email protected]>; Thuan Tran >>> <[email protected]> >>> Cc: [email protected]; Thang Duc Nguyen >>> <[email protected]> >>> Subject: [PATCH 1/1] ntf: restart ntfimcnd if it fails to get >>> operation invoke name [#3178] >>> >>> If ntfimcnd is restarted during ccb modify, it will initialize >>> ccbUtilCcbData that not contain operation invoke name. >>> This causes ntfimcnd crashed due to operation invoke name not existed. >>> >>> The fix is to restart ntfimcnd instead of raising the coredump. >>> --- >>> src/ntf/ntfimcnd/ntfimcn_imm.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/src/ntf/ntfimcnd/ntfimcn_imm.c >>> b/src/ntf/ntfimcnd/ntfimcn_imm.c index 3c0a8c02a..3563a2264 100644 >>> --- a/src/ntf/ntfimcnd/ntfimcn_imm.c >>> +++ b/src/ntf/ntfimcnd/ntfimcn_imm.c >>> @@ -376,9 +376,9 @@ get_operation_invoke_name_modify(SaImmOiCcbIdT ccbId, >>> goto done; >>> } >>> } >>> - /* If we get here no name is found! */ >>> + /* ntfimcnd was restarted during ccb midify */ >>> LOG_ER("%s no name was found", __FUNCTION__); >>> - osafassert(0); >>> + imcn_exit(EXIT_FAILURE); >>> >>> done: >>> TRACE_LEAVE(); >>> -- >>> 2.17.1 >>> > _______________________________________________ > Opensaf-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/opensaf-devel _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
