Ack for the original patch. Thanks, Praveen
On 31-Mar-16 3:08 PM, Quyen Dao wrote: > Hi Praveen, > > Please see my comments inline. > > Thanks, > Quyen > > -----Original Message----- > From: praveen malviya [mailto:praveen.malv...@oracle.com] > Sent: Thursday, March 31, 2016 3:32 PM > To: Quyen Dao; 'Hans Nordebäck'; gary....@dektech.com.au; > nagendr...@oracle.com > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 1 of 1] amfd: do not checkpoint the nway sg fsm in case > of no change [#1697] > > Hi Quyen, > avd_sg_nway_si_assign() is called even in the cases when SG is not stable > for example in > > 1) node_fail_su_oper() { > ... > ... > /* if sg hasnt transitioned to sg-realign, initiate new si assignments > */ > if (AVD_SG_FSM_SG_REALIGN != sg->sg_fsm_state) > avd_sg_nway_si_assign(avd_cb, sg); > } > 2) In avd_sg_nway_node_fail_sg_realign () also. > In such cases check-pointing is needed. > > So in order to fix reported problem in #1697, check-pointing can be done > conditionally if old_fsm state before marking SG stable is other than > Stable. So patch can be modified to: > > diff --git a/osaf/services/saf/amf/amfd/sg_nway_fsm.cc > b/osaf/services/saf/amf/amfd/sg_nway_fsm.cc > --- a/osaf/services/saf/amf/amfd/sg_nway_fsm.cc > +++ b/osaf/services/saf/amf/amfd/sg_nway_fsm.cc > @@ -1230,11 +1230,15 @@ uint32_t avd_sg_nway_si_assign(AVD_CL_CB > bool is_act_ass_sent = false, is_all_su_oos = true, is_all_si_ok = > false, su_found = true; > uint32_t rc = NCSCC_RC_SUCCESS; > AVD_SU_SI_REL *tmp_susi; > + AVD_SG_FSM_STATE old_fsm_state; > > TRACE_ENTER2("%s", sg->name.value); > > + //Remember old sg_fsm_state. > + old_fsm_state = sg->sg_fsm_state; > m_AVD_SET_SG_FSM(cb, sg, AVD_SG_FSM_STABLE); > - m_AVSV_SEND_CKPT_UPDT_ASYNC_UPDT(cb, sg, AVSV_CKPT_SG_FSM_STATE); > + if (old_fsm_state != AVD_SG_FSM_STABLE) > + m_AVSV_SEND_CKPT_UPDT_ASYNC_UPDT(cb, sg, > AVSV_CKPT_SG_FSM_STATE); > > avd_sidep_update_si_dep_state_for_all_sis(sg); > /* assign active assignments to unassigned sis */ > > What do you think? > [Quyen] your suggested modification is not needed. It is already covered in > the macro `m_AVD_SET_SG_FSM` > (one line before the line I removed) > > void AVD_SG::set_fsm_state(AVD_SG_FSM_STATE state) { > TRACE_ENTER(); > > if (sg_fsm_state != state) { > TRACE("%s sg_fsm_state %u => %u", name.value, sg_fsm_state, > state); > sg_fsm_state = state; > m_AVSV_SEND_CKPT_UPDT_ASYNC_UPDT(avd_cb, this, > AVSV_CKPT_SG_FSM_STATE); > } > > Please double check. Thanks! > > Thanks, > Praveen > > On 31-Mar-16 11:50 AM, Quyen Dao wrote: >> Hi, >> >> I will ask Hans N to push this patch tomorrow if there are no more > comments. >> >> Thanks, >> Quyen >> >> -----Original Message----- >> From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com] >> Sent: Friday, March 11, 2016 3:23 PM >> To: Quyen Dao; gary....@dektech.com.au; praveen.malv...@oracle.com; >> nagendr...@oracle.com >> Cc: opensaf-devel@lists.sourceforge.net >> Subject: Re: [PATCH 1 of 1] amfd: do not checkpoint the nway sg fsm in > case >> of no change [#1697] >> >> ack, code review only/Thanks HansN >> >> On 03/09/2016 11:35 AM, Quyen Dao wrote: >>> osaf/services/saf/amf/amfd/sg_nway_fsm.cc | 1 - >>> 1 files changed, 0 insertions(+), 1 deletions(-) >>> >>> >>> After the nway application AMF entities are created, active amfd calls >>> avd_sg_nway_si_assign function to assign any unassigned SI but all SUs >>> are locked so no SI are assigned and the FSM is still stable. In this >>> function, it always checkpoints the nway sg fsm with state STABLE (at >>> the beginning of the function) to standby even the fsm state is not >> changed. >>> When the standby amfd receives the checkpoint message, it founds that >>> the nway sg doesn't exist in its database and does an assert to >>> generate the coredump. The nway sg doesn't exist in standby is because >>> checkpoint message arrives before the ccb_apply_ccb (which informs the >> nway sg creation) from immnd. >>> >>> Object's checkpoint message arrives to standby when the corresponding >>> object doesn't exist is a rare and race condition case, it needs a > general >> solution. >>> >>> But to solve for this particular crash, it is changed to not >>> checkpoint the nway sg fsm state as it's not changed (this is the correct >> behaviour and aligned with other sg). >>> Since nway sg fsm is not checkpointed because of no change, crash will > not >> happen. >>> >>> diff --git a/osaf/services/saf/amf/amfd/sg_nway_fsm.cc >>> b/osaf/services/saf/amf/amfd/sg_nway_fsm.cc >>> --- a/osaf/services/saf/amf/amfd/sg_nway_fsm.cc >>> +++ b/osaf/services/saf/amf/amfd/sg_nway_fsm.cc >>> @@ -1234,7 +1234,6 @@ uint32_t avd_sg_nway_si_assign(AVD_CL_CB >>> TRACE_ENTER2("%s", sg->name.value); >>> >>> m_AVD_SET_SG_FSM(cb, sg, AVD_SG_FSM_STABLE); >>> - m_AVSV_SEND_CKPT_UPDT_ASYNC_UPDT(cb, sg, AVSV_CKPT_SG_FSM_STATE); >>> >>> avd_sidep_update_si_dep_state_for_all_sis(sg); >>> /* assign active assignments to unassigned sis */ >> > ------------------------------------------------------------------------------ Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel