Hi Quyen, avd_sg_nway_si_assign() is called even in the cases when SG is not stable for example in
1) node_fail_su_oper() { ... ... /* if sg hasnt transitioned to sg-realign, initiate new si assignments */ if (AVD_SG_FSM_SG_REALIGN != sg->sg_fsm_state) avd_sg_nway_si_assign(avd_cb, sg); } 2) In avd_sg_nway_node_fail_sg_realign () also. In such cases check-pointing is needed. So in order to fix reported problem in #1697, check-pointing can be done conditionally if old_fsm state before marking SG stable is other than Stable. So patch can be modified to: diff --git a/osaf/services/saf/amf/amfd/sg_nway_fsm.cc b/osaf/services/saf/amf/amfd/sg_nway_fsm.cc --- a/osaf/services/saf/amf/amfd/sg_nway_fsm.cc +++ b/osaf/services/saf/amf/amfd/sg_nway_fsm.cc @@ -1230,11 +1230,15 @@ uint32_t avd_sg_nway_si_assign(AVD_CL_CB bool is_act_ass_sent = false, is_all_su_oos = true, is_all_si_ok = false, su_found = true; uint32_t rc = NCSCC_RC_SUCCESS; AVD_SU_SI_REL *tmp_susi; + AVD_SG_FSM_STATE old_fsm_state; TRACE_ENTER2("%s", sg->name.value); + //Remember old sg_fsm_state. + old_fsm_state = sg->sg_fsm_state; m_AVD_SET_SG_FSM(cb, sg, AVD_SG_FSM_STABLE); - m_AVSV_SEND_CKPT_UPDT_ASYNC_UPDT(cb, sg, AVSV_CKPT_SG_FSM_STATE); + if (old_fsm_state != AVD_SG_FSM_STABLE) + m_AVSV_SEND_CKPT_UPDT_ASYNC_UPDT(cb, sg, AVSV_CKPT_SG_FSM_STATE); avd_sidep_update_si_dep_state_for_all_sis(sg); /* assign active assignments to unassigned sis */ What do you think? Thanks, Praveen On 31-Mar-16 11:50 AM, Quyen Dao wrote: > Hi, > > I will ask Hans N to push this patch tomorrow if there are no more comments. > > Thanks, > Quyen > > -----Original Message----- > From: Hans Nordebäck [mailto:hans.nordeb...@ericsson.com] > Sent: Friday, March 11, 2016 3:23 PM > To: Quyen Dao; gary....@dektech.com.au; praveen.malv...@oracle.com; > nagendr...@oracle.com > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 1 of 1] amfd: do not checkpoint the nway sg fsm in case > of no change [#1697] > > ack, code review only/Thanks HansN > > On 03/09/2016 11:35 AM, Quyen Dao wrote: >> osaf/services/saf/amf/amfd/sg_nway_fsm.cc | 1 - >> 1 files changed, 0 insertions(+), 1 deletions(-) >> >> >> After the nway application AMF entities are created, active amfd calls >> avd_sg_nway_si_assign function to assign any unassigned SI but all SUs >> are locked so no SI are assigned and the FSM is still stable. In this >> function, it always checkpoints the nway sg fsm with state STABLE (at >> the beginning of the function) to standby even the fsm state is not > changed. >> When the standby amfd receives the checkpoint message, it founds that >> the nway sg doesn't exist in its database and does an assert to >> generate the coredump. The nway sg doesn't exist in standby is because >> checkpoint message arrives before the ccb_apply_ccb (which informs the > nway sg creation) from immnd. >> >> Object's checkpoint message arrives to standby when the corresponding >> object doesn't exist is a rare and race condition case, it needs a general > solution. >> >> But to solve for this particular crash, it is changed to not >> checkpoint the nway sg fsm state as it's not changed (this is the correct > behaviour and aligned with other sg). >> Since nway sg fsm is not checkpointed because of no change, crash will not > happen. >> >> diff --git a/osaf/services/saf/amf/amfd/sg_nway_fsm.cc >> b/osaf/services/saf/amf/amfd/sg_nway_fsm.cc >> --- a/osaf/services/saf/amf/amfd/sg_nway_fsm.cc >> +++ b/osaf/services/saf/amf/amfd/sg_nway_fsm.cc >> @@ -1234,7 +1234,6 @@ uint32_t avd_sg_nway_si_assign(AVD_CL_CB >> TRACE_ENTER2("%s", sg->name.value); >> >> m_AVD_SET_SG_FSM(cb, sg, AVD_SG_FSM_STABLE); >> - m_AVSV_SEND_CKPT_UPDT_ASYNC_UPDT(cb, sg, AVSV_CKPT_SG_FSM_STATE); >> >> avd_sidep_update_si_dep_state_for_all_sis(sg); >> /* assign active assignments to unassigned sis */ > ------------------------------------------------------------------------------ Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel