Hi Nell, On 7/2/2015 2:54 PM, Anders Björnerstedt wrote: > Ack from me. > Not tested.
I was trying to test with 200K objects , I observed some issues please verify before pushing . 1) bring up SC-1 active with 2000K objects 2) bring up PL-3 3) bring up PL-4 4) try to bring up SC-2 as standby 5) you will observe osafimmnd restart on payload(s) and they will never re-join ================================================================================================================================== Jul 3 10:49:27 PL-4 osafamfnd[3651]: NO 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Presence State UNINSTANTIATED => INSTANTIATING Jul 3 10:49:27 PL-4 osafamfwd[3661]: Started Jul 3 10:49:27 PL-4 osafckptnd[3671]: Started Jul 3 10:49:27 PL-4 osaflcknd[3681]: Started Jul 3 10:49:27 PL-4 osafmsgnd[3699]: Started Jul 3 10:49:27 PL-4 osafimmnd[3624]: NO Implementer connected: 12 (MsgQueueService132111) <49, 2040f> Jul 3 10:49:27 PL-4 osafsmfnd[3710]: Started Jul 3 10:49:27 PL-4 osafamfnd[3651]: NO 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATING => INSTANTIATED Jul 3 10:49:27 PL-4 osafamfnd[3651]: NO Assigning 'safSi=NoRed8,safApp=OpenSAF' ACTIVE to 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Jul 3 10:49:27 PL-4 osafamfnd[3651]: NO Assigned 'safSi=NoRed8,safApp=OpenSAF' ACTIVE to 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Jul 3 10:49:27 PL-4 opensafd: OpenSAF(4.5.0 - ) services successfully started done PL-4:~ # Jul 3 10:49:42 PL-4 kernel: [ 568.167588] tipc: Established link <1.1.4:eth2-1.1.2:eth2> on network plane B Jul 3 10:49:42 PL-4 kernel: [ 568.168970] tipc: Established link <1.1.4:eth0-1.1.2:eth3> on network plane A Jul 3 10:49:43 PL-4 osafimmnd[3624]: NO NODE STATE-> IMM_NODE_R_AVAILABLE Jul 3 10:50:09 PL-4 osafamfnd[3651]: NO 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' component restart probation timer started (timeout: 60000000000 ns) Jul 3 10:50:09 PL-4 osafamfnd[3651]: NO Restarting a component of 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1) Jul 3 10:50:09 PL-4 osafamfnd[3651]: NO 'safComp=IMMND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'componentRestart' Jul 3 10:50:09 PL-4 osafimmnd[3751]: Started Jul 3 10:50:09 PL-4 osafimmnd[3751]: NO Persistent Back-End capability configured, Pbe file:imm.db (suffix may get added) Jul 3 10:50:09 PL-4 osafimmnd[3751]: NO Fevs count adjusted to 203641 preLoadPid: 0 Jul 3 10:50:09 PL-4 osafimmnd[3751]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING Jul 3 10:50:09 PL-4 osafimmnd[3751]: NO SERVER STATE: IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING Jul 3 10:50:09 PL-4 osafimmnd[3751]: NO SERVER STATE: IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING Jul 3 10:50:09 PL-4 osafimmnd[3751]: NO NODE STATE-> IMM_NODE_ISOLATED Jul 3 10:50:13 PL-4 osafamfnd[3651]: NO Restarting a component of 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' (comp restart count: 2) Jul 3 10:50:13 PL-4 osafamfnd[3651]: NO 'safComp=IMMND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' : Recovery is 'componentRestart' Jul 3 10:50:13 PL-4 osafimmnd[3773]: Started Jul 3 10:50:13 PL-4 osafimmnd[3773]: NO Persistent Back-End capability configured, Pbe file:imm.db (suffix may get added) Jul 3 10:50:13 PL-4 osafimmnd[3773]: NO Fevs count adjusted to 203732 preLoadPid: 0 Jul 3 10:50:13 PL-4 osafimmnd[3773]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING Jul 3 10:50:14 PL-4 osafimmnd[3773]: NO SERVER STATE: IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING Jul 3 10:50:14 PL-4 osafimmnd[3773]: NO SERVER STATE: IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING Jul 3 10:50:14 PL-4 osafimmnd[3773]: NO NODE STATE-> IMM_NODE_ISOLATED Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCompBaseType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSUBaseType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSGBaseType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfAppBaseType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSvcBaseType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCSBaseType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCompGlobalAttributes Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCompType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCSType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCtCsType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfHealthcheckType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSvcType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSvcTypeCSTypes Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSUType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSutCompType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSGType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfAppType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCluster Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfNode Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfNodeGroup Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfNodeSwBundle Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfApplication Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSG Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSI Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCSI Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCSIAttribute Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSU Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfComp Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfHealthcheck Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfCompCsType Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSIDependency Jul 3 10:50:26 PL-4 osafimmnd[3773]: NO Sync client discarded classimplementer set. Impl-id:13 Class:SaAmfSIRankedSU Jul 3 10:50:28 PL-4 osafimmnd[3773]: NO NODE STATE-> IMM_NODE_W_AVAILABLE Jul 3 10:50:29 PL-4 osafimmnd[3773]: NO SERVER STATE: IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT Jul 3 10:50:30 PL-4 osafimmnd[3773]: NO Implementer connected: 14 (MsgQueueService131599) <0, 2020f> ================================================================================================================================== -AVM On 7/2/2015 2:54 PM, Anders Björnerstedt wrote: > Ack from me. > Not tested. > Good work! > > One thought that struck me is that the message types: > > IMMND_EVT_A2ND_IMM_FEVS_2 > IMMD_EVT_ND2D_FEVS_REQ_2 > IMMND_EVT_D2ND_GLOB_FEVS_REQ_2 > > should (in some later cleanup) be renamed to reflect that they are only used > for imm-sync. > e.g. IMMND_EVT_A2ND_IMM_SYNC_FEVS > Not for this ticket though. > > /AndersBj > > > -----Original Message----- > From: reddy.neelaka...@oracle.com [mailto:reddy.neelaka...@oracle.com] > Sent: den 1 juli 2015 16:16 > To: Anders Björnerstedt; Zoran Milinkovic; mahesh.va...@oracle.com > Cc: opensaf-devel@lists.sourceforge.net > Subject: [PATCH 1 of 1] imm:checkpoint only FEVS header for sync messages > [#952] v2 > > osaf/services/saf/immsv/immd/immd_evt.c | 15 ++++++++++++--- > osaf/services/saf/immsv/immnd/immnd_evt.c | 9 ++++++++- > 2 files changed, 20 insertions(+), 4 deletions(-) > > > At the time of sync, when check-pointing to standby IMMD for > IMMND_EVT_D2ND_GLOB_FEVS_REQ_2, the fevs message buffer will be set to NULL > and message size will be set to 0. so, that the MBCSV check-pointing happens > only for header. > > diff --git a/osaf/services/saf/immsv/immd/immd_evt.c > b/osaf/services/saf/immsv/immd/immd_evt.c > --- a/osaf/services/saf/immsv/immd/immd_evt.c > +++ b/osaf/services/saf/immsv/immd/immd_evt.c > @@ -251,7 +251,7 @@ uint32_t immd_evt_proc_fevs_req(IMMD_CB > /* Populate & Send the FEVS Event to IMMND */ > memset(&send_evt, 0, sizeof(IMMSV_EVT)); > send_evt.type = IMMSV_EVT_TYPE_IMMND; > - send_evt.info.immnd.type = (evt->type == IMMD_EVT_ND2D_FEVS_REQ_2)? > + send_evt.info.immnd.type = ((evt->type == > IMMD_EVT_ND2D_FEVS_REQ_2)||(evt->type == 0))? > IMMND_EVT_D2ND_GLOB_FEVS_REQ_2: IMMND_EVT_D2ND_GLOB_FEVS_REQ; > > if ((evt->type == 0) && (fevs_req->sender_count > 0)) { @@ -266,8 > +266,8 @@ uint32_t immd_evt_proc_fevs_req(IMMD_CB > send_evt.info.immnd.info.fevsReq.msg.size = fevs_req->msg.size; > /*Borrow the buffer from the input message instead of copying */ > send_evt.info.immnd.info.fevsReq.msg.buf = fevs_req->msg.buf; > - send_evt.info.immnd.info.fevsReq.isObjSync = (evt->type == > IMMD_EVT_ND2D_FEVS_REQ_2)? > - (fevs_req->isObjSync):0x0; > + send_evt.info.immnd.info.fevsReq.isObjSync = ((evt->type == > IMMD_EVT_ND2D_FEVS_REQ_2) || > + (evt->type == 0 ))? (fevs_req->isObjSync):0x0; > > TRACE_5("immd_evt_proc_fevs_req send_count:%llu size:%u", > send_evt.info.immnd.info.fevsReq.sender_count, > send_evt.info.immnd.info.fevsReq.msg.size); > @@ -280,6 +280,15 @@ uint32_t immd_evt_proc_fevs_req(IMMD_CB > mbcp_msg.type = IMMD_A2S_MSG_FEVS; > mbcp_msg.info.fevsReq = send_evt.info.immnd.info.fevsReq; > > + /* FEVS_REQ_2 messages are object sync messages. since this is > mbcsv checkpointing > + to standby, at the time of sync checkpointing complete fevs > event is not required. > + Checkpointing the header is sufficient to have the standby > SC in > +sync with the fevs count.*/ > + > + if(evt->type == IMMD_EVT_ND2D_FEVS_REQ_2){ > + mbcp_msg.info.fevsReq.msg.size = 0; > + mbcp_msg.info.fevsReq.msg.buf = NULL; > + mbcp_msg.info.fevsReq.isObjSync = 0x0; > + } > /*Checkpoint the message to standby director. > Syncronous call=>wait for ack */ > proc_rc = immd_mbcsv_sync_update(cb, &mbcp_msg); diff --git > a/osaf/services/saf/immsv/immnd/immnd_evt.c > b/osaf/services/saf/immsv/immnd/immnd_evt.c > --- a/osaf/services/saf/immsv/immnd/immnd_evt.c > +++ b/osaf/services/saf/immsv/immnd/immnd_evt.c > @@ -8702,7 +8702,7 @@ static uint32_t immnd_evt_proc_fevs_rcv( > SaBoolT originatedAtThisNd = (m_IMMSV_UNPACK_HANDLE_LOW(clnt_hdl) == > cb->node_id); > > if (originatedAtThisNd) { > - osafassert(!reply_dest || (reply_dest == cb->immnd_mdest_id)); > + osafassert(!reply_dest || (reply_dest == cb->immnd_mdest_id) || > +isObjSync ); > if (cb->fevs_replies_pending) { > --(cb->fevs_replies_pending); /*flow control towards > IMMD */ > } > @@ -8731,6 +8731,12 @@ static uint32_t immnd_evt_proc_fevs_rcv( > } > } > > + if ((evt->type == IMMND_EVT_D2ND_GLOB_FEVS_REQ_2) && (msg->size == 0) > && (msg->buf == NULL)){ > + // This is sync message Re-broadcasted by IMMD standby because > of failover > + TRACE("Re-broadcasted FEVS at the time of sync"); > + goto done; > + } > + > /*NORMAL CASE: Received the expected in-order message. */ > > SaAisErrorT err = SA_AIS_OK; > @@ -8749,6 +8755,7 @@ static uint32_t immnd_evt_proc_fevs_rcv( > } > } > > + done: > cb->highestProcessed++; > dequeue_outgoing(cb); > TRACE_LEAVE(); ------------------------------------------------------------------------------ Don't Limit Your Business. Reach for the Cloud. GigeNET's Cloud Solutions provide you with the tools and support that you need to offload your IT needs and focus on growing your business. Configured For All Businesses. Start Your Cloud Today. https://www.gigenetcloud.com/ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel