Hi Zoran/Neel, It is miss configuration please ignore
-AVM On 2/19/2016 9:40 AM, A V Mahesh wrote: > Hi Zoran/Neel, > > I have cloud resilience feature enabled with Opensaf RPMs build with > pbe ( --enable-imm-pbe ) > BUT PBE configuration not enabled and ( `#export > IMMSV_PBE_FILE=imm.db` still not enabled immnd.conf) > when 4 node cluster is in stable state and CPSV application is running , > then i did stooped > bot SC`s and i have received Return Value : SA_AIS_ERR_TRY_AGAIN for > Finalize ckptHandle > as expected , at that moment i started both SC`s , but SC`s Both didn't > joined with error > ` WA PBE is configured at first attached SC-immnd, but no Pbe file is > configured for immnd at node 2010f - rejecting node` > > I s their any thing missing im configuration configuration ? > > =============================================================================================================== > > Feb 19 09:20:43 SC-1 opensafd: Starting OpenSAF Services(5.0.M0 - ) > (Using TIPC) > Starting OpenSAF Services (Using TIPC):Feb 19 09:20:43 SC-1 > osafrded[23407]: Started > Feb 19 09:20:43 SC-1 osafrded[23407]: NO Peer rde@2020f has no state, my > nodeid is less => Setting Active role > Feb 19 09:20:43 SC-1 osaffmd[23416]: Started > Feb 19 09:20:43 SC-1 osafimmd[23426]: Started > Feb 19 09:20:43 SC-1 osafimmd[23426]: NO ******* SC_ABSENCE_ALLOWED > (Headless Hydra) is configured: 900 *********** > Feb 19 09:20:43 SC-1 osafimmd[23426]: NO Waiting 3 seconds to allow > IMMND MDS attachments to get processed. > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND > process at node 2030f old epoch: 0 new epoch:4 > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Ruling epoch changed to:4 > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of fevs count from 0 to > 2628 from 2030f. > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of admoId count from 0 > to 4 from 2030f. > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of ccbId count from 0 > to 2 from 2030f. > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of impl count from 0 to > 14 from 2030f. > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:2 Accepted > nodes:0 KnownVeteran:1 doReply:1 > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO First Veteran IMMND found > (payload) at 2030f this IMMD at 2010f. Apparent IMMD lapse, *not* 2PBE > => designating that IMMND as coordinator > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND > process at node 2040f old epoch: 0 new epoch:4 > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted > nodes:1 KnownVeteran:1 doReply:1 > Feb 19 09:20:46 SC-1 osafimmnd[23437]: Started > Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO IMMD service is UP ... > ScAbsenseAllowed?:0 introduced?:0 > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO New IMMND process is on STANDBY > Controller at 2020f > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Extended intro from node 2020f > Feb 19 09:20:46 SC-1 osafimmd[23426]: WA PBE not configured at first > attached SC-immnd, but Pbe is configured for immnd at 2020f - possible > upgrade from pre 4.4 > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:4 Accepted > nodes:2 KnownVeteran:0 doReply:1 > Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO SERVER STATE: > IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE > Controller at 2010f > Feb 19 09:20:46 SC-1 osafimmd[23426]: WA PBE is configured at first > attached SC-immnd, but no Pbe file is configured for immnd at node 2010f > - rejecting node > Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO SETTING COORD TO 0 CLOUD PROTO > Feb 19 09:20:46 SC-1 osafimmnd[23437]: ER IMMND forced to restart on > order from IMMD, exiting > Feb 19 09:20:46 SC-1 opensafd[23368]: ER Failed DESC:IMMND > Feb 19 09:20:46 SC-1 opensafd[23368]: ER Going for recovery > Feb 19 09:20:46 SC-1 opensafd[23368]: ER Trying To RESPAWN > /usr/lib64/opensaf/clc-cli/osaf-immnd attempt #1 > Feb 19 09:20:46 SC-1 opensafd[23368]: ER Sending SIGKILL to IMMND, pid=23431 > Feb 19 09:20:46 SC-1 osafimmd[23426]: WA Error returned from processing > message err:2 msg-type:2 > Feb 19 09:20:46 SC-1 osafimmd[23426]: WA IMMND on controller (not > currently coord) requests sync > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Sc Absence Allowed is > configured (900) => IMMND coord at payload node:2030f dest566313894805508 > Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Node 2020f request sync > sync-pid:8766 epoch:0 > Feb 19 09:20:48 SC-1 osafimmd[23426]: NO Successfully announced sync. > New ruling epoch:5 > Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND > process at node 2030f old epoch: 4 new epoch:5 > Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted > nodes:3 KnownVeteran:0 doReply:0 > Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND > process at node 2040f old epoch: 4 new epoch:5 > Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted > nodes:3 KnownVeteran:0 doReply:0 > Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND > process at node 2020f old epoch: 0 new epoch:5 > Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted > nodes:3 KnownVeteran:0 doReply:0 > Feb 19 09:21:01 SC-1 osafimmnd[23462]: Started > Feb 19 09:21:01 SC-1 osafimmnd[23462]: NO IMMD service is UP ... > ScAbsenseAllowed?:0 introduced?:0 > Feb 19 09:21:02 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE > Controller at 2010f > Feb 19 09:21:02 SC-1 osafimmd[23426]: WA PBE is configured at first > attached SC-immnd, but no Pbe file is configured for immnd at node 2010f > - rejecting node > Feb 19 09:21:02 SC-1 osafimmd[23426]: WA Error returned from processing > message err:2 msg-type:2 > Feb 19 09:21:02 SC-1 osafimmnd[23462]: NO SERVER STATE: > IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING > Feb 19 09:21:02 SC-1 osafimmnd[23462]: NO SETTING COORD TO 0 CLOUD PROTO > Feb 19 09:21:02 SC-1 osafimmnd[23462]: ER IMMND forced to restart on > order from IMMD, exiting > Feb 19 09:21:02 SC-1 opensafd[23368]: ER Could Not RESPAWN IMMND > Feb 19 09:21:02 SC-1 opensafd[23368]: ER Failed DESC:IMMND > Feb 19 09:21:02 SC-1 opensafd[23368]: ER Trying To RESPAWN > /usr/lib64/opensaf/clc-cli/osaf-immnd attempt #2 > Feb 19 09:21:02 SC-1 opensafd[23368]: ER Sending SIGKILL to IMMND, pid=23456 > Feb 19 09:21:17 SC-1 osafimmnd[23487]: Started > Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO IMMD service is UP ... > ScAbsenseAllowed?:0 introduced?:0 > Feb 19 09:21:17 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE > Controller at 2010f > Feb 19 09:21:17 SC-1 osafimmd[23426]: WA PBE is configured at first > attached SC-immnd, but no Pbe file is configured for immnd at node 2010f > - rejecting node > Feb 19 09:21:17 SC-1 osafimmd[23426]: WA Error returned from processing > message err:2 msg-type:2 > Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO SERVER STATE: > IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING > Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO SETTING COORD TO 0 CLOUD PROTO > Feb 19 09:21:17 SC-1 osafimmnd[23487]: ER IMMND forced to restart on > order from IMMD, exiting > Feb 19 09:21:17 SC-1 opensafd[23368]: ER Could Not RESPAWN IMMND > Feb 19 09:21:17 SC-1 opensafd[23368]: ER Failed DESC:IMMND > Feb 19 09:21:17 SC-1 opensafd[23368]: ER FAILED TO RESPAWN > Feb 19 09:21:17 SC-1 osaffmd[23416]: exiting for shutdown > Feb 19 09:21:17 SC-1 osafimmd[23426]: exiting for shutdown > Feb 19 09:21:17 SC-1 osafrded[23407]: exiting for shutdown > Feb 19 09:21:17 SC-1 opensafd: warning: TIPC module unloading failed > failed > Feb 19 09:21:17 SC-1 opensafd: Starting OpenSAF failed > > > Feb 19 09:20:44 SC-2 opensafd: OpenSAF services successfully stopped > Feb 19 09:20:44 SC-2 opensafd: Starting OpenSAF Services(5.0.M0 - ) > (Using TIPC) > Feb 19 09:20:44 SC-2 kernel: [161176.504234] tipc: Activated (version 2.0.0) > Feb 19 09:20:44 SC-2 kernel: [161176.504623] NET: Registered protocol > family 30 > Feb 19 09:20:44 SC-2 kernel: [161176.504626] tipc: Started in single > node mode > Feb 19 09:20:44 SC-2 kernel: [161176.512492] tipc: Started in network mode > Feb 19 09:20:44 SC-2 kernel: [161176.512497] tipc: Own node address > <1.1.2>, network identity 7777 > Feb 19 09:20:44 SC-2 kernel: [161176.514875] tipc: Enabled bearer > <eth:eth3>, discovery domain <1.1.0>, priority 10 > Feb 19 09:20:44 SC-2 kernel: [161176.515726] tipc: Enabled bearer > <eth:eth2>, discovery domain <1.1.0>, priority 10 > Feb 19 09:20:44 SC-2 kernel: [161176.516587] tipc: Established link > <1.1.2:eth2-1.1.1:eth1> on network plane B > Feb 19 09:20:44 SC-2 kernel: [161176.516643] tipc: Established link > <1.1.2:eth3-1.1.3:eth4> on network plane A > Feb 19 09:20:44 SC-2 kernel: [161176.517021] tipc: Established link > <1.1.2:eth3-1.1.4:eth0> on network plane A > Feb 19 09:20:44 SC-2 kernel: [161176.518091] tipc: Established link > <1.1.2:eth3-1.1.1:eth0> on network plane A > Feb 19 09:20:44 SC-2 osafrded[8736]: Started > Feb 19 09:20:44 SC-2 kernel: [161176.645456] tipc: Established link > <1.1.2:eth2-1.1.4:eth2> on network plane B > Feb 19 09:20:44 SC-2 kernel: [161176.645566] tipc: Established link > <1.1.2:eth2-1.1.3:eth1> on network plane B > Feb 19 09:20:46 SC-2 osafrded[8736]: NO Peer rde@2010f has no state, my > nodeid is greater => Setting Standby role > Feb 19 09:20:46 SC-2 osaffmd[8745]: Started > Feb 19 09:20:46 SC-2 osafimmd[8755]: Started > Feb 19 09:20:46 SC-2 osafimmd[8755]: NO ******* SC_ABSENCE_ALLOWED > (Headless Hydra) is configured: 900 *********** > Feb 19 09:20:46 SC-2 osafimmd[8755]: NO Waiting 3 seconds to allow IMMND > MDS attachments to get processed. > Feb 19 09:20:49 SC-2 osafimmnd[8766]: Started > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO Persistent Back-End capability > configured, Pbe file:imm.db (suffix may get added) > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO IMMD service is UP ... > ScAbsenseAllowed?:0 introduced?:0 > Feb 19 09:20:49 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process > at node 2040f old epoch: 0 new epoch:4 > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE: > IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SETTING COORD TO 0 CLOUD PROTO > Feb 19 09:20:49 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller > f1 detected at standby immd!! f2. Possible failover > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO Fevs count adjusted to 2628 > preLoadPid: 0 > Feb 19 09:20:49 SC-2 osafimmd[8755]: WA Message count:2629 + 1 != 2629 > Feb 19 09:20:49 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2629 > Feb 19 09:20:49 SC-2 osafimmnd[8766]: WA Error code 2 returned for > message type 82 - ignoring > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE: > IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO ABT REQUESTING SYNC > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE: > IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING > Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO NODE STATE-> IMM_NODE_ISOLATED > Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: Ruling epoch noted as:5 > Feb 19 09:20:51 SC-2 osafimmd[8755]: NO IMMND coord at 2030f > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO NODE STATE-> IMM_NODE_W_AVAILABLE > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO SERVER STATE: > IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO NODE STATE-> > IMM_NODE_FULLY_AVAILABLE 2715 > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO RepositoryInitModeT is > SA_IMM_INIT_FROM_FILE > Feb 19 09:20:51 SC-2 osafimmnd[8766]: WA IMM Access Control mode is > DISABLED! > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Epoch set to 5 in ImmModel > Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process > at node 2030f old epoch: 4 new epoch:5 > Feb 19 09:20:51 SC-2 osafimmd[8755]: NO IMMND coord at 2030f > Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process > at node 2040f old epoch: 4 new epoch:5 > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Implementer connected: 15 > (MsgQueueService131855) <0, 2030f> > Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process > at node 2020f old epoch: 0 new epoch:5 > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Implementer connected: 16 > (MsgQueueService132111) <0, 2040f> > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO SERVER STATE: > IMM_SERVER_SYNC_CLIENT --> IMM_SERVER_READY > Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO ABT ImmModel received > scAbsenceAllowed 900 > Feb 19 09:20:51 SC-2 osaflogd[8776]: Started > Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOGSV_DATA_GROUPNAME not found > Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOG root directory is: > "/var/log/opensaf/saflog" > Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOG data group is: "" > Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LGS_MBCSV_VERSION = 5 > Feb 19 09:20:51 SC-2 osafntfd[8787]: Started > Feb 19 09:21:01 SC-2 osafntfd[8787]: WA saLogInitialize returns try > again, retries... > Feb 19 09:21:04 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller > f1 detected at standby immd!! f2. Possible failover > Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2806 > Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for > message type 82 - ignoring > Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2807 > Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for > message type 82 - ignoring > Feb 19 09:21:04 SC-2 osafimmnd[8766]: NO Global discard node received > for nodeId:2010f pid:0 > Feb 19 09:21:04 SC-2 osafimmd[8755]: WA Message count:2808 + 1 != 2808 > Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2808 > Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for > message type 82 - ignoring > Feb 19 09:21:20 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller > f1 detected at standby immd!! f2. Possible failover > Feb 19 09:21:20 SC-2 osafimmd[8755]: NO Skipping re-send of fevs message > 2807 since it has recently been resent. > Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2808 > Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for > message type 82 - ignoring > Feb 19 09:21:20 SC-2 osafimmnd[8766]: NO Global discard node received > for nodeId:2010f pid:0 > Feb 19 09:21:20 SC-2 osafimmd[8755]: WA Message count:2809 + 1 != 2809 > Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2809 > Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for > message type 82 - ignoring > Feb 19 09:21:20 SC-2 osafimmd[8755]: WA IMMD lost contact with peer IMMD > (NCSMDS_RED_DOWN) > Feb 19 09:21:20 SC-2 osafimmd[8755]: NO Skipping re-send of fevs message > 2808 since it has recently been resent. > Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2809 > Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for > message type 82 - ignoring > Feb 19 09:21:31 SC-2 opensafd[8704]: ER Timed-out for response from NTFD > Feb 19 09:21:31 SC-2 opensafd[8704]: ER > Feb 19 09:21:31 SC-2 opensafd[8704]: ER Going for recovery > Feb 19 09:21:31 SC-2 opensafd[8704]: ER Trying To RESPAWN > /usr/lib64/opensaf/clc-cli/osaf-ntfd attempt #1 > Feb 19 09:21:31 SC-2 opensafd[8704]: ER Sending SIGABRT to NTFD, > pid=8787, (origin parent pid=8782) > Feb 19 09:21:47 SC-2 osafntfd[8821]: Started > Feb 19 09:21:57 SC-2 osafntfd[8821]: WA saLogInitialize returns try > again, retries... > Feb 19 09:22:27 SC-2 opensafd[8704]: ER Timed-out for response from NTFD > Feb 19 09:22:27 SC-2 opensafd[8704]: ER Could Not RESPAWN NTFD > Feb 19 09:22:27 SC-2 opensafd[8704]: ER > Feb 19 09:22:27 SC-2 opensafd[8704]: ER Trying To RESPAWN > /usr/lib64/opensaf/clc-cli/osaf-ntfd attempt #2 > Feb 19 09:22:27 SC-2 opensafd[8704]: ER Sending SIGABRT to NTFD, > pid=8821, (origin parent pid=8816) > Feb 19 09:22:42 SC-2 osafntfd[8851]: Started > Feb 19 09:22:52 SC-2 osafntfd[8851]: WA saLogInitialize returns try > again, retries... > Feb 19 09:23:22 SC-2 opensafd[8704]: ER Timed-out for response from NTFD > Feb 19 09:23:22 SC-2 opensafd[8704]: ER Could Not RESPAWN NTFD > Feb 19 09:23:22 SC-2 opensafd[8704]: ER > Feb 19 09:23:22 SC-2 opensafd[8704]: ER FAILED TO RESPAWN > Feb 19 09:23:22 SC-2 osaffmd[8745]: exiting for shutdown > Feb 19 09:23:22 SC-2 osafimmd[8755]: exiting for shutdown > Feb 19 09:23:22 SC-2 osafimmnd[8766]: WA SC Absence IS allowed:900 IMMD > service is DOWN > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO IMMD SERVICE IS DOWN, HYDRA IS > CONFIGURED => UNREGISTERING IMMND form MDS > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Removing client id:10002020f > sv_id:27 > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT discard_connection OK > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT Client node REMOVED > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT DONE REMOVING CLIENTS > ENTERING immModel_isolateThisNode(cb) > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Implementer disconnected 16 <0, > 2040f(down)> (MsgQueueService132111) > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Impl Discarded node 2040f > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Implementer disconnected 15 <0, > 2030f(down)> (MsgQueueService131855) > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Impl Discarded node 2030f > Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO MDS unregisterede. sleeping ... > Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO Sleep done registering IMMND > with MDS > Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO MDS: mds_register_callback: > dest 2020f2a460010 already exist > Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO SUCCESS IN REGISTERING IMMND > WITH MDS > Feb 19 09:23:23 SC-2 osafimmnd[8766]: exiting for shutdown > Feb 19 09:23:23 SC-2 osaflogd[8776]: exiting for shutdown > Feb 19 09:23:23 SC-2 osafrded[8736]: exiting for shutdown > Feb 19 09:23:23 SC-2 kernel: [161335.919458] tipc: Disabling bearer > <eth:eth3> > Feb 19 09:23:23 SC-2 kernel: [161335.919469] tipc: Lost link > <1.1.2:eth3-1.1.3:eth4> on network plane A > Feb 19 09:23:23 SC-2 kernel: [161335.919616] tipc: Lost link > <1.1.2:eth3-1.1.4:eth0> on network plane A > Feb 19 09:23:23 SC-2 kernel: [161335.919672] tipc: Lost link > <1.1.2:eth3-1.1.1:eth0> on network plane A > Feb 19 09:23:23 SC-2 kernel: [161335.919691] tipc: Disabling bearer > <eth:eth2> > Feb 19 09:23:23 SC-2 kernel: [161335.919694] tipc: Lost link > <1.1.2:eth2-1.1.1:eth1> on network plane B > Feb 19 09:23:23 SC-2 kernel: [161335.919697] tipc: Lost contact with <1.1.1> > Feb 19 09:23:23 SC-2 kernel: [161335.919702] tipc: Lost link > <1.1.2:eth2-1.1.4:eth2> on network plane B > Feb 19 09:23:23 SC-2 kernel: [161335.919704] tipc: Lost contact with <1.1.4> > Feb 19 09:23:23 SC-2 kernel: [161335.919708] tipc: Lost link > <1.1.2:eth2-1.1.3:eth1> on network plane B > Feb 19 09:23:23 SC-2 kernel: [161335.919710] tipc: Lost contact with <1.1.3> > Feb 19 09:23:23 SC-2 kernel: [161335.919752] tipc: Left network mode > Feb 19 09:23:23 SC-2 kernel: [161335.919914] NET: Unregistered protocol > family 30 > Feb 19 09:23:23 SC-2 kernel: [161335.919920] tipc: Deactivated > Feb 19 09:23:23 SC-2 opensafd: Starting OpenSAF failed > > =============================================================================================================== > > -AVM > > > On 2/4/2016 3:11 PM, Hung Nguyen wrote: >> Hi Zoran, >> >> Please find my comment inline. >> >> BR, >> >> Hung Nguyen - DEK Technologies >> >> >> -------------------------------------------------------------------------------- >> From: Zoran Milinkovic zoran.milinko...@ericsson.com >> Sent: Tuesday, December 22, 2015 9:14PM >> To: Neelakanta Reddy >> reddy.neelaka...@oracle.com >> Cc: Opensaf-devel >> opensaf-devel@lists.sourceforge.net >> Subject: [devel] [PATCH 4 of 5] imm: add IMMND support for cloud resilience >> feature [#1625] >> >> >> osaf/services/saf/immsv/immnd/ImmModel.cc | 115 ++++++++++++++++++++ >> osaf/services/saf/immsv/immnd/ImmModel.hh | 9 +- >> osaf/services/saf/immsv/immnd/immnd_cb.h | 11 +- >> osaf/services/saf/immsv/immnd/immnd_evt.c | 166 >> ++++++++++++++++++++++++---- >> osaf/services/saf/immsv/immnd/immnd_init.h | 13 ++- >> osaf/services/saf/immsv/immnd/immnd_main.c | 7 + >> osaf/services/saf/immsv/immnd/immnd_proc.c | 120 ++++++++++++++++---- >> 7 files changed, 381 insertions(+), 60 deletions(-) >> >> >> The patch contains IMMND code that is needed for supporting cloud resilience >> feature. >> >> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc >> b/osaf/services/saf/immsv/immnd/ImmModel.cc >> --- a/osaf/services/saf/immsv/immnd/ImmModel.cc >> +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc >> @@ -446,6 +446,7 @@ static const std::string immPbeBSlaveNam >> static const std::string immLongDnsAllowed(OPENSAF_IMM_LONG_DNS_ALLOWED); >> static const std::string >> immAccessControlMode(OPENSAF_IMM_ACCESS_CONTROL_MODE); >> static const std::string >> immAuthorizedGroup(OPENSAF_IMM_AUTHORIZED_GROUP); >> +static const std::string >> immScAbsenceAllowed(OPENSAF_IMM_SC_ABSENCE_ALLOWED); >> >> static const std::string immMngtClass("SaImmMngt"); >> static const std::string >> immManagementDn("safRdn=immManagement,safApp=safImmService"); >> @@ -492,6 +493,17 @@ struct CcbIdIs >> }; >> >> >> +void >> +immModel_setScAbsenceAllowed(IMMND_CB *cb) >> +{ >> + if(cb->mCanBeCoord == 4) { >> + osafassert(cb->mScAbsenceAllowed > 0); >> + } else { >> + osafassert(cb->mScAbsenceAllowed == 0); >> + } >> + >> ImmModel::instance(&cb->immModel)->setScAbsenceAllowed(cb->mScAbsenceAllowed); >> +} >> + >> SaAisErrorT >> immModel_ccbResult(IMMND_CB *cb, SaUint32T ccbId) >> { >> @@ -511,6 +523,32 @@ immModel_abortSync(IMMND_CB *cb) >> } >> >> void >> +immModel_isolateThisNode(IMMND_CB *cb) >> +{ >> + ImmModel::instance(&cb->immModel)->isolateThisNode(cb->node_id, >> cb->mIsCoord); >> +} >> + >> +void >> +immModel_abortNonCriticalCcbs(IMMND_CB *cb) >> +{ >> + SaUint32T arrSize; >> + SaUint32T* implConnArr = NULL; >> + SaUint32T client; >> + SaClmNodeIdT pbeNodeId; >> + SaUint32T nodeId; >> + CcbVector::iterator i3 = sCcbVector.begin(); >> + for(; i3!=sCcbVector.end(); ++i3) { >> + if((*i3)->mState < IMM_CCB_CRITICAL) { >> + osafassert(immModel_ccbAbort(cb, (*i3)->mId, &arrSize, >> &implConnArr, &client, &nodeId, &pbeNodeId)); >> + osafassert(immModel_ccbFinalize(cb, (*i3)->mId) == SA_AIS_OK); >> + if (arrSize) { >> + free(implConnArr); >> + } >> + } >> + } >> +} >> + >> +void >> immModel_pbePrtoPurgeMutations(IMMND_CB *cb, SaUint32T nodeId, SaUint32T >> *reqArrSize, >> SaUint32T **reqConnArr) >> { >> @@ -17171,6 +17209,27 @@ ImmModel::getParentDn(std::string& paren >> TRACE_LEAVE(); >> } >> >> +void >> +ImmModel::setScAbsenceAllowed(SaUint16T scAbsenceAllowed) >> +{ >> + ObjectMap::iterator oi = sObjectMap.find(immObjectDn); >> + osafassert(oi != sObjectMap.end()); >> + ObjectInfo* immObject = oi->second; >> + ImmAttrValueMap::iterator avi = >> + immObject->mAttrValueMap.find(immScAbsenceAllowed); >> + if(avi == immObject->mAttrValueMap.end()) { >> + LOG_WA("Attribue '%s' does not exist in object '%s'", >> + immScAbsenceAllowed.c_str(), immObjectDn.c_str()); >> + return; >> + } >> + >> + osafassert(!(avi->second->isMultiValued())); >> + ImmAttrValue* valuep = (ImmAttrValue *) avi->second; >> + valuep->setValue_int(scAbsenceAllowed); >> + >> + LOG_NO("ABT ImmModel received scAbsenceAllowed %u", scAbsenceAllowed); >> +} >> + >> SaAisErrorT >> ImmModel::finalizeSync(ImmsvOmFinalizeSync* req, bool isCoord, >> bool isSyncClient) >> @@ -18067,3 +18126,59 @@ ImmModel::finalizeSync(ImmsvOmFinalizeSy >> return err; >> } >> >> +void >> +ImmModel::isolateThisNode(unsigned int thisNode, bool isAtCoord) >> +{ >> + /* Move this logic up to immModel_isolate... No need for this extra >> level. >> + But need to abort and terminate ccbs. >> + */ >> + ImplementerVector::iterator i; >> + AdminOwnerVector::iterator i2; >> + CcbVector::iterator i3; >> + unsigned int otherNode; >> + >> + if((sImmNodeState != IMM_NODE_FULLY_AVAILABLE) && (sImmNodeState != >> IMM_NODE_R_AVAILABLE)) { >> + LOG_NO("SC abscence interrupted sync of this IMMND - exiting"); >> + exit(0); >> + } >> + >> + i = sImplementerVector.begin(); >> + while(i != sImplementerVector.end()) { >> + IdVector cv, gv; >> + ImplementerInfo* info = (*i); >> + otherNode = info->mNodeId; >> + if(otherNode == thisNode || otherNode == 0) { >> + i++; >> + } else { >> + info = NULL; >> + this->discardNode(otherNode, cv, gv, isAtCoord); >> + LOG_NO("Impl Discarded node %x", otherNode); >> + /* Discard ccbs. */ >> + >> + i = sImplementerVector.begin(); /* restart iteration. */ >> + } >> + } >> + >> + i2 = sOwnerVector.begin(); >> + while(i2 != sOwnerVector.end()) { >> + IdVector cv, gv; >> + AdminOwnerInfo* ainfo = (*i2); >> + otherNode = ainfo->mNodeId; >> + if(otherNode == thisNode || otherNode == 0) { >> + /* ??? (otherNode == 0) is that really correct ??? */ >> + i2++; >> + } else { >> + ainfo = NULL; >> + this->discardNode(otherNode, cv, gv, isAtCoord); >> + LOG_NO("Admo Discarded node %x", otherNode); >> + /* Discard ccbs */ >> + >> + i2 = sOwnerVector.begin(); /* restart iteration. */ >> + } >> + } >> + >> + /* Verify that all noncritical CCBs are aborted. >> + Ccbs where client resided at this node chould already have been >> handled in >> + immnd_proc_discard_other_nodes() that calls >> immnd_proc_imma_discard_connection() >> + */ >> +} >> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.hh >> b/osaf/services/saf/immsv/immnd/ImmModel.hh >> --- a/osaf/services/saf/immsv/immnd/ImmModel.hh >> +++ b/osaf/services/saf/immsv/immnd/ImmModel.hh >> @@ -145,12 +145,6 @@ public: >> const immsv_octet_string* >> clName, >> ImmsvOmClassDescr* res); >> >> - SaAisErrorT classSerialize( >> - const char* className, >> - char** data, >> - size_t* size); >> - >> - >> SaAisErrorT attrCreate( >> ClassInfo* classInfo, >> const ImmsvAttrDefinition* attr, >> @@ -480,6 +474,8 @@ public: >> const struct >> ImmsvAdminOperationParam *reqparams, >> struct ImmsvAdminOperationParam >> **rparams, >> SaUint64T searchcount); >> + >> + void setScAbsenceAllowed(SaUint16T scAbsenceAllowed); >> >> SaAisErrorT objectSync(const ImmsvOmObjectSync* req); >> bool fetchRtUpdate(ImmsvOmObjectSync* syncReq, >> @@ -517,6 +513,7 @@ public: >> void recognizedIsolated(); >> bool syncComplete(bool isJoining); >> void abortSync(); >> + void isolateThisNode(unsigned int thisNode, bool >> isAtCoord); >> void pbePrtoPurgeMutations(unsigned int nodeId, >> ConnVector& connVector); >> SaAisErrorT ccbResult(SaUint32T ccbId); >> ImmsvAttrNameList * ccbGrabErrStrings(SaUint32T ccbId); >> diff --git a/osaf/services/saf/immsv/immnd/immnd_cb.h >> b/osaf/services/saf/immsv/immnd/immnd_cb.h >> --- a/osaf/services/saf/immsv/immnd/immnd_cb.h >> +++ b/osaf/services/saf/immsv/immnd/immnd_cb.h >> @@ -113,13 +113,17 @@ typedef struct immnd_cb_tag { >> SaUint32T mMyEpoch; //Epoch counter, used in synch of immnds >> SaUint32T mMyPid; //Is this needed ?? >> SaUint32T mRulingEpoch; >> - uint8_t mAccepted; //Should all fevs messages be processed? >> + SaUint32T mLatestAdmoId; >> + SaUint32T mLatestImplId; >> + SaUint32T mLatestCcbId; >> + >> + uint8_t mAccepted; //If=!0 Fevs messages can be processed. 2=>IMMD >> re-introduce. >> uint8_t mIntroduced; //Ack received on introduce message >> uint8_t mSyncRequested; //true=> I am coord, other req sync >> uint8_t mPendSync; //1=>sync announced but not received. >> uint8_t mSyncFinalizing; //1=>finalizeSync sent but not received. >> uint8_t mSync; //true => this node is being synced (client). >> - uint8_t mCanBeCoord; //If!=0 then SC, if 2 the 2pbe arbitration. >> + uint8_t mCanBeCoord; //If!=0 then SC, 2 => 2pbe arbitration, 4 => >> absentScAllowed. >> uint8_t mIsCoord; >> uint8_t mLostNodes; //Detached & not syncreq => delay sync start >> uint8_t mBlockPbeEnable; //Current PBE has not completed shutdown yet. >> @@ -128,6 +132,8 @@ typedef struct immnd_cb_tag { >> bool mIsOtherScUp; //If set & this is an SC then other SC is up(2pbe). >> //False=> *allow* 1safe 2pbe. May err conservatively (true) >> bool mForceClean; //true => Force cleanTheHouse to run once *now*. >> + SaUint16T mScAbsenceAllowed; /* Non zero if "headless Hydra" allowed >> (loss of both IMMDs/SCs). >> + Value is number of seconds of SC absence >> tolerated. */ >> >> /* Information about the IMMD */ >> MDS_DEST immd_mdest_id; >> @@ -161,6 +167,7 @@ typedef struct immnd_cb_tag { >> uint8_t mPbeVeteran; //false => regenerate. true => re-attach >> db-file >> uint8_t mPbeVeteranB; //false => regenerate. true => re-attach >> db-file >> uint8_t mPbeOldVeteranB; //false => restarted, true => stable. (only >> to reduce logging). >> + uint8_t mPbeUsesSharedFs; //false => not use SFS, true => use SFS >> >> SaAmfHAStateT ha_state; // present AMF HA state of the component >> EDU_HDL immnd_edu_hdl; // edu handle, obscurely needed by mds. >> diff --git a/osaf/services/saf/immsv/immnd/immnd_evt.c >> b/osaf/services/saf/immsv/immnd/immnd_evt.c >> --- a/osaf/services/saf/immsv/immnd/immnd_evt.c >> +++ b/osaf/services/saf/immsv/immnd/immnd_evt.c >> @@ -75,9 +75,9 @@ static void immnd_evt_proc_admo_finalize >> IMMND_EVT *evt, >> SaBoolT originatedAtThisNd, >> SaImmHandleT clnt_hdl, MDS_DEST reply_dest); >> >> -static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, >> - IMMND_EVT *evt, >> - SaBoolT originatedAtThisNd, >> SaImmHandleT clnt_hdl, MDS_DEST reply_dest); >> +//static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, >> +// IMMND_EVT *evt, >> +// SaBoolT originatedAtThisNd, >> SaImmHandleT clnt_hdl, MDS_DEST reply_dest); >> >> static void immnd_evt_proc_admo_set(IMMND_CB *cb, >> IMMND_EVT *evt, >> @@ -1515,7 +1515,7 @@ static uint32_t immnd_evt_proc_search_ne >> on a previous syncronous call. Discard the >> connection and return >> BAD_HANDLE to allow client to recover and make >> progress. >> */ >> - immnd_proc_imma_discard_connection(cb, cl_node); >> + immnd_proc_imma_discard_connection(cb, cl_node, false); >> rc = immnd_client_node_del(cb, cl_node); >> osafassert(rc == NCSCC_RC_SUCCESS); >> free(cl_node); >> @@ -1973,7 +1973,7 @@ static uint32_t immnd_evt_proc_imm_final >> goto agent_rsp; >> } >> >> - immnd_proc_imma_discard_connection(cb, cl_node); >> + immnd_proc_imma_discard_connection(cb, cl_node, false); >> >> rc = immnd_client_node_del(cb, cl_node); >> if (rc == NCSCC_RC_FAILURE) { >> @@ -2197,9 +2197,11 @@ static uint32_t immnd_evt_proc_imm_clien >> cl_node->mIsResurrect = 0x1; >> >> if (immnd_client_node_add(cb, cl_node) != NCSCC_RC_SUCCESS) { >> +#if 0 //CLOUD-PROTO ABT clients should be discarded !!!! >> LOG_ER("IMMND - Adding temporary imma client Failed."); >> /*free(cl_node);*/ >> abort(); >> +#endif >> } >> >> TRACE_2("Added client with id: %llx <node:%x, count:%u>", >> @@ -2314,7 +2316,7 @@ static uint32_t immnd_evt_proc_admowner_ >> on a previous syncronous call. Discard the >> connection and return >> BAD_HANDLE to allow client to recover and make >> progress. >> */ >> - immnd_proc_imma_discard_connection(cb, cl_node); >> + immnd_proc_imma_discard_connection(cb, cl_node, false); >> rc = immnd_client_node_del(cb, cl_node); >> osafassert(rc == NCSCC_RC_SUCCESS); >> free(cl_node); >> @@ -2442,7 +2444,7 @@ static uint32_t immnd_evt_proc_impl_set( >> on a previous syncronous call. Discard the >> connection and return >> BAD_HANDLE to allow client to recover and make >> progress. >> */ >> - immnd_proc_imma_discard_connection(cb, cl_node); >> + immnd_proc_imma_discard_connection(cb, cl_node, false); >> rc = immnd_client_node_del(cb, cl_node); >> osafassert(rc == NCSCC_RC_SUCCESS); >> free(cl_node); >> @@ -2573,7 +2575,7 @@ static uint32_t immnd_evt_proc_ccb_init( >> on a previous syncronous call. Discard the >> connection and return >> BAD_HANDLE to allow client to recover and make >> progress. >> */ >> - immnd_proc_imma_discard_connection(cb, cl_node); >> + immnd_proc_imma_discard_connection(cb, cl_node, false); >> rc = immnd_client_node_del(cb, cl_node); >> osafassert(rc == NCSCC_RC_SUCCESS); >> free(cl_node); >> @@ -2680,7 +2682,7 @@ static uint32_t immnd_evt_proc_rt_update >> on a previous syncronous call. Discard the >> connection and return >> BAD_HANDLE to allow client to recover and make >> progress. >> */ >> - immnd_proc_imma_discard_connection(cb, cl_node); >> + immnd_proc_imma_discard_connection(cb, cl_node, false); >> rc = immnd_client_node_del(cb, cl_node); >> osafassert(rc == NCSCC_RC_SUCCESS); >> free(cl_node); >> @@ -2866,7 +2868,7 @@ static uint32_t immnd_evt_proc_fevs_forw >> on a previous syncronous call. Discard the >> connection and return >> BAD_HANDLE to allow client to recover and >> make progress. >> */ >> - immnd_proc_imma_discard_connection(cb, cl_node); >> + immnd_proc_imma_discard_connection(cb, cl_node, >> false); >> rc = immnd_client_node_del(cb, cl_node); >> osafassert(rc == NCSCC_RC_SUCCESS); >> free(cl_node); >> @@ -8317,7 +8319,7 @@ uint32_t immnd_evt_proc_abort_sync(IMMND >> if (cb->mState == IMM_SERVER_SYNC_CLIENT || >> cb->mState == IMM_SERVER_SYNC_PENDING) { /* Sync >> client will have to restart the sync */ >> cb->mState = IMM_SERVER_LOADING_PENDING; >> - LOG_WA("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> IMM >> SERVER LOADING PENDING (sync aborted)"); >> + LOG_WA("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> >> IMM_SERVER_LOADING_PENDING (sync aborted)"); >> cb->mStep = 0; >> cb->mJobStart = time(NULL); >> osafassert(cb->mJobStart >= ((time_t) 0)); >> @@ -8451,6 +8453,7 @@ static uint32_t immnd_evt_proc_start_syn >> with respect to the just arriving start-sync. >> Search for "ticket:#598" in immnd_proc.c >> */ >> + immModel_setScAbsenceAllowed(cb); >> } else if ((cb->mState == IMM_SERVER_SYNC_CLIENT) && >> (immnd_syncComplete(cb, SA_FALSE, cb->mStep))) { >> cb->mStep = 0; >> cb->mJobStart = time(NULL); >> @@ -8467,6 +8470,7 @@ static uint32_t immnd_evt_proc_start_syn >> with respect to the just arriving start-sync. >> Search for "ticket:#599" in immnd_proc.c >> */ >> + immModel_setScAbsenceAllowed(cb); >> } >> >> cb->mRulingEpoch = evt->info.ctrl.rulingEpoch; >> @@ -8543,7 +8547,7 @@ static uint32_t immnd_evt_proc_start_syn >> static uint32_t immnd_evt_proc_reset(IMMND_CB *cb, IMMND_EVT *evt, >> IMMSV_SEND_INFO *sinfo) >> { >> TRACE_ENTER(); >> - if (cb->mIntroduced) { >> + if (cb->mIntroduced==1) { >> LOG_ER("IMMND forced to restart on order from IMMD, exiting"); >> if(cb->mState < IMM_SERVER_READY) { >> immnd_ackToNid(NCSCC_RC_FAILURE); >> @@ -8668,11 +8672,15 @@ static uint32_t immnd_evt_proc_intro_rsp >> evt->info.ctrl.nodeId != cb->node_id); >> cb->mNumNodes++; >> TRACE("immnd_evt_proc_intro_rsp cb->mNumNodes: %u", cb->mNumNodes); >> + LOG_IN("immnd_evt_proc_intro_rsp: epoch:%i rulingEpoch:%u", >> cb->mMyEpoch, evt->info.ctrl.rulingEpoch); >> + if(evt->info.ctrl.rulingEpoch > cb->mRulingEpoch) { >> + cb->mRulingEpoch = evt->info.ctrl.rulingEpoch; >> + } >> >> if (evt->info.ctrl.nodeId == cb->node_id) { >> /*This node was introduced to the IMM cluster */ >> uint8_t oldCanBeCoord = cb->mCanBeCoord; >> - cb->mIntroduced = true; >> + cb->mIntroduced = 1; >> if(evt->info.ctrl.canBeCoord == 3) { >> cb->m2Pbe = 1; >> evt->info.ctrl.canBeCoord = 1; >> @@ -8708,6 +8716,14 @@ static uint32_t immnd_evt_proc_intro_rsp >> ((oldCanBeCoord == 2)?"load":"sync")); >> } >> >> + if(cb->mCanBeCoord == 4) { >> + osafassert(!(cb->m2Pbe)); >> + cb->mScAbsenceAllowed = evt->info.ctrl.ndExecPid; >> + LOG_IN("ABT cb->mScAbsenceAllowed:%u >> evt->info.ctrl.ndExecPid:%u", cb->mScAbsenceAllowed, >> evt->info.ctrl.ndExecPid); >> + LOG_IN("SC_ABSENCE_ALLOWED (Headless Hydra) is >> configured for %u seconds. CanBeCoord:%u", >> + cb->mScAbsenceAllowed, cb->mCanBeCoord); >> + } >> + >> if (evt->info.ctrl.isCoord) { >> if (cb->mIsCoord) { >> LOG_NO("This IMMND re-elected coord >> redundantly, failover ?"); >> @@ -8733,7 +8749,14 @@ static uint32_t immnd_evt_proc_intro_rsp >> >> } >> } >> - cb->mIsCoord = evt->info.ctrl.isCoord; >> + if(cb->mIsCoord) { >> + if(!(evt->info.ctrl.isCoord)) { >> + LOG_NO("ABT CLOUD PROTO avoided canceling coord >> - SHOULD NOT GET HERE"); >> + } >> + } else { >> + LOG_NO("SETTING COORD TO %u CLOUD PROTO", >> evt->info.ctrl.isCoord); >> + cb->mIsCoord = evt->info.ctrl.isCoord; >> + } >> osafassert(!cb->mIsCoord || cb->mCanBeCoord); >> cb->mRulingEpoch = evt->info.ctrl.rulingEpoch; >> if (cb->mRulingEpoch) { >> @@ -8751,7 +8774,7 @@ static uint32_t immnd_evt_proc_intro_rsp >> >> */ >> if(cb->mCanBeCoord && evt->info.ctrl.canBeCoord) { >> - LOG_IN("Other SC node (%x) has been introduced", >> evt->info.ctrl.nodeId); >> + LOG_IN("Other %s IMMND node (%x) has been introduced", >> (cb->mScAbsenceAllowed)?"candidate coord":"SC", evt->info.ctrl.nodeId); >> cb->mIsOtherScUp = true; /* Prevents oneSafe2PBEAllowed >> from being turned on */ >> cb->other_sc_node_id = evt->info.ctrl.nodeId; >> >> @@ -9066,7 +9089,9 @@ static void immnd_evt_proc_adminit_rsp(I >> SaUint32T conn; >> SaUint32T ownerId = 0; >> >> - osafassert(evt); >> + /* Remember latest admo_id for IMMD recovery. */ >> + cb->mLatestAdmoId = evt->info.adminitGlobal.globalOwnerId; >> + >> conn = m_IMMSV_UNPACK_HANDLE_HIGH(clnt_hdl); >> nodeId = m_IMMSV_UNPACK_HANDLE_LOW(clnt_hdl); >> ownerId = evt->info.adminitGlobal.globalOwnerId; >> @@ -9231,6 +9256,45 @@ static void immnd_evt_proc_finalize_sync >> /*This adjust-epoch will persistify the new epoch for: >> veterans. */ >> immnd_adjustEpoch(cb, SA_TRUE); /* Will osafassert if >> immd is down. */ >> } >> + >> + if(cb->mScAbsenceAllowed) {/* Coord and veteran nodes. */ >> + IMMND_IMM_CLIENT_NODE *cl_node = NULL; >> + SaImmHandleT prev_hdl; >> + unsigned int count = 0; >> + IMMSV_EVT send_evt; >> + /* Sync completed for veteran & headless allowed => >> trigger active >> + resurrect. */ >> + memset(&send_evt, '\0', sizeof(IMMSV_EVT)); >> + send_evt.type = IMMSV_EVT_TYPE_IMMA; >> + send_evt.info.imma.type = >> IMMA_EVT_ND2A_PROC_STALE_CLIENTS; >> + immnd_client_node_getnext(cb, 0, &cl_node); >> + while (cl_node) { >> + prev_hdl = cl_node->imm_app_hdl; >> + if(!(cl_node->mIsResurrect)) { >> + LOG_IN("Veteran node found active >> client id: %llx " >> + "version:%c %u %u, after sync.", >> + cl_node->imm_app_hdl, >> cl_node->version.releaseCode, >> + cl_node->version.majorVersion, >> + cl_node->version.minorVersion); >> + immnd_client_node_getnext(cb, prev_hdl, >> &cl_node); >> + continue; >> + } >> + /* Send resurrect message. */ >> + if (immnd_mds_msg_send(cb, cl_node->sv_id, >> + cl_node->agent_mds_dest, >> &send_evt)!=NCSCC_RC_SUCCESS) >> + { >> + LOG_WA("Failed to send active resurrect >> message"); >> + } >> + /* Remove the temporary client node. */ >> + immnd_client_node_del(cb, cl_node); >> + memset(cl_node, '\0', >> sizeof(IMMND_IMM_CLIENT_NODE)); >> + free(cl_node); >> + cl_node = NULL; >> + ++count; >> + immnd_client_node_getnext(cb, 0, &cl_node); >> + } >> + TRACE_2("Triggered %u active resurrects at veteran >> node", count); >> + } >> } >> >> done: >> @@ -9485,7 +9549,7 @@ static void immnd_evt_proc_admo_finalize >> * is to be sent (only relevant >> if >> * originatedAtThisNode is >> false). >> >> *****************************************************************************/ >> -static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, >> +void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, >> IMMND_EVT *evt, >> SaBoolT originatedAtThisNd, >> SaImmHandleT clnt_hdl, MDS_DEST reply_dest) >> { >> @@ -9550,6 +9614,9 @@ static void immnd_evt_proc_impl_set_rsp( >> evt->info.implSet.oi_timeout = 0; >> } >> >> + /* Remember latest impl_id for IMMD recovery. */ >> + cb->mLatestImplId = evt->info.implSet.impl_id; >> + >> err = immModel_implementerSet(cb, &(evt->info.implSet.impl_name), >> (originatedAtThisNd) ? conn : 0, nodeId, implId, >> reply_dest, evt->info.implSet.oi_timeout, >> &discardImplementer); >> @@ -9934,6 +10001,9 @@ static void immnd_evt_proc_ccbinit_rsp(I >> nodeId = m_IMMSV_UNPACK_HANDLE_LOW(clnt_hdl); >> ccbId = evt->info.ccbinitGlobal.globalCcbId; >> >> + /* Remember latest ccb_id for IMMD recovery. */ >> + cb->mLatestCcbId = evt->info.ccbinitGlobal.globalCcbId; >> + >> err = immModel_ccbCreate(cb, >> evt->info.ccbinitGlobal.i.adminOwnerId, >> evt->info.ccbinitGlobal.i.ccbFlags, >> @@ -10053,12 +10123,61 @@ static uint32_t immnd_evt_proc_mds_evt(I >> immnd_proc_imma_down(cb, evt->info.mds_info.dest, >> evt->info.mds_info.svc_id); >> } else if ((evt->info.mds_info.change == NCSMDS_DOWN) && >> evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMD) { >> /* Cluster is going down. */ >> - LOG_NO("No IMMD service => cluster restart, exiting"); >> - if(cb->mState < IMM_SERVER_SYNC_SERVER) { >> - immnd_ackToNid(NCSCC_RC_FAILURE); >> - } >> - exit(1); >> - >> + if(cb->mScAbsenceAllowed == 0) { >> + /* Regular (non Hydra) exit on IMMD DOWN. */ >> + LOG_ER("No IMMD service => cluster restart, exiting"); >> + if(cb->mState < IMM_SERVER_SYNC_SERVER) { >> + immnd_ackToNid(NCSCC_RC_FAILURE); >> + } >> + exit(1); >> + } else { /* SC ABSENCE ALLOWED */ >> + LOG_WA("SC Absence IS allowed:%u IMMD service is DOWN", >> cb->mScAbsenceAllowed); >> + if(cb->mIsCoord) { >> + /* Note that normally the coord will reside at >> SCs so this branch will >> + only be relevant if REPEATED toal scAbsence >> occurs. After SC absence >> + and subsequent return of SC, the coord will >> be elected at a payload. >> + That coord will be active untill restart of >> that payload.. >> + unless we add functionality for the payload >> coord to restart after >> + a few minutes .. ? >> + */ >> + LOG_WA("This IMMND coord has to exit allowing >> restarted IMMD to select new coord"); >> + if(cb->mState < IMM_SERVER_SYNC_SERVER) { >> + immnd_ackToNid(NCSCC_RC_FAILURE); >> + } >> + exit(1); >> + } else if(cb->mState <= IMM_SERVER_LOADING_PENDING) { >> + /* Reset state in payloads that had not joined. >> No need to restart. */ >> + LOG_IN("Resetting IMMND state from %u to >> IMM_SERVER_ANONYMOUS", cb->mState); >> + cb->mState = IMM_SERVER_ANONYMOUS; >> + } else if(cb->mState < IMM_SERVER_READY) { >> + LOG_WA("IMMND was being synced or loaded (%u), >> has to restart", cb->mState); >> + if(cb->mState < IMM_SERVER_SYNC_SERVER) { >> + immnd_ackToNid(NCSCC_RC_FAILURE); >> + } >> + exit(1); >> + } >> + } >> + cb->mIntroduced = 2; >> + LOG_NO("IMMD SERVICE IS DOWN, HYDRA IS CONFIGURED => >> UNREGISTERING IMMND form MDS"); >> + immnd_mds_unregister(cb); >> + /* Discard local clients ... */ >> + immnd_proc_discard_other_nodes(cb); /* Isolate from the rest of >> cluster */ >> + LOG_NO("MDS unregisterede. sleeping ..."); >> + sleep(1); >> + LOG_NO("Sleep done registering IMMND with MDS"); >> + rc = immnd_mds_register(immnd_cb); >> + if(rc == NCSCC_RC_SUCCESS) { >> + LOG_NO("SUCCESS IN REGISTERING IMMND WITH MDS"); >> + } else { >> + LOG_ER("FAILURE IN REGISTERING IMMND WITH MDS - >> exiting"); >> + exit(1); >> + } >> + } else if ((evt->info.mds_info.change == NCSMDS_UP) && >> (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMD)) { >> + LOG_NO("IMMD service is UP ... ScAbsenseAllowed?:%u >> introduced?:%u", >> + cb->mScAbsenceAllowed, cb->mIntroduced); >> + if((cb->mIntroduced==2) && (immnd_introduceMe(cb) != >> NCSCC_RC_SUCCESS)) { >> + LOG_WA("IMMND re-introduceMe after IMMD restart failed, >> will retry"); >> + } >> } else if ((evt->info.mds_info.change == NCSMDS_UP) && >> (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMA_OM || >> evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMA_OM)) { >> @@ -10073,7 +10192,6 @@ static uint32_t immnd_evt_proc_mds_evt(I >> TRACE_2("IMMD FAILOVER"); >> /* The IMMD has failed over. */ >> immnd_proc_imma_discard_stales(cb); >> - >> } else if (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMND) { >> LOG_NO("MDS SERVICE EVENT OF TYPE IMMND!!"); >> } >> diff --git a/osaf/services/saf/immsv/immnd/immnd_init.h >> b/osaf/services/saf/immsv/immnd/immnd_init.h >> --- a/osaf/services/saf/immsv/immnd/immnd_init.h >> +++ b/osaf/services/saf/immsv/immnd/immnd_init.h >> @@ -39,8 +39,10 @@ extern IMMND_CB *immnd_cb; >> >> /* file : - immnd_proc.c */ >> >> +void immnd_proc_discard_other_nodes(IMMND_CB *cb); >> + >> void immnd_proc_imma_down(IMMND_CB *cb, MDS_DEST dest, NCSMDS_SVC_ID >> sv_id); >> -uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, >> IMMND_IMM_CLIENT_NODE *cl_node); >> +uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, >> IMMND_IMM_CLIENT_NODE *cl_node, bool scAbsenceAllowed); >> void immnd_proc_imma_discard_stales(IMMND_CB *cb); >> >> void immnd_cb_dump(void); >> @@ -75,6 +77,10 @@ extern "C" { >> >> void immModel_abortSync(IMMND_CB *cb); >> >> + void immModel_isolateThisNode(IMMND_CB *cb); >> + >> + void immModel_abortNonCriticalCcbs(IMMND_CB *cb); >> + >> void immModel_pbePrtoPurgeMutations(IMMND_CB *cb, unsigned int nodeId, >> SaUint32T *reqArrSize, >> SaUint32T **reqConArr); >> >> @@ -433,6 +439,8 @@ extern "C" { >> const char *errorString, >> ...); >> >> + void immModel_setScAbsenceAllowed(IMMND_CB *cb); >> + >> #ifdef __cplusplus >> } >> #endif >> @@ -471,6 +479,9 @@ uint32_t immnd_mds_get_handle(IMMND_CB * >> /* File : ---- immnd_evt.c */ >> void immnd_process_evt(void); >> uint32_t immnd_evt_destroy(IMMSV_EVT *evt, SaBoolT onheap, uint32_t >> line); >> +void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, IMMND_EVT *evt, >> + SaBoolT originatedAtThisNd, SaImmHandleT clnt_hdl, MDS_DEST reply_dest); >> + >> /* End : ---- immnd_evt.c */ >> >> /* File : ---- immnd_proc.c */ >> diff --git a/osaf/services/saf/immsv/immnd/immnd_main.c >> b/osaf/services/saf/immsv/immnd/immnd_main.c >> --- a/osaf/services/saf/immsv/immnd/immnd_main.c >> +++ b/osaf/services/saf/immsv/immnd/immnd_main.c >> @@ -169,6 +169,13 @@ static uint32_t immnd_initialize(char *p >> immnd_cb->mPbeFile); >> } >> >> + if ((envVar = getenv("IMMSV_USE_SHARED_FS"))) { >> + int useSharedFs = atoi(envVar); >> + if(useSharedFs != 0) { >> + immnd_cb->mPbeUsesSharedFs = 1; >> + } >> + } >> + >> immnd_cb->mRim = SA_IMM_INIT_FROM_FILE; >> immnd_cb->mPbeVeteran = SA_FALSE; >> immnd_cb->mPbeVeteranB = SA_FALSE; >> diff --git a/osaf/services/saf/immsv/immnd/immnd_proc.c >> b/osaf/services/saf/immsv/immnd/immnd_proc.c >> --- a/osaf/services/saf/immsv/immnd/immnd_proc.c >> +++ b/osaf/services/saf/immsv/immnd/immnd_proc.c >> @@ -34,6 +34,7 @@ >> >> #include "immnd.h" >> #include "immsv_api.h" >> +#include "immnd_init.h" >> >> static const char *loaderBase = "osafimmloadd"; >> static const char *pbeBase = "osafimmpbed"; >> @@ -76,7 +77,7 @@ void immnd_proc_immd_down(IMMND_CB *cb) >> * Notes : Policy used for handling immd down is to blindly >> cleanup >> * :immnd_cb >> >> ****************************************************************************/ >> -uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, >> IMMND_IMM_CLIENT_NODE *cl_node) >> +uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, >> IMMND_IMM_CLIENT_NODE *cl_node, bool scAbsence) >> { >> SaUint32T client_id; >> SaUint32T node_id; >> @@ -129,7 +130,8 @@ uint32_t immnd_proc_imma_discard_connect >> send_evt.type = IMMSV_EVT_TYPE_IMMD; >> send_evt.info.immd.type = IMMD_EVT_ND2D_DISCARD_IMPL; >> send_evt.info.immd.info.impl_set.r.impl_id = implId; >> - if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, >> cb->immd_mdest_id, &send_evt) != NCSCC_RC_SUCCESS) { >> + >> + if (!scAbsence && immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, >> cb->immd_mdest_id, &send_evt) != NCSCC_RC_SUCCESS) { >> if (immnd_is_immd_up(cb)) { >> LOG_ER("Discard implementer failed for >> implId:%u " >> "but IMMD is up !? - case not handled. >> Client will be orphanded", implId); >> @@ -142,7 +144,8 @@ uint32_t immnd_proc_imma_discard_connect >> /*Discard the local implementer directly and redundantly to >> avoid >> race conditions using this implementer (ccb's causing abort >> upcalls). >> */ >> - immModel_discardImplementer(cb, implId, SA_FALSE, NULL, NULL); >> + //immModel_discardImplementer(cb, implId, SA_FALSE, NULL, NULL); >> + immModel_discardImplementer(cb, implId, scAbsence, NULL, NULL); >> } >> >> if (cl_node->mIsStale) { >> @@ -163,7 +166,7 @@ uint32_t immnd_proc_imma_discard_connect >> for (ix = 0; ix < arrSize && !(cl_node->mIsStale); ++ix) { >> send_evt.info.immd.info.ccbId = idArr[ix]; >> TRACE_5("Discarding Ccb id:%u originating at dead >> connection: %u", idArr[ix], client_id); >> - if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, >> cb->immd_mdest_id, >> + if (!scAbsence && immnd_mds_msg_send(cb, >> NCSMDS_SVC_ID_IMMD, cb->immd_mdest_id, >> [Hung] We don't need this ... >> >> &send_evt) != NCSCC_RC_SUCCESS) { >> if (immnd_is_immd_up(cb)) { >> LOG_ER("Failure to broadcast discard >> Ccb for ccbId:%u " >> @@ -174,6 +177,8 @@ uint32_t immnd_proc_imma_discard_connect >> "(immd down)- will retry later", >> idArr[ix]); >> } >> cl_node->mIsStale = true; >> + } else if(scAbsence) { >> + /* ABT TODO discard local ccbs ??*/ >> [Hung] ... and this. When 'scAbsence' is true, the code will not send >> out any message. We can just simply do something like this, it will be >> faster. *if (!scAbsence) immModel_getCcbIdsForOrigCon(cb, client_id, >> &arrSize, &idArr);* 'arrSize' is initialized with '0' so it will not >> enter the 'if' block. >> >> } >> } >> free(idArr); >> @@ -197,20 +202,29 @@ uint32_t immnd_proc_imma_discard_connect >> send_evt.type = IMMSV_EVT_TYPE_IMMD; >> send_evt.info.immd.type = IMMD_EVT_ND2D_ADMO_HARD_FINALIZE; >> for (ix = 0; ix < arrSize && !(cl_node->mIsStale); ++ix) { >> - send_evt.info.immd.info.admoId = idArr[ix]; >> TRACE_5("Hard finalize of AdmOwner id:%u originating at >> " >> "dead connection: %u", idArr[ix], client_id); >> - if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, >> cb->immd_mdest_id, >> + if (scAbsence) { >> + SaImmHandleT clnt_hdl; >> + MDS_DEST reply_dest; >> + memset(&clnt_hdl, '\0', sizeof(SaImmHandleT)); >> + memset(&reply_dest, '\0', sizeof(MDS_DEST)); >> + send_evt.info.immnd.info.admFinReq.adm_owner_id >> = idArr[ix]; >> + immnd_evt_proc_admo_hard_finalize(cb, >> &send_evt.info.immnd, false, clnt_hdl, reply_dest); >> + } else { >> + send_evt.info.immd.info.admoId = idArr[ix]; >> + if(immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, >> cb->immd_mdest_id, >> &send_evt) != NCSCC_RC_SUCCESS) { >> - if (immnd_is_immd_up(cb)) { >> - LOG_ER("Failure to broadcast discard >> admo0wner for ccbId:%u " >> - "but IMMD is up !? - case not >> handled. Client will " >> - "be orphanded", implId); >> - } else { >> - LOG_WA("Failure to broadcast discard >> admowner for id:%u " >> - "(immd down)- will retry later", >> idArr[ix]); >> + if (immnd_is_immd_up(cb)) { >> + LOG_ER("Failure to broadcast >> discard admo0wner for ccbId:%u " >> + "but IMMD is up !? - >> case not handled. Client will " >> + "be orphanded", implId); >> + } else { >> + LOG_WA("Failure to broadcast >> discard admowner for id:%u " >> + "(immd down)- will >> retry later", idArr[ix]); >> + } >> + cl_node->mIsStale = true; >> } >> - cl_node->mIsStale = true; >> } >> } >> free(idArr); >> @@ -251,7 +265,7 @@ void immnd_proc_imma_down(IMMND_CB *cb, >> prev_hdl = cl_node->imm_app_hdl; >> >> if ((memcmp(&dest, &cl_node->agent_mds_dest, sizeof(MDS_DEST)) >> == 0) && sv_id == cl_node->sv_id) { >> - if (immnd_proc_imma_discard_connection(cb, cl_node)) { >> + if (immnd_proc_imma_discard_connection(cb, cl_node, >> false)) { >> TRACE_5("Removing client id:%llx sv_id:%u", >> cl_node->imm_app_hdl, cl_node->sv_id); >> immnd_client_node_del(cb, cl_node); >> memset(cl_node, '\0', >> sizeof(IMMND_IMM_CLIENT_NODE)); >> @@ -300,7 +314,7 @@ void immnd_proc_imma_discard_stales(IMMN >> prev_hdl = cl_node->imm_app_hdl; >> if (cl_node->mIsStale) { >> cl_node->mIsStale = false; >> - if (immnd_proc_imma_discard_connection(cb, cl_node)) { >> + if (immnd_proc_imma_discard_connection(cb, cl_node, >> false)) { >> TRACE_5("Removing client id:%llx sv_id:%u", >> cl_node->imm_app_hdl, cl_node->sv_id); >> immnd_client_node_del(cb, cl_node); >> memset(cl_node, '\0', >> sizeof(IMMND_IMM_CLIENT_NODE)); >> @@ -422,6 +436,17 @@ uint32_t immnd_introduceMe(IMMND_CB *cb) >> send_evt.info.immd.info.ctrl_msg.pbeEnabled, >> send_evt.info.immd.info.ctrl_msg.dir.size); >> >> + if(cb->mIntroduced==2) { >> + LOG_NO("Re-introduce-me highestProcessed:%llu >> highestReceived:%llu", >> + cb->highestProcessed, cb->highestReceived); >> + send_evt.info.immd.info.ctrl_msg.refresh = 2; >> + send_evt.info.immd.info.ctrl_msg.fevs_count = >> cb->highestReceived; >> + >> + send_evt.info.immd.info.ctrl_msg.admo_id_count = >> cb->mLatestAdmoId;; >> + send_evt.info.immd.info.ctrl_msg.ccb_id_count = >> cb->mLatestCcbId; >> + send_evt.info.immd.info.ctrl_msg.impl_count = cb->mLatestImplId; >> + } >> + >> if (!immnd_is_immd_up(cb)) { >> return NCSCC_RC_FAILURE; >> } >> @@ -480,7 +505,7 @@ static int32_t immnd_iAmLoader(IMMND_CB >> TRACE_5("Loading is not possible, preLoader still attached"); >> return (-3); >> } >> - >> +LOG_IN("ABT CLOUD PROTO cb->mMyEpoch:%u != cb->mRulingEpoch:%u", >> cb->mMyEpoch, cb->mRulingEpoch); >> if (cb->mMyEpoch != cb->mRulingEpoch) { >> /*We are joining the cluster, need to sync this IMMND. */ >> return (-2); >> @@ -536,7 +561,7 @@ static uint32_t immnd_requestSync(IMMND_ >> uint32_t rc = NCSCC_RC_SUCCESS; >> IMMSV_EVT send_evt; >> memset(&send_evt, '\0', sizeof(IMMSV_EVT)); >> - >> +LOG_NO("ABT REQUESTING SYNC"); >> send_evt.type = IMMSV_EVT_TYPE_IMMD; >> send_evt.info.immd.type = IMMD_EVT_ND2D_REQ_SYNC; >> send_evt.info.immd.info.ctrl_msg.ndExecPid = cb->mMyPid; >> @@ -546,6 +571,7 @@ static uint32_t immnd_requestSync(IMMND_ >> if (immnd_is_immd_up(cb)) { >> rc = immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, >> cb->immd_mdest_id, &send_evt); >> } else { >> + LOG_IN("Could not request sync because IMMD is not UP"); >> rc = NCSCC_RC_FAILURE; >> } >> return (rc == NCSCC_RC_SUCCESS); >> @@ -1571,13 +1597,19 @@ static int immnd_forkPbe(IMMND_CB *cb) >> if (pid == 0) { /*child */ >> /* TODO: Should close file-descriptors ... */ >> /*char * const pbeArgs[5] = { (char *) execPath, "--recover", >> "--pbeXX", dbFilePath, 0 };*/ >> - char * pbeArgs[5]; >> + char * pbeArgs[6]; >> bool veteran = (cb->mIsCoord) ? (cb->mPbeVeteran) : (cb->m2Pbe >> && cb->mPbeVeteranB); >> pbeArgs[0] = (char *) execPath; >> - if(veteran) { >> + if(veteran && cb->mScAbsenceAllowed && !cb->mPbeUsesSharedFs) { >> + pbeArgs[1] = "--recover"; >> + pbeArgs[2] = "--check-objects"; >> + pbeArgs[3] = >> (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe"; >> + pbeArgs[4] = dbFilePath; >> + pbeArgs[5] = 0; >> + } else if(veteran) { >> pbeArgs[1] = "--recover"; >> pbeArgs[2] = >> (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe"; >> - pbeArgs[3] = dbFilePath; >> + pbeArgs[3] = dbFilePath; >> pbeArgs[4] = 0; >> } else { >> pbeArgs[1] = >> (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe"; >> @@ -1685,7 +1717,7 @@ uint32_t immnd_proc_server(uint32_t *tim >> cb->mJobStart = now; >> } >> } else { /*We are not ready to start loading yet */ >> - if(cb->mIntroduced) { >> + if(cb->mIntroduced==1) { >> if((cb->m2Pbe == 2) && !(cb->preLoadPid)) { >> cb->preLoadPid = immnd_forkLoader(cb, >> true); >> } >> @@ -1833,6 +1865,7 @@ uint32_t immnd_proc_server(uint32_t *tim >> cb->mState = IMM_SERVER_READY; >> immnd_ackToNid(NCSCC_RC_SUCCESS); >> LOG_NO("SERVER STATE: IMM_SERVER_LOADING_SERVER >> --> IMM_SERVER_READY"); >> + immModel_setScAbsenceAllowed(cb); >> cb->mJobStart = now; >> if (cb->mPbeFile) {/* Pbe enabled */ >> cb->mRim = >> immModel_getRepositoryInitMode(cb); >> @@ -1876,6 +1909,7 @@ uint32_t immnd_proc_server(uint32_t *tim >> cb->mState = IMM_SERVER_READY; >> cb->mJobStart = now; >> LOG_NO("SERVER STATE: IMM_SERVER_LOADING_CLIENT --> >> IMM_SERVER_READY"); >> + immModel_setScAbsenceAllowed(cb); >> if (cb->mPbeFile) {/* Pbe configured */ >> cb->mRim = immModel_getRepositoryInitMode(cb); >> >> @@ -1896,7 +1930,9 @@ uint32_t immnd_proc_server(uint32_t *tim >> cb->mJobStart = now; >> cb->mState = IMM_SERVER_READY; >> immnd_ackToNid(NCSCC_RC_SUCCESS); >> - LOG_NO("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> IMM >> SERVER READY"); >> + LOG_NO("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> >> IMM_SERVER_READY"); >> + immModel_setScAbsenceAllowed(cb); >> + >> /* >> This code case duplicated in immnd_evt.c >> Search for: "ticket:#599" >> @@ -1927,7 +1963,7 @@ uint32_t immnd_proc_server(uint32_t *tim >> cb->mStep = 0; >> cb->mJobStart = now; >> cb->mState = IMM_SERVER_READY; >> - LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER >> --> IMM SERVER READY"); >> + LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER >> --> IMM_SERVER_READY"); >> } >> if (!(cb->mStep % 60)) { >> LOG_IN("Sync Phase-1, waiting for existing " >> @@ -1944,7 +1980,7 @@ uint32_t immnd_proc_server(uint32_t *tim >> cb->mStep = 0; >> cb->mJobStart = now; >> cb->mState = IMM_SERVER_READY; >> - LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER >> --> IMM SERVER READY"); >> + LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER >> --> IMM_SERVER_READY"); >> } >> >> /* PBE may intentionally be restarted by sync. Catch >> this here. */ >> @@ -1977,7 +2013,7 @@ uint32_t immnd_proc_server(uint32_t *tim >> cb->mJobStart = now; >> cb->mState = IMM_SERVER_READY; >> immnd_abortSync(cb); >> - LOG_NO("SERVER STATE: >> IMM_SERVER_SYNC_SERVER --> IMM SERVER READY"); >> + LOG_NO("SERVER STATE: >> IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY"); >> } else { >> LOG_IN("Sync Phase-2: Ccbs are >> terminated, IMM in " >> "read-only mode, forked sync >> process pid:%u", cb->syncPid); >> @@ -1991,7 +2027,7 @@ uint32_t immnd_proc_server(uint32_t *tim >> cb->mStep = 0; >> cb->mJobStart = now; >> cb->mState = IMM_SERVER_READY; >> - LOG_NO("SERVER STATE: >> IMM_SERVER_SYNC_SERVER --> IMM SERVER READY"); >> + LOG_NO("SERVER STATE: >> IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY"); >> } else if (!(cb->mSyncFinalizing)) { >> int status = 0; >> if (waitpid(cb->syncPid, &status, >> WNOHANG) > 0) { >> @@ -2031,6 +2067,11 @@ uint32_t immnd_proc_server(uint32_t *tim >> } >> } >> >> + if(cb->mIntroduced == 2) { >> + immnd_introduceMe(cb); >> + break; >> + } >> + >> coord = immnd_iAmCoordinator(cb); >> >> if (cb->pbePid > 0) { >> @@ -2275,3 +2316,28 @@ void immnd_dump_client_info(IMMND_IMM_CL >> } >> >> #endif >> + >> +/* Only for scAbsenceAllowed (headless hydra) */ >> +void immnd_proc_discard_other_nodes(IMMND_CB *cb) >> +{ >> + TRACE_ENTER(); >> + /* Discard all clients. */ >> + >> + IMMND_IMM_CLIENT_NODE *cl_node = NULL; >> + immnd_client_node_getnext(cb, 0, &cl_node); >> + while (cl_node) { >> + LOG_NO("Removing client id:%llx sv_id:%u", >> cl_node->imm_app_hdl, cl_node->sv_id); >> + osafassert(immnd_proc_imma_discard_connection(cb, cl_node, >> true)); >> + LOG_NO("ABT discard_connection OK"); >> + osafassert(immnd_client_node_del(cb, cl_node) == >> NCSCC_RC_SUCCESS); >> + free(cl_node); >> + cl_node = NULL; >> + LOG_NO("ABT Client node REMOVED"); >> + immnd_client_node_getnext(cb, 0, &cl_node); >> + } >> + >> + LOG_NO("ABT DONE REMOVING CLIENTS ENTERING immModel_isolateThisNode(cb) >> "); >> + immModel_isolateThisNode(cb); >> + immModel_abortNonCriticalCcbs(cb); >> + TRACE_LEAVE(); >> +} >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Opensaf-devel mailing list >> Opensaf-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >> >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >> _______________________________________________ >> Opensaf-devel mailing list >> Opensaf-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/opensaf-devel > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel