Hi Zoran/Neel, I have cloud resilience feature enabled with Opensaf RPMs build with pbe ( --enable-imm-pbe ) BUT PBE configuration not enabled and ( `#export IMMSV_PBE_FILE=imm.db` still not enabled immnd.conf) when 4 node cluster is in stable state and CPSV application is running , then i did stooped bot SC`s and i have received Return Value : SA_AIS_ERR_TRY_AGAIN for Finalize ckptHandle as expected , at that moment i started both SC`s , but SC`s Both didn't joined with error ` WA PBE is configured at first attached SC-immnd, but no Pbe file is configured for immnd at node 2010f - rejecting node`
I s their any thing missing im configuration configuration ? =============================================================================================================== Feb 19 09:20:43 SC-1 opensafd: Starting OpenSAF Services(5.0.M0 - ) (Using TIPC) Starting OpenSAF Services (Using TIPC):Feb 19 09:20:43 SC-1 osafrded[23407]: Started Feb 19 09:20:43 SC-1 osafrded[23407]: NO Peer rde@2020f has no state, my nodeid is less => Setting Active role Feb 19 09:20:43 SC-1 osaffmd[23416]: Started Feb 19 09:20:43 SC-1 osafimmd[23426]: Started Feb 19 09:20:43 SC-1 osafimmd[23426]: NO ******* SC_ABSENCE_ALLOWED (Headless Hydra) is configured: 900 *********** Feb 19 09:20:43 SC-1 osafimmd[23426]: NO Waiting 3 seconds to allow IMMND MDS attachments to get processed. Feb 19 09:20:46 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND process at node 2030f old epoch: 0 new epoch:4 Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Ruling epoch changed to:4 Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of fevs count from 0 to 2628 from 2030f. Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of admoId count from 0 to 4 from 2030f. Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of ccbId count from 0 to 2 from 2030f. Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of impl count from 0 to 14 from 2030f. Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:2 Accepted nodes:0 KnownVeteran:1 doReply:1 Feb 19 09:20:46 SC-1 osafimmd[23426]: NO First Veteran IMMND found (payload) at 2030f this IMMD at 2010f. Apparent IMMD lapse, *not* 2PBE => designating that IMMND as coordinator Feb 19 09:20:46 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND process at node 2040f old epoch: 0 new epoch:4 Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted nodes:1 KnownVeteran:1 doReply:1 Feb 19 09:20:46 SC-1 osafimmnd[23437]: Started Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO IMMD service is UP ... ScAbsenseAllowed?:0 introduced?:0 Feb 19 09:20:46 SC-1 osafimmd[23426]: NO New IMMND process is on STANDBY Controller at 2020f Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Extended intro from node 2020f Feb 19 09:20:46 SC-1 osafimmd[23426]: WA PBE not configured at first attached SC-immnd, but Pbe is configured for immnd at 2020f - possible upgrade from pre 4.4 Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:4 Accepted nodes:2 KnownVeteran:0 doReply:1 Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING Feb 19 09:20:46 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE Controller at 2010f Feb 19 09:20:46 SC-1 osafimmd[23426]: WA PBE is configured at first attached SC-immnd, but no Pbe file is configured for immnd at node 2010f - rejecting node Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO SETTING COORD TO 0 CLOUD PROTO Feb 19 09:20:46 SC-1 osafimmnd[23437]: ER IMMND forced to restart on order from IMMD, exiting Feb 19 09:20:46 SC-1 opensafd[23368]: ER Failed DESC:IMMND Feb 19 09:20:46 SC-1 opensafd[23368]: ER Going for recovery Feb 19 09:20:46 SC-1 opensafd[23368]: ER Trying To RESPAWN /usr/lib64/opensaf/clc-cli/osaf-immnd attempt #1 Feb 19 09:20:46 SC-1 opensafd[23368]: ER Sending SIGKILL to IMMND, pid=23431 Feb 19 09:20:46 SC-1 osafimmd[23426]: WA Error returned from processing message err:2 msg-type:2 Feb 19 09:20:46 SC-1 osafimmd[23426]: WA IMMND on controller (not currently coord) requests sync Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Sc Absence Allowed is configured (900) => IMMND coord at payload node:2030f dest566313894805508 Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Node 2020f request sync sync-pid:8766 epoch:0 Feb 19 09:20:48 SC-1 osafimmd[23426]: NO Successfully announced sync. New ruling epoch:5 Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND process at node 2030f old epoch: 4 new epoch:5 Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted nodes:3 KnownVeteran:0 doReply:0 Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND process at node 2040f old epoch: 4 new epoch:5 Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted nodes:3 KnownVeteran:0 doReply:0 Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND process at node 2020f old epoch: 0 new epoch:5 Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted nodes:3 KnownVeteran:0 doReply:0 Feb 19 09:21:01 SC-1 osafimmnd[23462]: Started Feb 19 09:21:01 SC-1 osafimmnd[23462]: NO IMMD service is UP ... ScAbsenseAllowed?:0 introduced?:0 Feb 19 09:21:02 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE Controller at 2010f Feb 19 09:21:02 SC-1 osafimmd[23426]: WA PBE is configured at first attached SC-immnd, but no Pbe file is configured for immnd at node 2010f - rejecting node Feb 19 09:21:02 SC-1 osafimmd[23426]: WA Error returned from processing message err:2 msg-type:2 Feb 19 09:21:02 SC-1 osafimmnd[23462]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING Feb 19 09:21:02 SC-1 osafimmnd[23462]: NO SETTING COORD TO 0 CLOUD PROTO Feb 19 09:21:02 SC-1 osafimmnd[23462]: ER IMMND forced to restart on order from IMMD, exiting Feb 19 09:21:02 SC-1 opensafd[23368]: ER Could Not RESPAWN IMMND Feb 19 09:21:02 SC-1 opensafd[23368]: ER Failed DESC:IMMND Feb 19 09:21:02 SC-1 opensafd[23368]: ER Trying To RESPAWN /usr/lib64/opensaf/clc-cli/osaf-immnd attempt #2 Feb 19 09:21:02 SC-1 opensafd[23368]: ER Sending SIGKILL to IMMND, pid=23456 Feb 19 09:21:17 SC-1 osafimmnd[23487]: Started Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO IMMD service is UP ... ScAbsenseAllowed?:0 introduced?:0 Feb 19 09:21:17 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE Controller at 2010f Feb 19 09:21:17 SC-1 osafimmd[23426]: WA PBE is configured at first attached SC-immnd, but no Pbe file is configured for immnd at node 2010f - rejecting node Feb 19 09:21:17 SC-1 osafimmd[23426]: WA Error returned from processing message err:2 msg-type:2 Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO SETTING COORD TO 0 CLOUD PROTO Feb 19 09:21:17 SC-1 osafimmnd[23487]: ER IMMND forced to restart on order from IMMD, exiting Feb 19 09:21:17 SC-1 opensafd[23368]: ER Could Not RESPAWN IMMND Feb 19 09:21:17 SC-1 opensafd[23368]: ER Failed DESC:IMMND Feb 19 09:21:17 SC-1 opensafd[23368]: ER FAILED TO RESPAWN Feb 19 09:21:17 SC-1 osaffmd[23416]: exiting for shutdown Feb 19 09:21:17 SC-1 osafimmd[23426]: exiting for shutdown Feb 19 09:21:17 SC-1 osafrded[23407]: exiting for shutdown Feb 19 09:21:17 SC-1 opensafd: warning: TIPC module unloading failed failed Feb 19 09:21:17 SC-1 opensafd: Starting OpenSAF failed Feb 19 09:20:44 SC-2 opensafd: OpenSAF services successfully stopped Feb 19 09:20:44 SC-2 opensafd: Starting OpenSAF Services(5.0.M0 - ) (Using TIPC) Feb 19 09:20:44 SC-2 kernel: [161176.504234] tipc: Activated (version 2.0.0) Feb 19 09:20:44 SC-2 kernel: [161176.504623] NET: Registered protocol family 30 Feb 19 09:20:44 SC-2 kernel: [161176.504626] tipc: Started in single node mode Feb 19 09:20:44 SC-2 kernel: [161176.512492] tipc: Started in network mode Feb 19 09:20:44 SC-2 kernel: [161176.512497] tipc: Own node address <1.1.2>, network identity 7777 Feb 19 09:20:44 SC-2 kernel: [161176.514875] tipc: Enabled bearer <eth:eth3>, discovery domain <1.1.0>, priority 10 Feb 19 09:20:44 SC-2 kernel: [161176.515726] tipc: Enabled bearer <eth:eth2>, discovery domain <1.1.0>, priority 10 Feb 19 09:20:44 SC-2 kernel: [161176.516587] tipc: Established link <1.1.2:eth2-1.1.1:eth1> on network plane B Feb 19 09:20:44 SC-2 kernel: [161176.516643] tipc: Established link <1.1.2:eth3-1.1.3:eth4> on network plane A Feb 19 09:20:44 SC-2 kernel: [161176.517021] tipc: Established link <1.1.2:eth3-1.1.4:eth0> on network plane A Feb 19 09:20:44 SC-2 kernel: [161176.518091] tipc: Established link <1.1.2:eth3-1.1.1:eth0> on network plane A Feb 19 09:20:44 SC-2 osafrded[8736]: Started Feb 19 09:20:44 SC-2 kernel: [161176.645456] tipc: Established link <1.1.2:eth2-1.1.4:eth2> on network plane B Feb 19 09:20:44 SC-2 kernel: [161176.645566] tipc: Established link <1.1.2:eth2-1.1.3:eth1> on network plane B Feb 19 09:20:46 SC-2 osafrded[8736]: NO Peer rde@2010f has no state, my nodeid is greater => Setting Standby role Feb 19 09:20:46 SC-2 osaffmd[8745]: Started Feb 19 09:20:46 SC-2 osafimmd[8755]: Started Feb 19 09:20:46 SC-2 osafimmd[8755]: NO ******* SC_ABSENCE_ALLOWED (Headless Hydra) is configured: 900 *********** Feb 19 09:20:46 SC-2 osafimmd[8755]: NO Waiting 3 seconds to allow IMMND MDS attachments to get processed. Feb 19 09:20:49 SC-2 osafimmnd[8766]: Started Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO Persistent Back-End capability configured, Pbe file:imm.db (suffix may get added) Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO IMMD service is UP ... ScAbsenseAllowed?:0 introduced?:0 Feb 19 09:20:49 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process at node 2040f old epoch: 0 new epoch:4 Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE: IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SETTING COORD TO 0 CLOUD PROTO Feb 19 09:20:49 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller f1 detected at standby immd!! f2. Possible failover Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO Fevs count adjusted to 2628 preLoadPid: 0 Feb 19 09:20:49 SC-2 osafimmd[8755]: WA Message count:2629 + 1 != 2629 Feb 19 09:20:49 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2629 Feb 19 09:20:49 SC-2 osafimmnd[8766]: WA Error code 2 returned for message type 82 - ignoring Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE: IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO ABT REQUESTING SYNC Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE: IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO NODE STATE-> IMM_NODE_ISOLATED Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: Ruling epoch noted as:5 Feb 19 09:20:51 SC-2 osafimmd[8755]: NO IMMND coord at 2030f Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO NODE STATE-> IMM_NODE_W_AVAILABLE Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO SERVER STATE: IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO NODE STATE-> IMM_NODE_FULLY_AVAILABLE 2715 Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO RepositoryInitModeT is SA_IMM_INIT_FROM_FILE Feb 19 09:20:51 SC-2 osafimmnd[8766]: WA IMM Access Control mode is DISABLED! Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Epoch set to 5 in ImmModel Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process at node 2030f old epoch: 4 new epoch:5 Feb 19 09:20:51 SC-2 osafimmd[8755]: NO IMMND coord at 2030f Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process at node 2040f old epoch: 4 new epoch:5 Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Implementer connected: 15 (MsgQueueService131855) <0, 2030f> Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process at node 2020f old epoch: 0 new epoch:5 Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Implementer connected: 16 (MsgQueueService132111) <0, 2040f> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO SERVER STATE: IMM_SERVER_SYNC_CLIENT --> IMM_SERVER_READY Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO ABT ImmModel received scAbsenceAllowed 900 Feb 19 09:20:51 SC-2 osaflogd[8776]: Started Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOGSV_DATA_GROUPNAME not found Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOG root directory is: "/var/log/opensaf/saflog" Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOG data group is: "" Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LGS_MBCSV_VERSION = 5 Feb 19 09:20:51 SC-2 osafntfd[8787]: Started Feb 19 09:21:01 SC-2 osafntfd[8787]: WA saLogInitialize returns try again, retries... Feb 19 09:21:04 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller f1 detected at standby immd!! f2. Possible failover Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2806 Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for message type 82 - ignoring Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2807 Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for message type 82 - ignoring Feb 19 09:21:04 SC-2 osafimmnd[8766]: NO Global discard node received for nodeId:2010f pid:0 Feb 19 09:21:04 SC-2 osafimmd[8755]: WA Message count:2808 + 1 != 2808 Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2808 Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for message type 82 - ignoring Feb 19 09:21:20 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller f1 detected at standby immd!! f2. Possible failover Feb 19 09:21:20 SC-2 osafimmd[8755]: NO Skipping re-send of fevs message 2807 since it has recently been resent. Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2808 Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for message type 82 - ignoring Feb 19 09:21:20 SC-2 osafimmnd[8766]: NO Global discard node received for nodeId:2010f pid:0 Feb 19 09:21:20 SC-2 osafimmd[8755]: WA Message count:2809 + 1 != 2809 Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2809 Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for message type 82 - ignoring Feb 19 09:21:20 SC-2 osafimmd[8755]: WA IMMD lost contact with peer IMMD (NCSMDS_RED_DOWN) Feb 19 09:21:20 SC-2 osafimmd[8755]: NO Skipping re-send of fevs message 2808 since it has recently been resent. Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2809 Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for message type 82 - ignoring Feb 19 09:21:31 SC-2 opensafd[8704]: ER Timed-out for response from NTFD Feb 19 09:21:31 SC-2 opensafd[8704]: ER Feb 19 09:21:31 SC-2 opensafd[8704]: ER Going for recovery Feb 19 09:21:31 SC-2 opensafd[8704]: ER Trying To RESPAWN /usr/lib64/opensaf/clc-cli/osaf-ntfd attempt #1 Feb 19 09:21:31 SC-2 opensafd[8704]: ER Sending SIGABRT to NTFD, pid=8787, (origin parent pid=8782) Feb 19 09:21:47 SC-2 osafntfd[8821]: Started Feb 19 09:21:57 SC-2 osafntfd[8821]: WA saLogInitialize returns try again, retries... Feb 19 09:22:27 SC-2 opensafd[8704]: ER Timed-out for response from NTFD Feb 19 09:22:27 SC-2 opensafd[8704]: ER Could Not RESPAWN NTFD Feb 19 09:22:27 SC-2 opensafd[8704]: ER Feb 19 09:22:27 SC-2 opensafd[8704]: ER Trying To RESPAWN /usr/lib64/opensaf/clc-cli/osaf-ntfd attempt #2 Feb 19 09:22:27 SC-2 opensafd[8704]: ER Sending SIGABRT to NTFD, pid=8821, (origin parent pid=8816) Feb 19 09:22:42 SC-2 osafntfd[8851]: Started Feb 19 09:22:52 SC-2 osafntfd[8851]: WA saLogInitialize returns try again, retries... Feb 19 09:23:22 SC-2 opensafd[8704]: ER Timed-out for response from NTFD Feb 19 09:23:22 SC-2 opensafd[8704]: ER Could Not RESPAWN NTFD Feb 19 09:23:22 SC-2 opensafd[8704]: ER Feb 19 09:23:22 SC-2 opensafd[8704]: ER FAILED TO RESPAWN Feb 19 09:23:22 SC-2 osaffmd[8745]: exiting for shutdown Feb 19 09:23:22 SC-2 osafimmd[8755]: exiting for shutdown Feb 19 09:23:22 SC-2 osafimmnd[8766]: WA SC Absence IS allowed:900 IMMD service is DOWN Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO IMMD SERVICE IS DOWN, HYDRA IS CONFIGURED => UNREGISTERING IMMND form MDS Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Removing client id:10002020f sv_id:27 Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT discard_connection OK Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT Client node REMOVED Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT DONE REMOVING CLIENTS ENTERING immModel_isolateThisNode(cb) Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Implementer disconnected 16 <0, 2040f(down)> (MsgQueueService132111) Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Impl Discarded node 2040f Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Implementer disconnected 15 <0, 2030f(down)> (MsgQueueService131855) Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Impl Discarded node 2030f Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO MDS unregisterede. sleeping ... Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO Sleep done registering IMMND with MDS Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO MDS: mds_register_callback: dest 2020f2a460010 already exist Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO SUCCESS IN REGISTERING IMMND WITH MDS Feb 19 09:23:23 SC-2 osafimmnd[8766]: exiting for shutdown Feb 19 09:23:23 SC-2 osaflogd[8776]: exiting for shutdown Feb 19 09:23:23 SC-2 osafrded[8736]: exiting for shutdown Feb 19 09:23:23 SC-2 kernel: [161335.919458] tipc: Disabling bearer <eth:eth3> Feb 19 09:23:23 SC-2 kernel: [161335.919469] tipc: Lost link <1.1.2:eth3-1.1.3:eth4> on network plane A Feb 19 09:23:23 SC-2 kernel: [161335.919616] tipc: Lost link <1.1.2:eth3-1.1.4:eth0> on network plane A Feb 19 09:23:23 SC-2 kernel: [161335.919672] tipc: Lost link <1.1.2:eth3-1.1.1:eth0> on network plane A Feb 19 09:23:23 SC-2 kernel: [161335.919691] tipc: Disabling bearer <eth:eth2> Feb 19 09:23:23 SC-2 kernel: [161335.919694] tipc: Lost link <1.1.2:eth2-1.1.1:eth1> on network plane B Feb 19 09:23:23 SC-2 kernel: [161335.919697] tipc: Lost contact with <1.1.1> Feb 19 09:23:23 SC-2 kernel: [161335.919702] tipc: Lost link <1.1.2:eth2-1.1.4:eth2> on network plane B Feb 19 09:23:23 SC-2 kernel: [161335.919704] tipc: Lost contact with <1.1.4> Feb 19 09:23:23 SC-2 kernel: [161335.919708] tipc: Lost link <1.1.2:eth2-1.1.3:eth1> on network plane B Feb 19 09:23:23 SC-2 kernel: [161335.919710] tipc: Lost contact with <1.1.3> Feb 19 09:23:23 SC-2 kernel: [161335.919752] tipc: Left network mode Feb 19 09:23:23 SC-2 kernel: [161335.919914] NET: Unregistered protocol family 30 Feb 19 09:23:23 SC-2 kernel: [161335.919920] tipc: Deactivated Feb 19 09:23:23 SC-2 opensafd: Starting OpenSAF failed =============================================================================================================== -AVM On 2/4/2016 3:11 PM, Hung Nguyen wrote: > Hi Zoran, > > Please find my comment inline. > > BR, > > Hung Nguyen - DEK Technologies > > > -------------------------------------------------------------------------------- > From: Zoran Milinkovic zoran.milinko...@ericsson.com > Sent: Tuesday, December 22, 2015 9:14PM > To: Neelakanta Reddy > reddy.neelaka...@oracle.com > Cc: Opensaf-devel > opensaf-devel@lists.sourceforge.net > Subject: [devel] [PATCH 4 of 5] imm: add IMMND support for cloud resilience > feature [#1625] > > > osaf/services/saf/immsv/immnd/ImmModel.cc | 115 ++++++++++++++++++++ > osaf/services/saf/immsv/immnd/ImmModel.hh | 9 +- > osaf/services/saf/immsv/immnd/immnd_cb.h | 11 +- > osaf/services/saf/immsv/immnd/immnd_evt.c | 166 > ++++++++++++++++++++++++---- > osaf/services/saf/immsv/immnd/immnd_init.h | 13 ++- > osaf/services/saf/immsv/immnd/immnd_main.c | 7 + > osaf/services/saf/immsv/immnd/immnd_proc.c | 120 ++++++++++++++++---- > 7 files changed, 381 insertions(+), 60 deletions(-) > > > The patch contains IMMND code that is needed for supporting cloud resilience > feature. > > diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc > b/osaf/services/saf/immsv/immnd/ImmModel.cc > --- a/osaf/services/saf/immsv/immnd/ImmModel.cc > +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc > @@ -446,6 +446,7 @@ static const std::string immPbeBSlaveNam > static const std::string immLongDnsAllowed(OPENSAF_IMM_LONG_DNS_ALLOWED); > static const std::string > immAccessControlMode(OPENSAF_IMM_ACCESS_CONTROL_MODE); > static const std::string immAuthorizedGroup(OPENSAF_IMM_AUTHORIZED_GROUP); > +static const std::string immScAbsenceAllowed(OPENSAF_IMM_SC_ABSENCE_ALLOWED); > > static const std::string immMngtClass("SaImmMngt"); > static const std::string > immManagementDn("safRdn=immManagement,safApp=safImmService"); > @@ -492,6 +493,17 @@ struct CcbIdIs > }; > > > +void > +immModel_setScAbsenceAllowed(IMMND_CB *cb) > +{ > + if(cb->mCanBeCoord == 4) { > + osafassert(cb->mScAbsenceAllowed > 0); > + } else { > + osafassert(cb->mScAbsenceAllowed == 0); > + } > + > ImmModel::instance(&cb->immModel)->setScAbsenceAllowed(cb->mScAbsenceAllowed); > +} > + > SaAisErrorT > immModel_ccbResult(IMMND_CB *cb, SaUint32T ccbId) > { > @@ -511,6 +523,32 @@ immModel_abortSync(IMMND_CB *cb) > } > > void > +immModel_isolateThisNode(IMMND_CB *cb) > +{ > + ImmModel::instance(&cb->immModel)->isolateThisNode(cb->node_id, > cb->mIsCoord); > +} > + > +void > +immModel_abortNonCriticalCcbs(IMMND_CB *cb) > +{ > + SaUint32T arrSize; > + SaUint32T* implConnArr = NULL; > + SaUint32T client; > + SaClmNodeIdT pbeNodeId; > + SaUint32T nodeId; > + CcbVector::iterator i3 = sCcbVector.begin(); > + for(; i3!=sCcbVector.end(); ++i3) { > + if((*i3)->mState < IMM_CCB_CRITICAL) { > + osafassert(immModel_ccbAbort(cb, (*i3)->mId, &arrSize, > &implConnArr, &client, &nodeId, &pbeNodeId)); > + osafassert(immModel_ccbFinalize(cb, (*i3)->mId) == SA_AIS_OK); > + if (arrSize) { > + free(implConnArr); > + } > + } > + } > +} > + > +void > immModel_pbePrtoPurgeMutations(IMMND_CB *cb, SaUint32T nodeId, SaUint32T > *reqArrSize, > SaUint32T **reqConnArr) > { > @@ -17171,6 +17209,27 @@ ImmModel::getParentDn(std::string& paren > TRACE_LEAVE(); > } > > +void > +ImmModel::setScAbsenceAllowed(SaUint16T scAbsenceAllowed) > +{ > + ObjectMap::iterator oi = sObjectMap.find(immObjectDn); > + osafassert(oi != sObjectMap.end()); > + ObjectInfo* immObject = oi->second; > + ImmAttrValueMap::iterator avi = > + immObject->mAttrValueMap.find(immScAbsenceAllowed); > + if(avi == immObject->mAttrValueMap.end()) { > + LOG_WA("Attribue '%s' does not exist in object '%s'", > + immScAbsenceAllowed.c_str(), immObjectDn.c_str()); > + return; > + } > + > + osafassert(!(avi->second->isMultiValued())); > + ImmAttrValue* valuep = (ImmAttrValue *) avi->second; > + valuep->setValue_int(scAbsenceAllowed); > + > + LOG_NO("ABT ImmModel received scAbsenceAllowed %u", scAbsenceAllowed); > +} > + > SaAisErrorT > ImmModel::finalizeSync(ImmsvOmFinalizeSync* req, bool isCoord, > bool isSyncClient) > @@ -18067,3 +18126,59 @@ ImmModel::finalizeSync(ImmsvOmFinalizeSy > return err; > } > > +void > +ImmModel::isolateThisNode(unsigned int thisNode, bool isAtCoord) > +{ > + /* Move this logic up to immModel_isolate... No need for this extra > level. > + But need to abort and terminate ccbs. > + */ > + ImplementerVector::iterator i; > + AdminOwnerVector::iterator i2; > + CcbVector::iterator i3; > + unsigned int otherNode; > + > + if((sImmNodeState != IMM_NODE_FULLY_AVAILABLE) && (sImmNodeState != > IMM_NODE_R_AVAILABLE)) { > + LOG_NO("SC abscence interrupted sync of this IMMND - exiting"); > + exit(0); > + } > + > + i = sImplementerVector.begin(); > + while(i != sImplementerVector.end()) { > + IdVector cv, gv; > + ImplementerInfo* info = (*i); > + otherNode = info->mNodeId; > + if(otherNode == thisNode || otherNode == 0) { > + i++; > + } else { > + info = NULL; > + this->discardNode(otherNode, cv, gv, isAtCoord); > + LOG_NO("Impl Discarded node %x", otherNode); > + /* Discard ccbs. */ > + > + i = sImplementerVector.begin(); /* restart iteration. */ > + } > + } > + > + i2 = sOwnerVector.begin(); > + while(i2 != sOwnerVector.end()) { > + IdVector cv, gv; > + AdminOwnerInfo* ainfo = (*i2); > + otherNode = ainfo->mNodeId; > + if(otherNode == thisNode || otherNode == 0) { > + /* ??? (otherNode == 0) is that really correct ??? */ > + i2++; > + } else { > + ainfo = NULL; > + this->discardNode(otherNode, cv, gv, isAtCoord); > + LOG_NO("Admo Discarded node %x", otherNode); > + /* Discard ccbs */ > + > + i2 = sOwnerVector.begin(); /* restart iteration. */ > + } > + } > + > + /* Verify that all noncritical CCBs are aborted. > + Ccbs where client resided at this node chould already have been > handled in > + immnd_proc_discard_other_nodes() that calls > immnd_proc_imma_discard_connection() > + */ > +} > diff --git a/osaf/services/saf/immsv/immnd/ImmModel.hh > b/osaf/services/saf/immsv/immnd/ImmModel.hh > --- a/osaf/services/saf/immsv/immnd/ImmModel.hh > +++ b/osaf/services/saf/immsv/immnd/ImmModel.hh > @@ -145,12 +145,6 @@ public: > const immsv_octet_string* > clName, > ImmsvOmClassDescr* res); > > - SaAisErrorT classSerialize( > - const char* className, > - char** data, > - size_t* size); > - > - > SaAisErrorT attrCreate( > ClassInfo* classInfo, > const ImmsvAttrDefinition* attr, > @@ -480,6 +474,8 @@ public: > const struct > ImmsvAdminOperationParam *reqparams, > struct ImmsvAdminOperationParam > **rparams, > SaUint64T searchcount); > + > + void setScAbsenceAllowed(SaUint16T scAbsenceAllowed); > > SaAisErrorT objectSync(const ImmsvOmObjectSync* req); > bool fetchRtUpdate(ImmsvOmObjectSync* syncReq, > @@ -517,6 +513,7 @@ public: > void recognizedIsolated(); > bool syncComplete(bool isJoining); > void abortSync(); > + void isolateThisNode(unsigned int thisNode, bool isAtCoord); > void pbePrtoPurgeMutations(unsigned int nodeId, > ConnVector& connVector); > SaAisErrorT ccbResult(SaUint32T ccbId); > ImmsvAttrNameList * ccbGrabErrStrings(SaUint32T ccbId); > diff --git a/osaf/services/saf/immsv/immnd/immnd_cb.h > b/osaf/services/saf/immsv/immnd/immnd_cb.h > --- a/osaf/services/saf/immsv/immnd/immnd_cb.h > +++ b/osaf/services/saf/immsv/immnd/immnd_cb.h > @@ -113,13 +113,17 @@ typedef struct immnd_cb_tag { > SaUint32T mMyEpoch; //Epoch counter, used in synch of immnds > SaUint32T mMyPid; //Is this needed ?? > SaUint32T mRulingEpoch; > - uint8_t mAccepted; //Should all fevs messages be processed? > + SaUint32T mLatestAdmoId; > + SaUint32T mLatestImplId; > + SaUint32T mLatestCcbId; > + > + uint8_t mAccepted; //If=!0 Fevs messages can be processed. 2=>IMMD > re-introduce. > uint8_t mIntroduced; //Ack received on introduce message > uint8_t mSyncRequested; //true=> I am coord, other req sync > uint8_t mPendSync; //1=>sync announced but not received. > uint8_t mSyncFinalizing; //1=>finalizeSync sent but not received. > uint8_t mSync; //true => this node is being synced (client). > - uint8_t mCanBeCoord; //If!=0 then SC, if 2 the 2pbe arbitration. > + uint8_t mCanBeCoord; //If!=0 then SC, 2 => 2pbe arbitration, 4 => > absentScAllowed. > uint8_t mIsCoord; > uint8_t mLostNodes; //Detached & not syncreq => delay sync start > uint8_t mBlockPbeEnable; //Current PBE has not completed shutdown yet. > @@ -128,6 +132,8 @@ typedef struct immnd_cb_tag { > bool mIsOtherScUp; //If set & this is an SC then other SC is up(2pbe). > //False=> *allow* 1safe 2pbe. May err conservatively (true) > bool mForceClean; //true => Force cleanTheHouse to run once *now*. > + SaUint16T mScAbsenceAllowed; /* Non zero if "headless Hydra" allowed > (loss of both IMMDs/SCs). > + Value is number of seconds of SC absence > tolerated. */ > > /* Information about the IMMD */ > MDS_DEST immd_mdest_id; > @@ -161,6 +167,7 @@ typedef struct immnd_cb_tag { > uint8_t mPbeVeteran; //false => regenerate. true => re-attach > db-file > uint8_t mPbeVeteranB; //false => regenerate. true => re-attach > db-file > uint8_t mPbeOldVeteranB; //false => restarted, true => stable. (only > to reduce logging). > + uint8_t mPbeUsesSharedFs; //false => not use SFS, true => use SFS > > SaAmfHAStateT ha_state; // present AMF HA state of the component > EDU_HDL immnd_edu_hdl; // edu handle, obscurely needed by mds. > diff --git a/osaf/services/saf/immsv/immnd/immnd_evt.c > b/osaf/services/saf/immsv/immnd/immnd_evt.c > --- a/osaf/services/saf/immsv/immnd/immnd_evt.c > +++ b/osaf/services/saf/immsv/immnd/immnd_evt.c > @@ -75,9 +75,9 @@ static void immnd_evt_proc_admo_finalize > IMMND_EVT *evt, > SaBoolT originatedAtThisNd, > SaImmHandleT clnt_hdl, MDS_DEST reply_dest); > > -static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, > - IMMND_EVT *evt, > - SaBoolT originatedAtThisNd, > SaImmHandleT clnt_hdl, MDS_DEST reply_dest); > +//static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, > +// IMMND_EVT *evt, > +// SaBoolT originatedAtThisNd, > SaImmHandleT clnt_hdl, MDS_DEST reply_dest); > > static void immnd_evt_proc_admo_set(IMMND_CB *cb, > IMMND_EVT *evt, > @@ -1515,7 +1515,7 @@ static uint32_t immnd_evt_proc_search_ne > on a previous syncronous call. Discard the > connection and return > BAD_HANDLE to allow client to recover and make > progress. > */ > - immnd_proc_imma_discard_connection(cb, cl_node); > + immnd_proc_imma_discard_connection(cb, cl_node, false); > rc = immnd_client_node_del(cb, cl_node); > osafassert(rc == NCSCC_RC_SUCCESS); > free(cl_node); > @@ -1973,7 +1973,7 @@ static uint32_t immnd_evt_proc_imm_final > goto agent_rsp; > } > > - immnd_proc_imma_discard_connection(cb, cl_node); > + immnd_proc_imma_discard_connection(cb, cl_node, false); > > rc = immnd_client_node_del(cb, cl_node); > if (rc == NCSCC_RC_FAILURE) { > @@ -2197,9 +2197,11 @@ static uint32_t immnd_evt_proc_imm_clien > cl_node->mIsResurrect = 0x1; > > if (immnd_client_node_add(cb, cl_node) != NCSCC_RC_SUCCESS) { > +#if 0 //CLOUD-PROTO ABT clients should be discarded !!!! > LOG_ER("IMMND - Adding temporary imma client Failed."); > /*free(cl_node);*/ > abort(); > +#endif > } > > TRACE_2("Added client with id: %llx <node:%x, count:%u>", > @@ -2314,7 +2316,7 @@ static uint32_t immnd_evt_proc_admowner_ > on a previous syncronous call. Discard the > connection and return > BAD_HANDLE to allow client to recover and make > progress. > */ > - immnd_proc_imma_discard_connection(cb, cl_node); > + immnd_proc_imma_discard_connection(cb, cl_node, false); > rc = immnd_client_node_del(cb, cl_node); > osafassert(rc == NCSCC_RC_SUCCESS); > free(cl_node); > @@ -2442,7 +2444,7 @@ static uint32_t immnd_evt_proc_impl_set( > on a previous syncronous call. Discard the > connection and return > BAD_HANDLE to allow client to recover and make > progress. > */ > - immnd_proc_imma_discard_connection(cb, cl_node); > + immnd_proc_imma_discard_connection(cb, cl_node, false); > rc = immnd_client_node_del(cb, cl_node); > osafassert(rc == NCSCC_RC_SUCCESS); > free(cl_node); > @@ -2573,7 +2575,7 @@ static uint32_t immnd_evt_proc_ccb_init( > on a previous syncronous call. Discard the > connection and return > BAD_HANDLE to allow client to recover and make > progress. > */ > - immnd_proc_imma_discard_connection(cb, cl_node); > + immnd_proc_imma_discard_connection(cb, cl_node, false); > rc = immnd_client_node_del(cb, cl_node); > osafassert(rc == NCSCC_RC_SUCCESS); > free(cl_node); > @@ -2680,7 +2682,7 @@ static uint32_t immnd_evt_proc_rt_update > on a previous syncronous call. Discard the > connection and return > BAD_HANDLE to allow client to recover and make > progress. > */ > - immnd_proc_imma_discard_connection(cb, cl_node); > + immnd_proc_imma_discard_connection(cb, cl_node, false); > rc = immnd_client_node_del(cb, cl_node); > osafassert(rc == NCSCC_RC_SUCCESS); > free(cl_node); > @@ -2866,7 +2868,7 @@ static uint32_t immnd_evt_proc_fevs_forw > on a previous syncronous call. Discard the > connection and return > BAD_HANDLE to allow client to recover and > make progress. > */ > - immnd_proc_imma_discard_connection(cb, cl_node); > + immnd_proc_imma_discard_connection(cb, cl_node, > false); > rc = immnd_client_node_del(cb, cl_node); > osafassert(rc == NCSCC_RC_SUCCESS); > free(cl_node); > @@ -8317,7 +8319,7 @@ uint32_t immnd_evt_proc_abort_sync(IMMND > if (cb->mState == IMM_SERVER_SYNC_CLIENT || > cb->mState == IMM_SERVER_SYNC_PENDING) { /* Sync > client will have to restart the sync */ > cb->mState = IMM_SERVER_LOADING_PENDING; > - LOG_WA("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> IMM > SERVER LOADING PENDING (sync aborted)"); > + LOG_WA("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> > IMM_SERVER_LOADING_PENDING (sync aborted)"); > cb->mStep = 0; > cb->mJobStart = time(NULL); > osafassert(cb->mJobStart >= ((time_t) 0)); > @@ -8451,6 +8453,7 @@ static uint32_t immnd_evt_proc_start_syn > with respect to the just arriving start-sync. > Search for "ticket:#598" in immnd_proc.c > */ > + immModel_setScAbsenceAllowed(cb); > } else if ((cb->mState == IMM_SERVER_SYNC_CLIENT) && > (immnd_syncComplete(cb, SA_FALSE, cb->mStep))) { > cb->mStep = 0; > cb->mJobStart = time(NULL); > @@ -8467,6 +8470,7 @@ static uint32_t immnd_evt_proc_start_syn > with respect to the just arriving start-sync. > Search for "ticket:#599" in immnd_proc.c > */ > + immModel_setScAbsenceAllowed(cb); > } > > cb->mRulingEpoch = evt->info.ctrl.rulingEpoch; > @@ -8543,7 +8547,7 @@ static uint32_t immnd_evt_proc_start_syn > static uint32_t immnd_evt_proc_reset(IMMND_CB *cb, IMMND_EVT *evt, > IMMSV_SEND_INFO *sinfo) > { > TRACE_ENTER(); > - if (cb->mIntroduced) { > + if (cb->mIntroduced==1) { > LOG_ER("IMMND forced to restart on order from IMMD, exiting"); > if(cb->mState < IMM_SERVER_READY) { > immnd_ackToNid(NCSCC_RC_FAILURE); > @@ -8668,11 +8672,15 @@ static uint32_t immnd_evt_proc_intro_rsp > evt->info.ctrl.nodeId != cb->node_id); > cb->mNumNodes++; > TRACE("immnd_evt_proc_intro_rsp cb->mNumNodes: %u", cb->mNumNodes); > + LOG_IN("immnd_evt_proc_intro_rsp: epoch:%i rulingEpoch:%u", > cb->mMyEpoch, evt->info.ctrl.rulingEpoch); > + if(evt->info.ctrl.rulingEpoch > cb->mRulingEpoch) { > + cb->mRulingEpoch = evt->info.ctrl.rulingEpoch; > + } > > if (evt->info.ctrl.nodeId == cb->node_id) { > /*This node was introduced to the IMM cluster */ > uint8_t oldCanBeCoord = cb->mCanBeCoord; > - cb->mIntroduced = true; > + cb->mIntroduced = 1; > if(evt->info.ctrl.canBeCoord == 3) { > cb->m2Pbe = 1; > evt->info.ctrl.canBeCoord = 1; > @@ -8708,6 +8716,14 @@ static uint32_t immnd_evt_proc_intro_rsp > ((oldCanBeCoord == 2)?"load":"sync")); > } > > + if(cb->mCanBeCoord == 4) { > + osafassert(!(cb->m2Pbe)); > + cb->mScAbsenceAllowed = evt->info.ctrl.ndExecPid; > + LOG_IN("ABT cb->mScAbsenceAllowed:%u > evt->info.ctrl.ndExecPid:%u", cb->mScAbsenceAllowed, > evt->info.ctrl.ndExecPid); > + LOG_IN("SC_ABSENCE_ALLOWED (Headless Hydra) is > configured for %u seconds. CanBeCoord:%u", > + cb->mScAbsenceAllowed, cb->mCanBeCoord); > + } > + > if (evt->info.ctrl.isCoord) { > if (cb->mIsCoord) { > LOG_NO("This IMMND re-elected coord > redundantly, failover ?"); > @@ -8733,7 +8749,14 @@ static uint32_t immnd_evt_proc_intro_rsp > > } > } > - cb->mIsCoord = evt->info.ctrl.isCoord; > + if(cb->mIsCoord) { > + if(!(evt->info.ctrl.isCoord)) { > + LOG_NO("ABT CLOUD PROTO avoided canceling coord > - SHOULD NOT GET HERE"); > + } > + } else { > + LOG_NO("SETTING COORD TO %u CLOUD PROTO", > evt->info.ctrl.isCoord); > + cb->mIsCoord = evt->info.ctrl.isCoord; > + } > osafassert(!cb->mIsCoord || cb->mCanBeCoord); > cb->mRulingEpoch = evt->info.ctrl.rulingEpoch; > if (cb->mRulingEpoch) { > @@ -8751,7 +8774,7 @@ static uint32_t immnd_evt_proc_intro_rsp > > */ > if(cb->mCanBeCoord && evt->info.ctrl.canBeCoord) { > - LOG_IN("Other SC node (%x) has been introduced", > evt->info.ctrl.nodeId); > + LOG_IN("Other %s IMMND node (%x) has been introduced", > (cb->mScAbsenceAllowed)?"candidate coord":"SC", evt->info.ctrl.nodeId); > cb->mIsOtherScUp = true; /* Prevents oneSafe2PBEAllowed > from being turned on */ > cb->other_sc_node_id = evt->info.ctrl.nodeId; > > @@ -9066,7 +9089,9 @@ static void immnd_evt_proc_adminit_rsp(I > SaUint32T conn; > SaUint32T ownerId = 0; > > - osafassert(evt); > + /* Remember latest admo_id for IMMD recovery. */ > + cb->mLatestAdmoId = evt->info.adminitGlobal.globalOwnerId; > + > conn = m_IMMSV_UNPACK_HANDLE_HIGH(clnt_hdl); > nodeId = m_IMMSV_UNPACK_HANDLE_LOW(clnt_hdl); > ownerId = evt->info.adminitGlobal.globalOwnerId; > @@ -9231,6 +9256,45 @@ static void immnd_evt_proc_finalize_sync > /*This adjust-epoch will persistify the new epoch for: > veterans. */ > immnd_adjustEpoch(cb, SA_TRUE); /* Will osafassert if > immd is down. */ > } > + > + if(cb->mScAbsenceAllowed) {/* Coord and veteran nodes. */ > + IMMND_IMM_CLIENT_NODE *cl_node = NULL; > + SaImmHandleT prev_hdl; > + unsigned int count = 0; > + IMMSV_EVT send_evt; > + /* Sync completed for veteran & headless allowed => > trigger active > + resurrect. */ > + memset(&send_evt, '\0', sizeof(IMMSV_EVT)); > + send_evt.type = IMMSV_EVT_TYPE_IMMA; > + send_evt.info.imma.type = > IMMA_EVT_ND2A_PROC_STALE_CLIENTS; > + immnd_client_node_getnext(cb, 0, &cl_node); > + while (cl_node) { > + prev_hdl = cl_node->imm_app_hdl; > + if(!(cl_node->mIsResurrect)) { > + LOG_IN("Veteran node found active > client id: %llx " > + "version:%c %u %u, after sync.", > + cl_node->imm_app_hdl, > cl_node->version.releaseCode, > + cl_node->version.majorVersion, > + cl_node->version.minorVersion); > + immnd_client_node_getnext(cb, prev_hdl, > &cl_node); > + continue; > + } > + /* Send resurrect message. */ > + if (immnd_mds_msg_send(cb, cl_node->sv_id, > + cl_node->agent_mds_dest, > &send_evt)!=NCSCC_RC_SUCCESS) > + { > + LOG_WA("Failed to send active resurrect > message"); > + } > + /* Remove the temporary client node. */ > + immnd_client_node_del(cb, cl_node); > + memset(cl_node, '\0', > sizeof(IMMND_IMM_CLIENT_NODE)); > + free(cl_node); > + cl_node = NULL; > + ++count; > + immnd_client_node_getnext(cb, 0, &cl_node); > + } > + TRACE_2("Triggered %u active resurrects at veteran > node", count); > + } > } > > done: > @@ -9485,7 +9549,7 @@ static void immnd_evt_proc_admo_finalize > * is to be sent (only relevant if > * originatedAtThisNode is false). > > *****************************************************************************/ > -static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, > +void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, > IMMND_EVT *evt, > SaBoolT originatedAtThisNd, > SaImmHandleT clnt_hdl, MDS_DEST reply_dest) > { > @@ -9550,6 +9614,9 @@ static void immnd_evt_proc_impl_set_rsp( > evt->info.implSet.oi_timeout = 0; > } > > + /* Remember latest impl_id for IMMD recovery. */ > + cb->mLatestImplId = evt->info.implSet.impl_id; > + > err = immModel_implementerSet(cb, &(evt->info.implSet.impl_name), > (originatedAtThisNd) ? conn : 0, nodeId, implId, > reply_dest, evt->info.implSet.oi_timeout, > &discardImplementer); > @@ -9934,6 +10001,9 @@ static void immnd_evt_proc_ccbinit_rsp(I > nodeId = m_IMMSV_UNPACK_HANDLE_LOW(clnt_hdl); > ccbId = evt->info.ccbinitGlobal.globalCcbId; > > + /* Remember latest ccb_id for IMMD recovery. */ > + cb->mLatestCcbId = evt->info.ccbinitGlobal.globalCcbId; > + > err = immModel_ccbCreate(cb, > evt->info.ccbinitGlobal.i.adminOwnerId, > evt->info.ccbinitGlobal.i.ccbFlags, > @@ -10053,12 +10123,61 @@ static uint32_t immnd_evt_proc_mds_evt(I > immnd_proc_imma_down(cb, evt->info.mds_info.dest, > evt->info.mds_info.svc_id); > } else if ((evt->info.mds_info.change == NCSMDS_DOWN) && > evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMD) { > /* Cluster is going down. */ > - LOG_NO("No IMMD service => cluster restart, exiting"); > - if(cb->mState < IMM_SERVER_SYNC_SERVER) { > - immnd_ackToNid(NCSCC_RC_FAILURE); > - } > - exit(1); > - > + if(cb->mScAbsenceAllowed == 0) { > + /* Regular (non Hydra) exit on IMMD DOWN. */ > + LOG_ER("No IMMD service => cluster restart, exiting"); > + if(cb->mState < IMM_SERVER_SYNC_SERVER) { > + immnd_ackToNid(NCSCC_RC_FAILURE); > + } > + exit(1); > + } else { /* SC ABSENCE ALLOWED */ > + LOG_WA("SC Absence IS allowed:%u IMMD service is DOWN", > cb->mScAbsenceAllowed); > + if(cb->mIsCoord) { > + /* Note that normally the coord will reside at > SCs so this branch will > + only be relevant if REPEATED toal scAbsence > occurs. After SC absence > + and subsequent return of SC, the coord will > be elected at a payload. > + That coord will be active untill restart of > that payload.. > + unless we add functionality for the payload > coord to restart after > + a few minutes .. ? > + */ > + LOG_WA("This IMMND coord has to exit allowing > restarted IMMD to select new coord"); > + if(cb->mState < IMM_SERVER_SYNC_SERVER) { > + immnd_ackToNid(NCSCC_RC_FAILURE); > + } > + exit(1); > + } else if(cb->mState <= IMM_SERVER_LOADING_PENDING) { > + /* Reset state in payloads that had not joined. > No need to restart. */ > + LOG_IN("Resetting IMMND state from %u to > IMM_SERVER_ANONYMOUS", cb->mState); > + cb->mState = IMM_SERVER_ANONYMOUS; > + } else if(cb->mState < IMM_SERVER_READY) { > + LOG_WA("IMMND was being synced or loaded (%u), > has to restart", cb->mState); > + if(cb->mState < IMM_SERVER_SYNC_SERVER) { > + immnd_ackToNid(NCSCC_RC_FAILURE); > + } > + exit(1); > + } > + } > + cb->mIntroduced = 2; > + LOG_NO("IMMD SERVICE IS DOWN, HYDRA IS CONFIGURED => > UNREGISTERING IMMND form MDS"); > + immnd_mds_unregister(cb); > + /* Discard local clients ... */ > + immnd_proc_discard_other_nodes(cb); /* Isolate from the rest of > cluster */ > + LOG_NO("MDS unregisterede. sleeping ..."); > + sleep(1); > + LOG_NO("Sleep done registering IMMND with MDS"); > + rc = immnd_mds_register(immnd_cb); > + if(rc == NCSCC_RC_SUCCESS) { > + LOG_NO("SUCCESS IN REGISTERING IMMND WITH MDS"); > + } else { > + LOG_ER("FAILURE IN REGISTERING IMMND WITH MDS - > exiting"); > + exit(1); > + } > + } else if ((evt->info.mds_info.change == NCSMDS_UP) && > (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMD)) { > + LOG_NO("IMMD service is UP ... ScAbsenseAllowed?:%u > introduced?:%u", > + cb->mScAbsenceAllowed, cb->mIntroduced); > + if((cb->mIntroduced==2) && (immnd_introduceMe(cb) != > NCSCC_RC_SUCCESS)) { > + LOG_WA("IMMND re-introduceMe after IMMD restart failed, > will retry"); > + } > } else if ((evt->info.mds_info.change == NCSMDS_UP) && > (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMA_OM || > evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMA_OM)) { > @@ -10073,7 +10192,6 @@ static uint32_t immnd_evt_proc_mds_evt(I > TRACE_2("IMMD FAILOVER"); > /* The IMMD has failed over. */ > immnd_proc_imma_discard_stales(cb); > - > } else if (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMND) { > LOG_NO("MDS SERVICE EVENT OF TYPE IMMND!!"); > } > diff --git a/osaf/services/saf/immsv/immnd/immnd_init.h > b/osaf/services/saf/immsv/immnd/immnd_init.h > --- a/osaf/services/saf/immsv/immnd/immnd_init.h > +++ b/osaf/services/saf/immsv/immnd/immnd_init.h > @@ -39,8 +39,10 @@ extern IMMND_CB *immnd_cb; > > /* file : - immnd_proc.c */ > > +void immnd_proc_discard_other_nodes(IMMND_CB *cb); > + > void immnd_proc_imma_down(IMMND_CB *cb, MDS_DEST dest, NCSMDS_SVC_ID > sv_id); > -uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, > IMMND_IMM_CLIENT_NODE *cl_node); > +uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, > IMMND_IMM_CLIENT_NODE *cl_node, bool scAbsenceAllowed); > void immnd_proc_imma_discard_stales(IMMND_CB *cb); > > void immnd_cb_dump(void); > @@ -75,6 +77,10 @@ extern "C" { > > void immModel_abortSync(IMMND_CB *cb); > > + void immModel_isolateThisNode(IMMND_CB *cb); > + > + void immModel_abortNonCriticalCcbs(IMMND_CB *cb); > + > void immModel_pbePrtoPurgeMutations(IMMND_CB *cb, unsigned int nodeId, > SaUint32T *reqArrSize, > SaUint32T **reqConArr); > > @@ -433,6 +439,8 @@ extern "C" { > const char *errorString, > ...); > > + void immModel_setScAbsenceAllowed(IMMND_CB *cb); > + > #ifdef __cplusplus > } > #endif > @@ -471,6 +479,9 @@ uint32_t immnd_mds_get_handle(IMMND_CB * > /* File : ---- immnd_evt.c */ > void immnd_process_evt(void); > uint32_t immnd_evt_destroy(IMMSV_EVT *evt, SaBoolT onheap, uint32_t line); > +void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, IMMND_EVT *evt, > + SaBoolT originatedAtThisNd, SaImmHandleT clnt_hdl, MDS_DEST reply_dest); > + > /* End : ---- immnd_evt.c */ > > /* File : ---- immnd_proc.c */ > diff --git a/osaf/services/saf/immsv/immnd/immnd_main.c > b/osaf/services/saf/immsv/immnd/immnd_main.c > --- a/osaf/services/saf/immsv/immnd/immnd_main.c > +++ b/osaf/services/saf/immsv/immnd/immnd_main.c > @@ -169,6 +169,13 @@ static uint32_t immnd_initialize(char *p > immnd_cb->mPbeFile); > } > > + if ((envVar = getenv("IMMSV_USE_SHARED_FS"))) { > + int useSharedFs = atoi(envVar); > + if(useSharedFs != 0) { > + immnd_cb->mPbeUsesSharedFs = 1; > + } > + } > + > immnd_cb->mRim = SA_IMM_INIT_FROM_FILE; > immnd_cb->mPbeVeteran = SA_FALSE; > immnd_cb->mPbeVeteranB = SA_FALSE; > diff --git a/osaf/services/saf/immsv/immnd/immnd_proc.c > b/osaf/services/saf/immsv/immnd/immnd_proc.c > --- a/osaf/services/saf/immsv/immnd/immnd_proc.c > +++ b/osaf/services/saf/immsv/immnd/immnd_proc.c > @@ -34,6 +34,7 @@ > > #include "immnd.h" > #include "immsv_api.h" > +#include "immnd_init.h" > > static const char *loaderBase = "osafimmloadd"; > static const char *pbeBase = "osafimmpbed"; > @@ -76,7 +77,7 @@ void immnd_proc_immd_down(IMMND_CB *cb) > * Notes : Policy used for handling immd down is to blindly cleanup > * :immnd_cb > > ****************************************************************************/ > -uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, > IMMND_IMM_CLIENT_NODE *cl_node) > +uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, > IMMND_IMM_CLIENT_NODE *cl_node, bool scAbsence) > { > SaUint32T client_id; > SaUint32T node_id; > @@ -129,7 +130,8 @@ uint32_t immnd_proc_imma_discard_connect > send_evt.type = IMMSV_EVT_TYPE_IMMD; > send_evt.info.immd.type = IMMD_EVT_ND2D_DISCARD_IMPL; > send_evt.info.immd.info.impl_set.r.impl_id = implId; > - if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, > cb->immd_mdest_id, &send_evt) != NCSCC_RC_SUCCESS) { > + > + if (!scAbsence && immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, > cb->immd_mdest_id, &send_evt) != NCSCC_RC_SUCCESS) { > if (immnd_is_immd_up(cb)) { > LOG_ER("Discard implementer failed for > implId:%u " > "but IMMD is up !? - case not handled. > Client will be orphanded", implId); > @@ -142,7 +144,8 @@ uint32_t immnd_proc_imma_discard_connect > /*Discard the local implementer directly and redundantly to > avoid > race conditions using this implementer (ccb's causing abort > upcalls). > */ > - immModel_discardImplementer(cb, implId, SA_FALSE, NULL, NULL); > + //immModel_discardImplementer(cb, implId, SA_FALSE, NULL, NULL); > + immModel_discardImplementer(cb, implId, scAbsence, NULL, NULL); > } > > if (cl_node->mIsStale) { > @@ -163,7 +166,7 @@ uint32_t immnd_proc_imma_discard_connect > for (ix = 0; ix < arrSize && !(cl_node->mIsStale); ++ix) { > send_evt.info.immd.info.ccbId = idArr[ix]; > TRACE_5("Discarding Ccb id:%u originating at dead > connection: %u", idArr[ix], client_id); > - if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, > cb->immd_mdest_id, > + if (!scAbsence && immnd_mds_msg_send(cb, > NCSMDS_SVC_ID_IMMD, cb->immd_mdest_id, > [Hung] We don't need this ... > > &send_evt) != NCSCC_RC_SUCCESS) { > if (immnd_is_immd_up(cb)) { > LOG_ER("Failure to broadcast discard > Ccb for ccbId:%u " > @@ -174,6 +177,8 @@ uint32_t immnd_proc_imma_discard_connect > "(immd down)- will retry later", > idArr[ix]); > } > cl_node->mIsStale = true; > + } else if(scAbsence) { > + /* ABT TODO discard local ccbs ??*/ > [Hung] ... and this. When 'scAbsence' is true, the code will not send > out any message. We can just simply do something like this, it will be > faster. *if (!scAbsence) immModel_getCcbIdsForOrigCon(cb, client_id, > &arrSize, &idArr);* 'arrSize' is initialized with '0' so it will not > enter the 'if' block. > > } > } > free(idArr); > @@ -197,20 +202,29 @@ uint32_t immnd_proc_imma_discard_connect > send_evt.type = IMMSV_EVT_TYPE_IMMD; > send_evt.info.immd.type = IMMD_EVT_ND2D_ADMO_HARD_FINALIZE; > for (ix = 0; ix < arrSize && !(cl_node->mIsStale); ++ix) { > - send_evt.info.immd.info.admoId = idArr[ix]; > TRACE_5("Hard finalize of AdmOwner id:%u originating at > " > "dead connection: %u", idArr[ix], client_id); > - if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, > cb->immd_mdest_id, > + if (scAbsence) { > + SaImmHandleT clnt_hdl; > + MDS_DEST reply_dest; > + memset(&clnt_hdl, '\0', sizeof(SaImmHandleT)); > + memset(&reply_dest, '\0', sizeof(MDS_DEST)); > + send_evt.info.immnd.info.admFinReq.adm_owner_id > = idArr[ix]; > + immnd_evt_proc_admo_hard_finalize(cb, > &send_evt.info.immnd, false, clnt_hdl, reply_dest); > + } else { > + send_evt.info.immd.info.admoId = idArr[ix]; > + if(immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, > cb->immd_mdest_id, > &send_evt) != NCSCC_RC_SUCCESS) { > - if (immnd_is_immd_up(cb)) { > - LOG_ER("Failure to broadcast discard > admo0wner for ccbId:%u " > - "but IMMD is up !? - case not > handled. Client will " > - "be orphanded", implId); > - } else { > - LOG_WA("Failure to broadcast discard > admowner for id:%u " > - "(immd down)- will retry later", > idArr[ix]); > + if (immnd_is_immd_up(cb)) { > + LOG_ER("Failure to broadcast > discard admo0wner for ccbId:%u " > + "but IMMD is up !? - > case not handled. Client will " > + "be orphanded", implId); > + } else { > + LOG_WA("Failure to broadcast > discard admowner for id:%u " > + "(immd down)- will > retry later", idArr[ix]); > + } > + cl_node->mIsStale = true; > } > - cl_node->mIsStale = true; > } > } > free(idArr); > @@ -251,7 +265,7 @@ void immnd_proc_imma_down(IMMND_CB *cb, > prev_hdl = cl_node->imm_app_hdl; > > if ((memcmp(&dest, &cl_node->agent_mds_dest, sizeof(MDS_DEST)) > == 0) && sv_id == cl_node->sv_id) { > - if (immnd_proc_imma_discard_connection(cb, cl_node)) { > + if (immnd_proc_imma_discard_connection(cb, cl_node, > false)) { > TRACE_5("Removing client id:%llx sv_id:%u", > cl_node->imm_app_hdl, cl_node->sv_id); > immnd_client_node_del(cb, cl_node); > memset(cl_node, '\0', > sizeof(IMMND_IMM_CLIENT_NODE)); > @@ -300,7 +314,7 @@ void immnd_proc_imma_discard_stales(IMMN > prev_hdl = cl_node->imm_app_hdl; > if (cl_node->mIsStale) { > cl_node->mIsStale = false; > - if (immnd_proc_imma_discard_connection(cb, cl_node)) { > + if (immnd_proc_imma_discard_connection(cb, cl_node, > false)) { > TRACE_5("Removing client id:%llx sv_id:%u", > cl_node->imm_app_hdl, cl_node->sv_id); > immnd_client_node_del(cb, cl_node); > memset(cl_node, '\0', > sizeof(IMMND_IMM_CLIENT_NODE)); > @@ -422,6 +436,17 @@ uint32_t immnd_introduceMe(IMMND_CB *cb) > send_evt.info.immd.info.ctrl_msg.pbeEnabled, > send_evt.info.immd.info.ctrl_msg.dir.size); > > + if(cb->mIntroduced==2) { > + LOG_NO("Re-introduce-me highestProcessed:%llu > highestReceived:%llu", > + cb->highestProcessed, cb->highestReceived); > + send_evt.info.immd.info.ctrl_msg.refresh = 2; > + send_evt.info.immd.info.ctrl_msg.fevs_count = > cb->highestReceived; > + > + send_evt.info.immd.info.ctrl_msg.admo_id_count = > cb->mLatestAdmoId;; > + send_evt.info.immd.info.ctrl_msg.ccb_id_count = > cb->mLatestCcbId; > + send_evt.info.immd.info.ctrl_msg.impl_count = cb->mLatestImplId; > + } > + > if (!immnd_is_immd_up(cb)) { > return NCSCC_RC_FAILURE; > } > @@ -480,7 +505,7 @@ static int32_t immnd_iAmLoader(IMMND_CB > TRACE_5("Loading is not possible, preLoader still attached"); > return (-3); > } > - > +LOG_IN("ABT CLOUD PROTO cb->mMyEpoch:%u != cb->mRulingEpoch:%u", > cb->mMyEpoch, cb->mRulingEpoch); > if (cb->mMyEpoch != cb->mRulingEpoch) { > /*We are joining the cluster, need to sync this IMMND. */ > return (-2); > @@ -536,7 +561,7 @@ static uint32_t immnd_requestSync(IMMND_ > uint32_t rc = NCSCC_RC_SUCCESS; > IMMSV_EVT send_evt; > memset(&send_evt, '\0', sizeof(IMMSV_EVT)); > - > +LOG_NO("ABT REQUESTING SYNC"); > send_evt.type = IMMSV_EVT_TYPE_IMMD; > send_evt.info.immd.type = IMMD_EVT_ND2D_REQ_SYNC; > send_evt.info.immd.info.ctrl_msg.ndExecPid = cb->mMyPid; > @@ -546,6 +571,7 @@ static uint32_t immnd_requestSync(IMMND_ > if (immnd_is_immd_up(cb)) { > rc = immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, > cb->immd_mdest_id, &send_evt); > } else { > + LOG_IN("Could not request sync because IMMD is not UP"); > rc = NCSCC_RC_FAILURE; > } > return (rc == NCSCC_RC_SUCCESS); > @@ -1571,13 +1597,19 @@ static int immnd_forkPbe(IMMND_CB *cb) > if (pid == 0) { /*child */ > /* TODO: Should close file-descriptors ... */ > /*char * const pbeArgs[5] = { (char *) execPath, "--recover", > "--pbeXX", dbFilePath, 0 };*/ > - char * pbeArgs[5]; > + char * pbeArgs[6]; > bool veteran = (cb->mIsCoord) ? (cb->mPbeVeteran) : (cb->m2Pbe > && cb->mPbeVeteranB); > pbeArgs[0] = (char *) execPath; > - if(veteran) { > + if(veteran && cb->mScAbsenceAllowed && !cb->mPbeUsesSharedFs) { > + pbeArgs[1] = "--recover"; > + pbeArgs[2] = "--check-objects"; > + pbeArgs[3] = > (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe"; > + pbeArgs[4] = dbFilePath; > + pbeArgs[5] = 0; > + } else if(veteran) { > pbeArgs[1] = "--recover"; > pbeArgs[2] = > (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe"; > - pbeArgs[3] = dbFilePath; > + pbeArgs[3] = dbFilePath; > pbeArgs[4] = 0; > } else { > pbeArgs[1] = > (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe"; > @@ -1685,7 +1717,7 @@ uint32_t immnd_proc_server(uint32_t *tim > cb->mJobStart = now; > } > } else { /*We are not ready to start loading yet */ > - if(cb->mIntroduced) { > + if(cb->mIntroduced==1) { > if((cb->m2Pbe == 2) && !(cb->preLoadPid)) { > cb->preLoadPid = immnd_forkLoader(cb, > true); > } > @@ -1833,6 +1865,7 @@ uint32_t immnd_proc_server(uint32_t *tim > cb->mState = IMM_SERVER_READY; > immnd_ackToNid(NCSCC_RC_SUCCESS); > LOG_NO("SERVER STATE: IMM_SERVER_LOADING_SERVER > --> IMM_SERVER_READY"); > + immModel_setScAbsenceAllowed(cb); > cb->mJobStart = now; > if (cb->mPbeFile) {/* Pbe enabled */ > cb->mRim = > immModel_getRepositoryInitMode(cb); > @@ -1876,6 +1909,7 @@ uint32_t immnd_proc_server(uint32_t *tim > cb->mState = IMM_SERVER_READY; > cb->mJobStart = now; > LOG_NO("SERVER STATE: IMM_SERVER_LOADING_CLIENT --> > IMM_SERVER_READY"); > + immModel_setScAbsenceAllowed(cb); > if (cb->mPbeFile) {/* Pbe configured */ > cb->mRim = immModel_getRepositoryInitMode(cb); > > @@ -1896,7 +1930,9 @@ uint32_t immnd_proc_server(uint32_t *tim > cb->mJobStart = now; > cb->mState = IMM_SERVER_READY; > immnd_ackToNid(NCSCC_RC_SUCCESS); > - LOG_NO("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> IMM > SERVER READY"); > + LOG_NO("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> > IMM_SERVER_READY"); > + immModel_setScAbsenceAllowed(cb); > + > /* > This code case duplicated in immnd_evt.c > Search for: "ticket:#599" > @@ -1927,7 +1963,7 @@ uint32_t immnd_proc_server(uint32_t *tim > cb->mStep = 0; > cb->mJobStart = now; > cb->mState = IMM_SERVER_READY; > - LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER > --> IMM SERVER READY"); > + LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER > --> IMM_SERVER_READY"); > } > if (!(cb->mStep % 60)) { > LOG_IN("Sync Phase-1, waiting for existing " > @@ -1944,7 +1980,7 @@ uint32_t immnd_proc_server(uint32_t *tim > cb->mStep = 0; > cb->mJobStart = now; > cb->mState = IMM_SERVER_READY; > - LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER > --> IMM SERVER READY"); > + LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER > --> IMM_SERVER_READY"); > } > > /* PBE may intentionally be restarted by sync. Catch > this here. */ > @@ -1977,7 +2013,7 @@ uint32_t immnd_proc_server(uint32_t *tim > cb->mJobStart = now; > cb->mState = IMM_SERVER_READY; > immnd_abortSync(cb); > - LOG_NO("SERVER STATE: > IMM_SERVER_SYNC_SERVER --> IMM SERVER READY"); > + LOG_NO("SERVER STATE: > IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY"); > } else { > LOG_IN("Sync Phase-2: Ccbs are > terminated, IMM in " > "read-only mode, forked sync > process pid:%u", cb->syncPid); > @@ -1991,7 +2027,7 @@ uint32_t immnd_proc_server(uint32_t *tim > cb->mStep = 0; > cb->mJobStart = now; > cb->mState = IMM_SERVER_READY; > - LOG_NO("SERVER STATE: > IMM_SERVER_SYNC_SERVER --> IMM SERVER READY"); > + LOG_NO("SERVER STATE: > IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY"); > } else if (!(cb->mSyncFinalizing)) { > int status = 0; > if (waitpid(cb->syncPid, &status, > WNOHANG) > 0) { > @@ -2031,6 +2067,11 @@ uint32_t immnd_proc_server(uint32_t *tim > } > } > > + if(cb->mIntroduced == 2) { > + immnd_introduceMe(cb); > + break; > + } > + > coord = immnd_iAmCoordinator(cb); > > if (cb->pbePid > 0) { > @@ -2275,3 +2316,28 @@ void immnd_dump_client_info(IMMND_IMM_CL > } > > #endif > + > +/* Only for scAbsenceAllowed (headless hydra) */ > +void immnd_proc_discard_other_nodes(IMMND_CB *cb) > +{ > + TRACE_ENTER(); > + /* Discard all clients. */ > + > + IMMND_IMM_CLIENT_NODE *cl_node = NULL; > + immnd_client_node_getnext(cb, 0, &cl_node); > + while (cl_node) { > + LOG_NO("Removing client id:%llx sv_id:%u", > cl_node->imm_app_hdl, cl_node->sv_id); > + osafassert(immnd_proc_imma_discard_connection(cb, cl_node, > true)); > + LOG_NO("ABT discard_connection OK"); > + osafassert(immnd_client_node_del(cb, cl_node) == > NCSCC_RC_SUCCESS); > + free(cl_node); > + cl_node = NULL; > + LOG_NO("ABT Client node REMOVED"); > + immnd_client_node_getnext(cb, 0, &cl_node); > + } > + > + LOG_NO("ABT DONE REMOVING CLIENTS ENTERING immModel_isolateThisNode(cb) > "); > + immModel_isolateThisNode(cb); > + immModel_abortNonCriticalCcbs(cb); > + TRACE_LEAVE(); > +} > > ------------------------------------------------------------------------------ > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel