Hi Zoran/Neel,

It is miss configuration please ignore


-AVM

On 2/19/2016 9:40 AM, A V Mahesh wrote:
> Hi Zoran/Neel,
>
> I have cloud resilience feature enabled with  Opensaf RPMs build with
> pbe ( --enable-imm-pbe )
> BUT PBE configuration not enabled and   (  `#export
> IMMSV_PBE_FILE=imm.db` still not enabled immnd.conf)
> when 4 node cluster is in stable state and CPSV application is running ,
> then i did stooped
> bot SC`s and i have received   Return Value  : SA_AIS_ERR_TRY_AGAIN for
> Finalize ckptHandle
> as expected , at that moment i started both SC`s , but SC`s Both didn't
> joined  with error
> ` WA PBE is configured at first attached SC-immnd, but no Pbe file is
> configured for immnd at node 2010f - rejecting node`
>
> I s their any thing missing im configuration configuration ?
>
> ===============================================================================================================
>
> Feb 19 09:20:43 SC-1 opensafd: Starting OpenSAF Services(5.0.M0 - )
> (Using TIPC)
> Starting OpenSAF Services (Using TIPC):Feb 19 09:20:43 SC-1
> osafrded[23407]: Started
> Feb 19 09:20:43 SC-1 osafrded[23407]: NO Peer rde@2020f has no state, my
> nodeid is less => Setting Active role
> Feb 19 09:20:43 SC-1 osaffmd[23416]: Started
> Feb 19 09:20:43 SC-1 osafimmd[23426]: Started
> Feb 19 09:20:43 SC-1 osafimmd[23426]: NO ******* SC_ABSENCE_ALLOWED
> (Headless Hydra) is configured: 900 ***********
> Feb 19 09:20:43 SC-1 osafimmd[23426]: NO Waiting 3 seconds to allow
> IMMND MDS attachments to get processed.
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND
> process at node 2030f old epoch: 0  new epoch:4
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Ruling epoch changed to:4
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of fevs count from 0 to
> 2628 from 2030f.
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of admoId count from 0
> to 4 from 2030f.
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of ccbId count from 0
> to 2 from 2030f.
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Refresh of impl count from 0 to
> 14 from 2030f.
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:2 Accepted
> nodes:0 KnownVeteran:1 doReply:1
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO First Veteran IMMND found
> (payload) at 2030f this IMMD at 2010f. Apparent IMMD lapse, *not* 2PBE
> => designating that IMMND as coordinator
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND
> process at node 2040f old epoch: 0  new epoch:4
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted
> nodes:1 KnownVeteran:1 doReply:1
> Feb 19 09:20:46 SC-1 osafimmnd[23437]: Started
> Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO IMMD service is UP ...
> ScAbsenseAllowed?:0 introduced?:0
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO New IMMND process is on STANDBY
> Controller at 2020f
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Extended intro from node 2020f
> Feb 19 09:20:46 SC-1 osafimmd[23426]: WA PBE not configured at first
> attached SC-immnd, but Pbe is configured for immnd at 2020f - possible
> upgrade from pre 4.4
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Attached Nodes:4 Accepted
> nodes:2 KnownVeteran:0 doReply:1
> Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO SERVER STATE:
> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE
> Controller at 2010f
> Feb 19 09:20:46 SC-1 osafimmd[23426]: WA PBE is configured at first
> attached SC-immnd, but no Pbe file is configured for immnd at node 2010f
> - rejecting node
> Feb 19 09:20:46 SC-1 osafimmnd[23437]: NO SETTING COORD TO 0 CLOUD PROTO
> Feb 19 09:20:46 SC-1 osafimmnd[23437]: ER IMMND forced to restart on
> order from IMMD, exiting
> Feb 19 09:20:46 SC-1 opensafd[23368]: ER Failed   DESC:IMMND
> Feb 19 09:20:46 SC-1 opensafd[23368]: ER Going for recovery
> Feb 19 09:20:46 SC-1 opensafd[23368]: ER Trying To RESPAWN
> /usr/lib64/opensaf/clc-cli/osaf-immnd attempt #1
> Feb 19 09:20:46 SC-1 opensafd[23368]: ER Sending SIGKILL to IMMND, pid=23431
> Feb 19 09:20:46 SC-1 osafimmd[23426]: WA Error returned from processing
> message err:2 msg-type:2
> Feb 19 09:20:46 SC-1 osafimmd[23426]: WA IMMND on controller (not
> currently coord) requests sync
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Sc Absence Allowed is
> configured (900) => IMMND coord at payload node:2030f dest566313894805508
> Feb 19 09:20:46 SC-1 osafimmd[23426]: NO Node 2020f request sync
> sync-pid:8766 epoch:0
> Feb 19 09:20:48 SC-1 osafimmd[23426]: NO Successfully announced sync.
> New ruling epoch:5
> Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND
> process at node 2030f old epoch: 4  new epoch:5
> Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted
> nodes:3 KnownVeteran:0 doReply:0
> Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND
> process at node 2040f old epoch: 4  new epoch:5
> Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted
> nodes:3 KnownVeteran:0 doReply:0
> Feb 19 09:20:49 SC-1 osafimmd[23426]: NO ACT: New Epoch for IMMND
> process at node 2020f old epoch: 0  new epoch:5
> Feb 19 09:20:49 SC-1 osafimmd[23426]: NO Attached Nodes:3 Accepted
> nodes:3 KnownVeteran:0 doReply:0
> Feb 19 09:21:01 SC-1 osafimmnd[23462]: Started
> Feb 19 09:21:01 SC-1 osafimmnd[23462]: NO IMMD service is UP ...
> ScAbsenseAllowed?:0 introduced?:0
> Feb 19 09:21:02 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE
> Controller at 2010f
> Feb 19 09:21:02 SC-1 osafimmd[23426]: WA PBE is configured at first
> attached SC-immnd, but no Pbe file is configured for immnd at node 2010f
> - rejecting node
> Feb 19 09:21:02 SC-1 osafimmd[23426]: WA Error returned from processing
> message err:2 msg-type:2
> Feb 19 09:21:02 SC-1 osafimmnd[23462]: NO SERVER STATE:
> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
> Feb 19 09:21:02 SC-1 osafimmnd[23462]: NO SETTING COORD TO 0 CLOUD PROTO
> Feb 19 09:21:02 SC-1 osafimmnd[23462]: ER IMMND forced to restart on
> order from IMMD, exiting
> Feb 19 09:21:02 SC-1 opensafd[23368]: ER Could Not RESPAWN IMMND
> Feb 19 09:21:02 SC-1 opensafd[23368]: ER Failed   DESC:IMMND
> Feb 19 09:21:02 SC-1 opensafd[23368]: ER Trying To RESPAWN
> /usr/lib64/opensaf/clc-cli/osaf-immnd attempt #2
> Feb 19 09:21:02 SC-1 opensafd[23368]: ER Sending SIGKILL to IMMND, pid=23456
> Feb 19 09:21:17 SC-1 osafimmnd[23487]: Started
> Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO IMMD service is UP ...
> ScAbsenseAllowed?:0 introduced?:0
> Feb 19 09:21:17 SC-1 osafimmd[23426]: NO New IMMND process is on ACTIVE
> Controller at 2010f
> Feb 19 09:21:17 SC-1 osafimmd[23426]: WA PBE is configured at first
> attached SC-immnd, but no Pbe file is configured for immnd at node 2010f
> - rejecting node
> Feb 19 09:21:17 SC-1 osafimmd[23426]: WA Error returned from processing
> message err:2 msg-type:2
> Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO SERVER STATE:
> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
> Feb 19 09:21:17 SC-1 osafimmnd[23487]: NO SETTING COORD TO 0 CLOUD PROTO
> Feb 19 09:21:17 SC-1 osafimmnd[23487]: ER IMMND forced to restart on
> order from IMMD, exiting
> Feb 19 09:21:17 SC-1 opensafd[23368]: ER Could Not RESPAWN IMMND
> Feb 19 09:21:17 SC-1 opensafd[23368]: ER Failed   DESC:IMMND
> Feb 19 09:21:17 SC-1 opensafd[23368]: ER FAILED TO RESPAWN
> Feb 19 09:21:17 SC-1 osaffmd[23416]: exiting for shutdown
> Feb 19 09:21:17 SC-1 osafimmd[23426]: exiting for shutdown
> Feb 19 09:21:17 SC-1 osafrded[23407]: exiting for shutdown
> Feb 19 09:21:17 SC-1 opensafd: warning: TIPC module unloading failed
> failed
> Feb 19 09:21:17 SC-1 opensafd: Starting OpenSAF failed
>
>
> Feb 19 09:20:44 SC-2 opensafd: OpenSAF services successfully stopped
> Feb 19 09:20:44 SC-2 opensafd: Starting OpenSAF Services(5.0.M0 - )
> (Using TIPC)
> Feb 19 09:20:44 SC-2 kernel: [161176.504234] tipc: Activated (version 2.0.0)
> Feb 19 09:20:44 SC-2 kernel: [161176.504623] NET: Registered protocol
> family 30
> Feb 19 09:20:44 SC-2 kernel: [161176.504626] tipc: Started in single
> node mode
> Feb 19 09:20:44 SC-2 kernel: [161176.512492] tipc: Started in network mode
> Feb 19 09:20:44 SC-2 kernel: [161176.512497] tipc: Own node address
> <1.1.2>, network identity 7777
> Feb 19 09:20:44 SC-2 kernel: [161176.514875] tipc: Enabled bearer
> <eth:eth3>, discovery domain <1.1.0>, priority 10
> Feb 19 09:20:44 SC-2 kernel: [161176.515726] tipc: Enabled bearer
> <eth:eth2>, discovery domain <1.1.0>, priority 10
> Feb 19 09:20:44 SC-2 kernel: [161176.516587] tipc: Established link
> <1.1.2:eth2-1.1.1:eth1> on network plane B
> Feb 19 09:20:44 SC-2 kernel: [161176.516643] tipc: Established link
> <1.1.2:eth3-1.1.3:eth4> on network plane A
> Feb 19 09:20:44 SC-2 kernel: [161176.517021] tipc: Established link
> <1.1.2:eth3-1.1.4:eth0> on network plane A
> Feb 19 09:20:44 SC-2 kernel: [161176.518091] tipc: Established link
> <1.1.2:eth3-1.1.1:eth0> on network plane A
> Feb 19 09:20:44 SC-2 osafrded[8736]: Started
> Feb 19 09:20:44 SC-2 kernel: [161176.645456] tipc: Established link
> <1.1.2:eth2-1.1.4:eth2> on network plane B
> Feb 19 09:20:44 SC-2 kernel: [161176.645566] tipc: Established link
> <1.1.2:eth2-1.1.3:eth1> on network plane B
> Feb 19 09:20:46 SC-2 osafrded[8736]: NO Peer rde@2010f has no state, my
> nodeid is greater => Setting Standby role
> Feb 19 09:20:46 SC-2 osaffmd[8745]: Started
> Feb 19 09:20:46 SC-2 osafimmd[8755]: Started
> Feb 19 09:20:46 SC-2 osafimmd[8755]: NO ******* SC_ABSENCE_ALLOWED
> (Headless Hydra) is configured: 900 ***********
> Feb 19 09:20:46 SC-2 osafimmd[8755]: NO Waiting 3 seconds to allow IMMND
> MDS attachments to get processed.
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: Started
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO Persistent Back-End capability
> configured, Pbe file:imm.db (suffix may get added)
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO IMMD service is UP ...
> ScAbsenseAllowed?:0 introduced?:0
> Feb 19 09:20:49 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process
> at node 2040f old epoch: 0  new epoch:4
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE:
> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SETTING COORD TO 0 CLOUD PROTO
> Feb 19 09:20:49 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller
> f1 detected at standby immd!! f2. Possible failover
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO Fevs count adjusted to 2628
> preLoadPid: 0
> Feb 19 09:20:49 SC-2 osafimmd[8755]: WA Message count:2629 + 1 != 2629
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2629
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: WA Error code 2 returned for
> message type 82 - ignoring
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE:
> IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO ABT REQUESTING SYNC
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO SERVER STATE:
> IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
> Feb 19 09:20:49 SC-2 osafimmnd[8766]: NO NODE STATE-> IMM_NODE_ISOLATED
> Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: Ruling epoch noted as:5
> Feb 19 09:20:51 SC-2 osafimmd[8755]: NO IMMND coord at 2030f
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO SERVER STATE:
> IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO NODE STATE->
> IMM_NODE_FULLY_AVAILABLE 2715
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO RepositoryInitModeT is
> SA_IMM_INIT_FROM_FILE
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: WA IMM Access Control mode is
> DISABLED!
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Epoch set to 5 in ImmModel
> Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process
> at node 2030f old epoch: 4  new epoch:5
> Feb 19 09:20:51 SC-2 osafimmd[8755]: NO IMMND coord at 2030f
> Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process
> at node 2040f old epoch: 4  new epoch:5
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Implementer connected: 15
> (MsgQueueService131855) <0, 2030f>
> Feb 19 09:20:51 SC-2 osafimmd[8755]: NO SBY: New Epoch for IMMND process
> at node 2020f old epoch: 0  new epoch:5
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO Implementer connected: 16
> (MsgQueueService132111) <0, 2040f>
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO SERVER STATE:
> IMM_SERVER_SYNC_CLIENT --> IMM_SERVER_READY
> Feb 19 09:20:51 SC-2 osafimmnd[8766]: NO ABT ImmModel received
> scAbsenceAllowed 900
> Feb 19 09:20:51 SC-2 osaflogd[8776]: Started
> Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOGSV_DATA_GROUPNAME not found
> Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOG root directory is:
> "/var/log/opensaf/saflog"
> Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LOG data group is: ""
> Feb 19 09:20:51 SC-2 osaflogd[8776]: NO LGS_MBCSV_VERSION = 5
> Feb 19 09:20:51 SC-2 osafntfd[8787]: Started
> Feb 19 09:21:01 SC-2 osafntfd[8787]: WA saLogInitialize returns try
> again, retries...
> Feb 19 09:21:04 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller
> f1 detected at standby immd!! f2. Possible failover
> Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2806
> Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for
> message type 82 - ignoring
> Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2807
> Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for
> message type 82 - ignoring
> Feb 19 09:21:04 SC-2 osafimmnd[8766]: NO Global discard node received
> for nodeId:2010f pid:0
> Feb 19 09:21:04 SC-2 osafimmd[8755]: WA Message count:2808 + 1 != 2808
> Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2808
> Feb 19 09:21:04 SC-2 osafimmnd[8766]: WA Error code 2 returned for
> message type 82 - ignoring
> Feb 19 09:21:20 SC-2 osafimmd[8755]: WA IMMND DOWN on active controller
> f1 detected at standby immd!! f2. Possible failover
> Feb 19 09:21:20 SC-2 osafimmd[8755]: NO Skipping re-send of fevs message
> 2807 since it has recently been resent.
> Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2808
> Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for
> message type 82 - ignoring
> Feb 19 09:21:20 SC-2 osafimmnd[8766]: NO Global discard node received
> for nodeId:2010f pid:0
> Feb 19 09:21:20 SC-2 osafimmd[8755]: WA Message count:2809 + 1 != 2809
> Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2809
> Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for
> message type 82 - ignoring
> Feb 19 09:21:20 SC-2 osafimmd[8755]: WA IMMD lost contact with peer IMMD
> (NCSMDS_RED_DOWN)
> Feb 19 09:21:20 SC-2 osafimmd[8755]: NO Skipping re-send of fevs message
> 2808 since it has recently been resent.
> Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA DISCARD DUPLICATE FEVS message:2809
> Feb 19 09:21:20 SC-2 osafimmnd[8766]: WA Error code 2 returned for
> message type 82 - ignoring
> Feb 19 09:21:31 SC-2 opensafd[8704]: ER Timed-out for response from NTFD
> Feb 19 09:21:31 SC-2 opensafd[8704]: ER
> Feb 19 09:21:31 SC-2 opensafd[8704]: ER Going for recovery
> Feb 19 09:21:31 SC-2 opensafd[8704]: ER Trying To RESPAWN
> /usr/lib64/opensaf/clc-cli/osaf-ntfd attempt #1
> Feb 19 09:21:31 SC-2 opensafd[8704]: ER Sending SIGABRT to NTFD,
> pid=8787, (origin parent pid=8782)
> Feb 19 09:21:47 SC-2 osafntfd[8821]: Started
> Feb 19 09:21:57 SC-2 osafntfd[8821]: WA saLogInitialize returns try
> again, retries...
> Feb 19 09:22:27 SC-2 opensafd[8704]: ER Timed-out for response from NTFD
> Feb 19 09:22:27 SC-2 opensafd[8704]: ER Could Not RESPAWN NTFD
> Feb 19 09:22:27 SC-2 opensafd[8704]: ER
> Feb 19 09:22:27 SC-2 opensafd[8704]: ER Trying To RESPAWN
> /usr/lib64/opensaf/clc-cli/osaf-ntfd attempt #2
> Feb 19 09:22:27 SC-2 opensafd[8704]: ER Sending SIGABRT to NTFD,
> pid=8821, (origin parent pid=8816)
> Feb 19 09:22:42 SC-2 osafntfd[8851]: Started
> Feb 19 09:22:52 SC-2 osafntfd[8851]: WA saLogInitialize returns try
> again, retries...
> Feb 19 09:23:22 SC-2 opensafd[8704]: ER Timed-out for response from NTFD
> Feb 19 09:23:22 SC-2 opensafd[8704]: ER Could Not RESPAWN NTFD
> Feb 19 09:23:22 SC-2 opensafd[8704]: ER
> Feb 19 09:23:22 SC-2 opensafd[8704]: ER FAILED TO RESPAWN
> Feb 19 09:23:22 SC-2 osaffmd[8745]: exiting for shutdown
> Feb 19 09:23:22 SC-2 osafimmd[8755]: exiting for shutdown
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: WA SC Absence IS allowed:900 IMMD
> service is DOWN
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO IMMD SERVICE IS DOWN, HYDRA IS
> CONFIGURED => UNREGISTERING IMMND form MDS
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Removing client id:10002020f
> sv_id:27
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT discard_connection OK
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT Client node REMOVED
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO ABT DONE REMOVING CLIENTS
> ENTERING immModel_isolateThisNode(cb)
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Implementer disconnected 16 <0,
> 2040f(down)> (MsgQueueService132111)
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Impl Discarded node 2040f
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Implementer disconnected 15 <0,
> 2030f(down)> (MsgQueueService131855)
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO Impl Discarded node 2030f
> Feb 19 09:23:22 SC-2 osafimmnd[8766]: NO MDS unregisterede. sleeping ...
> Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO Sleep done registering IMMND
> with MDS
> Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO MDS: mds_register_callback:
> dest 2020f2a460010 already exist
> Feb 19 09:23:23 SC-2 osafimmnd[8766]: NO SUCCESS IN REGISTERING IMMND
> WITH MDS
> Feb 19 09:23:23 SC-2 osafimmnd[8766]: exiting for shutdown
> Feb 19 09:23:23 SC-2 osaflogd[8776]: exiting for shutdown
> Feb 19 09:23:23 SC-2 osafrded[8736]: exiting for shutdown
> Feb 19 09:23:23 SC-2 kernel: [161335.919458] tipc: Disabling bearer
> <eth:eth3>
> Feb 19 09:23:23 SC-2 kernel: [161335.919469] tipc: Lost link
> <1.1.2:eth3-1.1.3:eth4> on network plane A
> Feb 19 09:23:23 SC-2 kernel: [161335.919616] tipc: Lost link
> <1.1.2:eth3-1.1.4:eth0> on network plane A
> Feb 19 09:23:23 SC-2 kernel: [161335.919672] tipc: Lost link
> <1.1.2:eth3-1.1.1:eth0> on network plane A
> Feb 19 09:23:23 SC-2 kernel: [161335.919691] tipc: Disabling bearer
> <eth:eth2>
> Feb 19 09:23:23 SC-2 kernel: [161335.919694] tipc: Lost link
> <1.1.2:eth2-1.1.1:eth1> on network plane B
> Feb 19 09:23:23 SC-2 kernel: [161335.919697] tipc: Lost contact with <1.1.1>
> Feb 19 09:23:23 SC-2 kernel: [161335.919702] tipc: Lost link
> <1.1.2:eth2-1.1.4:eth2> on network plane B
> Feb 19 09:23:23 SC-2 kernel: [161335.919704] tipc: Lost contact with <1.1.4>
> Feb 19 09:23:23 SC-2 kernel: [161335.919708] tipc: Lost link
> <1.1.2:eth2-1.1.3:eth1> on network plane B
> Feb 19 09:23:23 SC-2 kernel: [161335.919710] tipc: Lost contact with <1.1.3>
> Feb 19 09:23:23 SC-2 kernel: [161335.919752] tipc: Left network mode
> Feb 19 09:23:23 SC-2 kernel: [161335.919914] NET: Unregistered protocol
> family 30
> Feb 19 09:23:23 SC-2 kernel: [161335.919920] tipc: Deactivated
> Feb 19 09:23:23 SC-2 opensafd: Starting OpenSAF failed
>
> ===============================================================================================================
>
> -AVM
>
>
> On 2/4/2016 3:11 PM, Hung Nguyen wrote:
>> Hi Zoran,
>>
>> Please find my comment inline.
>>
>> BR,
>>
>> Hung Nguyen - DEK Technologies
>>
>>
>> --------------------------------------------------------------------------------
>> From: Zoran Milinkovic zoran.milinko...@ericsson.com
>> Sent: Tuesday, December 22, 2015 9:14PM
>> To: Neelakanta Reddy
>>        reddy.neelaka...@oracle.com
>> Cc: Opensaf-devel
>>        opensaf-devel@lists.sourceforge.net
>> Subject: [devel] [PATCH 4 of 5] imm: add IMMND support for cloud resilience 
>> feature [#1625]
>>
>>
>>     osaf/services/saf/immsv/immnd/ImmModel.cc  |  115 ++++++++++++++++++++
>>     osaf/services/saf/immsv/immnd/ImmModel.hh  |    9 +-
>>     osaf/services/saf/immsv/immnd/immnd_cb.h   |   11 +-
>>     osaf/services/saf/immsv/immnd/immnd_evt.c  |  166 
>> ++++++++++++++++++++++++----
>>     osaf/services/saf/immsv/immnd/immnd_init.h |   13 ++-
>>     osaf/services/saf/immsv/immnd/immnd_main.c |    7 +
>>     osaf/services/saf/immsv/immnd/immnd_proc.c |  120 ++++++++++++++++----
>>     7 files changed, 381 insertions(+), 60 deletions(-)
>>
>>
>> The patch contains IMMND code that is needed for supporting cloud resilience 
>> feature.
>>
>> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.cc 
>> b/osaf/services/saf/immsv/immnd/ImmModel.cc
>> --- a/osaf/services/saf/immsv/immnd/ImmModel.cc
>> +++ b/osaf/services/saf/immsv/immnd/ImmModel.cc
>> @@ -446,6 +446,7 @@ static const std::string immPbeBSlaveNam
>>     static const std::string immLongDnsAllowed(OPENSAF_IMM_LONG_DNS_ALLOWED);
>>     static const std::string 
>> immAccessControlMode(OPENSAF_IMM_ACCESS_CONTROL_MODE);
>>     static const std::string 
>> immAuthorizedGroup(OPENSAF_IMM_AUTHORIZED_GROUP);
>> +static const std::string 
>> immScAbsenceAllowed(OPENSAF_IMM_SC_ABSENCE_ALLOWED);
>>     
>>     static const std::string immMngtClass("SaImmMngt");
>>     static const std::string 
>> immManagementDn("safRdn=immManagement,safApp=safImmService");
>> @@ -492,6 +493,17 @@ struct CcbIdIs
>>     };
>>     
>>     
>> +void
>> +immModel_setScAbsenceAllowed(IMMND_CB *cb)
>> +{
>> +    if(cb->mCanBeCoord == 4) {
>> +        osafassert(cb->mScAbsenceAllowed > 0);
>> +    } else {
>> +        osafassert(cb->mScAbsenceAllowed == 0);
>> +    }
>> +    
>> ImmModel::instance(&cb->immModel)->setScAbsenceAllowed(cb->mScAbsenceAllowed);
>> +}
>> +
>>     SaAisErrorT
>>     immModel_ccbResult(IMMND_CB *cb, SaUint32T ccbId)
>>     {
>> @@ -511,6 +523,32 @@ immModel_abortSync(IMMND_CB *cb)
>>     }
>>     
>>     void
>> +immModel_isolateThisNode(IMMND_CB *cb)
>> +{
>> +  ImmModel::instance(&cb->immModel)->isolateThisNode(cb->node_id, 
>> cb->mIsCoord);
>> +}
>> +
>> +void
>> +immModel_abortNonCriticalCcbs(IMMND_CB *cb)
>> +{
>> +    SaUint32T arrSize;
>> +    SaUint32T* implConnArr = NULL;
>> +    SaUint32T client;
>> +    SaClmNodeIdT pbeNodeId;
>> +    SaUint32T nodeId;
>> +    CcbVector::iterator i3 = sCcbVector.begin();
>> +    for(; i3!=sCcbVector.end(); ++i3) {
>> +        if((*i3)->mState < IMM_CCB_CRITICAL) {
>> +            osafassert(immModel_ccbAbort(cb, (*i3)->mId, &arrSize, 
>> &implConnArr, &client, &nodeId, &pbeNodeId));
>> +            osafassert(immModel_ccbFinalize(cb, (*i3)->mId) == SA_AIS_OK);
>> +            if (arrSize) {
>> +                free(implConnArr);
>> +            }
>> +        }
>> +    }
>> +}
>> +
>> +void
>>     immModel_pbePrtoPurgeMutations(IMMND_CB *cb, SaUint32T nodeId, SaUint32T 
>> *reqArrSize,
>>         SaUint32T **reqConnArr)
>>     {
>> @@ -17171,6 +17209,27 @@ ImmModel::getParentDn(std::string& paren
>>         TRACE_LEAVE();
>>     }
>>     
>> +void
>> +ImmModel::setScAbsenceAllowed(SaUint16T scAbsenceAllowed)
>> +{
>> +    ObjectMap::iterator oi = sObjectMap.find(immObjectDn);
>> +    osafassert(oi != sObjectMap.end());
>> +    ObjectInfo* immObject =  oi->second;
>> +    ImmAttrValueMap::iterator avi =
>> +        immObject->mAttrValueMap.find(immScAbsenceAllowed);
>> +    if(avi == immObject->mAttrValueMap.end()) {
>> +        LOG_WA("Attribue '%s' does not exist in object '%s'",
>> +            immScAbsenceAllowed.c_str(), immObjectDn.c_str());
>> +        return;
>> +    }
>> +
>> +    osafassert(!(avi->second->isMultiValued()));
>> +    ImmAttrValue* valuep = (ImmAttrValue *) avi->second;
>> +    valuep->setValue_int(scAbsenceAllowed);
>> +
>> +    LOG_NO("ABT ImmModel received scAbsenceAllowed %u", scAbsenceAllowed);
>> +}
>> +
>>     SaAisErrorT
>>     ImmModel::finalizeSync(ImmsvOmFinalizeSync* req, bool isCoord,
>>         bool isSyncClient)
>> @@ -18067,3 +18126,59 @@ ImmModel::finalizeSync(ImmsvOmFinalizeSy
>>         return err;
>>     }
>>     
>> +void
>> +ImmModel::isolateThisNode(unsigned int thisNode, bool isAtCoord)
>> +{
>> +    /* Move this logic up to immModel_isolate... No need for this extra 
>> level.
>> +       But need to abort and terminate ccbs.
>> +     */
>> +    ImplementerVector::iterator i;
>> +    AdminOwnerVector::iterator i2;
>> +    CcbVector::iterator i3;
>> +    unsigned int otherNode;
>> +
>> +    if((sImmNodeState != IMM_NODE_FULLY_AVAILABLE) && (sImmNodeState != 
>> IMM_NODE_R_AVAILABLE)) {
>> +        LOG_NO("SC abscence interrupted sync of this IMMND - exiting");
>> +        exit(0);
>> +    }
>> +
>> +    i = sImplementerVector.begin();
>> +    while(i != sImplementerVector.end()) {
>> +        IdVector cv, gv;
>> +        ImplementerInfo* info = (*i);
>> +        otherNode = info->mNodeId;
>> +        if(otherNode == thisNode || otherNode == 0) {
>> +            i++;
>> +        } else {
>> +            info = NULL;
>> +            this->discardNode(otherNode, cv, gv, isAtCoord);
>> +            LOG_NO("Impl Discarded node %x", otherNode);
>> +            /* Discard ccbs. */
>> +
>> +            i = sImplementerVector.begin(); /* restart iteration. */
>> +        }
>> +    }
>> +
>> +    i2 = sOwnerVector.begin();
>> +    while(i2 != sOwnerVector.end()) {
>> +        IdVector cv, gv;
>> +        AdminOwnerInfo* ainfo = (*i2);
>> +        otherNode = ainfo->mNodeId;
>> +        if(otherNode == thisNode || otherNode == 0) {
>> +            /* ??? (otherNode == 0) is that really correct ??? */
>> +            i2++;
>> +        } else {
>> +            ainfo = NULL;
>> +            this->discardNode(otherNode, cv, gv, isAtCoord);
>> +            LOG_NO("Admo Discarded node %x", otherNode);
>> +            /* Discard ccbs */
>> +
>> +            i2 =  sOwnerVector.begin(); /* restart iteration. */
>> +        }
>> +    }
>> +
>> +    /* Verify that all noncritical CCBs are aborted.
>> +       Ccbs where client resided at this node chould already have been 
>> handled in
>> +       immnd_proc_discard_other_nodes() that calls 
>> immnd_proc_imma_discard_connection()
>> +     */
>> +}
>> diff --git a/osaf/services/saf/immsv/immnd/ImmModel.hh 
>> b/osaf/services/saf/immsv/immnd/ImmModel.hh
>> --- a/osaf/services/saf/immsv/immnd/ImmModel.hh
>> +++ b/osaf/services/saf/immsv/immnd/ImmModel.hh
>> @@ -145,12 +145,6 @@ public:
>>                                                 const immsv_octet_string* 
>> clName,
>>                                                 ImmsvOmClassDescr* res);
>>         
>> -    SaAisErrorT         classSerialize(
>> -                                       const char* className,
>> -                                       char** data,
>> -                                       size_t* size);
>> -
>> -
>>         SaAisErrorT         attrCreate(
>>                                        ClassInfo* classInfo,
>>                                        const ImmsvAttrDefinition* attr,
>> @@ -480,6 +474,8 @@ public:
>>                                           const struct 
>> ImmsvAdminOperationParam *reqparams,
>>                                           struct ImmsvAdminOperationParam 
>> **rparams,
>>                                           SaUint64T searchcount);
>> +
>> +    void              setScAbsenceAllowed(SaUint16T scAbsenceAllowed);
>>         
>>         SaAisErrorT       objectSync(const ImmsvOmObjectSync* req);
>>         bool              fetchRtUpdate(ImmsvOmObjectSync* syncReq,
>> @@ -517,6 +513,7 @@ public:
>>         void              recognizedIsolated();
>>         bool              syncComplete(bool isJoining);
>>         void              abortSync();
>> +    void              isolateThisNode(unsigned int thisNode, bool 
>> isAtCoord);
>>         void              pbePrtoPurgeMutations(unsigned int nodeId, 
>> ConnVector& connVector);
>>         SaAisErrorT       ccbResult(SaUint32T ccbId);
>>         ImmsvAttrNameList * ccbGrabErrStrings(SaUint32T ccbId);
>> diff --git a/osaf/services/saf/immsv/immnd/immnd_cb.h 
>> b/osaf/services/saf/immsv/immnd/immnd_cb.h
>> --- a/osaf/services/saf/immsv/immnd/immnd_cb.h
>> +++ b/osaf/services/saf/immsv/immnd/immnd_cb.h
>> @@ -113,13 +113,17 @@ typedef struct immnd_cb_tag {
>>      SaUint32T mMyEpoch;     //Epoch counter, used in synch of immnds
>>      SaUint32T mMyPid;       //Is this needed ??
>>      SaUint32T mRulingEpoch;
>> -    uint8_t mAccepted;              //Should all fevs messages be processed?
>> +    SaUint32T mLatestAdmoId;
>> +    SaUint32T mLatestImplId;
>> +    SaUint32T mLatestCcbId;
>> +
>> +    uint8_t mAccepted; //If=!0 Fevs messages can be processed. 2=>IMMD 
>> re-introduce.
>>      uint8_t mIntroduced;    //Ack received on introduce message
>>      uint8_t mSyncRequested; //true=> I am coord, other req sync
>>      uint8_t mPendSync;              //1=>sync announced but not received.
>>      uint8_t mSyncFinalizing;   //1=>finalizeSync sent but not received.
>>      uint8_t mSync;          //true => this node is being synced (client).
>> -    uint8_t mCanBeCoord;    //If!=0 then SC, if 2 the 2pbe arbitration.
>> +    uint8_t mCanBeCoord;    //If!=0 then SC, 2 => 2pbe arbitration, 4 => 
>> absentScAllowed.
>>      uint8_t mIsCoord;
>>      uint8_t mLostNodes;       //Detached & not syncreq => delay sync start
>>      uint8_t mBlockPbeEnable;  //Current PBE has not completed shutdown yet.
>> @@ -128,6 +132,8 @@ typedef struct immnd_cb_tag {
>>      bool mIsOtherScUp; //If set & this is an SC then other SC is up(2pbe).
>>                 //False=> *allow* 1safe 2pbe. May err conservatively (true)
>>      bool mForceClean; //true => Force cleanTheHouse to run once *now*.
>> +    SaUint16T mScAbsenceAllowed; /* Non zero if "headless Hydra" allowed 
>> (loss of both IMMDs/SCs).
>> +                                   Value is number of seconds of SC absence 
>> tolerated. */
>>     
>>      /* Information about the IMMD */
>>      MDS_DEST immd_mdest_id;
>> @@ -161,6 +167,7 @@ typedef struct immnd_cb_tag {
>>      uint8_t mPbeVeteran;       //false => regenerate. true => re-attach 
>> db-file
>>      uint8_t mPbeVeteranB;      //false => regenerate. true => re-attach 
>> db-file
>>      uint8_t mPbeOldVeteranB;   //false => restarted,  true => stable. (only 
>> to reduce logging).
>> +    uint8_t mPbeUsesSharedFs;  //false => not use SFS, true => use SFS
>>     
>>      SaAmfHAStateT ha_state; // present AMF HA state of the component
>>      EDU_HDL immnd_edu_hdl;  // edu handle, obscurely needed by mds.
>> diff --git a/osaf/services/saf/immsv/immnd/immnd_evt.c 
>> b/osaf/services/saf/immsv/immnd/immnd_evt.c
>> --- a/osaf/services/saf/immsv/immnd/immnd_evt.c
>> +++ b/osaf/services/saf/immsv/immnd/immnd_evt.c
>> @@ -75,9 +75,9 @@ static void immnd_evt_proc_admo_finalize
>>                                       IMMND_EVT *evt,
>>                                       SaBoolT originatedAtThisNd, 
>> SaImmHandleT clnt_hdl, MDS_DEST reply_dest);
>>     
>> -static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb,
>> -                                          IMMND_EVT *evt,
>> -                                          SaBoolT originatedAtThisNd, 
>> SaImmHandleT clnt_hdl, MDS_DEST reply_dest);
>> +//static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb,
>> +//                                        IMMND_EVT *evt,
>> +//                                        SaBoolT originatedAtThisNd, 
>> SaImmHandleT clnt_hdl, MDS_DEST reply_dest);
>>     
>>     static void immnd_evt_proc_admo_set(IMMND_CB *cb,
>>                                  IMMND_EVT *evt,
>> @@ -1515,7 +1515,7 @@ static uint32_t immnd_evt_proc_search_ne
>>                         on a previous syncronous call. Discard the 
>> connection and return
>>                         BAD_HANDLE to allow client to recover and make 
>> progress.
>>                       */
>> -                    immnd_proc_imma_discard_connection(cb, cl_node);
>> +                    immnd_proc_imma_discard_connection(cb, cl_node, false);
>>                      rc = immnd_client_node_del(cb, cl_node);
>>                      osafassert(rc  == NCSCC_RC_SUCCESS);
>>                      free(cl_node);
>> @@ -1973,7 +1973,7 @@ static uint32_t immnd_evt_proc_imm_final
>>              goto agent_rsp;
>>      }
>>     
>> -    immnd_proc_imma_discard_connection(cb, cl_node);
>> +    immnd_proc_imma_discard_connection(cb, cl_node, false);
>>     
>>      rc = immnd_client_node_del(cb, cl_node);
>>      if (rc == NCSCC_RC_FAILURE) {
>> @@ -2197,9 +2197,11 @@ static uint32_t immnd_evt_proc_imm_clien
>>         cl_node->mIsResurrect = 0x1;
>>     
>>         if (immnd_client_node_add(cb, cl_node) != NCSCC_RC_SUCCESS) {
>> +#if 0 //CLOUD-PROTO  ABT clients should be discarded !!!!
>>          LOG_ER("IMMND - Adding temporary imma client Failed.");
>>          /*free(cl_node);*/
>>          abort();
>> +#endif
>>         }
>>     
>>         TRACE_2("Added client with id: %llx <node:%x, count:%u>",
>> @@ -2314,7 +2316,7 @@ static uint32_t immnd_evt_proc_admowner_
>>                         on a previous syncronous call. Discard the 
>> connection and return
>>                         BAD_HANDLE to allow client to recover and make 
>> progress.
>>                       */
>> -                    immnd_proc_imma_discard_connection(cb, cl_node);
>> +                    immnd_proc_imma_discard_connection(cb, cl_node, false);
>>                      rc = immnd_client_node_del(cb, cl_node);
>>                      osafassert(rc  == NCSCC_RC_SUCCESS);
>>                      free(cl_node);
>> @@ -2442,7 +2444,7 @@ static uint32_t immnd_evt_proc_impl_set(
>>                         on a previous syncronous call. Discard the 
>> connection and return
>>                         BAD_HANDLE to allow client to recover and make 
>> progress.
>>                       */
>> -                    immnd_proc_imma_discard_connection(cb, cl_node);
>> +                    immnd_proc_imma_discard_connection(cb, cl_node, false);
>>                      rc = immnd_client_node_del(cb, cl_node);
>>                      osafassert(rc  == NCSCC_RC_SUCCESS);
>>                      free(cl_node);
>> @@ -2573,7 +2575,7 @@ static uint32_t immnd_evt_proc_ccb_init(
>>                         on a previous syncronous call. Discard the 
>> connection and return
>>                         BAD_HANDLE to allow client to recover and make 
>> progress.
>>                       */
>> -                    immnd_proc_imma_discard_connection(cb, cl_node);
>> +                    immnd_proc_imma_discard_connection(cb, cl_node, false);
>>                      rc = immnd_client_node_del(cb, cl_node);
>>                      osafassert(rc  == NCSCC_RC_SUCCESS);
>>                      free(cl_node);
>> @@ -2680,7 +2682,7 @@ static uint32_t immnd_evt_proc_rt_update
>>                         on a previous syncronous call. Discard the 
>> connection and return
>>                         BAD_HANDLE to allow client to recover and make 
>> progress.
>>                       */
>> -                    immnd_proc_imma_discard_connection(cb, cl_node);
>> +                    immnd_proc_imma_discard_connection(cb, cl_node, false);
>>                      rc = immnd_client_node_del(cb, cl_node);
>>                      osafassert(rc  == NCSCC_RC_SUCCESS);
>>                      free(cl_node);
>> @@ -2866,7 +2868,7 @@ static uint32_t immnd_evt_proc_fevs_forw
>>                                 on a previous syncronous call. Discard the 
>> connection and return
>>                                 BAD_HANDLE to allow client to recover and 
>> make progress.
>>                              */
>> -                            immnd_proc_imma_discard_connection(cb, cl_node);
>> +                            immnd_proc_imma_discard_connection(cb, cl_node, 
>> false);
>>                              rc = immnd_client_node_del(cb, cl_node);
>>                              osafassert(rc  == NCSCC_RC_SUCCESS);
>>                              free(cl_node);
>> @@ -8317,7 +8319,7 @@ uint32_t immnd_evt_proc_abort_sync(IMMND
>>              if (cb->mState == IMM_SERVER_SYNC_CLIENT ||
>>                      cb->mState == IMM_SERVER_SYNC_PENDING) {        /* Sync 
>> client will have to restart the sync */
>>                      cb->mState = IMM_SERVER_LOADING_PENDING;
>> -                    LOG_WA("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> IMM 
>> SERVER LOADING PENDING (sync aborted)");
>> +                    LOG_WA("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> 
>> IMM_SERVER_LOADING_PENDING (sync aborted)");
>>                      cb->mStep = 0;
>>                      cb->mJobStart = time(NULL);
>>                      osafassert(cb->mJobStart >= ((time_t) 0));
>> @@ -8451,6 +8453,7 @@ static uint32_t immnd_evt_proc_start_syn
>>                 with respect to the just arriving start-sync.
>>                 Search for "ticket:#598" in immnd_proc.c
>>               */
>> +            immModel_setScAbsenceAllowed(cb);
>>      } else if ((cb->mState == IMM_SERVER_SYNC_CLIENT) && 
>> (immnd_syncComplete(cb, SA_FALSE, cb->mStep))) {
>>              cb->mStep = 0;
>>              cb->mJobStart = time(NULL);
>> @@ -8467,6 +8470,7 @@ static uint32_t immnd_evt_proc_start_syn
>>                 with respect to the just arriving start-sync.
>>                 Search for "ticket:#599" in immnd_proc.c
>>               */
>> +            immModel_setScAbsenceAllowed(cb);
>>      }
>>     
>>      cb->mRulingEpoch = evt->info.ctrl.rulingEpoch;
>> @@ -8543,7 +8547,7 @@ static uint32_t immnd_evt_proc_start_syn
>>     static uint32_t immnd_evt_proc_reset(IMMND_CB *cb, IMMND_EVT *evt, 
>> IMMSV_SEND_INFO *sinfo)
>>     {
>>      TRACE_ENTER();
>> -    if (cb->mIntroduced) {
>> +    if (cb->mIntroduced==1) {
>>              LOG_ER("IMMND forced to restart on order from IMMD, exiting");
>>              if(cb->mState < IMM_SERVER_READY) {
>>                      immnd_ackToNid(NCSCC_RC_FAILURE);
>> @@ -8668,11 +8672,15 @@ static uint32_t immnd_evt_proc_intro_rsp
>>              evt->info.ctrl.nodeId != cb->node_id);
>>      cb->mNumNodes++;
>>      TRACE("immnd_evt_proc_intro_rsp cb->mNumNodes: %u", cb->mNumNodes);
>> +    LOG_IN("immnd_evt_proc_intro_rsp: epoch:%i rulingEpoch:%u", 
>> cb->mMyEpoch, evt->info.ctrl.rulingEpoch);
>> +    if(evt->info.ctrl.rulingEpoch > cb->mRulingEpoch) {
>> +            cb->mRulingEpoch = evt->info.ctrl.rulingEpoch;
>> +    }
>>     
>>      if (evt->info.ctrl.nodeId == cb->node_id) {
>>              /*This node was introduced to the IMM cluster */
>>              uint8_t oldCanBeCoord = cb->mCanBeCoord;
>> -            cb->mIntroduced = true;
>> +            cb->mIntroduced = 1;
>>              if(evt->info.ctrl.canBeCoord == 3) {
>>                      cb->m2Pbe = 1;
>>                      evt->info.ctrl.canBeCoord = 1;
>> @@ -8708,6 +8716,14 @@ static uint32_t immnd_evt_proc_intro_rsp
>>                              ((oldCanBeCoord == 2)?"load":"sync"));
>>              }
>>     
>> +            if(cb->mCanBeCoord == 4) {
>> +                    osafassert(!(cb->m2Pbe));
>> +                    cb->mScAbsenceAllowed =  evt->info.ctrl.ndExecPid;
>> +                    LOG_IN("ABT cb->mScAbsenceAllowed:%u 
>> evt->info.ctrl.ndExecPid:%u", cb->mScAbsenceAllowed, 
>> evt->info.ctrl.ndExecPid);
>> +                    LOG_IN("SC_ABSENCE_ALLOWED (Headless Hydra) is 
>> configured for %u seconds. CanBeCoord:%u",
>> +                            cb->mScAbsenceAllowed, cb->mCanBeCoord);
>> +            }
>> +
>>              if (evt->info.ctrl.isCoord) {
>>                      if (cb->mIsCoord) {
>>                              LOG_NO("This IMMND re-elected coord 
>> redundantly, failover ?");
>> @@ -8733,7 +8749,14 @@ static uint32_t immnd_evt_proc_intro_rsp
>>                              
>>                      }
>>              }
>> -            cb->mIsCoord = evt->info.ctrl.isCoord;
>> +            if(cb->mIsCoord) {
>> +                    if(!(evt->info.ctrl.isCoord)) {
>> +                            LOG_NO("ABT CLOUD PROTO avoided canceling coord 
>> - SHOULD NOT GET HERE");
>> +                    }
>> +            } else {
>> +                    LOG_NO("SETTING COORD TO %u CLOUD PROTO", 
>> evt->info.ctrl.isCoord);
>> +                    cb->mIsCoord = evt->info.ctrl.isCoord;
>> +            }
>>              osafassert(!cb->mIsCoord || cb->mCanBeCoord);
>>              cb->mRulingEpoch = evt->info.ctrl.rulingEpoch;
>>              if (cb->mRulingEpoch) {
>> @@ -8751,7 +8774,7 @@ static uint32_t immnd_evt_proc_intro_rsp
>>              
>>              */
>>              if(cb->mCanBeCoord && evt->info.ctrl.canBeCoord) {
>> -                    LOG_IN("Other SC node (%x) has been introduced", 
>> evt->info.ctrl.nodeId);
>> +                    LOG_IN("Other %s IMMND node (%x) has been introduced", 
>> (cb->mScAbsenceAllowed)?"candidate coord":"SC", evt->info.ctrl.nodeId);
>>                      cb->mIsOtherScUp = true; /* Prevents oneSafe2PBEAllowed 
>> from being turned on */
>>                      cb->other_sc_node_id = evt->info.ctrl.nodeId;
>>                      
>> @@ -9066,7 +9089,9 @@ static void immnd_evt_proc_adminit_rsp(I
>>      SaUint32T conn;
>>      SaUint32T ownerId = 0;
>>     
>> -    osafassert(evt);
>> +    /* Remember latest admo_id for IMMD recovery. */
>> +    cb->mLatestAdmoId = evt->info.adminitGlobal.globalOwnerId;
>> +
>>      conn = m_IMMSV_UNPACK_HANDLE_HIGH(clnt_hdl);
>>      nodeId = m_IMMSV_UNPACK_HANDLE_LOW(clnt_hdl);
>>      ownerId = evt->info.adminitGlobal.globalOwnerId;
>> @@ -9231,6 +9256,45 @@ static void immnd_evt_proc_finalize_sync
>>                      /*This adjust-epoch will persistify the new epoch for: 
>> veterans. */
>>                      immnd_adjustEpoch(cb, SA_TRUE); /* Will osafassert if 
>> immd is down. */
>>              }
>> +
>> +            if(cb->mScAbsenceAllowed) {/* Coord and veteran nodes. */
>> +                    IMMND_IMM_CLIENT_NODE *cl_node = NULL;
>> +                    SaImmHandleT prev_hdl;
>> +                    unsigned int count = 0;
>> +                    IMMSV_EVT send_evt;
>> +                    /* Sync completed for veteran & headless allowed => 
>> trigger active
>> +                       resurrect. */
>> +                    memset(&send_evt, '\0', sizeof(IMMSV_EVT));
>> +                    send_evt.type = IMMSV_EVT_TYPE_IMMA;
>> +                    send_evt.info.imma.type = 
>> IMMA_EVT_ND2A_PROC_STALE_CLIENTS;
>> +                    immnd_client_node_getnext(cb, 0, &cl_node);
>> +                    while (cl_node) {
>> +                            prev_hdl = cl_node->imm_app_hdl;
>> +                            if(!(cl_node->mIsResurrect)) {
>> +                                    LOG_IN("Veteran node found active 
>> client id: %llx "
>> +                                            "version:%c %u %u, after sync.",
>> +                                            cl_node->imm_app_hdl, 
>> cl_node->version.releaseCode,
>> +                                            cl_node->version.majorVersion,
>> +                                            cl_node->version.minorVersion);
>> +                                    immnd_client_node_getnext(cb, prev_hdl, 
>> &cl_node);
>> +                                    continue;
>> +                            }
>> +                            /* Send resurrect message. */
>> +                            if (immnd_mds_msg_send(cb, cl_node->sv_id,
>> +                                            cl_node->agent_mds_dest, 
>> &send_evt)!=NCSCC_RC_SUCCESS)
>> +                            {
>> +                                    LOG_WA("Failed to send active resurrect 
>> message");
>> +                            }
>> +                            /* Remove the temporary client node. */
>> +                            immnd_client_node_del(cb, cl_node);
>> +                            memset(cl_node, '\0', 
>> sizeof(IMMND_IMM_CLIENT_NODE));
>> +                            free(cl_node);
>> +                            cl_node = NULL;
>> +                            ++count;
>> +                            immnd_client_node_getnext(cb, 0, &cl_node);
>> +                    }
>> +                    TRACE_2("Triggered %u active resurrects at veteran 
>> node", count);
>> +            }
>>      }
>>     
>>      done:
>> @@ -9485,7 +9549,7 @@ static void immnd_evt_proc_admo_finalize
>>      *                                         is to be sent (only relevant 
>> if
>>      *                                         originatedAtThisNode is 
>> false).
>>      
>> *****************************************************************************/
>> -static void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb,
>> +void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb,
>>                                            IMMND_EVT *evt,
>>                                            SaBoolT originatedAtThisNd, 
>> SaImmHandleT clnt_hdl, MDS_DEST reply_dest)
>>     {
>> @@ -9550,6 +9614,9 @@ static void immnd_evt_proc_impl_set_rsp(
>>              evt->info.implSet.oi_timeout = 0;
>>      }
>>     
>> +    /* Remember latest impl_id for IMMD recovery. */
>> +    cb->mLatestImplId =  evt->info.implSet.impl_id;
>> +
>>      err = immModel_implementerSet(cb, &(evt->info.implSet.impl_name),
>>                      (originatedAtThisNd) ? conn : 0, nodeId, implId,
>>                      reply_dest, evt->info.implSet.oi_timeout, 
>> &discardImplementer);
>> @@ -9934,6 +10001,9 @@ static void immnd_evt_proc_ccbinit_rsp(I
>>      nodeId = m_IMMSV_UNPACK_HANDLE_LOW(clnt_hdl);
>>      ccbId = evt->info.ccbinitGlobal.globalCcbId;
>>     
>> +    /* Remember latest ccb_id for IMMD recovery. */
>> +    cb->mLatestCcbId =  evt->info.ccbinitGlobal.globalCcbId;
>> +
>>      err = immModel_ccbCreate(cb,
>>                               evt->info.ccbinitGlobal.i.adminOwnerId,
>>                               evt->info.ccbinitGlobal.i.ccbFlags,
>> @@ -10053,12 +10123,61 @@ static uint32_t immnd_evt_proc_mds_evt(I
>>              immnd_proc_imma_down(cb, evt->info.mds_info.dest, 
>> evt->info.mds_info.svc_id);
>>      } else if ((evt->info.mds_info.change == NCSMDS_DOWN) && 
>> evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMD) {
>>              /* Cluster is going down. */
>> -            LOG_NO("No IMMD service => cluster restart, exiting");
>> -            if(cb->mState < IMM_SERVER_SYNC_SERVER) {
>> -                    immnd_ackToNid(NCSCC_RC_FAILURE);
>> -            }
>> -            exit(1);
>> -
>> +            if(cb->mScAbsenceAllowed == 0) {
>> +                    /* Regular (non Hydra) exit on IMMD DOWN. */
>> +                    LOG_ER("No IMMD service => cluster restart, exiting");
>> +                    if(cb->mState < IMM_SERVER_SYNC_SERVER) {
>> +                            immnd_ackToNid(NCSCC_RC_FAILURE);
>> +                    }
>> +                    exit(1);
>> +            } else { /* SC ABSENCE ALLOWED */
>> +                    LOG_WA("SC Absence IS allowed:%u IMMD service is DOWN", 
>> cb->mScAbsenceAllowed);
>> +                    if(cb->mIsCoord) {
>> +                            /* Note that normally the coord will reside at 
>> SCs so this branch will
>> +                               only be relevant if REPEATED toal scAbsence 
>> occurs. After SC absence
>> +                               and subsequent return of SC, the coord will 
>> be elected at a payload.
>> +                               That coord will be active untill restart of 
>> that payload..
>> +                               unless we add functionality for the payload 
>> coord to restart after
>> +                               a few minutes .. ?
>> +                            */
>> +                            LOG_WA("This IMMND coord has to exit allowing 
>> restarted IMMD to select new coord");
>> +                            if(cb->mState < IMM_SERVER_SYNC_SERVER) {
>> +                                    immnd_ackToNid(NCSCC_RC_FAILURE);
>> +                            }
>> +                            exit(1);
>> +                    } else if(cb->mState <= IMM_SERVER_LOADING_PENDING) {
>> +                            /* Reset state in payloads that had not joined. 
>> No need to restart. */
>> +                            LOG_IN("Resetting IMMND state from %u to 
>> IMM_SERVER_ANONYMOUS", cb->mState);
>> +                            cb->mState = IMM_SERVER_ANONYMOUS;
>> +                    } else if(cb->mState < IMM_SERVER_READY) {
>> +                            LOG_WA("IMMND was being synced or loaded (%u), 
>> has to restart", cb->mState);
>> +                            if(cb->mState < IMM_SERVER_SYNC_SERVER) {
>> +                                    immnd_ackToNid(NCSCC_RC_FAILURE);
>> +                            }
>> +                            exit(1);
>> +                    }
>> +            }
>> +            cb->mIntroduced = 2;
>> +            LOG_NO("IMMD SERVICE IS DOWN, HYDRA IS CONFIGURED => 
>> UNREGISTERING IMMND form MDS");
>> +            immnd_mds_unregister(cb);
>> +            /* Discard local clients ...  */
>> +            immnd_proc_discard_other_nodes(cb); /* Isolate from the rest of 
>> cluster */
>> +            LOG_NO("MDS unregisterede. sleeping ...");
>> +            sleep(1);
>> +            LOG_NO("Sleep done registering IMMND with MDS");
>> +            rc = immnd_mds_register(immnd_cb);
>> +            if(rc == NCSCC_RC_SUCCESS) {
>> +                    LOG_NO("SUCCESS IN REGISTERING IMMND WITH MDS");
>> +            } else {
>> +                    LOG_ER("FAILURE IN REGISTERING IMMND WITH MDS - 
>> exiting");
>> +                    exit(1);
>> +            }
>> +    } else if ((evt->info.mds_info.change == NCSMDS_UP) && 
>> (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMD)) {
>> +            LOG_NO("IMMD service is UP ... ScAbsenseAllowed?:%u 
>> introduced?:%u",
>> +                       cb->mScAbsenceAllowed, cb->mIntroduced);
>> +            if((cb->mIntroduced==2) && (immnd_introduceMe(cb) != 
>> NCSCC_RC_SUCCESS)) {
>> +                    LOG_WA("IMMND re-introduceMe after IMMD restart failed, 
>> will retry");
>> +            }
>>      } else if ((evt->info.mds_info.change == NCSMDS_UP) &&
>>                 (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMA_OM ||
>>                  evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMA_OM)) {
>> @@ -10073,7 +10192,6 @@ static uint32_t immnd_evt_proc_mds_evt(I
>>              TRACE_2("IMMD FAILOVER");
>>              /* The IMMD has failed over. */
>>              immnd_proc_imma_discard_stales(cb);
>> -
>>      } else if (evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMND) {
>>              LOG_NO("MDS SERVICE EVENT OF TYPE IMMND!!");
>>      }
>> diff --git a/osaf/services/saf/immsv/immnd/immnd_init.h 
>> b/osaf/services/saf/immsv/immnd/immnd_init.h
>> --- a/osaf/services/saf/immsv/immnd/immnd_init.h
>> +++ b/osaf/services/saf/immsv/immnd/immnd_init.h
>> @@ -39,8 +39,10 @@ extern IMMND_CB *immnd_cb;
>>     
>>     /* file : -  immnd_proc.c */
>>     
>> +void immnd_proc_discard_other_nodes(IMMND_CB *cb);
>> +
>>     void immnd_proc_imma_down(IMMND_CB *cb, MDS_DEST dest, NCSMDS_SVC_ID 
>> sv_id);
>> -uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, 
>> IMMND_IMM_CLIENT_NODE *cl_node);
>> +uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, 
>> IMMND_IMM_CLIENT_NODE *cl_node, bool scAbsenceAllowed);
>>     void immnd_proc_imma_discard_stales(IMMND_CB *cb);
>>     
>>     void immnd_cb_dump(void);
>> @@ -75,6 +77,10 @@ extern "C" {
>>     
>>      void immModel_abortSync(IMMND_CB *cb);
>>     
>> +    void immModel_isolateThisNode(IMMND_CB *cb);
>> +
>> +    void immModel_abortNonCriticalCcbs(IMMND_CB *cb);
>> +
>>      void immModel_pbePrtoPurgeMutations(IMMND_CB *cb, unsigned int nodeId, 
>> SaUint32T *reqArrSize,
>>              SaUint32T **reqConArr);
>>     
>> @@ -433,6 +439,8 @@ extern "C" {
>>              const char *errorString,
>>              ...);
>>     
>> +    void immModel_setScAbsenceAllowed(IMMND_CB *cb);
>> +
>>     #ifdef __cplusplus
>>     }
>>     #endif
>> @@ -471,6 +479,9 @@ uint32_t immnd_mds_get_handle(IMMND_CB *
>>     /* File : ----  immnd_evt.c */
>>     void immnd_process_evt(void);
>>     uint32_t immnd_evt_destroy(IMMSV_EVT *evt, SaBoolT onheap, uint32_t 
>> line);
>> +void immnd_evt_proc_admo_hard_finalize(IMMND_CB *cb, IMMND_EVT *evt,
>> +    SaBoolT originatedAtThisNd, SaImmHandleT clnt_hdl, MDS_DEST reply_dest);
>> +
>>     /* End : ----  immnd_evt.c  */
>>     
>>     /* File : ----  immnd_proc.c */
>> diff --git a/osaf/services/saf/immsv/immnd/immnd_main.c 
>> b/osaf/services/saf/immsv/immnd/immnd_main.c
>> --- a/osaf/services/saf/immsv/immnd/immnd_main.c
>> +++ b/osaf/services/saf/immsv/immnd/immnd_main.c
>> @@ -169,6 +169,13 @@ static uint32_t immnd_initialize(char *p
>>                      immnd_cb->mPbeFile);
>>      }
>>     
>> +    if ((envVar = getenv("IMMSV_USE_SHARED_FS"))) {
>> +            int useSharedFs = atoi(envVar);
>> +            if(useSharedFs != 0) {
>> +                    immnd_cb->mPbeUsesSharedFs = 1;
>> +            }
>> +    }
>> +
>>      immnd_cb->mRim = SA_IMM_INIT_FROM_FILE;
>>      immnd_cb->mPbeVeteran = SA_FALSE;
>>      immnd_cb->mPbeVeteranB = SA_FALSE;
>> diff --git a/osaf/services/saf/immsv/immnd/immnd_proc.c 
>> b/osaf/services/saf/immsv/immnd/immnd_proc.c
>> --- a/osaf/services/saf/immsv/immnd/immnd_proc.c
>> +++ b/osaf/services/saf/immsv/immnd/immnd_proc.c
>> @@ -34,6 +34,7 @@
>>     
>>     #include "immnd.h"
>>     #include "immsv_api.h"
>> +#include "immnd_init.h"
>>     
>>     static const char *loaderBase = "osafimmloadd";
>>     static const char *pbeBase = "osafimmpbed";
>> @@ -76,7 +77,7 @@ void immnd_proc_immd_down(IMMND_CB *cb)
>>      * Notes         : Policy used for handling immd down is to blindly 
>> cleanup
>>      *                :immnd_cb
>>      
>> ****************************************************************************/
>> -uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, 
>> IMMND_IMM_CLIENT_NODE *cl_node)
>> +uint32_t immnd_proc_imma_discard_connection(IMMND_CB *cb, 
>> IMMND_IMM_CLIENT_NODE *cl_node, bool scAbsence)
>>     {
>>      SaUint32T client_id;
>>      SaUint32T node_id;
>> @@ -129,7 +130,8 @@ uint32_t immnd_proc_imma_discard_connect
>>              send_evt.type = IMMSV_EVT_TYPE_IMMD;
>>              send_evt.info.immd.type = IMMD_EVT_ND2D_DISCARD_IMPL;
>>              send_evt.info.immd.info.impl_set.r.impl_id = implId;
>> -            if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, 
>> cb->immd_mdest_id, &send_evt) != NCSCC_RC_SUCCESS) {
>> +
>> +            if (!scAbsence && immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, 
>> cb->immd_mdest_id, &send_evt) != NCSCC_RC_SUCCESS) {
>>                      if (immnd_is_immd_up(cb)) {
>>                              LOG_ER("Discard implementer failed for 
>> implId:%u "
>>                                     "but IMMD is up !? - case not handled. 
>> Client will be orphanded", implId);
>> @@ -142,7 +144,8 @@ uint32_t immnd_proc_imma_discard_connect
>>              /*Discard the local implementer directly and redundantly to 
>> avoid
>>                 race conditions using this implementer (ccb's causing abort 
>> upcalls).
>>               */
>> -            immModel_discardImplementer(cb, implId, SA_FALSE, NULL, NULL);
>> +            //immModel_discardImplementer(cb, implId, SA_FALSE, NULL, NULL);
>> +            immModel_discardImplementer(cb, implId, scAbsence, NULL, NULL);
>>      }
>>     
>>      if (cl_node->mIsStale) {
>> @@ -163,7 +166,7 @@ uint32_t immnd_proc_imma_discard_connect
>>              for (ix = 0; ix < arrSize && !(cl_node->mIsStale); ++ix) {
>>                      send_evt.info.immd.info.ccbId = idArr[ix];
>>                      TRACE_5("Discarding Ccb id:%u originating at dead 
>> connection: %u", idArr[ix], client_id);
>> -                    if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, 
>> cb->immd_mdest_id,
>> +                    if (!scAbsence && immnd_mds_msg_send(cb, 
>> NCSMDS_SVC_ID_IMMD, cb->immd_mdest_id,
>> [Hung] We don't need this ...
>>
>>                                             &send_evt) != NCSCC_RC_SUCCESS) {
>>                              if (immnd_is_immd_up(cb)) {
>>                                      LOG_ER("Failure to broadcast discard 
>> Ccb for ccbId:%u "
>> @@ -174,6 +177,8 @@ uint32_t immnd_proc_imma_discard_connect
>>                                             "(immd down)- will retry later", 
>> idArr[ix]);
>>                              }
>>                              cl_node->mIsStale = true;
>> +                    } else if(scAbsence) {
>> +                            /* ABT TODO discard local ccbs ??*/
>> [Hung] ... and this. When 'scAbsence' is true, the code will not send
>> out any message. We can just simply do something like this, it will be
>> faster. *if (!scAbsence) immModel_getCcbIdsForOrigCon(cb, client_id,
>> &arrSize, &idArr);* 'arrSize' is initialized with '0' so it will not
>> enter the 'if' block.
>>
>>                      }
>>              }
>>              free(idArr);
>> @@ -197,20 +202,29 @@ uint32_t immnd_proc_imma_discard_connect
>>              send_evt.type = IMMSV_EVT_TYPE_IMMD;
>>              send_evt.info.immd.type = IMMD_EVT_ND2D_ADMO_HARD_FINALIZE;
>>              for (ix = 0; ix < arrSize && !(cl_node->mIsStale); ++ix) {
>> -                    send_evt.info.immd.info.admoId = idArr[ix];
>>                      TRACE_5("Hard finalize of AdmOwner id:%u originating at 
>> "
>>                              "dead connection: %u", idArr[ix], client_id);
>> -                    if (immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, 
>> cb->immd_mdest_id,
>> +                    if (scAbsence) {
>> +                            SaImmHandleT clnt_hdl;
>> +                            MDS_DEST reply_dest;
>> +                            memset(&clnt_hdl, '\0', sizeof(SaImmHandleT));
>> +                            memset(&reply_dest, '\0', sizeof(MDS_DEST));
>> +                            send_evt.info.immnd.info.admFinReq.adm_owner_id 
>> = idArr[ix];
>> +                            immnd_evt_proc_admo_hard_finalize(cb, 
>> &send_evt.info.immnd, false, clnt_hdl, reply_dest);
>> +                    } else {
>> +                            send_evt.info.immd.info.admoId = idArr[ix];
>> +                            if(immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, 
>> cb->immd_mdest_id,
>>                                             &send_evt) != NCSCC_RC_SUCCESS) {
>> -                            if (immnd_is_immd_up(cb)) {
>> -                                    LOG_ER("Failure to broadcast discard 
>> admo0wner for ccbId:%u "
>> -                                           "but IMMD is up !? - case not 
>> handled. Client will "
>> -                                           "be orphanded", implId);
>> -                            } else {
>> -                                    LOG_WA("Failure to broadcast discard 
>> admowner for id:%u "
>> -                                           "(immd down)- will retry later", 
>> idArr[ix]);
>> +                                    if (immnd_is_immd_up(cb)) {
>> +                                            LOG_ER("Failure to broadcast 
>> discard admo0wner for ccbId:%u "
>> +                                                    "but IMMD is up !? - 
>> case not handled. Client will "
>> +                                                    "be orphanded", implId);
>> +                                    } else {
>> +                                            LOG_WA("Failure to broadcast 
>> discard admowner for id:%u "
>> +                                                    "(immd down)- will 
>> retry later", idArr[ix]);
>> +                                    }
>> +                                    cl_node->mIsStale = true;
>>                              }
>> -                            cl_node->mIsStale = true;
>>                      }
>>              }
>>              free(idArr);
>> @@ -251,7 +265,7 @@ void immnd_proc_imma_down(IMMND_CB *cb,
>>              prev_hdl = cl_node->imm_app_hdl;
>>     
>>              if ((memcmp(&dest, &cl_node->agent_mds_dest, sizeof(MDS_DEST)) 
>> == 0) && sv_id == cl_node->sv_id) {
>> -                    if (immnd_proc_imma_discard_connection(cb, cl_node)) {
>> +                    if (immnd_proc_imma_discard_connection(cb, cl_node, 
>> false)) {
>>                              TRACE_5("Removing client id:%llx sv_id:%u", 
>> cl_node->imm_app_hdl, cl_node->sv_id);
>>                              immnd_client_node_del(cb, cl_node);
>>                              memset(cl_node, '\0', 
>> sizeof(IMMND_IMM_CLIENT_NODE));
>> @@ -300,7 +314,7 @@ void immnd_proc_imma_discard_stales(IMMN
>>              prev_hdl = cl_node->imm_app_hdl;
>>              if (cl_node->mIsStale) {
>>                      cl_node->mIsStale = false;
>> -                    if (immnd_proc_imma_discard_connection(cb, cl_node)) {
>> +                    if (immnd_proc_imma_discard_connection(cb, cl_node, 
>> false)) {
>>                              TRACE_5("Removing client id:%llx sv_id:%u", 
>> cl_node->imm_app_hdl, cl_node->sv_id);
>>                              immnd_client_node_del(cb, cl_node);
>>                              memset(cl_node, '\0', 
>> sizeof(IMMND_IMM_CLIENT_NODE));
>> @@ -422,6 +436,17 @@ uint32_t immnd_introduceMe(IMMND_CB *cb)
>>              send_evt.info.immd.info.ctrl_msg.pbeEnabled,
>>              send_evt.info.immd.info.ctrl_msg.dir.size);
>>     
>> +    if(cb->mIntroduced==2) {
>> +            LOG_NO("Re-introduce-me highestProcessed:%llu 
>> highestReceived:%llu",
>> +                    cb->highestProcessed, cb->highestReceived);
>> +            send_evt.info.immd.info.ctrl_msg.refresh = 2;
>> +            send_evt.info.immd.info.ctrl_msg.fevs_count = 
>> cb->highestReceived;
>> +
>> +            send_evt.info.immd.info.ctrl_msg.admo_id_count = 
>> cb->mLatestAdmoId;;
>> +            send_evt.info.immd.info.ctrl_msg.ccb_id_count = 
>> cb->mLatestCcbId;
>> +            send_evt.info.immd.info.ctrl_msg.impl_count = cb->mLatestImplId;
>> +    }
>> +
>>      if (!immnd_is_immd_up(cb)) {
>>              return NCSCC_RC_FAILURE;
>>      }
>> @@ -480,7 +505,7 @@ static int32_t immnd_iAmLoader(IMMND_CB
>>              TRACE_5("Loading is not possible, preLoader still attached");
>>              return (-3);
>>      }
>> -
>> +LOG_IN("ABT CLOUD PROTO cb->mMyEpoch:%u !=  cb->mRulingEpoch:%u", 
>> cb->mMyEpoch, cb->mRulingEpoch);
>>      if (cb->mMyEpoch != cb->mRulingEpoch) {
>>              /*We are joining the cluster, need to sync this IMMND. */
>>              return (-2);
>> @@ -536,7 +561,7 @@ static uint32_t immnd_requestSync(IMMND_
>>      uint32_t rc = NCSCC_RC_SUCCESS;
>>      IMMSV_EVT send_evt;
>>      memset(&send_evt, '\0', sizeof(IMMSV_EVT));
>> -
>> +LOG_NO("ABT REQUESTING SYNC");
>>      send_evt.type = IMMSV_EVT_TYPE_IMMD;
>>      send_evt.info.immd.type = IMMD_EVT_ND2D_REQ_SYNC;
>>      send_evt.info.immd.info.ctrl_msg.ndExecPid = cb->mMyPid;
>> @@ -546,6 +571,7 @@ static uint32_t immnd_requestSync(IMMND_
>>      if (immnd_is_immd_up(cb)) {
>>              rc = immnd_mds_msg_send(cb, NCSMDS_SVC_ID_IMMD, 
>> cb->immd_mdest_id, &send_evt);
>>      } else {
>> +            LOG_IN("Could not request sync because IMMD is not UP");
>>              rc = NCSCC_RC_FAILURE;
>>      }
>>      return (rc == NCSCC_RC_SUCCESS);
>> @@ -1571,13 +1597,19 @@ static int immnd_forkPbe(IMMND_CB *cb)
>>      if (pid == 0) {         /*child */
>>              /* TODO: Should close file-descriptors ... */
>>              /*char * const pbeArgs[5] = { (char *) execPath, "--recover", 
>> "--pbeXX", dbFilePath, 0 };*/
>> -            char * pbeArgs[5];
>> +            char * pbeArgs[6];
>>              bool veteran = (cb->mIsCoord) ? (cb->mPbeVeteran) : (cb->m2Pbe 
>> && cb->mPbeVeteranB);
>>              pbeArgs[0] = (char *) execPath;
>> -            if(veteran) {
>> +            if(veteran && cb->mScAbsenceAllowed && !cb->mPbeUsesSharedFs) {
>> +                    pbeArgs[1] =  "--recover";
>> +                    pbeArgs[2] =  "--check-objects";
>> +                    pbeArgs[3] = 
>> (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe";
>> +                    pbeArgs[4] = dbFilePath;
>> +                    pbeArgs[5] =  0;
>> +            } else if(veteran) {
>>                      pbeArgs[1] =  "--recover";
>>                      pbeArgs[2] = 
>> (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe";
>> -                    pbeArgs[3] = dbFilePath;
>> +                    pbeArgs[3] = dbFilePath;
>>                      pbeArgs[4] =  0;
>>              } else {
>>                      pbeArgs[1] = 
>> (cb->m2Pbe)?((cb->mIsCoord)?"--pbe2A":"--pbe2B"):"--pbe";
>> @@ -1685,7 +1717,7 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                              cb->mJobStart = now;
>>                      }
>>              } else {        /*We are not ready to start loading yet */
>> -                    if(cb->mIntroduced) {
>> +                    if(cb->mIntroduced==1) {
>>                              if((cb->m2Pbe == 2) && !(cb->preLoadPid)) {
>>                                      cb->preLoadPid = immnd_forkLoader(cb, 
>> true);
>>                              }
>> @@ -1833,6 +1865,7 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                              cb->mState = IMM_SERVER_READY;
>>                              immnd_ackToNid(NCSCC_RC_SUCCESS);
>>                              LOG_NO("SERVER STATE: IMM_SERVER_LOADING_SERVER 
>> --> IMM_SERVER_READY");
>> +                            immModel_setScAbsenceAllowed(cb);
>>                              cb->mJobStart = now;
>>                              if (cb->mPbeFile) {/* Pbe enabled */
>>                                      cb->mRim = 
>> immModel_getRepositoryInitMode(cb);
>> @@ -1876,6 +1909,7 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                      cb->mState = IMM_SERVER_READY;
>>                      cb->mJobStart = now;
>>                      LOG_NO("SERVER STATE: IMM_SERVER_LOADING_CLIENT --> 
>> IMM_SERVER_READY");
>> +                    immModel_setScAbsenceAllowed(cb);
>>                      if (cb->mPbeFile) {/* Pbe configured */
>>                              cb->mRim = immModel_getRepositoryInitMode(cb);
>>     
>> @@ -1896,7 +1930,9 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                      cb->mJobStart = now;
>>                      cb->mState = IMM_SERVER_READY;
>>                      immnd_ackToNid(NCSCC_RC_SUCCESS);
>> -                    LOG_NO("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> IMM 
>> SERVER READY");
>> +                    LOG_NO("SERVER STATE: IMM_SERVER_SYNC_CLIENT --> 
>> IMM_SERVER_READY");
>> +                    immModel_setScAbsenceAllowed(cb);
>> +
>>                      /*
>>                         This code case duplicated in immnd_evt.c
>>                         Search for: "ticket:#599"
>> @@ -1927,7 +1963,7 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                              cb->mStep = 0;
>>                              cb->mJobStart = now;
>>                              cb->mState = IMM_SERVER_READY;
>> -                            LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER 
>> --> IMM SERVER READY");
>> +                            LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER 
>> --> IMM_SERVER_READY");
>>                      }
>>                      if (!(cb->mStep % 60)) {
>>                              LOG_IN("Sync Phase-1, waiting for existing "
>> @@ -1944,7 +1980,7 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                              cb->mStep = 0;
>>                              cb->mJobStart = now;
>>                              cb->mState = IMM_SERVER_READY;
>> -                            LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER 
>> --> IMM SERVER READY");
>> +                            LOG_NO("SERVER STATE: IMM_SERVER_SYNC_SERVER 
>> --> IMM_SERVER_READY");
>>                      }
>>     
>>                      /* PBE may intentionally be restarted by sync. Catch 
>> this here. */
>> @@ -1977,7 +2013,7 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                                      cb->mJobStart = now;
>>                                      cb->mState = IMM_SERVER_READY;
>>                                      immnd_abortSync(cb);
>> -                                    LOG_NO("SERVER STATE: 
>> IMM_SERVER_SYNC_SERVER --> IMM SERVER READY");
>> +                                    LOG_NO("SERVER STATE: 
>> IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY");
>>                              } else {
>>                                      LOG_IN("Sync Phase-2: Ccbs are 
>> terminated, IMM in "
>>                                             "read-only mode, forked sync 
>> process pid:%u", cb->syncPid);
>> @@ -1991,7 +2027,7 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                                      cb->mStep = 0;
>>                                      cb->mJobStart = now;
>>                                      cb->mState = IMM_SERVER_READY;
>> -                                    LOG_NO("SERVER STATE: 
>> IMM_SERVER_SYNC_SERVER --> IMM SERVER READY");
>> +                                    LOG_NO("SERVER STATE: 
>> IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY");
>>                              } else if (!(cb->mSyncFinalizing)) {
>>                                      int status = 0;
>>                                      if (waitpid(cb->syncPid, &status, 
>> WNOHANG) > 0) {
>> @@ -2031,6 +2067,11 @@ uint32_t immnd_proc_server(uint32_t *tim
>>                      }
>>              }
>>     
>> +            if(cb->mIntroduced == 2) {
>> +                    immnd_introduceMe(cb);
>> +                    break;
>> +            }
>> +
>>              coord = immnd_iAmCoordinator(cb);
>>     
>>              if (cb->pbePid > 0) {
>> @@ -2275,3 +2316,28 @@ void immnd_dump_client_info(IMMND_IMM_CL
>>     }
>>     
>>     #endif
>> +
>> +/* Only for scAbsenceAllowed (headless hydra) */
>> +void immnd_proc_discard_other_nodes(IMMND_CB *cb)
>> +{
>> +    TRACE_ENTER();
>> +    /* Discard all clients. */
>> +
>> +    IMMND_IMM_CLIENT_NODE *cl_node = NULL;
>> +    immnd_client_node_getnext(cb, 0, &cl_node);
>> +    while (cl_node) {
>> +            LOG_NO("Removing client id:%llx sv_id:%u", 
>> cl_node->imm_app_hdl, cl_node->sv_id);
>> +            osafassert(immnd_proc_imma_discard_connection(cb, cl_node, 
>> true));
>> +            LOG_NO("ABT discard_connection OK");
>> +            osafassert(immnd_client_node_del(cb, cl_node) == 
>> NCSCC_RC_SUCCESS);
>> +            free(cl_node);
>> +            cl_node = NULL;
>> +            LOG_NO("ABT Client node REMOVED");
>> +            immnd_client_node_getnext(cb, 0, &cl_node);
>> +    }
>> +
>> +    LOG_NO("ABT DONE REMOVING CLIENTS ENTERING immModel_isolateThisNode(cb) 
>> ");
>> +    immModel_isolateThisNode(cb);
>> +    immModel_abortNonCriticalCcbs(cb);
>> +    TRACE_LEAVE();
>> +}
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Opensaf-devel mailing list
>> Opensaf-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>
>>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>> _______________________________________________
>> Opensaf-devel mailing list
>> Opensaf-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Opensaf-devel mailing list
> Opensaf-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to