First of all, please double-check that you have applied the immtools patch, since "999" was a magic number in immxml-nodegen. Secondly, could you re-run your test with some smaller node-ids, like for example PL-298 & PL-299 instead of PL-998 & PL-999? There are scalability problems in OpenSAF when you have more than approximately 300 nodes, and from what I have seen these scalability problems seem to be there to some extent even if you just have a few nodes, but the slot numbers are not in sequence (as in your example, when you use slot id 998 & 999). TIPC seems to scale better than TCP.
This patch is only addressing the limit for the address space, which was previously eight bits and thus limited to at most 255 nodes. The scalability issues is a separate problem. regards, Anders Widell On 03/29/2016 10:21 AM, A V Mahesh wrote: > Hi, > > Some times Standby never joins the cluster ( see below ) > we will test TIPC and provide, is the behavior is consistent across > the transports > > ============================================ > # /etc/init.d/opensafd restart > Mar 29 13:45:07 SC-2 opensafd: Stopping OpenSAF Services > Stopping OpenSAF Services: Mar 29 13:45:07 SC-2 opensafd: OpenSAF > services successfully stopped > done > Mar 29 13:45:07 SC-2 opensafd: Starting OpenSAF Services(5.0.M0 - ) > (Using TCP) > Starting OpenSAF Services (Using TCP):Mar 29 13:45:07 SC-2 > osafdtmd[17441]: Started > Mar 29 13:45:07 SC-2 osafrded[17458]: Started > Mar 29 13:45:07 SC-2 osafdtmd[17441]: NO Established contact with 'SC-1' > Mar 29 13:45:07 SC-2 osafrded[17458]: NO Peer rde@2010f has active > state => Assigning Standby role to this node > Mar 29 13:45:07 SC-2 osaffmd[17468]: Started > Mar 29 13:45:07 SC-2 osafimmd[17478]: Started > Mar 29 13:45:07 SC-2 osafimmnd[17489]: Started > Mar 29 13:45:07 SC-2 osafimmnd[17489]: NO IMMD service is UP ... > ScAbsenseAllowed?:0 introduced?:0 > Mar 29 13:45:07 SC-2 osafimmnd[17489]: NO SERVER STATE: > IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING > Mar 29 13:45:08 SC-2 osafimmnd[17489]: NO SETTING COORD TO 0 CLOUD PROTO > Mar 29 13:45:08 SC-2 osafimmnd[17489]: NO SERVER STATE: > IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING > Mar 29 13:45:08 SC-2 osafimmnd[17489]: NO SERVER STATE: > IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING > Mar 29 13:45:08 SC-2 osafimmnd[17489]: NO NODE STATE-> IMM_NODE_ISOLATED > Mar 29 13:45:08 SC-2 osafdtmd[17441]: NO Established contact with 'PL-998' > Mar 29 13:45:09 SC-2 osafimmd[17478]: NO SBY: Ruling epoch noted as:4 > Mar 29 13:45:09 SC-2 osafimmd[17478]: NO IMMND coord at 2010f > Mar 29 13:45:09 SC-2 osafimmnd[17489]: NO NODE STATE-> > IMM_NODE_W_AVAILABLE > Mar 29 13:45:09 SC-2 osafimmnd[17489]: NO SERVER STATE: > IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT > Mar 29 13:45:09 SC-2 osafdtmd[17441]: NO Established contact with 'PL-999' > Mar 29 13:45:16 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND > process at node 2010f old epoch: 3 new epoch:4 > Mar 29 13:45:16 SC-2 osafimmd[17478]: NO IMMND coord at 2010f > Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO NODE STATE-> > IMM_NODE_FULLY_AVAILABLE 2729 > Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO RepositoryInitModeT is > SA_IMM_INIT_FROM_FILE > Mar 29 13:45:16 SC-2 osafimmnd[17489]: WA IMM Access Control mode is > DISABLED! > Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO Epoch set to 4 in ImmModel > Mar 29 13:45:16 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND > process at node 2020f old epoch: 0 new epoch:4 > Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO SERVER STATE: > IMM_SERVER_SYNC_CLIENT --> IMM_SERVER_READY > Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO ImmModel received > scAbsenceAllowed 0 > Mar 29 13:45:16 SC-2 osaflogd[17507]: Started > Mar 29 13:45:16 SC-2 osaflogd[17507]: NO LOGSV_DATA_GROUPNAME not found > Mar 29 13:45:16 SC-2 osaflogd[17507]: NO LOG root directory is: > "/var/log/opensaf/saflog" > Mar 29 13:45:16 SC-2 osaflogd[17507]: NO LOG data group is: "" > Mar 29 13:45:16 SC-2 osaflogd[17507]: NO LGS_MBCSV_VERSION = 5 > Mar 29 13:45:16 SC-2 osafntfd[17518]: Started > Mar 29 13:45:16 SC-2 osafclmd[17532]: Started > Mar 29 13:45:16 SC-2 osafclmna[17542]: Started > Mar 29 13:45:16 SC-2 osafclmna[17542]: NO > safNode=SC-2,safCluster=myClmCluster Joined cluster, nodeid=2020f > Mar 29 13:45:16 SC-2 osafamfd[17551]: Started > Mar 29 13:45:17 SC-2 osafimmnd[17489]: NO Implementer (applier) > connected: 27 (@OpenSafImmReplicatorB) <6, 2020f> > Mar 29 13:45:17 SC-2 osafntfimcnd[17524]: NO Started > Mar 29 13:45:20 SC-2 osafimmd[17478]: NO SBY: Ruling epoch noted as:5 > Mar 29 13:45:20 SC-2 osafimmd[17478]: NO IMMND coord at 2010f > Mar 29 13:45:20 SC-2 osafimmnd[17489]: NO NODE STATE-> > IMM_NODE_R_AVAILABLE > Mar 29 13:45:28 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND > process at node 2010f old epoch: 4 new epoch:5 > Mar 29 13:45:28 SC-2 osafimmd[17478]: NO IMMND coord at 2010f > Mar 29 13:45:28 SC-2 osafimmnd[17489]: NO NODE STATE-> > IMM_NODE_FULLY_AVAILABLE 18093 > Mar 29 13:45:28 SC-2 osafimmnd[17489]: NO Epoch set to 5 in ImmModel > Mar 29 13:45:28 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND > process at node 2020f old epoch: 4 new epoch:5 > Mar 29 13:45:28 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND > process at node 2e60c old epoch: 0 new epoch:5 > Mar 29 13:45:28 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND > process at node 2e70c old epoch: 0 new epoch:5 > Mar 29 13:45:29 SC-2 osafimmnd[17489]: NO Implementer connected: 28 > (MsgQueueService190220) <0, 2e70c> > Mar 29 13:45:29 SC-2 osafimmnd[17489]: NO Implementer connected: 29 > (MsgQueueService189964) <0, 2e60c> > ============================================ > > On 3/29/2016 12:24 PM, A V Mahesh wrote: >> >> My configuration are : >> >> slot 1 with 'SC-1' >> slot 2 with 'SC-2' >> slot 998 with 'PL-998' >> slot 999 with 'PL-999' >> >> -AVM >> >> On 3/29/2016 12:17 PM, A V Mahesh wrote: >>> >>> Hi Anders Widell, >>> >>> I an not able to bring up cluster with TCP ( i stared with TCP it >>> self because it will touch different code flows in MDS ) >>> and observed different time different behavior. >>> >>> Is this [#1613] tested with TCP ? or am i missing any thing ? >>> >>> My configuration are : >>> >>> slot 1 with 'SC-1' >>> slot 1 with 'SC-2' >>> slot 998 with 'PL-998' >>> slot 999 with 'PL-999' >>> >>> ============================================================================================= >>> Mar 29 11:58:57 SC-1 opensafd: Stopping OpenSAF Services >>> Mar 29 11:58:58 SC-1 opensafd: OpenSAF services successfully stopped >>> Mar 29 11:58:58 SC-1 opensafd: Starting OpenSAF Services(5.0.M0 - ) >>> (Using TCP) >>> Mar 29 11:58:58 SC-1 osafdtmd[3253]: Started >>> Mar 29 11:58:58 SC-1 osafrded[3270]: Started >>> Mar 29 11:59:00 SC-1 osafrded[3270]: NO No peer available => Setting >>> Active role for this node >>> Mar 29 11:59:00 SC-1 osaffmd[3282]: Started >>> Mar 29 11:59:00 SC-1 osafimmd[3292]: Started >>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: Started >>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: NO IMMD service is UP ... >>> ScAbsenseAllowed?:0 introduced?:0 >>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO New IMMND process is on >>> ACTIVE Controller at 2010f >>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO First SC IMMND (OpenSAF 4.4 >>> or later) attached 2010f >>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO First IMMND at SC to attach >>> is NOT configured for PBE >>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO Attached Nodes:1 Accepted >>> nodes:0 KnownVeteran:0 doReply:1 >>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO First IMMND on SC found at >>> 2010f this IMMD at 2010f. Cluster is loading, *not* 2PBE => >>> designating that IMMND as coordinator >>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: NO SERVER STATE: >>> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING >>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: NO This IMMND is now the NEW Coord >>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: NO SETTING COORD TO 1 CLOUD PROTO >>> Mar 29 11:59:03 SC-1 osafimmnd[3303]: NO SERVER STATE: >>> IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING >>> Mar 29 11:59:03 SC-1 osafimmd[3292]: NO Successfully announced >>> loading. New ruling epoch:1 >>> Mar 29 11:59:03 SC-1 osafimmnd[3303]: NO SERVER STATE: >>> IMM_SERVER_LOADING_PENDING --> IMM_SERVER_LOADING_SERVER >>> Mar 29 11:59:03 SC-1 osafimmnd[3303]: NO NODE STATE-> IMM_NODE_LOADING >>> Mar 29 11:59:03 SC-1 osafimmloadd: NO Load starting >>> Mar 29 11:59:03 SC-1 osafimmloadd: NO ***** Loading from XML file >>> imm.xml at /etc/opensaf ***** >>> Mar 29 11:59:03 SC-1 osafimmloadd: NO The class OpensafImm has been >>> created since it was missing from the imm.xml load file >>> Mar 29 11:59:03 SC-1 osafimmloadd: IN Class OsafImmPbeRt created >>> Mar 29 11:59:03 SC-1 osafimmloadd: NO The class OsafImmPbeRt has >>> been created since it was missing from the imm.xml load file >>> Mar 29 11:59:08 SC-1 osafdtmd[3253]: NO Established contact with 'SC-2' >>> Mar 29 11:59:08 SC-1 osafimmd[3292]: NO New IMMND process is on >>> STANDBY Controller at 2020f >>> Mar 29 11:59:08 SC-1 osafimmd[3292]: WA IMMND on controller (not >>> currently coord) requests sync >>> Mar 29 11:59:08 SC-1 osafimmd[3292]: WA No coordinator IMMND known >>> (case B) - ignoring sync request >>> Mar 29 11:59:08 SC-1 osafimmd[3292]: NO Node 2020f request sync >>> sync-pid:9043 epoch:0 >>> Mar 29 11:59:12 SC-1 osafimmloadd: NO The >>> opensafImm=opensafImm,safApp=safImmService object of class >>> OpensafImm has been created since it was missing from the imm.xml >>> load file >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Ccb 1 COMMITTED (IMMLOADER) >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Closing admin owner >>> IMMLOADER id(1), loading of IMM done >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO NODE STATE-> >>> IMM_NODE_FULLY_AVAILABLE 2729 >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO RepositoryInitModeT is >>> SA_IMM_INIT_FROM_FILE >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: WA IMM Access Control mode is >>> DISABLED! >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO opensafImmNostdFlags >>> changed to: 0xf6 >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Epoch set to 2 in ImmModel >>> Mar 29 11:59:12 SC-1 osafimmd[3292]: NO ACT: New Epoch for IMMND >>> process at node 2010f old epoch: 1 new epoch:2 >>> Mar 29 11:59:12 SC-1 osafimmd[3292]: NO Ruling epoch changed to:2 >>> Mar 29 11:59:12 SC-1 osafimmloadd: NO Load ending normally >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO SERVER STATE: >>> IMM_SERVER_LOADING_SERVER --> IMM_SERVER_READY >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO ImmModel received >>> scAbsenceAllowed 0 >>> Mar 29 11:59:12 SC-1 osaflogd[3327]: Started >>> Mar 29 11:59:12 SC-1 osaflogd[3327]: NO LOGSV_DATA_GROUPNAME not found >>> Mar 29 11:59:12 SC-1 osaflogd[3327]: NO LOG root directory is: >>> "/var/log/opensaf/saflog" >>> Mar 29 11:59:12 SC-1 osaflogd[3327]: NO LOG data group is: "" >>> Mar 29 11:59:12 SC-1 osaflogd[3327]: NO LGS_MBCSV_VERSION = 5 >>> Mar 29 11:59:12 SC-1 osafdtmd[3253]: NO Established contact with >>> 'PL-998' >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Implementer connected: 1 >>> (safLogService) <2, 2010f> >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO implementer for class >>> 'OpenSafLogConfig' is safLogService => class extent is safe. >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO implementer for class >>> 'SaLogStreamConfig' is safLogService => class extent is safe. >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Implementer (applier) >>> connected: 2 (@safLogService_appl) <11, 2010f> >>> Mar 29 11:59:12 SC-1 osafntfd[3340]: Started >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Implementer (applier) >>> connected: 3 (@OpenSafImmReplicatorA) <17, 2010f> >>> Mar 29 11:59:12 SC-1 osafntfimcnd[3348]: NO Started >>> Mar 29 11:59:12 SC-1 osafclmd[3354]: Started >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Implementer connected: 4 >>> (safClmService) <19, 2010f> >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO implementer for class >>> 'SaClmNode' is safClmService => class extent is safe. >>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO implementer for class >>> 'SaClmCluster' is safClmService => class extent is safe. >>> Mar 29 11:59:12 SC-1 osafimmd[3292]: WA No coordinator IMMND known >>> (case B) - ignoring sync request >>> Mar 29 11:59:12 SC-1 osafimmd[3292]: NO Node 2e60c request sync >>> sync-pid:7734 epoch:0 >>> Mar 29 11:59:12 SC-1 osafclmna[3364]: Started >>> Mar 29 11:59:12 SC-1 osafclmna[3364]: NO >>> safNode=SC-1,safCluster=myClmCluster Joined cluster, nodeid=2010f >>> Mar 29 11:59:12 SC-1 osafamfd[3373]: Started >>> Mar 29 11:59:16 SC-1 osafdtmd[3253]: NO Established contact with >>> 'PL-999' >>> Mar 29 11:59:16 SC-1 osafimmd[3292]: NO Extended intro from node 2e70c >>> Mar 29 11:59:16 SC-1 osafimmd[3292]: WA PBE not configured at first >>> attached SC-immnd, but Pbe is configured for immnd at 2e70c - >>> possible upgrade from pre 4.4 >>> Mar 29 11:59:16 SC-1 osafimmd[3292]: WA No coordinator IMMND known >>> (case B) - ignoring sync request >>> Mar 29 11:59:16 SC-1 osafimmd[3292]: NO Node 2e70c request sync >>> sync-pid:7615 epoch:0 >>> Mar 29 11:59:17 SC-1 osafimmnd[3303]: NO Announce sync, epoch:3 >>> Mar 29 11:59:17 SC-1 osafimmnd[3303]: NO SERVER STATE: >>> IMM_SERVER_READY --> IMM_SERVER_SYNC_SERVER >>> Mar 29 11:59:17 SC-1 osafimmd[3292]: NO Successfully announced sync. >>> New ruling epoch:3 >>> Mar 29 11:59:17 SC-1 osafimmnd[3303]: NO NODE STATE-> >>> IMM_NODE_R_AVAILABLE >>> Mar 29 11:59:17 SC-1 osafimmloadd: NO Sync starting >>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: WA STOPPING sync process pid >>> 3386 after five minutes >>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: ER SYNC APPARENTLY FAILED status:0 >>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: NO -SERVER STATE: >>> IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY >>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: NO NODE STATE-> >>> IMM_NODE_FULLY_AVAILABLE (2624) >>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: NO Epoch set to 3 in ImmModel >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: NO ACT: New Epoch for IMMND >>> process at node 2010f old epoch: 2 new epoch:3 >>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: NO Coord broadcasting >>> ABORT_SYNC, epoch:3 >>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: WA IMMND - Client Node Get >>> Failed for cli_hdl:708669735183 >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA Successfully aborted sync. >>> Epoch:3 >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA IMMND on controller (not >>> currently coord) requests sync >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA No coordinator IMMND known >>> (case B) - ignoring sync request >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: NO Node 2020f request sync >>> sync-pid:9043 epoch:0 >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA No coordinator IMMND known >>> (case B) - ignoring sync request >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: NO Node 2e60c request sync >>> sync-pid:7734 epoch:0 >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA No coordinator IMMND known >>> (case B) - ignoring sync request >>> Mar 29 12:04:18 SC-1 osafimmd[3292]: NO Node 2e70c request sync >>> sync-pid:7615 epoch:0 >>> Mar 29 12:04:19 SC-1 osafimmnd[3303]: WA MDS Send Failed >>> Mar 29 12:04:19 SC-1 osafimmnd[3303]: WA Error code 2 returned for >>> message type 17 - ignoring >>> Mar 29 12:04:19 SC-1 osafimmnd[3303]: WA Global ABORT SYNC received >>> for epoch 3 >>> Mar 29 12:04:22 SC-1 osafimmnd[3303]: NO Announce sync, epoch:4 >>> Mar 29 12:04:22 SC-1 osafimmnd[3303]: NO SERVER STATE: >>> IMM_SERVER_READY --> IMM_SERVER_SYNC_SERVER >>> Mar 29 12:04:22 SC-1 osafimmd[3292]: NO Successfully announced sync. >>> New ruling epoch:4 >>> Mar 29 12:04:22 SC-1 osafimmnd[3303]: NO NODE STATE-> >>> IMM_NODE_R_AVAILABLE >>> Mar 29 12:04:22 SC-1 osafimmloadd: NO Sync starting >>> Mar 29 12:07:08 SC-1 osafimmnd[3303]: NO Global discard node >>> received for nodeId:2020f pid:9043 >>> Mar 29 12:07:12 SC-1 osafimmnd[3303]: NO Global discard node >>> received for nodeId:2e60c pid:7734 >>> Mar 29 12:07:16 SC-1 osafimmnd[3303]: NO Global discard node >>> received for nodeId:2e70c pid:7615 >>> Mar 29 12:07:23 SC-1 osafimmd[3292]: NO New IMMND process is on >>> STANDBY Controller at 2020f >>> Mar 29 12:07:23 SC-1 osafimmd[3292]: WA PBE is configured at first >>> attached SC-immnd, but no Pbe file is configured for immnd at node >>> 2020f - rejecting node >>> Mar 29 12:07:23 SC-1 osafimmd[3292]: WA Error returned from >>> processing message err:2 msg-type:2 >>> Mar 29 12:07:23 SC-1 osafimmnd[3303]: NO Global discard node >>> received for nodeId:2020f pid:9670 >>> Mar 29 12:07:27 SC-1 osafimmd[3292]: WA PBE is configured at first >>> attached SC-immnd, but no Pbe file is configured for immnd at node >>> 2e60c - rejecting node >>> Mar 29 12:07:27 SC-1 osafimmd[3292]: WA Error returned from >>> processing message err:2 msg-type:2 >>> Mar 29 12:07:27 SC-1 osafimmnd[3303]: NO Global discard node >>> received for nodeId:2e60c pid:8359 >>> Mar 29 12:07:29 SC-1 osafimmloadd: IN Synced 33170 objects in total >>> Mar 29 12:07:29 SC-1 osafimmnd[3303]: NO NODE STATE-> >>> IMM_NODE_FULLY_AVAILABLE 17518 >>> Mar 29 12:07:29 SC-1 osafimmloadd: NO Sync ending normally >>> Mar 29 12:07:29 SC-1 osafimmnd[3303]: NO Epoch set to 4 in ImmModel >>> Mar 29 12:07:29 SC-1 osafimmd[3292]: NO ACT: New Epoch for IMMND >>> process at node 2010f old epoch: 3 new epoch:4 >>> Mar 29 12:07:29 SC-1 osafimmnd[3303]: NO SERVER STATE: >>> IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY >>> Mar 29 12:07:31 SC-1 osafimmd[3292]: NO Extended intro from node 2e70c >>> Mar 29 12:07:31 SC-1 osafimmd[3292]: WA No coordinator IMMND known >>> (case B) - ignoring sync request >>> Mar 29 12:07:31 SC-1 osafimmd[3292]: NO Node 2e70c request sync >>> sync-pid:8240 epoch:0 >>> Mar 29 12:07:33 SC-1 osafimmnd[3303]: NO Announce sync, epoch:5 >>> Mar 29 12:07:33 SC-1 osafimmnd[3303]: NO SERVER STATE: >>> IMM_SERVER_READY --> IMM_SERVER_SYNC_SERVER >>> Mar 29 12:07:33 SC-1 osafimmd[3292]: NO Successfully announced sync. >>> New ruling epoch:5 >>> Mar 29 12:07:33 SC-1 osafimmnd[3303]: NO NODE STATE-> >>> IMM_NODE_R_AVAILABLE >>> Mar 29 12:07:33 SC-1 osafimmloadd: NO Sync starting >>> Mar 29 12:07:38 SC-1 osafimmd[3292]: NO New IMMND process is on >>> STANDBY Controller at 2020f >>> Mar 29 12:07:38 SC-1 osafimmd[3292]: WA PBE is configured at first >>> attached SC-immnd, but no Pbe file is configured for immnd at node >>> 2020f - rejecting node >>> Mar 29 12:07:38 SC-1 osafimmd[3292]: WA Error returned from >>> processing message err:2 msg-type:2 >>> Mar 29 12:07:38 SC-1 osafimmnd[3303]: NO Global discard node >>> received for nodeId:2020f pid:9709 >>> Mar 29 12:07:38 SC-1 osafimmd[3292]: WA IMMD lost contact with peer >>> IMMD (NCSMDS_RED_DOWN) >>> Mar 29 12:07:39 SC-1 osafdtmd[3253]: NO Lost contact with 'SC-2' >>> Mar 29 12:07:39 SC-1 osaffmd[3282]: NO Node Down event for node id >>> 2020f: >>> Mar 29 12:07:39 SC-1 osaffmd[3282]: NO Current role: ACTIVE >>> Mar 29 12:07:39 SC-1 osaffmd[3282]: Rebooting OpenSAF NodeId = 0 EE >>> Name = No EE Mapped, Reason: Failover occurred, but this node is not >>> yet ready, OwnNodeId = 131343, SupervisionTime = 60 >>> Mar 29 12:07:39 SC-1 opensaf_reboot: Rebooting local node; timeout=60 >>> ============================================================================================= >>> >>> -AVM >>> >>> On 3/18/2016 9:38 PM, Anders Widell wrote: >>>> osaf/libs/core/mds/include/mds_dt.h | 26 +++++++++++++++++--------- >>>> osaf/libs/core/mds/mds_c_db.c | 26 ++++++++++++-------------- >>>> 2 files changed, 29 insertions(+), 23 deletions(-) >>>> >>>> >>>> Support up to 4095 nodes in the flat addressing scheme for TIPC, by >>>> encoding the >>>> slot ID in the lower eight bits and the ones' complement of the subslot ID >>>> in >>>> bits 8 to 11 in the node identifier of the TIPC address. The reason for >>>> taking >>>> the ones' complement of the subslot ID is backwards compatibility with >>>> existing >>>> installations, so that this enhancement can be upgraded in-service. >>>> >>>> diff --git a/osaf/libs/core/mds/include/mds_dt.h >>>> b/osaf/libs/core/mds/include/mds_dt.h >>>> --- a/osaf/libs/core/mds/include/mds_dt.h >>>> +++ b/osaf/libs/core/mds/include/mds_dt.h >>>> @@ -237,7 +237,8 @@ bool mdtm_mailbox_mbx_cleanup(NCSCONTEXT >>>> >>>> /* >>>> * In the default flat addressing scheme, TIPC node addresses looks like >>>> - * 1.1.1, 1.1.2 etc. >>>> + * 1.1.1, 1.1.2 etc. The ones' complement of the subslot ID is shifted 8 >>>> + * bits up and the slot ID is added in the 8 LSB. >>>> * In the non flat (old/legacy) addressing scheme TIPC addresses looks >>>> like >>>> * 1.1.31, 1.1.47. The slot ID is shifted 4 bits up and subslot ID is >>>> added >>>> * in the 4 LSB. >>>> @@ -248,13 +249,20 @@ bool mdtm_mailbox_mbx_cleanup(NCSCONTEXT >>>> >>>> #if (MDS_USE_SUBSLOT_ID == 0) >>>> #define MDS_TIPC_NODE_ID_MIN 0x01001001 >>>> -#define MDS_TIPC_NODE_ID_MAX 0x010010ff >>>> -#define MDS_NCS_NODE_ID_MIN (MDS_NCS_CHASSIS_ID|0x0000010f) >>>> -#define MDS_NCS_NODE_ID_MAX (MDS_NCS_CHASSIS_ID|0x0000ff0f) >>>> -#define m_MDS_GET_NCS_NODE_ID_FROM_TIPC_NODE_ID(node) \ >>>> - (NODE_ID)( MDS_NCS_CHASSIS_ID | (((node)&0xff)<<8) | (0xf)) >>>> -#define m_MDS_GET_TIPC_NODE_ID_FROM_NCS_NODE_ID(node) \ >>>> - (NODE_ID)( MDS_TIPC_COMMON_ID | (((node)&0xff00)>>8) ) >>>> +#define MDS_TIPC_NODE_ID_MAX 0x01001fff >>>> +static inline NODE_ID m_MDS_GET_NCS_NODE_ID_FROM_TIPC_NODE_ID(NODE_ID >>>> node) { >>>> + return MDS_NCS_CHASSIS_ID | ((node & 0xff) << 8) | (((node & >>>> 0xf00) >> 8) ^ 0xf); >>>> +} >>>> +static inline NODE_ID m_MDS_GET_TIPC_NODE_ID_FROM_NCS_NODE_ID(NODE_ID >>>> node) { >>>> + return MDS_TIPC_COMMON_ID | ((node & 0xff00) >> 8) | (((node & >>>> 0xf) ^ 0xf) << 8); >>>> +} >>>> +static inline uint32_t m_MDS_CHECK_TIPC_NODE_ID_RANGE(NODE_ID node) { >>>> + return node < MDS_TIPC_NODE_ID_MIN || node > MDS_TIPC_NODE_ID_MAX ? >>>> + NCSCC_RC_FAILURE : NCSCC_RC_SUCCESS; >>>> +} >>>> +static inline uint32_t m_MDS_CHECK_NCS_NODE_ID_RANGE(NODE_ID node) { >>>> + return >>>> m_MDS_CHECK_TIPC_NODE_ID_RANGE(m_MDS_GET_TIPC_NODE_ID_FROM_NCS_NODE_ID(node)); >>>> +} >>>> #else >>>> #define MDS_TIPC_NODE_ID_MIN 0x01001001 >>>> #define MDS_TIPC_NODE_ID_MAX 0x0100110f >>>> @@ -264,10 +272,10 @@ bool mdtm_mailbox_mbx_cleanup(NCSCONTEXT >>>> (NODE_ID)( MDS_NCS_CHASSIS_ID | ((node)&0xf) | >>>> (((node)&0xff0)<<4)) >>>> #define m_MDS_GET_TIPC_NODE_ID_FROM_NCS_NODE_ID(node) \ >>>> (NODE_ID)( MDS_TIPC_COMMON_ID | (((node)&0xff00)>>4) | >>>> ((node)&0xf) ) >>>> -#endif >>>> >>>> #define m_MDS_CHECK_TIPC_NODE_ID_RANGE(node) >>>> (((((node)<MDS_TIPC_NODE_ID_MIN)||((node)>MDS_TIPC_NODE_ID_MAX))?NCSCC_RC_FAILURE:NCSCC_RC_SUCCESS)) >>>> #define m_MDS_CHECK_NCS_NODE_ID_RANGE(node) >>>> (((((node)<MDS_NCS_NODE_ID_MIN)||((node)>MDS_NCS_NODE_ID_MAX))?NCSCC_RC_FAILURE:NCSCC_RC_SUCCESS)) >>>> +#endif >>>> >>>> /* ******************************************** */ >>>> /* ******************************************** */ >>>> diff --git a/osaf/libs/core/mds/mds_c_db.c b/osaf/libs/core/mds/mds_c_db.c >>>> --- a/osaf/libs/core/mds/mds_c_db.c >>>> +++ b/osaf/libs/core/mds/mds_c_db.c >>>> @@ -37,14 +37,13 @@ void get_adest_details(MDS_DEST adest, c >>>> char *token, *saveptr; >>>> struct stat s; >>>> uint32_t process_id = 0; >>>> - NCS_PHY_SLOT_ID phy_slot; >>>> - NCS_SUB_SLOT_ID sub_slot; >>>> + SlotSubslotId slot_subslot_id; >>>> char pid_path[1024]; >>>> char *pid_name = NULL; >>>> char process_name[MDS_MAX_PROCESS_NAME_LEN]; >>>> bool remote = false; >>>> >>>> - m_NCS_GET_PHYINFO_FROM_NODE_ID(m_NCS_NODE_ID_FROM_MDS_DEST(adest), >>>> NULL, &phy_slot, &sub_slot); >>>> + slot_subslot_id = >>>> GetSlotSubslotIdFromNodeId(m_NCS_NODE_ID_FROM_MDS_DEST(adest)); >>>> >>>> if (!tipc_mode_enabled) { >>>> process_id = m_MDS_GET_PROCESS_ID_FROM_ADEST(adest); >>>> @@ -111,11 +110,11 @@ void get_adest_details(MDS_DEST adest, c >>>> } >>>> >>>> if (remote == true) >>>> - snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN, >>>> "<rem_nodeid[%d]:%s>", >>>> - phy_slot, process_name); >>>> + snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN, >>>> "<rem_nodeid[%u]:%s>", >>>> + slot_subslot_id, process_name); >>>> else >>>> - snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN, >>>> "<nodeid[%d]:%s>", >>>> - phy_slot, process_name); >>>> + snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN, >>>> "<nodeid[%u]:%s>", >>>> + slot_subslot_id, process_name); >>>> >>>> m_MDS_LOG_DBG("MDS:DB: adest_details: %s ", adest_details); >>>> m_MDS_LEAVE(); >>>> @@ -129,8 +128,7 @@ void get_adest_details(MDS_DEST adest, c >>>> void get_subtn_adest_details(MDS_PWE_HDL pwe_hdl, MDS_SVC_ID svc_id, >>>> MDS_DEST adest, char* adest_details) >>>> { >>>> uint32_t process_id = 0; >>>> - NCS_PHY_SLOT_ID phy_slot; >>>> - NCS_SUB_SLOT_ID sub_slot; >>>> + SlotSubslotId slot_subslot_id; >>>> char process_name[MDS_MAX_PROCESS_NAME_LEN]; >>>> bool remote = false; >>>> MDS_SVC_INFO *svc_info = NULL; >>>> @@ -139,7 +137,7 @@ void get_subtn_adest_details(MDS_PWE_HDL >>>> char *pid_name = NULL; >>>> struct stat s; >>>> >>>> - m_NCS_GET_PHYINFO_FROM_NODE_ID(m_NCS_NODE_ID_FROM_MDS_DEST(adest), >>>> NULL, &phy_slot, &sub_slot); >>>> + slot_subslot_id = >>>> GetSlotSubslotIdFromNodeId(m_NCS_NODE_ID_FROM_MDS_DEST(adest)); >>>> process_id = m_MDS_GET_PROCESS_ID_FROM_ADEST(adest); >>>> >>>> if (NCSCC_RC_SUCCESS == mds_mcm_check_intranode(adest)) { >>>> @@ -185,11 +183,11 @@ void get_subtn_adest_details(MDS_PWE_HDL >>>> } >>>> >>>> if (remote == true) >>>> - snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN, >>>> "<rem_node[%d]:%s>", >>>> - phy_slot, process_name); >>>> + snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN, >>>> "<rem_node[%u]:%s>", >>>> + slot_subslot_id, process_name); >>>> else >>>> - snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN, >>>> "<node[%d]:%s>", >>>> - phy_slot, process_name); >>>> + snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN, >>>> "<node[%u]:%s>", >>>> + slot_subslot_id, process_name); >>>> done: >>>> m_MDS_LOG_DBG("MDS:DB: adest_details: %s ", adest_details); >>>> m_MDS_LEAVE(); >>> >> > ------------------------------------------------------------------------------ Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel