How we will go about changing /etc/opensaf/nodeinit.conf.controller &
/etc/opensaf/nodeinit.conf.payload TIME-OUT
tuned values based on number of node at build time?
-AVM
On 3/30/2016 3:48 PM, Anders Widell wrote:
> Ok, good that you got it working. I see that you actually generate an
> IMM configuration with 1000 nodes. Then I am not so surprised that you
> bump into scalability problems. It should work smoother if the
> configuration only contains a few nodes. I.e. use the IMM XML tools to
> generate a cluster with two controllers and two payloads. Then rename
> PL-3 to PL-998 and PL-4 to PL-999 in the generated imm.xml file.
>
> regards,
> Anders Widell
>
> On 03/30/2016 11:20 AM, A V Mahesh wrote:
>> Hi Anders Widell,
>>
>> Please find my comments
>>
>> On 3/29/2016 2:49 PM, Anders Widell wrote:
>>> First of all, please double-check that you have applied the immtools
>>> patch, since "999" was a magic number in immxml-nodegen. Secondly,
>>> could you re-run your test with some smaller node-ids, like for
>>> example PL-298 & PL-299 instead of PL-998 & PL-999? There are
>>> scalability problems in OpenSAF when you have more than
>>> approximately 300 nodes, and from what I have seen these scalability
>>> problems seem to be there to some extent even if you just have a few
>>> nodes, but the slot numbers are not in sequence (as in your example,
>>> when you use slot id 998 & 999). TIPC seems to scale better than TCP.
>> [AVM] I am able to bring up payload with slot number up to "999"
>> with out any issue with both TCP & TIPC ,
>> just by tuning osaf-amfnd , osaf-amfd , osaf-immnd
>> , osaf-immd TIME-OUT values
>> in clc-cli start-up scripts (
>> /etc/opensaf/nodeinit.conf.controller &
>> /etc/opensaf/nodeinit.conf.payload )
>>
>> configuration :
>>
>> 1) 4 node setup : 2 Controllers & 2 payloads
>> 2) xml configuration : ./immxml-clustersize -s 2 -p 999
>> 3) Controller slot number : SC-1
>> 4) Controller slot number : SC-2
>> 5) payload slot number : PL-998
>> 5) payload slot number : PL-999
>>
>>>
>>> This patch is only addressing the limit for the address space, which
>>> was previously eight bits and thus limited to at most 255 nodes. The
>>> scalability issues is a separate problem.
>>
>> [AVM] Bring up and done some switchover & fail-over i haven seen
>> any problem of having slot number up to "999" for payload in both
>> TCP & TIPC transport.
>> so along with this patch we also need to give a new
>> configuration option at build time ( some thing like /configure
>> [OPTION]... [VAR=VALUE]... --numnodes=<777> )
>> based on the number of nodes we need generate the
>> tuned TIME-OUT values in clc-cli start-up scripts (
>> /etc/opensaf/nodeinit.conf.controller &
>> /etc/opensaf/nodeinit.conf.payload )
>>
>>
>> -AVM
>>>
>>> regards,
>>> Anders Widell
>>>
>>> On 03/29/2016 10:21 AM, A V Mahesh wrote:
>>>> Hi,
>>>>
>>>> Some times Standby never joins the cluster ( see below )
>>>> we will test TIPC and provide, is the behavior is consistent
>>>> across the transports
>>>>
>>>> ============================================
>>>> # /etc/init.d/opensafd restart
>>>> Mar 29 13:45:07 SC-2 opensafd: Stopping OpenSAF Services
>>>> Stopping OpenSAF Services: Mar 29 13:45:07 SC-2 opensafd: OpenSAF
>>>> services successfully stopped
>>>> done
>>>> Mar 29 13:45:07 SC-2 opensafd: Starting OpenSAF Services(5.0.M0 - )
>>>> (Using TCP)
>>>> Starting OpenSAF Services (Using TCP):Mar 29 13:45:07 SC-2
>>>> osafdtmd[17441]: Started
>>>> Mar 29 13:45:07 SC-2 osafrded[17458]: Started
>>>> Mar 29 13:45:07 SC-2 osafdtmd[17441]: NO Established contact with
>>>> 'SC-1'
>>>> Mar 29 13:45:07 SC-2 osafrded[17458]: NO Peer rde@2010f has active
>>>> state => Assigning Standby role to this node
>>>> Mar 29 13:45:07 SC-2 osaffmd[17468]: Started
>>>> Mar 29 13:45:07 SC-2 osafimmd[17478]: Started
>>>> Mar 29 13:45:07 SC-2 osafimmnd[17489]: Started
>>>> Mar 29 13:45:07 SC-2 osafimmnd[17489]: NO IMMD service is UP ...
>>>> ScAbsenseAllowed?:0 introduced?:0
>>>> Mar 29 13:45:07 SC-2 osafimmnd[17489]: NO SERVER STATE:
>>>> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
>>>> Mar 29 13:45:08 SC-2 osafimmnd[17489]: NO SETTING COORD TO 0 CLOUD
>>>> PROTO
>>>> Mar 29 13:45:08 SC-2 osafimmnd[17489]: NO SERVER STATE:
>>>> IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
>>>> Mar 29 13:45:08 SC-2 osafimmnd[17489]: NO SERVER STATE:
>>>> IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
>>>> Mar 29 13:45:08 SC-2 osafimmnd[17489]: NO NODE STATE->
>>>> IMM_NODE_ISOLATED
>>>> Mar 29 13:45:08 SC-2 osafdtmd[17441]: NO Established contact with
>>>> 'PL-998'
>>>> Mar 29 13:45:09 SC-2 osafimmd[17478]: NO SBY: Ruling epoch noted as:4
>>>> Mar 29 13:45:09 SC-2 osafimmd[17478]: NO IMMND coord at 2010f
>>>> Mar 29 13:45:09 SC-2 osafimmnd[17489]: NO NODE STATE->
>>>> IMM_NODE_W_AVAILABLE
>>>> Mar 29 13:45:09 SC-2 osafimmnd[17489]: NO SERVER STATE:
>>>> IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
>>>> Mar 29 13:45:09 SC-2 osafdtmd[17441]: NO Established contact with
>>>> 'PL-999'
>>>> Mar 29 13:45:16 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND
>>>> process at node 2010f old epoch: 3 new epoch:4
>>>> Mar 29 13:45:16 SC-2 osafimmd[17478]: NO IMMND coord at 2010f
>>>> Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO NODE STATE->
>>>> IMM_NODE_FULLY_AVAILABLE 2729
>>>> Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO RepositoryInitModeT is
>>>> SA_IMM_INIT_FROM_FILE
>>>> Mar 29 13:45:16 SC-2 osafimmnd[17489]: WA IMM Access Control mode
>>>> is DISABLED!
>>>> Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO Epoch set to 4 in ImmModel
>>>> Mar 29 13:45:16 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND
>>>> process at node 2020f old epoch: 0 new epoch:4
>>>> Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO SERVER STATE:
>>>> IMM_SERVER_SYNC_CLIENT --> IMM_SERVER_READY
>>>> Mar 29 13:45:16 SC-2 osafimmnd[17489]: NO ImmModel received
>>>> scAbsenceAllowed 0
>>>> Mar 29 13:45:16 SC-2 osaflogd[17507]: Started
>>>> Mar 29 13:45:16 SC-2 osaflogd[17507]: NO LOGSV_DATA_GROUPNAME not found
>>>> Mar 29 13:45:16 SC-2 osaflogd[17507]: NO LOG root directory is:
>>>> "/var/log/opensaf/saflog"
>>>> Mar 29 13:45:16 SC-2 osaflogd[17507]: NO LOG data group is: ""
>>>> Mar 29 13:45:16 SC-2 osaflogd[17507]: NO LGS_MBCSV_VERSION = 5
>>>> Mar 29 13:45:16 SC-2 osafntfd[17518]: Started
>>>> Mar 29 13:45:16 SC-2 osafclmd[17532]: Started
>>>> Mar 29 13:45:16 SC-2 osafclmna[17542]: Started
>>>> Mar 29 13:45:16 SC-2 osafclmna[17542]: NO
>>>> safNode=SC-2,safCluster=myClmCluster Joined cluster, nodeid=2020f
>>>> Mar 29 13:45:16 SC-2 osafamfd[17551]: Started
>>>> Mar 29 13:45:17 SC-2 osafimmnd[17489]: NO Implementer (applier)
>>>> connected: 27 (@OpenSafImmReplicatorB) <6, 2020f>
>>>> Mar 29 13:45:17 SC-2 osafntfimcnd[17524]: NO Started
>>>> Mar 29 13:45:20 SC-2 osafimmd[17478]: NO SBY: Ruling epoch noted as:5
>>>> Mar 29 13:45:20 SC-2 osafimmd[17478]: NO IMMND coord at 2010f
>>>> Mar 29 13:45:20 SC-2 osafimmnd[17489]: NO NODE STATE->
>>>> IMM_NODE_R_AVAILABLE
>>>> Mar 29 13:45:28 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND
>>>> process at node 2010f old epoch: 4 new epoch:5
>>>> Mar 29 13:45:28 SC-2 osafimmd[17478]: NO IMMND coord at 2010f
>>>> Mar 29 13:45:28 SC-2 osafimmnd[17489]: NO NODE STATE->
>>>> IMM_NODE_FULLY_AVAILABLE 18093
>>>> Mar 29 13:45:28 SC-2 osafimmnd[17489]: NO Epoch set to 5 in ImmModel
>>>> Mar 29 13:45:28 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND
>>>> process at node 2020f old epoch: 4 new epoch:5
>>>> Mar 29 13:45:28 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND
>>>> process at node 2e60c old epoch: 0 new epoch:5
>>>> Mar 29 13:45:28 SC-2 osafimmd[17478]: NO SBY: New Epoch for IMMND
>>>> process at node 2e70c old epoch: 0 new epoch:5
>>>> Mar 29 13:45:29 SC-2 osafimmnd[17489]: NO Implementer connected: 28
>>>> (MsgQueueService190220) <0, 2e70c>
>>>> Mar 29 13:45:29 SC-2 osafimmnd[17489]: NO Implementer connected: 29
>>>> (MsgQueueService189964) <0, 2e60c>
>>>> ============================================
>>>>
>>>> On 3/29/2016 12:24 PM, A V Mahesh wrote:
>>>>>
>>>>> My configuration are :
>>>>>
>>>>> slot 1 with 'SC-1'
>>>>> slot 2 with 'SC-2'
>>>>> slot 998 with 'PL-998'
>>>>> slot 999 with 'PL-999'
>>>>>
>>>>> -AVM
>>>>>
>>>>> On 3/29/2016 12:17 PM, A V Mahesh wrote:
>>>>>>
>>>>>> Hi Anders Widell,
>>>>>>
>>>>>> I an not able to bring up cluster with TCP ( i stared with TCP it
>>>>>> self because it will touch different code flows in MDS )
>>>>>> and observed different time different behavior.
>>>>>>
>>>>>> Is this [#1613] tested with TCP ? or am i missing any thing ?
>>>>>>
>>>>>> My configuration are :
>>>>>>
>>>>>> slot 1 with 'SC-1'
>>>>>> slot 1 with 'SC-2'
>>>>>> slot 998 with 'PL-998'
>>>>>> slot 999 with 'PL-999'
>>>>>>
>>>>>> =============================================================================================
>>>>>> Mar 29 11:58:57 SC-1 opensafd: Stopping OpenSAF Services
>>>>>> Mar 29 11:58:58 SC-1 opensafd: OpenSAF services successfully stopped
>>>>>> Mar 29 11:58:58 SC-1 opensafd: Starting OpenSAF Services(5.0.M0 -
>>>>>> ) (Using TCP)
>>>>>> Mar 29 11:58:58 SC-1 osafdtmd[3253]: Started
>>>>>> Mar 29 11:58:58 SC-1 osafrded[3270]: Started
>>>>>> Mar 29 11:59:00 SC-1 osafrded[3270]: NO No peer available =>
>>>>>> Setting Active role for this node
>>>>>> Mar 29 11:59:00 SC-1 osaffmd[3282]: Started
>>>>>> Mar 29 11:59:00 SC-1 osafimmd[3292]: Started
>>>>>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: Started
>>>>>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: NO IMMD service is UP ...
>>>>>> ScAbsenseAllowed?:0 introduced?:0
>>>>>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO New IMMND process is on
>>>>>> ACTIVE Controller at 2010f
>>>>>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO First SC IMMND (OpenSAF
>>>>>> 4.4 or later) attached 2010f
>>>>>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO First IMMND at SC to
>>>>>> attach is NOT configured for PBE
>>>>>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO Attached Nodes:1 Accepted
>>>>>> nodes:0 KnownVeteran:0 doReply:1
>>>>>> Mar 29 11:59:00 SC-1 osafimmd[3292]: NO First IMMND on SC found
>>>>>> at 2010f this IMMD at 2010f. Cluster is loading, *not* 2PBE =>
>>>>>> designating that IMMND as coordinator
>>>>>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: NO SERVER STATE:
>>>>>> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
>>>>>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: NO This IMMND is now the
>>>>>> NEW Coord
>>>>>> Mar 29 11:59:00 SC-1 osafimmnd[3303]: NO SETTING COORD TO 1 CLOUD
>>>>>> PROTO
>>>>>> Mar 29 11:59:03 SC-1 osafimmnd[3303]: NO SERVER STATE:
>>>>>> IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
>>>>>> Mar 29 11:59:03 SC-1 osafimmd[3292]: NO Successfully announced
>>>>>> loading. New ruling epoch:1
>>>>>> Mar 29 11:59:03 SC-1 osafimmnd[3303]: NO SERVER STATE:
>>>>>> IMM_SERVER_LOADING_PENDING --> IMM_SERVER_LOADING_SERVER
>>>>>> Mar 29 11:59:03 SC-1 osafimmnd[3303]: NO NODE STATE->
>>>>>> IMM_NODE_LOADING
>>>>>> Mar 29 11:59:03 SC-1 osafimmloadd: NO Load starting
>>>>>> Mar 29 11:59:03 SC-1 osafimmloadd: NO ***** Loading from XML file
>>>>>> imm.xml at /etc/opensaf *****
>>>>>> Mar 29 11:59:03 SC-1 osafimmloadd: NO The class OpensafImm has
>>>>>> been created since it was missing from the imm.xml load file
>>>>>> Mar 29 11:59:03 SC-1 osafimmloadd: IN Class OsafImmPbeRt created
>>>>>> Mar 29 11:59:03 SC-1 osafimmloadd: NO The class OsafImmPbeRt has
>>>>>> been created since it was missing from the imm.xml load file
>>>>>> Mar 29 11:59:08 SC-1 osafdtmd[3253]: NO Established contact with
>>>>>> 'SC-2'
>>>>>> Mar 29 11:59:08 SC-1 osafimmd[3292]: NO New IMMND process is on
>>>>>> STANDBY Controller at 2020f
>>>>>> Mar 29 11:59:08 SC-1 osafimmd[3292]: WA IMMND on controller (not
>>>>>> currently coord) requests sync
>>>>>> Mar 29 11:59:08 SC-1 osafimmd[3292]: WA No coordinator IMMND
>>>>>> known (case B) - ignoring sync request
>>>>>> Mar 29 11:59:08 SC-1 osafimmd[3292]: NO Node 2020f request sync
>>>>>> sync-pid:9043 epoch:0
>>>>>> Mar 29 11:59:12 SC-1 osafimmloadd: NO The
>>>>>> opensafImm=opensafImm,safApp=safImmService object of class
>>>>>> OpensafImm has been created since it was missing from the imm.xml
>>>>>> load file
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Ccb 1 COMMITTED (IMMLOADER)
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Closing admin owner
>>>>>> IMMLOADER id(1), loading of IMM done
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO NODE STATE->
>>>>>> IMM_NODE_FULLY_AVAILABLE 2729
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO RepositoryInitModeT is
>>>>>> SA_IMM_INIT_FROM_FILE
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: WA IMM Access Control mode
>>>>>> is DISABLED!
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO opensafImmNostdFlags
>>>>>> changed to: 0xf6
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Epoch set to 2 in ImmModel
>>>>>> Mar 29 11:59:12 SC-1 osafimmd[3292]: NO ACT: New Epoch for IMMND
>>>>>> process at node 2010f old epoch: 1 new epoch:2
>>>>>> Mar 29 11:59:12 SC-1 osafimmd[3292]: NO Ruling epoch changed to:2
>>>>>> Mar 29 11:59:12 SC-1 osafimmloadd: NO Load ending normally
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO SERVER STATE:
>>>>>> IMM_SERVER_LOADING_SERVER --> IMM_SERVER_READY
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO ImmModel received
>>>>>> scAbsenceAllowed 0
>>>>>> Mar 29 11:59:12 SC-1 osaflogd[3327]: Started
>>>>>> Mar 29 11:59:12 SC-1 osaflogd[3327]: NO LOGSV_DATA_GROUPNAME not
>>>>>> found
>>>>>> Mar 29 11:59:12 SC-1 osaflogd[3327]: NO LOG root directory is:
>>>>>> "/var/log/opensaf/saflog"
>>>>>> Mar 29 11:59:12 SC-1 osaflogd[3327]: NO LOG data group is: ""
>>>>>> Mar 29 11:59:12 SC-1 osaflogd[3327]: NO LGS_MBCSV_VERSION = 5
>>>>>> Mar 29 11:59:12 SC-1 osafdtmd[3253]: NO Established contact with
>>>>>> 'PL-998'
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Implementer connected: 1
>>>>>> (safLogService) <2, 2010f>
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO implementer for class
>>>>>> 'OpenSafLogConfig' is safLogService => class extent is safe.
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO implementer for class
>>>>>> 'SaLogStreamConfig' is safLogService => class extent is safe.
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Implementer (applier)
>>>>>> connected: 2 (@safLogService_appl) <11, 2010f>
>>>>>> Mar 29 11:59:12 SC-1 osafntfd[3340]: Started
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Implementer (applier)
>>>>>> connected: 3 (@OpenSafImmReplicatorA) <17, 2010f>
>>>>>> Mar 29 11:59:12 SC-1 osafntfimcnd[3348]: NO Started
>>>>>> Mar 29 11:59:12 SC-1 osafclmd[3354]: Started
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO Implementer connected: 4
>>>>>> (safClmService) <19, 2010f>
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO implementer for class
>>>>>> 'SaClmNode' is safClmService => class extent is safe.
>>>>>> Mar 29 11:59:12 SC-1 osafimmnd[3303]: NO implementer for class
>>>>>> 'SaClmCluster' is safClmService => class extent is safe.
>>>>>> Mar 29 11:59:12 SC-1 osafimmd[3292]: WA No coordinator IMMND
>>>>>> known (case B) - ignoring sync request
>>>>>> Mar 29 11:59:12 SC-1 osafimmd[3292]: NO Node 2e60c request sync
>>>>>> sync-pid:7734 epoch:0
>>>>>> Mar 29 11:59:12 SC-1 osafclmna[3364]: Started
>>>>>> Mar 29 11:59:12 SC-1 osafclmna[3364]: NO
>>>>>> safNode=SC-1,safCluster=myClmCluster Joined cluster, nodeid=2010f
>>>>>> Mar 29 11:59:12 SC-1 osafamfd[3373]: Started
>>>>>> Mar 29 11:59:16 SC-1 osafdtmd[3253]: NO Established contact with
>>>>>> 'PL-999'
>>>>>> Mar 29 11:59:16 SC-1 osafimmd[3292]: NO Extended intro from node
>>>>>> 2e70c
>>>>>> Mar 29 11:59:16 SC-1 osafimmd[3292]: WA PBE not configured at
>>>>>> first attached SC-immnd, but Pbe is configured for immnd at 2e70c
>>>>>> - possible upgrade from pre 4.4
>>>>>> Mar 29 11:59:16 SC-1 osafimmd[3292]: WA No coordinator IMMND
>>>>>> known (case B) - ignoring sync request
>>>>>> Mar 29 11:59:16 SC-1 osafimmd[3292]: NO Node 2e70c request sync
>>>>>> sync-pid:7615 epoch:0
>>>>>> Mar 29 11:59:17 SC-1 osafimmnd[3303]: NO Announce sync, epoch:3
>>>>>> Mar 29 11:59:17 SC-1 osafimmnd[3303]: NO SERVER STATE:
>>>>>> IMM_SERVER_READY --> IMM_SERVER_SYNC_SERVER
>>>>>> Mar 29 11:59:17 SC-1 osafimmd[3292]: NO Successfully announced
>>>>>> sync. New ruling epoch:3
>>>>>> Mar 29 11:59:17 SC-1 osafimmnd[3303]: NO NODE STATE->
>>>>>> IMM_NODE_R_AVAILABLE
>>>>>> Mar 29 11:59:17 SC-1 osafimmloadd: NO Sync starting
>>>>>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: WA STOPPING sync process
>>>>>> pid 3386 after five minutes
>>>>>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: ER SYNC APPARENTLY FAILED
>>>>>> status:0
>>>>>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: NO -SERVER STATE:
>>>>>> IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY
>>>>>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: NO NODE STATE->
>>>>>> IMM_NODE_FULLY_AVAILABLE (2624)
>>>>>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: NO Epoch set to 3 in ImmModel
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: NO ACT: New Epoch for IMMND
>>>>>> process at node 2010f old epoch: 2 new epoch:3
>>>>>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: NO Coord broadcasting
>>>>>> ABORT_SYNC, epoch:3
>>>>>> Mar 29 12:04:18 SC-1 osafimmnd[3303]: WA IMMND - Client Node Get
>>>>>> Failed for cli_hdl:708669735183
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA Successfully aborted
>>>>>> sync. Epoch:3
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA IMMND on controller (not
>>>>>> currently coord) requests sync
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA No coordinator IMMND
>>>>>> known (case B) - ignoring sync request
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: NO Node 2020f request sync
>>>>>> sync-pid:9043 epoch:0
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA No coordinator IMMND
>>>>>> known (case B) - ignoring sync request
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: NO Node 2e60c request sync
>>>>>> sync-pid:7734 epoch:0
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: WA No coordinator IMMND
>>>>>> known (case B) - ignoring sync request
>>>>>> Mar 29 12:04:18 SC-1 osafimmd[3292]: NO Node 2e70c request sync
>>>>>> sync-pid:7615 epoch:0
>>>>>> Mar 29 12:04:19 SC-1 osafimmnd[3303]: WA MDS Send Failed
>>>>>> Mar 29 12:04:19 SC-1 osafimmnd[3303]: WA Error code 2 returned
>>>>>> for message type 17 - ignoring
>>>>>> Mar 29 12:04:19 SC-1 osafimmnd[3303]: WA Global ABORT SYNC
>>>>>> received for epoch 3
>>>>>> Mar 29 12:04:22 SC-1 osafimmnd[3303]: NO Announce sync, epoch:4
>>>>>> Mar 29 12:04:22 SC-1 osafimmnd[3303]: NO SERVER STATE:
>>>>>> IMM_SERVER_READY --> IMM_SERVER_SYNC_SERVER
>>>>>> Mar 29 12:04:22 SC-1 osafimmd[3292]: NO Successfully announced
>>>>>> sync. New ruling epoch:4
>>>>>> Mar 29 12:04:22 SC-1 osafimmnd[3303]: NO NODE STATE->
>>>>>> IMM_NODE_R_AVAILABLE
>>>>>> Mar 29 12:04:22 SC-1 osafimmloadd: NO Sync starting
>>>>>> Mar 29 12:07:08 SC-1 osafimmnd[3303]: NO Global discard node
>>>>>> received for nodeId:2020f pid:9043
>>>>>> Mar 29 12:07:12 SC-1 osafimmnd[3303]: NO Global discard node
>>>>>> received for nodeId:2e60c pid:7734
>>>>>> Mar 29 12:07:16 SC-1 osafimmnd[3303]: NO Global discard node
>>>>>> received for nodeId:2e70c pid:7615
>>>>>> Mar 29 12:07:23 SC-1 osafimmd[3292]: NO New IMMND process is on
>>>>>> STANDBY Controller at 2020f
>>>>>> Mar 29 12:07:23 SC-1 osafimmd[3292]: WA PBE is configured at
>>>>>> first attached SC-immnd, but no Pbe file is configured for immnd
>>>>>> at node 2020f - rejecting node
>>>>>> Mar 29 12:07:23 SC-1 osafimmd[3292]: WA Error returned from
>>>>>> processing message err:2 msg-type:2
>>>>>> Mar 29 12:07:23 SC-1 osafimmnd[3303]: NO Global discard node
>>>>>> received for nodeId:2020f pid:9670
>>>>>> Mar 29 12:07:27 SC-1 osafimmd[3292]: WA PBE is configured at
>>>>>> first attached SC-immnd, but no Pbe file is configured for immnd
>>>>>> at node 2e60c - rejecting node
>>>>>> Mar 29 12:07:27 SC-1 osafimmd[3292]: WA Error returned from
>>>>>> processing message err:2 msg-type:2
>>>>>> Mar 29 12:07:27 SC-1 osafimmnd[3303]: NO Global discard node
>>>>>> received for nodeId:2e60c pid:8359
>>>>>> Mar 29 12:07:29 SC-1 osafimmloadd: IN Synced 33170 objects in total
>>>>>> Mar 29 12:07:29 SC-1 osafimmnd[3303]: NO NODE STATE->
>>>>>> IMM_NODE_FULLY_AVAILABLE 17518
>>>>>> Mar 29 12:07:29 SC-1 osafimmloadd: NO Sync ending normally
>>>>>> Mar 29 12:07:29 SC-1 osafimmnd[3303]: NO Epoch set to 4 in ImmModel
>>>>>> Mar 29 12:07:29 SC-1 osafimmd[3292]: NO ACT: New Epoch for IMMND
>>>>>> process at node 2010f old epoch: 3 new epoch:4
>>>>>> Mar 29 12:07:29 SC-1 osafimmnd[3303]: NO SERVER STATE:
>>>>>> IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY
>>>>>> Mar 29 12:07:31 SC-1 osafimmd[3292]: NO Extended intro from node
>>>>>> 2e70c
>>>>>> Mar 29 12:07:31 SC-1 osafimmd[3292]: WA No coordinator IMMND
>>>>>> known (case B) - ignoring sync request
>>>>>> Mar 29 12:07:31 SC-1 osafimmd[3292]: NO Node 2e70c request sync
>>>>>> sync-pid:8240 epoch:0
>>>>>> Mar 29 12:07:33 SC-1 osafimmnd[3303]: NO Announce sync, epoch:5
>>>>>> Mar 29 12:07:33 SC-1 osafimmnd[3303]: NO SERVER STATE:
>>>>>> IMM_SERVER_READY --> IMM_SERVER_SYNC_SERVER
>>>>>> Mar 29 12:07:33 SC-1 osafimmd[3292]: NO Successfully announced
>>>>>> sync. New ruling epoch:5
>>>>>> Mar 29 12:07:33 SC-1 osafimmnd[3303]: NO NODE STATE->
>>>>>> IMM_NODE_R_AVAILABLE
>>>>>> Mar 29 12:07:33 SC-1 osafimmloadd: NO Sync starting
>>>>>> Mar 29 12:07:38 SC-1 osafimmd[3292]: NO New IMMND process is on
>>>>>> STANDBY Controller at 2020f
>>>>>> Mar 29 12:07:38 SC-1 osafimmd[3292]: WA PBE is configured at
>>>>>> first attached SC-immnd, but no Pbe file is configured for immnd
>>>>>> at node 2020f - rejecting node
>>>>>> Mar 29 12:07:38 SC-1 osafimmd[3292]: WA Error returned from
>>>>>> processing message err:2 msg-type:2
>>>>>> Mar 29 12:07:38 SC-1 osafimmnd[3303]: NO Global discard node
>>>>>> received for nodeId:2020f pid:9709
>>>>>> Mar 29 12:07:38 SC-1 osafimmd[3292]: WA IMMD lost contact with
>>>>>> peer IMMD (NCSMDS_RED_DOWN)
>>>>>> Mar 29 12:07:39 SC-1 osafdtmd[3253]: NO Lost contact with 'SC-2'
>>>>>> Mar 29 12:07:39 SC-1 osaffmd[3282]: NO Node Down event for node
>>>>>> id 2020f:
>>>>>> Mar 29 12:07:39 SC-1 osaffmd[3282]: NO Current role: ACTIVE
>>>>>> Mar 29 12:07:39 SC-1 osaffmd[3282]: Rebooting OpenSAF NodeId = 0
>>>>>> EE Name = No EE Mapped, Reason: Failover occurred, but this node
>>>>>> is not yet ready, OwnNodeId = 131343, SupervisionTime = 60
>>>>>> Mar 29 12:07:39 SC-1 opensaf_reboot: Rebooting local node; timeout=60
>>>>>> =============================================================================================
>>>>>>
>>>>>> -AVM
>>>>>>
>>>>>> On 3/18/2016 9:38 PM, Anders Widell wrote:
>>>>>>> osaf/libs/core/mds/include/mds_dt.h | 26 +++++++++++++++++---------
>>>>>>> osaf/libs/core/mds/mds_c_db.c | 26 ++++++++++++--------------
>>>>>>> 2 files changed, 29 insertions(+), 23 deletions(-)
>>>>>>>
>>>>>>>
>>>>>>> Support up to 4095 nodes in the flat addressing scheme for TIPC, by
>>>>>>> encoding the
>>>>>>> slot ID in the lower eight bits and the ones' complement of the subslot
>>>>>>> ID in
>>>>>>> bits 8 to 11 in the node identifier of the TIPC address. The reason for
>>>>>>> taking
>>>>>>> the ones' complement of the subslot ID is backwards compatibility with
>>>>>>> existing
>>>>>>> installations, so that this enhancement can be upgraded in-service.
>>>>>>>
>>>>>>> diff --git a/osaf/libs/core/mds/include/mds_dt.h
>>>>>>> b/osaf/libs/core/mds/include/mds_dt.h
>>>>>>> --- a/osaf/libs/core/mds/include/mds_dt.h
>>>>>>> +++ b/osaf/libs/core/mds/include/mds_dt.h
>>>>>>> @@ -237,7 +237,8 @@ bool mdtm_mailbox_mbx_cleanup(NCSCONTEXT
>>>>>>>
>>>>>>> /*
>>>>>>> * In the default flat addressing scheme, TIPC node addresses looks
>>>>>>> like
>>>>>>> - * 1.1.1, 1.1.2 etc.
>>>>>>> + * 1.1.1, 1.1.2 etc. The ones' complement of the subslot ID is shifted
>>>>>>> 8
>>>>>>> + * bits up and the slot ID is added in the 8 LSB.
>>>>>>> * In the non flat (old/legacy) addressing scheme TIPC addresses
>>>>>>> looks like
>>>>>>> * 1.1.31, 1.1.47. The slot ID is shifted 4 bits up and subslot ID is
>>>>>>> added
>>>>>>> * in the 4 LSB.
>>>>>>> @@ -248,13 +249,20 @@ bool mdtm_mailbox_mbx_cleanup(NCSCONTEXT
>>>>>>>
>>>>>>> #if (MDS_USE_SUBSLOT_ID == 0)
>>>>>>> #define MDS_TIPC_NODE_ID_MIN 0x01001001
>>>>>>> -#define MDS_TIPC_NODE_ID_MAX 0x010010ff
>>>>>>> -#define MDS_NCS_NODE_ID_MIN (MDS_NCS_CHASSIS_ID|0x0000010f)
>>>>>>> -#define MDS_NCS_NODE_ID_MAX (MDS_NCS_CHASSIS_ID|0x0000ff0f)
>>>>>>> -#define m_MDS_GET_NCS_NODE_ID_FROM_TIPC_NODE_ID(node) \
>>>>>>> - (NODE_ID)( MDS_NCS_CHASSIS_ID | (((node)&0xff)<<8) | (0xf))
>>>>>>> -#define m_MDS_GET_TIPC_NODE_ID_FROM_NCS_NODE_ID(node) \
>>>>>>> - (NODE_ID)( MDS_TIPC_COMMON_ID | (((node)&0xff00)>>8) )
>>>>>>> +#define MDS_TIPC_NODE_ID_MAX 0x01001fff
>>>>>>> +static inline NODE_ID m_MDS_GET_NCS_NODE_ID_FROM_TIPC_NODE_ID(NODE_ID
>>>>>>> node) {
>>>>>>> + return MDS_NCS_CHASSIS_ID | ((node & 0xff) << 8) | (((node &
>>>>>>> 0xf00) >> 8) ^ 0xf);
>>>>>>> +}
>>>>>>> +static inline NODE_ID m_MDS_GET_TIPC_NODE_ID_FROM_NCS_NODE_ID(NODE_ID
>>>>>>> node) {
>>>>>>> + return MDS_TIPC_COMMON_ID | ((node & 0xff00) >> 8) | (((node &
>>>>>>> 0xf) ^ 0xf) << 8);
>>>>>>> +}
>>>>>>> +static inline uint32_t m_MDS_CHECK_TIPC_NODE_ID_RANGE(NODE_ID node) {
>>>>>>> + return node < MDS_TIPC_NODE_ID_MIN || node >
>>>>>>> MDS_TIPC_NODE_ID_MAX ?
>>>>>>> + NCSCC_RC_FAILURE : NCSCC_RC_SUCCESS;
>>>>>>> +}
>>>>>>> +static inline uint32_t m_MDS_CHECK_NCS_NODE_ID_RANGE(NODE_ID node) {
>>>>>>> + return
>>>>>>> m_MDS_CHECK_TIPC_NODE_ID_RANGE(m_MDS_GET_TIPC_NODE_ID_FROM_NCS_NODE_ID(node));
>>>>>>> +}
>>>>>>> #else
>>>>>>> #define MDS_TIPC_NODE_ID_MIN 0x01001001
>>>>>>> #define MDS_TIPC_NODE_ID_MAX 0x0100110f
>>>>>>> @@ -264,10 +272,10 @@ bool mdtm_mailbox_mbx_cleanup(NCSCONTEXT
>>>>>>> (NODE_ID)( MDS_NCS_CHASSIS_ID | ((node)&0xf) |
>>>>>>> (((node)&0xff0)<<4))
>>>>>>> #define m_MDS_GET_TIPC_NODE_ID_FROM_NCS_NODE_ID(node) \
>>>>>>> (NODE_ID)( MDS_TIPC_COMMON_ID | (((node)&0xff00)>>4) |
>>>>>>> ((node)&0xf) )
>>>>>>> -#endif
>>>>>>>
>>>>>>> #define m_MDS_CHECK_TIPC_NODE_ID_RANGE(node)
>>>>>>> (((((node)<MDS_TIPC_NODE_ID_MIN)||((node)>MDS_TIPC_NODE_ID_MAX))?NCSCC_RC_FAILURE:NCSCC_RC_SUCCESS))
>>>>>>> #define m_MDS_CHECK_NCS_NODE_ID_RANGE(node)
>>>>>>> (((((node)<MDS_NCS_NODE_ID_MIN)||((node)>MDS_NCS_NODE_ID_MAX))?NCSCC_RC_FAILURE:NCSCC_RC_SUCCESS))
>>>>>>> +#endif
>>>>>>>
>>>>>>> /* ******************************************** */
>>>>>>> /* ******************************************** */
>>>>>>> diff --git a/osaf/libs/core/mds/mds_c_db.c
>>>>>>> b/osaf/libs/core/mds/mds_c_db.c
>>>>>>> --- a/osaf/libs/core/mds/mds_c_db.c
>>>>>>> +++ b/osaf/libs/core/mds/mds_c_db.c
>>>>>>> @@ -37,14 +37,13 @@ void get_adest_details(MDS_DEST adest, c
>>>>>>> char *token, *saveptr;
>>>>>>> struct stat s;
>>>>>>> uint32_t process_id = 0;
>>>>>>> - NCS_PHY_SLOT_ID phy_slot;
>>>>>>> - NCS_SUB_SLOT_ID sub_slot;
>>>>>>> + SlotSubslotId slot_subslot_id;
>>>>>>> char pid_path[1024];
>>>>>>> char *pid_name = NULL;
>>>>>>> char process_name[MDS_MAX_PROCESS_NAME_LEN];
>>>>>>> bool remote = false;
>>>>>>>
>>>>>>> -
>>>>>>> m_NCS_GET_PHYINFO_FROM_NODE_ID(m_NCS_NODE_ID_FROM_MDS_DEST(adest),
>>>>>>> NULL, &phy_slot, &sub_slot);
>>>>>>> + slot_subslot_id =
>>>>>>> GetSlotSubslotIdFromNodeId(m_NCS_NODE_ID_FROM_MDS_DEST(adest));
>>>>>>>
>>>>>>> if (!tipc_mode_enabled) {
>>>>>>> process_id = m_MDS_GET_PROCESS_ID_FROM_ADEST(adest);
>>>>>>> @@ -111,11 +110,11 @@ void get_adest_details(MDS_DEST adest, c
>>>>>>> }
>>>>>>>
>>>>>>> if (remote == true)
>>>>>>> - snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN,
>>>>>>> "<rem_nodeid[%d]:%s>",
>>>>>>> - phy_slot, process_name);
>>>>>>> + snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN,
>>>>>>> "<rem_nodeid[%u]:%s>",
>>>>>>> + slot_subslot_id, process_name);
>>>>>>> else
>>>>>>> - snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN,
>>>>>>> "<nodeid[%d]:%s>",
>>>>>>> - phy_slot, process_name);
>>>>>>> + snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN,
>>>>>>> "<nodeid[%u]:%s>",
>>>>>>> + slot_subslot_id, process_name);
>>>>>>>
>>>>>>> m_MDS_LOG_DBG("MDS:DB: adest_details: %s ", adest_details);
>>>>>>> m_MDS_LEAVE();
>>>>>>> @@ -129,8 +128,7 @@ void get_adest_details(MDS_DEST adest, c
>>>>>>> void get_subtn_adest_details(MDS_PWE_HDL pwe_hdl, MDS_SVC_ID svc_id,
>>>>>>> MDS_DEST adest, char* adest_details)
>>>>>>> {
>>>>>>> uint32_t process_id = 0;
>>>>>>> - NCS_PHY_SLOT_ID phy_slot;
>>>>>>> - NCS_SUB_SLOT_ID sub_slot;
>>>>>>> + SlotSubslotId slot_subslot_id;
>>>>>>> char process_name[MDS_MAX_PROCESS_NAME_LEN];
>>>>>>> bool remote = false;
>>>>>>> MDS_SVC_INFO *svc_info = NULL;
>>>>>>> @@ -139,7 +137,7 @@ void get_subtn_adest_details(MDS_PWE_HDL
>>>>>>> char *pid_name = NULL;
>>>>>>> struct stat s;
>>>>>>>
>>>>>>> -
>>>>>>> m_NCS_GET_PHYINFO_FROM_NODE_ID(m_NCS_NODE_ID_FROM_MDS_DEST(adest),
>>>>>>> NULL, &phy_slot, &sub_slot);
>>>>>>> + slot_subslot_id =
>>>>>>> GetSlotSubslotIdFromNodeId(m_NCS_NODE_ID_FROM_MDS_DEST(adest));
>>>>>>> process_id = m_MDS_GET_PROCESS_ID_FROM_ADEST(adest);
>>>>>>>
>>>>>>> if (NCSCC_RC_SUCCESS == mds_mcm_check_intranode(adest)) {
>>>>>>> @@ -185,11 +183,11 @@ void get_subtn_adest_details(MDS_PWE_HDL
>>>>>>> }
>>>>>>>
>>>>>>> if (remote == true)
>>>>>>> - snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN,
>>>>>>> "<rem_node[%d]:%s>",
>>>>>>> - phy_slot, process_name);
>>>>>>> + snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN,
>>>>>>> "<rem_node[%u]:%s>",
>>>>>>> + slot_subslot_id, process_name);
>>>>>>> else
>>>>>>> - snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN,
>>>>>>> "<node[%d]:%s>",
>>>>>>> - phy_slot, process_name);
>>>>>>> + snprintf(adest_details, MDS_MAX_PROCESS_NAME_LEN,
>>>>>>> "<node[%u]:%s>",
>>>>>>> + slot_subslot_id, process_name);
>>>>>>> done:
>>>>>>> m_MDS_LOG_DBG("MDS:DB: adest_details: %s ", adest_details);
>>>>>>> m_MDS_LEAVE();
>>>>>>
>>>>>
>>>>
>>>
>>
>
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel