Hi Ravi Sekhar,
Attached is a smaller log file of just the standby controller's amfnd log. I
previously sent multiple logs however they were too big and were quarantined.
At time 20:31:01 just after the "Cold sync complete!" is when I issued the
"opensafd stop" on the active controller.
Thanks,
Chris
-----Original Message-----
From: Bisirri, Christopher D (US)
Sent: Tuesday, September 05, 2017 4:02 PM
To: 'Ravi Sekhar Reddy Konda' <[email protected]>; Carroll, James R
(US) <[email protected]>
Cc: [email protected]
Subject: RE: EXTERNAL: Re: [users] Open-SAF 5.2 - question on TIPC
usage/failovers
Hi Ravi Sekhar,
Here are our logs which captures the issue Jim described earlier. This is a tar
file that I renamed to ".allow" just in case your system strips attachments.
You'll need to rename the file to logs.tar to view the logs.
Just to give a little background. For this scenario we have 2 nodes defined as
controllers, amcontroller1 (active) and amcontroller2 (standby). We performed
an OpenSAF shut down on amcontroller1 expecting amcontroller2 to become the
active controller. We shut down amcontroller1 by issuing the following command:
"/etc/init.d/opensafd stop".
Thanks for taking a look.
-Chris
-----Original Message-----
From: Ravi Sekhar Reddy Konda [mailto:[email protected]]
Sent: Monday, September 04, 2017 4:50 AM
To: Carroll, James R (US) <[email protected]>
Cc: [email protected]
Subject: EXTERNAL: Re: [users] Open-SAF 5.2 - question on TIPC usage/failovers
Hi Jim,
I just tried the scenario that you explained with OpenSAF 5.2, I did not see
any issue
When you say reconfigure, I am expecting that you brought down the new Active
and then reconfigured both the nodes OpenSAF to use TIPC.
Can you share the amf logs when you face the issue, we can analyze the cause
Regards,
Ravi Sekhar
----- Original Message -----
From: [email protected]
To: [email protected]
Sent: Friday, September 1, 2017 9:48:55 PM GMT +05:30 Chennai, Kolkata, Mumbai,
New Delhi
Subject: Re: [users] Open-SAF 5.2 - question on TIPC usage/failovers
Hi,
We are running with OpenSAF 5.2. We have successfully run the following
scenarios using TCP:
1) Two SC nodes, configured as active and Standby, use opensafd stop to
perform graceful shutdown of active controller, Standby takes over as expected
When we reconfigure OpenSAF to use TIPC, we are seeing problems with the
Controller failovers.
1) Two SC nodes, use opensafd stop to perform graceful shutdown of active
controller, Standby does not take over, eventually the immnd restarts, and we
are left in a bad state that requires a node reboot.
Is there any information available regarding this behavior, or are there other
settings we need to adjust to configure TIPC properly?
Additionally, we stepped back to OpenSAF 5.0, and confirmed this same identical
behavior. When using TCP, the failover performs as expected. But when
configured to use TIPC, this same erratic behavior occurs.
Thanks.
Jim
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most engaging tech
sites, Slashdot.org!
https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=rFCQ76TW5HZUgA7b20ApVcXgXru6mvz4fvCm1_H6w1k&m=TFbplD6zN3y1rDAEkBcQvrLnBFPZ4_kUWO4dfPqxx-g&s=Lct2UGZVseNYSrM9wMoYVMOcxq4uO_Vu1cz0bhlCdrw&e=
_______________________________________________
Opensaf-users mailing list
[email protected]
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_opensaf-2Dusers&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=rFCQ76TW5HZUgA7b20ApVcXgXru6mvz4fvCm1_H6w1k&m=TFbplD6zN3y1rDAEkBcQvrLnBFPZ4_kUWO4dfPqxx-g&s=-HzTg2WVlMVreo-sUSmYn-x8JTYTXeWhj4tktBHsVbg&e=
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most engaging tech
sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users
Sep 6 20:30:21 amcontroller2 opensafd: Starting OpenSAF Services(5.2.0 -
8760:28d2dc4b3cb8:default) (Using TIPC)
Sep 6 20:30:21 amcontroller2 osaftransportd[5338]: Started
Sep 6 20:30:21 amcontroller2 opensafd[5298]: NO Monitoring of TRANSPORT started
Sep 6 20:30:21 amcontroller2 osafclmna[5342]: Started
Sep 6 20:30:21 amcontroller2 opensafd[5298]: NO Monitoring of CLMNA started
Sep 6 20:30:21 amcontroller2 osafclmna[5342]: NO
safNode=amcontroller2,safCluster=smamTestClmCluster Joined cluster, nodeid=1020f
Sep 6 20:30:21 amcontroller2 osafrded[5356]: Started
Sep 6 20:30:21 amcontroller2 osaffmd[5370]: Started
Sep 6 20:30:21 amcontroller2 osaffmd[5370]: NO Remote fencing is disabled
Sep 6 20:30:21 amcontroller2 opensafd[5298]: NO Monitoring of HLFM started
Sep 6 20:30:21 amcontroller2 osafimmd[5385]: Started
Sep 6 20:30:21 amcontroller2 opensafd[5298]: NO Monitoring of IMMD started
Sep 6 20:30:21 amcontroller2 osafimmnd[5401]: Started
Sep 6 20:30:21 amcontroller2 osafimmnd[5401]: NO IMMD service is UP ...
ScAbsenseAllowed?:0 introduced?:0
Sep 6 20:30:21 amcontroller2 osafimmnd[5401]: NO SERVER STATE:
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Sep 6 20:30:21 amcontroller2 osafimmnd[5401]: NO SERVER STATE:
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 6 20:30:21 amcontroller2 osafimmnd[5401]: NO SERVER STATE:
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 6 20:30:21 amcontroller2 osafimmnd[5401]: NO NODE STATE-> IMM_NODE_ISOLATED
Sep 6 20:30:22 amcontroller2 osafimmnd[5401]: NO NODE STATE->
IMM_NODE_W_AVAILABLE
Sep 6 20:30:22 amcontroller2 osafimmnd[5401]: NO SERVER STATE:
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Sep 6 20:30:22 amcontroller2 osafimmnd[5401]: NO NODE STATE->
IMM_NODE_FULLY_AVAILABLE 2714
Sep 6 20:30:22 amcontroller2 osafimmnd[5401]: NO RepositoryInitModeT is
SA_IMM_INIT_FROM_FILE
Sep 6 20:30:22 amcontroller2 osafimmnd[5401]: WA IMM Access Control mode is
DISABLED!
Sep 6 20:30:22 amcontroller2 osafimmnd[5401]: NO Epoch set to 3 in ImmModel
Sep 6 20:30:22 amcontroller2 osafimmnd[5401]: NO SERVER STATE:
IMM_SERVER_SYNC_CLIENT --> IMM_SERVER_READY
Sep 6 20:30:22 amcontroller2 osafimmnd[5401]: NO ImmModel received
scAbsenceAllowed 0
Sep 6 20:30:22 amcontroller2 opensafd[5298]: NO Monitoring of IMMND started
Sep 6 20:30:22 amcontroller2 osaflogd[5416]: Started
Sep 6 20:30:22 amcontroller2 opensafd[5298]: NO Monitoring of LOGD started
Sep 6 20:30:22 amcontroller2 osafntfd[5431]: Started
Sep 6 20:30:22 amcontroller2 opensafd[5298]: NO Monitoring of NTFD started
Sep 6 20:30:22 amcontroller2 osafclmd[5446]: Started
Sep 6 20:30:22 amcontroller2 opensafd[5298]: NO Monitoring of CLMD started
Sep 6 20:30:22 amcontroller2 osafamfd[5461]: Started
Sep 6 20:30:22 amcontroller2 opensafd[5298]: NO Monitoring of AMFD started
Sep 6 20:30:22 amcontroller2 osafamfnd[5476]: Started
Sep 6 20:30:22 amcontroller2 osafamfnd[5476]: NO Start monitoring AMFD using
/var/lib/opensaf/osafamfd.fifo
Sep 6 20:30:22 amcontroller2 osafamfnd[5476]: NO Sending node up due to
NCSMDS_UP
Sep 6 20:30:22 amcontroller2 osafamfnd[5476]: NO
'safSu=SC-2,safSg=2N,safApp=OpenSAF' Presence State UNINSTANTIATED =>
INSTANTIATING
Sep 6 20:30:22 amcontroller2 osafamfnd[5476]: NO
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' Presence State UNINSTANTIATED =>
INSTANTIATING
Sep 6 20:30:22 amcontroller2 osafamfwd[5498]: Started
Sep 6 20:30:22 amcontroller2 osafckptd[5506]: Started
Sep 6 20:30:22 amcontroller2 osafckptnd[5526]: Started
Sep 6 20:30:22 amcontroller2 osafevtd[5536]: Started
Sep 6 20:30:22 amcontroller2 osaflcknd[5556]: Started
Sep 6 20:30:23 amcontroller2 osaflckd[5588]: Started
Sep 6 20:30:23 amcontroller2 osafmsgnd[5608]: Started
Sep 6 20:30:23 amcontroller2 osafimmnd[5401]: NO Implementer connected: 13
(MsgQueueService66063) <124, 1020f>
Sep 6 20:30:23 amcontroller2 osafsmfnd[5639]: Started
Sep 6 20:30:23 amcontroller2 osafmsgd[5663]: Started
Sep 6 20:30:23 amcontroller2 osafamfnd[5476]: NO
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATING =>
INSTANTIATED
Sep 6 20:30:23 amcontroller2 osafamfnd[5476]: NO Assigning
'safSi=NoRed2,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=NoRed,safApp=OpenSAF'
Sep 6 20:30:23 amcontroller2 osafamfnd[5476]: NO Assigned
'safSi=NoRed2,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=NoRed,safApp=OpenSAF'
Sep 6 20:30:23 amcontroller2 osafsmfd[5693]: Started
Sep 6 20:30:23 amcontroller2 osafamfnd[5476]: NO
'safSu=SC-2,safSg=2N,safApp=OpenSAF' Presence State INSTANTIATING =>
INSTANTIATED
Sep 6 20:30:23 amcontroller2 osafamfnd[5476]: NO Assigning
'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 6 20:30:23 amcontroller2 osafrded[5356]: NO RDE role set to STANDBY
Sep 6 20:30:23 amcontroller2 osafimmd[5385]: NO MDS event from svc_id 24
(change:3, dest:13)
Sep 6 20:30:23 amcontroller2 osafimmd[5385]: NO MDS event from svc_id 24
(change:5, dest:13)
Sep 6 20:30:23 amcontroller2 osafimmd[5385]: NO MDS event from svc_id 24
(change:5, dest:13)
Sep 6 20:30:23 amcontroller2 osafimmd[5385]: NO MDS event from svc_id 25
(change:3, dest:283740098879511)
Sep 6 20:30:23 amcontroller2 osafimmd[5385]: NO MDS event from svc_id 25
(change:3, dest:282640437534734)
Sep 6 20:30:23 amcontroller2 osafrded[5356]: NO Peer up on node 0x1010f
Sep 6 20:30:23 amcontroller2 osafrded[5356]: NO Got peer info request from
node 0x1010f with role ACTIVE
Sep 6 20:30:23 amcontroller2 osafrded[5356]: NO Got peer info response from
node 0x1010f with role ACTIVE
Sep 6 20:30:23 amcontroller2 osaflogd[5416]: NO LOGSV_DATA_GROUPNAME not found
Sep 6 20:30:23 amcontroller2 osaflogd[5416]: NO LOG root directory is:
"/var/log/opensaf/saflog"
Sep 6 20:30:23 amcontroller2 osaflogd[5416]: NO LOG data group is: ""
Sep 6 20:30:23 amcontroller2 osafimmnd[5401]: NO Implementer (applier)
connected: 14 (@safAmfService1020f) <129, 1020f>
Sep 6 20:30:23 amcontroller2 osaflogd[5416]: NO LGS_MBCSV_VERSION = 6
Sep 6 20:30:23 amcontroller2 osafamfnd[5476]: NO Assigned
'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 6 20:30:23 amcontroller2 opensafd: OpenSAF(5.2.0 -
8760:28d2dc4b3cb8:default) services successfully started
Sep 6 20:30:25 amcontroller2 osafamfd[5461]: NO Cold sync complete!
Sep 6 20:31:01 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 3
<0, 1010f> (safClmService)
Sep 6 20:31:01 amcontroller2 osafrded[5356]: NO Peer down on node 0x1010f
Sep 6 20:31:01 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 5
<0, 1010f> (MsgQueueService65807)
Sep 6 20:31:01 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 8
<0, 1010f> (safMsgGrpService)
Sep 6 20:31:01 amcontroller2 osafimmd[5385]: NO MDS event from svc_id 24
(change:1, dest:13)
Sep 6 20:31:01 amcontroller2 osafimmd[5385]: NO MDS event from svc_id 24
(change:6, dest:13)
Sep 6 20:31:01 amcontroller2 osaffmd[5370]: NO IMMD down on: 1010f
Sep 6 20:31:01 amcontroller2 osafimmd[5385]: WA IMMD lost contact with peer
IMMD (NCSMDS_RED_DOWN)
Sep 6 20:31:01 amcontroller2 osafimmnd[5401]: WA DISCARD DUPLICATE FEVS
message:1371
Sep 6 20:31:01 amcontroller2 osafimmnd[5401]: WA Error code 2 returned for
message type 82 - ignoring
Sep 6 20:31:01 amcontroller2 osafimmnd[5401]: WA DISCARD DUPLICATE FEVS
message:1372
Sep 6 20:31:01 amcontroller2 osafimmnd[5401]: WA Error code 2 returned for
message type 82 - ignoring
Sep 6 20:31:02 amcontroller2 osaffmd[5370]: NO FM down on: 1010f
Sep 6 20:31:02 amcontroller2 osafimmd[5385]: NO MDS event from svc_id 25
(change:4, dest:282640437534734)
Sep 6 20:31:02 amcontroller2 osaffmd[5370]: NO IMMND down on: 1010f
Sep 6 20:31:02 amcontroller2 osafimmd[5385]: WA IMMND DOWN on active
controller 1 detected at standby immd!! 2. Possible failover
Sep 6 20:31:02 amcontroller2 osafimmd[5385]: NO Skipping re-send of fevs
message 1371 since it has recently been resent.
Sep 6 20:31:02 amcontroller2 osafimmd[5385]: NO Skipping re-send of fevs
message 1372 since it has recently been resent.
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Global discard node received
for nodeId:1010f pid:15277
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 12
<0, 1010f(down)> (safCheckPointService)
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 10
<0, 1010f(down)> (@safSmf_applier1)
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 9
<0, 1010f(down)> (safSmfService)
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 7
<0, 1010f(down)> (safLckService)
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 6
<0, 1010f(down)> (safEvtService)
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 4
<0, 1010f(down)> (safAmfService)
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 2
<0, 1010f(down)> (@safLogService_appl)
Sep 6 20:31:02 amcontroller2 osafimmnd[5401]: NO Implementer disconnected 1
<0, 1010f(down)> (safLogService)
Sep 6 20:31:02 amcontroller2 osaffmd[5370]: NO AMFND down on: 1010f
Sep 6 20:31:02 amcontroller2 osaffmd[5370]: NO AVD down on: 1010f
Sep 6 20:31:02 amcontroller2 osaffmd[5370]: NO Core services went down on
node_id: 1010f
Sep 6 20:31:25 amcontroller2 opensafd: Stopping OpenSAF Services
Sep 6 20:31:25 amcontroller2 osafamfnd[5476]: NO Shutdown initiated
Sep 6 20:31:25 amcontroller2 osafamfnd[5476]: NO Terminating all AMF components
Sep 6 20:31:25 amcontroller2 osaflcknd[5556]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafckptnd[5526]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafsmfnd[5639]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osaflogd[5416]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafevtd[5536]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafclmna[5342]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafsmfd[5693]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafclmd[5446]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafmsgnd[5608]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafimmnd[5401]: NO Implementer locally
disconnected. Marking it as doomed 13 <124, 1020f> (MsgQueueService66063)
Sep 6 20:31:26 amcontroller2 osafckptd[5506]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osaflckd[5588]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafrded[5356]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafmsgd[5663]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafamfwd[5498]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafntfd[5431]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafimmd[5385]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafimmnd[5401]: ER No IMMD service => cluster
restart, exiting
Sep 6 20:31:26 amcontroller2 osafamfd[5461]: NO Re-initializing with IMM
Sep 6 20:31:26 amcontroller2 osaffmd[5370]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafamfnd[5476]: NO Terminated all AMF components
Sep 6 20:31:26 amcontroller2 osafamfnd[5476]: NO Shutdown completed, exiting
Sep 6 20:31:26 amcontroller2 osafamfnd[5476]: exiting for shutdown
Sep 6 20:31:26 amcontroller2 osafamfd[5461]: exiting for shutdown
Sep 6 20:31:27 amcontroller2 osaftransportd[5338]: exiting for shutdown
Sep 6 20:31:27 amcontroller2 opensafd: OpenSAF services successfully stopped
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users