If opensafd on standby is successfully started, then it means the standby node
is ready to take the active role.
Performed failover, after standby joined the cluster successfully. But the
standby node could not take the active role and entire *CLUSTER RESET* has
happened, as the cluster is not having active role.
On the active controller ::
May 25 11:18:03 CONTROLLER-1 osafimmnd[2281]: NO SERVER STATE:
IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY
May 25 11:18:03 CONTROLLER-1 osafamfd[2342]: NO Received node_up from 2020f:
msg_id 1
May 25 11:18:04 CONTROLLER-1 osafamfd[2342]: NO Node 'SC-2' joined the cluster
9May 25 11:18:04 CONTROLLER-1 osafimmnd[2281]: NO Implementer connected: 19
(MsgQueueService131599) <0, 2020f>
May 25 11:18:04 CONTROLLER-1 osafrded[2249]: NO Peer up on node 0x2020f
May 25 11:18:04 CONTROLLER-1 osafrded[2249]: NO Got peer info request from node
0x2020f with role STANDBY
May 25 11:18:04 CONTROLLER-1 osafrded[2249]: NO Got peer info response from
node 0x2020f with role STANDBY
May 25 11:18:04 CONTROLLER-1 osafimmnd[2281]: NO Implementer (applier)
connected: 20 (@safAmfService2020f) <0, 2020f>
May 25 11:18:05 CONTROLLER-1 osafimmnd[2281]: NO Implementer (applier)
connected: 21 (@OpenSafImmReplicatorB) <0, 2020f>
May 25 11:18:05 CONTROLLER-1 osafamfnd[2353]: NO
'safComp=CPD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
On the standby controller ::
May 25 11:18:04 CONTROLLER-2 osafrded[4212]: NO Got peer info response from
node 0x2010f with role ACTIVE
May 25 11:18:04 CONTROLLER-2 osafimmd[4231]: IN AMF HA STANDBY request
May 25 11:18:04 CONTROLLER-2 osafimmd[4231]: IN node with dest ADDED
564114611150864
May 25 11:18:04 CONTROLLER-2 osafamfnd[4292]: NO Assigned
'safSi=SC-2N,safApp=OpenSAF' STANDBY to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
May 25 11:18:04 CONTROLLER-2 osafimmd[4231]: IN node with dest ADDED
565214191280144
May 25 11:18:04 CONTROLLER-2 opensafd: OpenSAF(5.0.0 - ) services successfully
started
May 25 11:18:04 CONTROLLER-2 osafimmd[4231]: IN node with dest ADDED
567412731609092
May 25 11:18:04 CONTROLLER-2 osafimmd[4231]: IN node with dest ADDED
566316589850628
done
CONTROLLER-2:~ # May 25 11:18:04 CONTROLLER-2 osafimmnd[4242]: NO Implementer
(applier) connected: 20 (@safAmfService2020f) <139, 2020f>
May 25 11:18:04 CONTROLLER-2 osafimmnd[4242]: NO Implementer (applier)
connected: 21 (@OpenSafImmReplicatorB) <147, 2020f>
May 25 11:18:04 CONTROLLER-2 osafntfimcnd[4446]: NO Started
May 25 11:18:05 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 12
<0, 2010f> (safCheckPointService)
May 25 11:18:10 CONTROLLER-2 osaffmd[4221]: NO Node Down event for node id
2010f:
May 25 11:18:10 CONTROLLER-2 osaffmd[4221]: NO Current role: STANDBY
May 25 11:18:10 CONTROLLER-2 osaffmd[4221]: Rebooting OpenSAF NodeId = 131343
EE Name = , Reason: Received Node Down for peer controller, OwnNodeId = 131599,
SupervisionTime = 60
May 25 11:18:10 CONTROLLER-2 kernel: [ 2246.200249] TIPC: Resetting link
<1.1.2:eth3-1.1.1:eth0>, peer not responding
May 25 11:18:10 CONTROLLER-2 kernel: [ 2246.200263] TIPC: Lost link
<1.1.2:eth3-1.1.1:eth0> on network plane A
May 25 11:18:10 CONTROLLER-2 kernel: [ 2246.200272] TIPC: Lost contact with
<1.1.1>
May 25 11:18:10 CONTROLLER-2 osafrded[4212]: NO Peer down on node 0x2010f
May 25 11:18:10 CONTROLLER-2 osafimmd[4231]: WA IMMD lost contact with peer
IMMD (NCSMDS_RED_DOWN)
May 25 11:18:10 CONTROLLER-2 osafimmd[4231]: IN Resend of fevs message 52769,
will not mbcp to peer IMMD
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: WA DISCARD DUPLICATE FEVS
message:52769
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: WA Error code 2 returned for
message type 82 - ignoring
May 25 11:18:10 CONTROLLER-2 osafimmd[4231]: IN Resend of fevs message 52770,
will not mbcp to peer IMMD
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: WA DISCARD DUPLICATE FEVS
message:52770
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: WA Error code 2 returned for
message type 82 - ignoring
May 25 11:18:10 CONTROLLER-2 osafimmd[4231]: WA IMMND DOWN on active controller
1 detected at standby immd!! 2. Possible failover
May 25 11:18:10 CONTROLLER-2 osafimmd[4231]: NO Skipping re-send of fevs
message 52769 since it has recently been resent.
May 25 11:18:10 CONTROLLER-2 osafimmd[4231]: NO Skipping re-send of fevs
message 52770 since it has recently been resent.
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Global discard node received
for nodeId:2010f pid:2281
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 13
<0, 2010f(down)> (OpenSafImmPBE)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 10
<0, 2010f(down)> (safSmfService)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 9 <0,
2010f(down)> (safLckService)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 8 <0,
2010f(down)> (safEvtService)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 7 <0,
2010f(down)> (safMsgGrpService)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 6 <0,
2010f(down)> (MsgQueueService131343)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 5 <0,
2010f(down)> (safAmfService)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 4 <0,
2010f(down)> (safClmService)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 3 <0,
2010f(down)> (@OpenSafImmReplicatorA)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 2 <0,
2010f(down)> (@safLogService_appl)
May 25 11:18:10 CONTROLLER-2 osafimmnd[4242]: NO Implementer disconnected 1 <0,
2010f(down)> (safLogService)
May 25 11:18:10 CONTROLLER-2 opensaf_reboot: Rebooting remote node in the
absence of PLM is outside the scope of OpenSAF
May 25 11:18:10 CONTROLLER-2 osaffmd[4221]: NO Controller Failover: Setting
role to ACTIVE
May 25 11:18:10 CONTROLLER-2 osafrded[4212]: NO RDE role set to ACTIVE
May 25 11:18:10 CONTROLLER-2 osafrded[4212]: NO Running
'/usr/lib64/opensaf/opensaf_sc_active' with 0 argument(s)
May 25 11:18:10 CONTROLLER-2 osafimmd[4231]: NO ACTIVE request
May 25 11:18:10 CONTROLLER-2 osaflogd[4252]: NO ACTIVE request
May 25 11:18:10 CONTROLLER-2 osafntfd[4262]: NO ACTIVE request
May 25 11:18:10 CONTROLLER-2 osafclmd[4272]: NO ACTIVE request
May 25 11:18:10 CONTROLLER-2 osafamfd[4282]: NO FAILOVER StandBy --> Active
May 25 11:18:10 CONTROLLER-2 osafamfd[4282]: ER FAILOVER StandBy --> Active
FAILED, Standby OUT OF SYNC
May 25 11:18:10 CONTROLLER-2 osafamfd[4282]: Rebooting OpenSAF NodeId = 0 EE
Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 131599,
SupervisionTime = 60
Here if cold sync is happening in background, this means opensafd on standby
is not completely UP. Opensafd successful start on standby is giving a false
claim to user.
---
** [tickets:#1842] rde: standby amfd notifies to NID early.**
**Status:** invalid
**Milestone:** never
**Created:** Fri May 20, 2016 09:25 AM UTC by Praveen
**Last Updated:** Fri May 20, 2016 07:19 PM UTC
**Owner:** nobody
Rde API rda_get_role() gives quiesced role on other than active controller from
5.0.
Since API gives quiesced role, AMFD notifies to NID in
initialize_for_assginment() even before cold sync completion.This ledas to
assignment of MW components even when still standby AMFD is undergoing cold
sync. This repopens a fixed issue #1334.
AMFD gets standby role through rde callback and then it again call
initialize_for_assignment() and initializes its interfaces. Also need to
remember rde callback does not come for quiesced role on spare controller.
This porblem coould be applicable to other direcotors also or atleast notifying
to nid before getting the role on standy may need some investigation.
One possible solution could be to give callback for quiesced role also. In that
case call to rda_get_role() along with initialize_for_assignmet() can be
removed and call initialize_for_assignment() only in rde callback.
Active AMFD:
May 20 14:51:26.012760 osafamfd [485:getenv.cc:0124] TR
OSAF_AMF_MIN_CLUSTER_SIZE is not set; using default value 2
May 20 14:51:26.014155 osafamfd [485:role.cc:0176] >>
initialize_for_assignment: ha_state = 1
May 20 14:51:26.014646 osafamfd [485:mds.cc:0108] >> avd_mds_init
May 20 14:51:26.026272 osafamfd [485:mds.cc:0136] TR vdest created
May 20 14:51:26.030009 osafamfd [485:mds.cc:0160] TR mds install vdest
Standby AMFD:
May 20 14:51:32.911753 osafamfd [481:getenv.cc:0124] TR
OSAF_AMF_MIN_CLUSTER_SIZE is not set; using default value 2
May 20 14:51:32.916912 osafamfd [481:role.cc:0176] >>
initialize_for_assignment: ha_state = 3
May 20 14:51:32.921865 osafamfd [481:role.cc:0243] <<
initialize_for_assignment: rc = 1
May 20 14:51:32.921912 osafamfd [481:main.cc:0587] << initialize
May 20 14:51:40.391058 osafamfd [481:main.cc:0456] >> rda_cb
May 20 14:51:40.391266 osafamfd [481:main.cc:0478] << rda_cb
May 20 14:51:40.392132 osafamfd [481:main.cc:0757] >> process_event:
evt->rcv_evt 23
May 20 14:51:40.392200 osafamfd [481:role.cc:0078] >> avd_role_change_evh:
cause=1, role=2, current_role=3
May 20 14:51:40.392230 osafamfd [481:role.cc:0176] >>
initialize_for_assignment: ha_state = 2
May 20 14:51:40.392256 osafamfd [481:mds.cc:0108] >> avd_mds_init
Spare AMFD:
May 20 14:51:33.030099 osafamfd [482:getenv.cc:0124] TR
OSAF_AMF_MIN_CLUSTER_SIZE is not set; using default value 2
May 20 14:51:33.030828 osafamfd [482:role.cc:0176] >>
initialize_for_assignment: ha_state = 3
May 20 14:51:33.031877 osafamfd [482:role.cc:0243] <<
initialize_for_assignment: rc = 1
May 20 14:51:33.031877 osafamfd [482:main.cc:0587] << initialize
~
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets