- **Comment**:
changeset: 6400:6637c54a41e7
branch: opensaf-4.5.x
parent: 6393:025368efd626
user: Nagendra Kumar<[email protected]>
date: Fri Mar 27 19:44:43 2015 +0530
summary: amfd: avoid failover in completed step of clm track when plm is
enabled [#1264]
changeset: 6401:538a8894aa06
branch: opensaf-4.6.x
parent: 6397:c18c06bcbc73
user: Nagendra Kumar<[email protected]>
date: Fri Mar 27 19:44:43 2015 +0530
summary: amfd: avoid failover in completed step of clm track when plm is
enabled [#1264]
changeset: 6402:ee847b078451
tag: tip
parent: 6399:729a89ab59f9
user: Nagendra Kumar<[email protected]>
date: Fri Mar 27 19:44:43 2015 +0530
summary: amfd: avoid failover in completed step of clm track when plm is
enabled [#1264]
---
** [tickets:#1264] hard resetting active controller reboots standby too**
**Status:** review
**Milestone:** 4.6.RC1
**Created:** Thu Mar 12, 2015 04:35 PM UTC by Alex Jones
**Last Updated:** Mon Mar 16, 2015 05:13 PM UTC
**Owner:** Alex Jones
If I hard reset the active controller, the standby reboots, too. This is
because the node down message is coming after amfd on the active has initiated
a switchover, and the newly active OpenSAF services fail in
saImmOiClassImplementerSet because node down isn't seen yet. The bug appears
to be in how AMF, CLM, and PLM are interworking.
Here is a log snippet from the standby controller when the active is hard reset.
Mar 11 12:53:30 q50-s2 osafmsgd[6447]: ER Standby Processing for DEREG message
Success
Mar 11 12:53:30 q50-s2 osafmsgd[6447]: ER ERR_NOT_EXIST: Track list associated
with the object not found
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: NO Assigning
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Mar 11 12:53:30 q50-s2 osafimmd[6257]: WA IMMD not re-electing coord for
switch-over (si-swap) coord at (2010f)
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: NO Assigning
'safSi=Management-2N,safApp=ManagementApp' ACTIVE to
'safSu=Management-SU2,safSg=Management-2N,safApp=ManagementApp'
Mar 11 12:53:30 q50-s2 osafmsgd[6447]: ER mqd_imm_declare_implementer failed:
err = 14
Mar 11 12:53:30 q50-s2 osaflogd[6275]: ER saImmOiClassImplementerSet
(safLogService) failed: 14
Mar 11 12:53:30 q50-s2 osafckptd[6367]: ER cpd immOiImplmenterSet failed with
err = 14
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: NO
'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: ER
safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: Rebooting OpenSAF NodeId = 131599 EE
Name =
safEE=Linux_os_hosting_clm_node,safHE=Stirling_Blade_slot_2,safDomain=Q50chassis,
Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599,
SupervisionTime = 60
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets