- **status**: assigned --> review


---

** [tickets:#1264] hard resetting active controller reboots standby too**

**Status:** review
**Milestone:** 4.6.FC
**Created:** Thu Mar 12, 2015 04:35 PM UTC by Alex Jones
**Last Updated:** Thu Mar 12, 2015 04:35 PM UTC
**Owner:** Alex Jones

If I hard reset the active controller, the standby reboots, too.  This is 
because the node down message is coming after amfd on the active has initiated 
a switchover, and the newly active OpenSAF services fail in 
saImmOiClassImplementerSet because node down isn't seen yet.  The bug appears 
to be in how AMF, CLM, and PLM are interworking.

Here is a log snippet from the standby controller when the active is hard reset.

Mar 11 12:53:30 q50-s2 osafmsgd[6447]: ER Standby Processing for DEREG message 
Success
Mar 11 12:53:30 q50-s2 osafmsgd[6447]: ER ERR_NOT_EXIST: Track list associated 
with the object not found
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Mar 11 12:53:30 q50-s2 osafimmd[6257]: WA IMMD not re-electing coord for 
switch-over (si-swap) coord at (2010f)
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: NO Assigning 
'safSi=Management-2N,safApp=ManagementApp' ACTIVE to 
'safSu=Management-SU2,safSg=Management-2N,safApp=ManagementApp'
Mar 11 12:53:30 q50-s2 osafmsgd[6447]: ER mqd_imm_declare_implementer failed: 
err = 14
Mar 11 12:53:30 q50-s2 osaflogd[6275]: ER saImmOiClassImplementerSet 
(safLogService) failed: 14
Mar 11 12:53:30 q50-s2 osafckptd[6367]: ER cpd immOiImplmenterSet failed with 
err = 14
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: NO 
'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: ER 
safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Mar 11 12:53:30 q50-s2 osafamfnd[6354]: Rebooting OpenSAF NodeId = 131599 EE 
Name = 
safEE=Linux_os_hosting_clm_node,safHE=Stirling_Blade_slot_2,safDomain=Q50chassis,
 Reason: Component faulted: recovery is node failfast, OwnNodeId = 131599, 
SupervisionTime = 60



---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to