- **status**: assigned --> not-reproducible
- **Comment**:
Not able to reproduce the problem with changeset: 6775.
I configured TCP transport with IPV6 link local configuration
please find steps followed to reproduce the problem :
SC-1:# Download AppConfig-2N-68.xml to SC-1 /tmp/ dir and issue following
commands
SC-1:#immcfg -f /tmp/AppConfig-2N-68.xml
SC-1:# amf-adm unlock-in safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
SC-1:# immlist safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
Name Type Value(s)
===================================
safSu SA_STRING_T safSu=SU1
saAmfSUType SA_NAME_T
safVersion=1,safSuType=AmfDemo1 (31)
saAmfSURestartCount SA_UINT32_T 0 (0x0)
saAmfSUReadinessState SA_UINT32_T 1 (0x1)
saAmfSURank SA_UINT32_T 1 (0x1)
saAmfSUPresenceState SA_UINT32_T 3 (0x3)
saAmfSUPreInstantiable SA_UINT32_T 1 (0x1)
saAmfSUOperState SA_UINT32_T 1 (0x1)
saAmfSUNumCurrStandbySIs SA_UINT32_T 0 (0x0)
saAmfSUNumCurrActiveSIs SA_UINT32_T 0 (0x0)
saAmfSUMaintenanceCampaign SA_NAME_T <Empty>
saAmfSUHostedByNode SA_NAME_T
safAmfNode=SC-1,safAmfCluster=myAmfCluster (42)
saAmfSUHostNodeOrNodeGroup SA_NAME_T
safAmfNode=SC-1,safAmfCluster=myAmfCluster (42)
saAmfSUFailover SA_UINT32_T <Empty>
saAmfSUAssignedSIs SA_NAME_T <Empty>
saAmfSUAdminState SA_UINT32_T 2 (0x2)
SaImmAttrImplementerName SA_STRING_T safAmfService
SaImmAttrClassName SA_STRING_T SaAmfSU
SaImmAttrAdminOwnerName SA_STRING_T <Empty>
SC-1:# ps -ef | grep opt
root 5658 1 0 10:10 ? 00:00:00 /opt/amf_demo/amf_demo
root 5666 1 0 10:10 ? 00:00:00 /opt/amf_demo/amf_demo
root 5674 1 0 10:10 ? 00:00:00 /opt/amf_demo/amf_demo
root 5743 25800 0 10:11 pts/0 00:00:00 grep opt
SC-1:# immfind safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
safComp=AmfDemo1,safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
safComp=AmfDemo2,safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
safComp=AmfDemo3,safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
safSupportedCsType=safVersion=1\,safCSType=AmfDemo1,safComp=AmfDemo1,safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
safSupportedCsType=safVersion=1\,safCSType=AmfDemo1,safComp=AmfDemo2,safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
safSupportedCsType=safVersion=1\,safCSType=AmfDemo1,safComp=AmfDemo3,safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
SC-1:# pkill amf_demo
SC-1:# pkill amf_demo
SC-1:# pkill amf_demo
SC-1:# pkill amf_demo
SC-1:# pkill amf_demo
SC-1:# pkill amf_demo
SC-1:# pkill amf_demo
SC-1:# ..........................
---
** [tickets:#68] failover didnot succeed and cluster got reset due to MDS
problems.**
**Status:** not-reproducible
**Milestone:** 4.5.2
**Created:** Sat May 11, 2013 05:22 PM UTC by surender khetavath
**Last Updated:** Mon Sep 07, 2015 10:30 AM UTC
**Owner:** A V Mahesh (AVM)
**Attachments:**
- [logs.tgz](https://sourceforge.net/p/opensaf/tickets/68/attachment/logs.tgz)
(16.2 MB; application/x-compressed-tar)
-
[AppConfig-2N-68.xml](https://sourceforge.net/p/opensaf/tickets/68/attachment/AppConfig-2N-68.xml)
(23.1 kB; text/xml)
Changeset : 4241 with 2794&3117 patch
Model : TwoN
configuration: 1App,1SG,4SUs with 3comps each and 5SIs with 3CSIs each
Transport : TCP/ipv6-linklocal
PBE enabled.
scenario:
sc1 was active and sc2 standby.
Active SU on Sc1 was shutdown and component was made to reject quiescing
assignment. Component got restarted for 10times as compRestartMax=10 and then
escalated to nodefailover following a suFailover.
sc-2 didnot become active, and eventually rebooted. Thus causing a cluster
reset.
syslog on sc-1:
--------------
May 11 21:24:49 sc-1 osafimmnd[4683]: WA Error code 2 returned for message type
21 - ignoring
May 11 21:24:49 sc-1 osafamfnd[4790]: NO Received reboot order, ordering reboot
now!
May 11 21:24:49 sc-1 osafamfnd[4790]: Rebooting OpenSAF NodeId = 131343 EE Name
= , Reason: Received reboot order
May 11 21:24:49 sc-1 opensaf_reboot: Rebooting local node
May 11 21:24:49 sc-1 osafimmnd[4683]: WA MESSAGE:5319 OUT OF ORDER my highest
processed:5317, exiting
May 11 21:24:49 sc-1 osafimmpbed: WA PBE lost contact with parent IMMND -
Exiting
May 11 21:24:49 sc-1 osafntfimcnd[4734]: ER saImmOiDispatch() Fail
SA_AIS_ERR_BAD_HANDLE (9)
May 11 21:24:49 sc-1 osafimmd[4668]: WA IMMND coordinator at 2010f apparently
crashed => electing new coord
May 11 21:24:49 sc-1 osafimmd[4668]: ER Failed to find candidate for new IMMND
coordinator
May 11 21:24:49 sc-1 osafimmd[4668]: ER Active IMMD has to restart the IMMSv.
All IMMNDs will restart
May 11 21:24:49 sc-1 osafimmd[4668]: ER IMM RELOAD => ensure cluster restart
by IMMD exit at both SCs, exiting
syslog on sc-2:
----------------
May 11 21:24:49 sc-2 osafimmd[3894]: WA IMMD not re-electing coord for
switch-over (si-swap) coord at (2010f)
May 11 21:24:49 sc-2 osafntfimcnd[3969]: NO exiting on signal 15
May 11 21:24:49 sc-2 osafsmfd[4052]: ER amf_active_state_handler oi activate
FAILED
May 11 21:24:49 sc-2 osafamfnd[4023]: NO
'safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to
'csiSetcallbackFailed' : Recovery is 'nodeFailfast'
May 11 21:24:49 sc-2 osafamfnd[4023]: ER
safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due
to:csiSetcallbackFailed Recovery is:nodeFailfast
May 11 21:24:49 sc-2 osafamfnd[4023]: Rebooting OpenSAF NodeId = 131599 EE Name
= , Reason: Component faulted: recovery is node failfast
May 11 21:24:49 sc-2 osafmsgd[4216]: ER mqd_imm_declare_implementer failed: err
= 14
May 11 21:24:49 sc-2 osafckptd[4202]: ER cpd immOiImplmenterSet failed with err
= 14
May 11 21:24:49 sc-2 opensaf_reboot: Rebooting local node
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets