- **status**: unassigned --> duplicate
- **assigned_to**: Minh Hon Chau
- **Milestone**: 5.2.RC1 --> future
- **Comment**:
It's a bit hard to debug because the timestamp is not synced between SC and PL
~~~
Oct 5 12:34:23 SYSTEST-PLD-1 osafamfnd[2626]: NO Assigned
'safSi=TestApp_SI3,safApp=TestApp_TwoN' ACTIVE to
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 5 12:34:23 SYSTEST-PLD-1 osafamfnd[2626]: NO Assigned
'safSi=TestApp_SI4,safApp=TestApp_TwoN' ACTIVE to
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 5 12:34:23 SYSTEST-PLD-1 osafamfnd[2626]: NO Assigned
'safSi=TestApp_SI1,safApp=TestApp_TwoN' ACTIVE to
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
~~~
~~~
Oct 5 12:34:05.923981 osafamfd [2188:sgproc.cc:1056] >> avd_su_si_assign_evh:
id:126, node:2030f, act:2,
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN',
'safSi=TestApp_SI3,safApp=TestApp_TwoN', ha:1, err:1, single:0
Oct 5 12:34:05.936734 osafamfd [2188:sgproc.cc:1056] >> avd_su_si_assign_evh:
id:127, node:2030f, act:2,
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN',
'safSi=TestApp_SI4,safApp=TestApp_TwoN', ha:1, err:1, single:0
Oct 5 12:34:05.944304 osafamfd [2188:sgproc.cc:1056] >> avd_su_si_assign_evh:
id:128, node:2030f, act:2,
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN',
'safSi=TestApp_SI1,safApp=TestApp_TwoN', ha:1, err:1, single:0
~~~
The COMP2 has reponded but it just happened before SC was down
~~~
Oct 5 12:34:33 SYSTEST-PLD-1 osafamfnd[2626]: NO Restarting a component of
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' (comp restart count:
1)
Oct 5 12:34:33 SYSTEST-PLD-1 osafamfnd[2626]: NO
'safComp=COMP2,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' faulted
due to 'csiSetcallbackTimeout' : Recovery is 'componentRestart'
Oct 5 12:34:33 SYSTEST-PLD-1 osafamfnd[2626]: NO Assigned
'safSi=TestApp_SI2,safApp=TestApp_TwoN' ACTIVE to
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct 5 12:34:33 SYSTEST-PLD-1 osafamfnd[2626]: WA AMF director unexpectedly
crashed
~~~
The problem appears to be similar to #2105.
Ticket #2105 is reported in a general way for all AMF entities, so mark #2096
as a duplication of #2105.
---
** [tickets:#2096] AMF : SG in unstable state for fault in component during
admin unlock (headless)**
**Status:** duplicate
**Milestone:** future
**Created:** Wed Oct 05, 2016 08:08 AM UTC by Srikanth R
**Last Updated:** Wed Mar 01, 2017 04:24 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**
-
[2096.tgz](https://sourceforge.net/p/opensaf/tickets/2096/attachment/2096.tgz)
(4.6 MB; application/x-compressed-tar)
Environment :
-----------------
Changeset: 7997 5.1.FC
Setup : 5 nodes setup with 2 controllers and headless feature enabled and PBE
disabled.
Application : 2N application with 2 SUs and 4 SIs with out si-si deps.
Steps performed :
----------------------
SG moved to unstable state for fault in component when admin unlock operation
is performed on SG and headless state is invoked. Below are the steps performed.
-> The application is brought up initially and the SIs are fully assigned.
-> Now performed lock,lock-in , unlock-in and unlock operation performed on SG
with the sufficient time gap.
-> During unlock operation of SG, component 2 of SU1 did not respond to the
active assignment, headless scenario is invoked.
3148 12:34:05 10/05/2016 NO safApp=safAmfService "Admin op "UNLOCK"
initiated for 'safSg=TestApp_SG1,safApp=TestApp_TwoN', invocation:
1683627180042"
3149 12:34:05 10/05/2016 NO safApp=safAmfService
"safSg=TestApp_SG1,safApp=TestApp_TwoN AdmState LOCKED => UNLOCKED"
-> After headless state is achieved, component2 faulted with csi set callback
timeout.
Oct 5 12:34:33 SYSTEST-PLD-1 osafamfnd[2626]: NO
'safComp=COMP2,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' faulted
due to 'csiSetcallbackTimeout' : Recovery is 'componentRestart'
-> After controllers joined back the cluster, SU2 did not get any assignments.
--> Further operations on SG resulted in UNSTABLE state.
3202 12:40:59 10/05/2016 NO safApp=safAmfService "Admin op "LOCK"
initiated for 'safSg=TestApp_SG1,safApp=TestApp_TwoN', invocation:
1696512081921"
3203 12:40:59 10/05/2016 NO safApp=safAmfService "Admin op invocation:
1696512081921, err: 'SG not in STABLE state
(safSg=TestApp_SG1,safApp=TestApp_TwoN)'"
3204 12:40:59 10/05/2016 NO safApp=safAmfService "Admin op done for
invocation: 1696512081921, result 6"
Logs :
The traces of SC-1 ( active controller before headless and after headless )
and PL-3 ( SU1 hosted) are attached.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets