- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#722] payloads did not go for reboot when both the controllers 
rebooted**

**Status:** assigned
**Milestone:** future
**Created:** Thu Jan 16, 2014 07:36 AM UTC by Sirisha Alla
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody
**Attachments:**

- 
[payloadnoreboot.tar.bz2](https://sourceforge.net/p/opensaf/tickets/722/attachment/payloadnoreboot.tar.bz2)
 (765.1 kB; application/x-bzip)


The issue is seen on changeset 4733 + patches of CLM corresponding to 
changesets of #220. Continuous failovers are happening when some api 
invocations of IMM application are ongoing. The IMMD has asserted on the new 
active which is reported in the ticket #721

When both controllers got rebooted, the payloads did not get rebooted. Instead 
the opensaf services are up and running. CLM shows that both the payloads are 
not part of cluster. When the payloads are restarted manually, they joined the 
cluster.

PL-3 syslog:

Jan 15 18:23:09 SLES-64BIT-SLOT3 osafimmnd[3550]: NO implementer for class 
'testMA_verifyObjApplNoResponseModCallback_101' is released => class extent is 
UNSAFE
Jan 15 18:23:59 SLES-64BIT-SLOT3 logger: Invoking failover from 
invoke_failover.sh
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA DISCARD DUPLICATE FEVS 
message:92993
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA Error code 2 returned for 
message type 57 - ignoring
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA DISCARD DUPLICATE FEVS 
message:92994
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA Error code 2 returned for 
message type 57 - ignoring
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: WA Director Service in 
NOACTIVE state - fevs replies pending:1 fevs highest processed:92994
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[3550]: NO No IMMD service => cluster 
restart
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafamfnd[3572]: NO 
'safComp=IMMND,safSu=PL-3,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[6827]: Started
Jan 15 18:24:01 SLES-64BIT-SLOT3 osafimmnd[6827]: NO Persistent Back-End 
capability configured, Pbe file:imm.db (suffix may get added)
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.176901] TIPC: Resetting link 
<1.1.3:eth0-1.1.2:eth0>, peer not responding
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.176911] TIPC: Lost link 
<1.1.3:eth0-1.1.2:eth0> on network plane A
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.176918] TIPC: Lost contact with 
<1.1.2>
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.256091] TIPC: Resetting link 
<1.1.3:eth0-1.1.1:eth0>, peer not responding
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.256100] TIPC: Lost link 
<1.1.3:eth0-1.1.1:eth0> on network plane A
Jan 15 18:24:07 SLES-64BIT-SLOT3 kernel: [ 6343.256106] TIPC: Lost contact with 
<1.1.1>
Jan 15 18:24:25 SLES-64BIT-SLOT3 kernel: [ 6361.425537] TIPC: Established link 
<1.1.3:eth0-1.1.2:eth0> on network plane A
Jan 15 18:24:27 SLES-64BIT-SLOT3 osafimmnd[6827]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Jan 15 18:24:27 SLES-64BIT-SLOT3 osafimmnd[6827]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Jan 15 18:24:27 SLES-64BIT-SLOT3 osafimmnd[6827]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_LOADING_CLIENT
Jan 15 18:24:29 SLES-64BIT-SLOT3 osafimmnd[6827]: NO ERR_BAD_HANDLE: Admin 
owner 1 does not exist
Jan 15 18:24:36 SLES-64BIT-SLOT3 kernel: [ 6372.473240] TIPC: Established link 
<1.1.3:eth0-1.1.1:eth0> on network plane A
Jan 15 18:24:39 SLES-64BIT-SLOT3 osafimmnd[6827]: NO ERR_BAD_HANDLE: Admin 
owner 2 does not exist
Jan 15 18:24:39 SLES-64BIT-SLOT3 osafimmnd[6827]: NO NODE STATE-> 
IMM_NODE_LOADING
Jan 15 18:24:45 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:5000
Jan 15 18:24:46 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:6000
Jan 15 18:24:47 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:7000
Jan 15 18:24:48 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:8000
Jan 15 18:24:49 SLES-64BIT-SLOT3 osafimmnd[6827]: WA Number of objects in IMM 
is:9000

After both the controllers came up following is the status:

SLES-64BIT-SLOT1:~ # immlist safNode=PL-3,safCluster=myClmCluster
Name                                               Type         Value(s)
========================================================================
safNode                                            SA_STRING_T  safNode=PL-3
saClmNodeLockCallbackTimeout                       SA_TIME_T    50000000000 
(0xba43b7400, Thu Jan  1 05:30:50 1970)
saClmNodeIsMember                                  SA_UINT32_T  <Empty>
saClmNodeInitialViewNumber                         SA_UINT64_T  <Empty>
saClmNodeID                                        SA_UINT32_T  <Empty>
saClmNodeEE                                        SA_NAME_T    <Empty>
saClmNodeDisableReboot                             SA_UINT32_T  0 (0x0)
saClmNodeCurrAddressFamily                         SA_UINT32_T  <Empty>
saClmNodeCurrAddress                               SA_STRING_T  <Empty>
saClmNodeBootTimeStamp                             SA_TIME_T    <Empty>
saClmNodeAdminState                                SA_UINT32_T  1 (0x1)
saClmNodeAddressFamily                             SA_UINT32_T  <Empty>
saClmNodeAddress                                   SA_STRING_T  <Empty>
SaImmAttrImplementerName                           SA_STRING_T  safClmService
SaImmAttrClassName                                 SA_STRING_T  SaClmNode
SaImmAttrAdminOwnerName                            SA_STRING_T  IMMLOADER

SLES-64BIT-SLOT1:~ # immlist safAmfNode=PL-3,safAmfCluster=myAmfCluster
Name                                               Type         Value(s)
========================================================================
safAmfNode                                         SA_STRING_T  safAmfNode=PL-3
saAmfNodeSuFailoverMax                             SA_UINT32_T  2 (0x2)
saAmfNodeSuFailOverProb                            SA_TIME_T    1200000000000 
(0x1176592e000, Thu Jan  1 05:50:00 1970)
saAmfNodeOperState                                 SA_UINT32_T  2 (0x2)
saAmfNodeFailfastOnTerminationFailure              SA_UINT32_T  0 (0x0)
saAmfNodeFailfastOnInstantiationFailure            SA_UINT32_T  0 (0x0)
saAmfNodeClmNode                                   SA_NAME_T    
safNode=PL-3,safCluster=myClmCluster (36)
saAmfNodeCapacity                                  SA_STRING_T  <Empty>
saAmfNodeAutoRepair                                SA_UINT32_T  1 (0x1)
saAmfNodeAdminState                                SA_UINT32_T  1 (0x1)
SaImmAttrImplementerName                           SA_STRING_T  safAmfService
SaImmAttrClassName                                 SA_STRING_T  SaAmfNode
SaImmAttrAdminOwnerName                            SA_STRING_T  IMMLOADER

SLES-64BIT-SLOT3:/opt/goahead/tetware/opensaffire # /etc/init.d/opensafd status
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
SLES-64BIT-SLOT3:/opt/goahead/tetware/opensaffire # cat /etc/opensaf/node_name
PL-3
SLES-64BIT-SLOT3:/opt/goahead/tetware/opensaffire # ps -ef | grep saf
root      3538     1  0 17:16 ?        00:00:00 /bin/sh 
/usr/lib64/opensaf/clc-cli/osaf-transport-monitor
root      3563     1  0 17:16 ?        00:00:00 /usr/lib64/opensaf/osafclmna 
--tracemask=0xffffffff
root      3572     1  0 17:16 ?        00:00:00 /usr/lib64/opensaf/osafamfnd 
--tracemask=0xffffffff
root      3582     1  0 17:16 ?        00:00:00 /usr/lib64/opensaf/osafsmfnd
root      3591     1  0 17:16 ?        00:00:00 /usr/lib64/opensaf/osafmsgnd
root      3608     1  0 17:16 ?        00:00:00 /usr/lib64/opensaf/osaflcknd
root      3617     1  0 17:16 ?        00:00:00 /usr/lib64/opensaf/osafckptnd
root      3626     1  0 17:16 ?        00:00:00 /usr/lib64/opensaf/osafamfwd
root      6827     1  1 18:24 ?        00:00:13 /usr/lib64/opensaf/osafimmnd 
--tracemask=0xffffffff
root      7490  3073  0 18:42 pts/0    00:00:00 grep saf


Same is with PL-4. Attached the AMF traces.



---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to