Hi AnderBj,
I have tested in the same setup, that has reported #724 with the attached patch
(The patch removes the checking of adminowner each time in same CCB)
Since this is CCB abort, and each CCB is only associated with one admin owner,
then having multiple admin owner scenario may not be considered in this case.
The rejection(Aborting ) the CCB with 200k taken only 2 seconds.
Jan 30 16:38:21 SLES-SLOT4 osafimmpbed: WA Re-inserting Ccb record 2 in
ccb-completed upcall => PBE recovery,
Jan 30 16:38:21 SLES-SLOT4 osafimmpbed: WA Failed to find CCB object for 2/2 -
checking DB file for outcome
Jan 30 16:38:21 SLES-SLOT4 osafimmpbed: NO getCcbOutcomeFromPbe: Could not find
ccb 2 presume ABORT
Jan 30 16:38:21 SLES-SLOT4 osafimmnd[11877]: NO Ccb 2 ABORTED (<released>)
Jan 30 16:38:21 SLES-SLOT4 osafimmpbed: WA Failed to find CCB object for 2/2
Jan 30 16:38:21 SLES-SLOT4 osafimmnd[11877]: WA Admin owner id 8 does not exist
Jan 30 16:38:23 SLES-SLOT4 osafimmnd[11877]: NO Announce sync, epoch:8
The latest logs and patch are attached in latest_logs.tgz
/Neel.
---
** [tickets:#724] imm: sync with payload node resulted in controller reboots**
**Status:** accepted
**Created:** Thu Jan 16, 2014 11:22 AM UTC by surender khetavath
**Last Updated:** Thu Jan 30, 2014 10:58 AM UTC
**Owner:** Anders Bjornerstedt
changeset: 4733
setup: 2 controllers
Test:
Brought up 2 controllers and addes 2Lakh objects(200000). Now started pl-3
Opensaf start on pl3 was not successful. After some time both controllers
rebooted but pl-3 didnot go for reboot though the controllers are not
available.
syslog on sc-1:
Jan 16 16:26:43 SLES-SLOT4 osafamfnd[4536]: NO
'safComp=IMMND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF' faulted due to
'healthCheckcallbackTimeout' : Recovery is 'componentRestart'
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4464]: WA Admin owner id 8 does not exist
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4464]: WA Admin owner id 8 does not exist
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4464]: WA Admin owner id 8 does not exist
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4464]: WA Admin owner id 8 does not exist
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4464]: WA Admin owner id 8 does not exist
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4464]: WA Admin owner id 8 does not exist
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4464]: WA Admin owner id 8 does not exist
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4464]: WA Admin owner id 8 does not exist
Jan 16 16:26:43 SLES-SLOT4 osafimmpbed: NO PBE received SIG_TERM, closing db
handle
Jan 16 16:26:43 SLES-SLOT4 osafimmd[4454]: WA IMMND coordinator at 2010f
apparently crashed => electing new coord
Jan 16 16:26:43 SLES-SLOT4 osafntfimcnd[4496]: ER saImmOiDispatch() Fail
SA_AIS_ERR_BAD_HANDLE (9)
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4980]: Started
Jan 16 16:26:43 SLES-SLOT4 osafimmpbed: WA PBE lost contact with parent IMMND -
Exiting
Jan 16 16:26:43 SLES-SLOT4 osafimmd[4454]: NO New coord elected, resides at
2020f
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4980]: NO Persistent Back-End capability
configured, Pbe file:imm.db (suffix may get added)
Jan 16 16:26:43 SLES-SLOT4 osafimmnd[4980]: NO SERVER STATE:
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Jan 16 16:26:44 SLES-SLOT4 osafimmd[4454]: NO New IMMND process is on ACTIVE
Controller at 2010f
Jan 16 16:26:44 SLES-SLOT4 osafimmd[4454]: NO Extended intro from node 2010f
Jan 16 16:26:44 SLES-SLOT4 osafimmnd[4980]: NO Fevs count adjusted to 201407
preLoadPid: 0
Jan 16 16:26:44 SLES-SLOT4 osafimmnd[4980]: NO SERVER STATE:
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Jan 16 16:26:44 SLES-SLOT4 osafimmnd[4980]: NO SERVER STATE:
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Jan 16 16:26:44 SLES-SLOT4 osafimmd[4454]: WA IMMND on controller (not
currently coord) requests sync
Jan 16 16:26:44 SLES-SLOT4 osafimmd[4454]: NO Node 2010f request sync
sync-pid:4980 epoch:0
Jan 16 16:26:44 SLES-SLOT4 osafimmnd[4980]: NO NODE STATE-> IMM_NODE_ISOLATED
Jan 16 16:26:53 SLES-SLOT4 osafamfd[4526]: NO Re-initializing with IMM
Jan 16 16:27:08 SLES-SLOT4 osafimmd[4454]: WA IMMND coordinator at 2020f
apparently crashed => electing new coord
Jan 16 16:27:08 SLES-SLOT4 osafimmd[4454]: ER Failed to find candidate for new
IMMND coordinator
Jan 16 16:27:08 SLES-SLOT4 osafimmd[4454]: ER Active IMMD has to restart the
IMMSv. All IMMNDs will restart
Jan 16 16:27:09 SLES-SLOT4 osafimmd[4454]: ER IMM RELOAD => ensure cluster
restart by IMMD exit at both SCs, exiting
Jan 16 16:27:09 SLES-SLOT4 osafimmnd[4980]: ER IMMND forced to restart on order
from IMMD, exiting
Jan 16 16:27:09 SLES-SLOT4 osafamfnd[4536]: NO
'safComp=IMMND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown'
: Recovery is 'componentRestart'
Jan 16 16:27:09 SLES-SLOT4 osafamfnd[4536]: NO
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
Jan 16 16:27:09 SLES-SLOT4 osafamfnd[4536]: ER
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Jan 16 16:27:09 SLES-SLOT4 osafamfnd[4536]: Rebooting OpenSAF NodeId = 131343
EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId =
131343, SupervisionTime = 60
Jan 16 16:27:09 SLES-SLOT4 opensaf_reboot: Rebooting local node; timeout=60
Jan 16 16:27:11 SLES-SLOT4 kernel: [ 1435.892099] md: stopping all md devices.
Read from remote host 172.1.1.4: Connection reset by peer
console output on pl-3
ps -ef| grep saf
root 16523 1 0 16:22 ? 00:00:00 /bin/sh
/usr/lib64/opensaf/clc-cli/osaf-transport-monitor
root 16629 1 0 16:27 ? 00:00:00 /usr/lib64/opensaf/osafimmnd
--tracemask=0xffffffff
root 16860 10365 0 16:39 pts/0 00:00:00 grep saf
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends. Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets