Hello. I currently have an issue with a cluster. It's running RHEL 6 (I believe it's 6.7), with the normal Ricci, Luci, etc.
The cluster has been working fine for over a year, until it suddenly didn't work fine anymore. Basically, it has three Oracle databases running, from the same Oracle-home, but with different data- directories. Both the Oracle software and datafiles are on shared storage (SAN), using clvm (so it's an active/passive setup, only one of the two nodes can have the disk, filesystem etc.). There are three databases, and two of them will run fine on either node. The problem is the third instance, which will only run on one of the nodes. The short story is: I get told "ORA-01081: cannot start already-running ORACLE - shut it down first" on the non-working database. I'm kind of hoping someone will have seen this or something similar before and be able to give me a friendly nudge in the right direction :) Here's the extract from cluster.conf : >From resources: <oralistener home="/u01/app/oracle/product/11.2" name="Listener" user="oracle"/> <orainstance home="/u01/app/oracle/product/11.2" name="EDB" user="oracle"/> <orainstance home="/u01/app/oracle/product/11.2" name="BB" user="oracle"/> <orainstance home="/u01/app/oracle/product/11.2" name="MYDB" user="oracle"/> >From service block: <oralistener ref="Listener"/> <orainstance ref="EDB"/> <orainstance ref="BB"/> <orainstance ref="MYDB"/> In /var/log/cluster/rgmanager.log I see the following: Feb 25 01:06:47 rgmanager [oralistener] Validating configuration for Listener Feb 25 01:06:49 rgmanager [orainstance] Validating configuration for EDB Feb 25 01:07:07 rgmanager [orainstance] Validating configuration for BB Feb 25 01:07:15 rgmanager [orainstance] Validating configuration for MYDB Feb 25 01:07:16 rgmanager start on orainstance "MYDB" returned 1 (generic error) Feb 25 01:07:16 rgmanager #68: Failed to start service:my-cluster-01- db; return value: 1 Feb 25 01:07:16 rgmanager Stopping service service:my-cluster-01-db I can also see Oracle processes running, for the listener and the two other databases, so that part is OK. I've added extra syslog debugging, and this is what I see in /var/log/messages: Feb 25 01:07:15 dbserv02 rgmanager[50083]: [orainstance] Validating configuration for MYDB Feb 25 01:07:15 dbserv02 logger[50127]: Validating configuration for MYDB Feb 25 01:07:15 dbserv02 logger[50135]: Validation checks for MYDB succeeded Feb 25 01:07:15 dbserv02 logger[50136]: Starting service MYDB Feb 25 01:07:15 dbserv02 logger[50137]: Starting Oracle DB MYDB Feb 25 01:07:16 dbserv02 logger[50167]: [MYDB] [0] sent set heading off;\nstartup;\nquit;\n Feb 25 01:07:16 dbserv02 logger[50168]: [MYDB] [0] got ORA-01081: cannot start already-running ORACLE - shut it down first Feb 25 01:07:16 dbserv02 logger[50172]: Starting Oracle DB MYDB failed, found errors in stdout Feb 25 01:07:16 dbserv02 logger[50173]: Starting service MYDB failed Feb 25 01:07:16 dbserv02 rgmanager[7467]: start on orainstance "MYDB" returned 1 (generic error) Feb 25 01:07:16 dbserv02 rgmanager[7467]: #68: Failed to start service:my-cluster-01-db; return value: 1 Feb 25 01:07:16 dbserv02 rgmanager[7467]: Stopping service service:my- cluster-01-db I know the non-working database instance has previously been running fine on the node we now see the problem with. I guess something must have changed, but I'm currently not sure where I should look. Oh, one problem we did see: initially neither of the databases would run on this node because someone had decided to remove the user "oracle" from the "dba" group. Regards Eivind Olsen -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster