Thanks for your suggestion, i checked that article, but still unable to solve the problem. The same time the next day, another instance in the cluster died, with the same reason. ora-29740, still with reason 2. The cluster runs quite stable in the past month(since the patchset is installed, it is just about 30 days). When i check the linux /var/log/messages, i found at the exact same time, syslogd restarted in both node in the two days , when rac instance died. Whould there be some relations between them?Unix did not rebooted ,I checked uptime value. >From the trace file, i found it said the dead instance failed to transfer heart beat: first day, from the alive instace rac1:
*** 2002-12-21 04:01:54.227 kjxgrnbrisalive: (1, 2) not beating, HB: 479418910, 479418910 *** 2002-12-21 04:01:54.239 kjxgrnbrdead: Detected death of 1, initiating reconfig kjxgrrcfgchk: Initiating reconfig, reason 2 *** 2002-12-21 04:01:59.256 kjxgmrcfg: Reconfiguration started, reason 2 kjxgmcs: Setting state to 6 0. *** 2002-12-21 04:01:59.258 Name Service frozen kjxgmcs: Setting state to 6 1. from the trace file of the second day, from the alive instance rac2: *** 2002-12-22 04:01:56.457 kjxgrnbrisalive: (0, 1) not beating, HB: 479438832, 479438832 *** 2002-12-22 04:01:56.457 kjxgrnbrdead: Detected death of 0, initiating reconfig kjxgrrcfgchk: Initiating reconfig, reason 2 *** 2002-12-22 04:02:01.486 kjxgmrcfg: Reconfiguration started, reason 2 kjxgmcs: Setting state to 9 0. *** 2002-12-22 04:02:01.495 Name Service frozen I wonder if anyone here have the experience of dealing with rac system. What shall i check to verify why rac instance failed to update the controlfile. I already enabled event: event="29740 trace name errorstack level 3" in one instance. shall i enable the undocumented parameter _imr_active=false in the system? > Hi Chao: > > THe Instance 2 in your Cluster (rac2) was dead during > the fast reconfiguration (Check the reason in the alert log > file.. which says reason 2). You generally do a reconfig > (or fast reconfig) when you add/remove instances from the > Cluster setup, which is not (I hope) in your case. > > THere are some kernel events to trace the reconfigurations, > and an underscore parameter (I think it is _imr_active !) > to disable the 29740 usually not recommended. > > For investigation , review the check point, LMON trace files > and check the OS log files. > > > > > Best Regards, > K Gopalakrishnan > > > -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: chao_ping INET: [EMAIL PROTECTED] Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
