RE: Raid Arrays and Power Loss
Ian, Thanks for sharing (seriously). Igor Neyman, OCP DBA [EMAIL PROTECTED] -Original Message- MacGregor, Ian A. Sent: Monday, September 15, 2003 11:34 PM To: Multiple recipients of list ORACLE-L Last Friday was hot here, and rumor has it our 230 KV power line sagged and touched some tree branches. The local power company shut it off. Leaving our systems to depend on UPS. About 30 minutes afterwards one system produced these errors. This was jus before the system went dead Fri Sep 12 12:58:40 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00221: error on write to controlfile ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 CKPT: terminating instance due to error 221 Instance terminated by CKPT, pid = 1420 - Things look pretty shaky here. When things were restarted the following error was produced. Fri Sep 12 13:32:01 2003 ORA-00204: error in reading (block 1, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27091: skgfqio: unable to queue I/O SVR4 Error: 6: No such device or address Additional information: 1 The raid array had not been powered on --- However Fri Sep 12 15:33:08 2003 ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Fri Sep 12 15:33:11 2003 ORA-205 signalled during: alter database mount... Now the file system is available, but the file itself has disappeared. It was not corrupted, just disappeared. We duplex a copy to an internal disk. So recovery was easy. However once this was fixed Fri Sep 12 16:18:58 2003 Thread recovery: start rolling forward thread 1 Fri Sep 12 16:18:58 2003 Errors in file /opt/oracle/admin/BBRO/udump/bbro_ora_1804.trc: ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: '/u2/oradata/BBRO/redo0301.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 ORA-313 signalled during: ALTER DATABASE OPEN... - These files are on a RAID 1 LUN. Both copies of the file are gone. Again not corrupted but gone. I don't know if using duplexing rather than RAID 1 would have mattered here, but I am changing things so that one group of redo logs is on internal disk and written via the duplexing method. Ian MacGregor Stanford linear Accelerator Center [EMAIL PROTECTED] -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: MacGregor, Ian A. INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Igor Neyman INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
RE: Raid Arrays and Power Loss
For the curious, what brand/model RAID 1 are you using? Size? Rich Rich Jesse System/Database Administrator [EMAIL PROTECTED] Quad/Tech Inc, Sussex, WI USA -Original Message- From: MacGregor, Ian A. [mailto:[EMAIL PROTECTED] Sent: Monday, September 15, 2003 11:34 PM To: Multiple recipients of list ORACLE-L Subject: Raid Arrays and Power Loss Last Friday was hot here, and rumor has it our 230 KV power line sagged and touched some tree branches. The local power company shut it off. Leaving our systems to depend on UPS. About 30 minutes afterwards one system produced these errors. This was jus before the system went dead snip -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Jesse, Rich INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: Raid Arrays and Power Loss
Hi, what is your OS and filesystem? Regards zhu chao msn:[EMAIL PROTECTED] www.cnoug.org - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Tuesday, September 16, 2003 12:34 PM Last Friday was hot here, and rumor has it our 230 KV power line sagged and touched some tree branches. The local power company shut it off. Leaving our systems to depend on UPS. About 30 minutes afterwards one system produced these errors. This was jus before the system went dead Fri Sep 12 12:58:40 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00221: error on write to controlfile ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 CKPT: terminating instance due to error 221 Instance terminated by CKPT, pid = 1420 -- --- Things look pretty shaky here. When things were restarted the following error was produced. Fri Sep 12 13:32:01 2003 ORA-00204: error in reading (block 1, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27091: skgfqio: unable to queue I/O SVR4 Error: 6: No such device or address Additional information: 1 The raid array had not been powered on -- - However Fri Sep 12 15:33:08 2003 ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Fri Sep 12 15:33:11 2003 ORA-205 signalled during: alter database mount... Now the file system is available, but the file itself has disappeared. It was not corrupted, just disappeared. We duplex a copy to an internal disk. So recovery was easy. However once this was fixed Fri Sep 12 16:18:58 2003 Thread recovery: start rolling forward thread 1 Fri Sep 12 16:18:58 2003 Errors in file /opt/oracle/admin/BBRO/udump/bbro_ora_1804.trc: ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: '/u2/oradata/BBRO/redo0301.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 ORA-313 signalled during: ALTER DATABASE OPEN... -- --- These files are on a RAID 1 LUN. Both copies of the file are gone. Again not corrupted but gone. I don't know if using duplexing rather than RAID 1 would have mattered here, but I am changing things so that one group of redo logs is on internal disk and written via the duplexing method. Ian MacGregor Stanford linear Accelerator Center [EMAIL PROTECTED] -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: MacGregor, Ian A. INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: zhu chao INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
RE: [SPAM:#] Re: Raid Arrays and Power Loss
The OS is Solaris 5.8. The file systems is Veritas. Ian MacGregor Stanford Linear Accelerator Center [EMAIL PROTECTED] -Original Message- Sent: Tuesday, September 16, 2003 8:25 AM To: Multiple recipients of list ORACLE-L Hi, what is your OS and filesystem? Regards zhu chao msn:[EMAIL PROTECTED] www.cnoug.org - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Tuesday, September 16, 2003 12:34 PM Last Friday was hot here, and rumor has it our 230 KV power line sagged and touched some tree branches. The local power company shut it off. Leaving our systems to depend on UPS. About 30 minutes afterwards one system produced these errors. This was jus before the system went dead Fri Sep 12 12:58:40 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00221: error on write to controlfile ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 CKPT: terminating instance due to error 221 Instance terminated by CKPT, pid = 1420 -- --- Things look pretty shaky here. When things were restarted the following error was produced. Fri Sep 12 13:32:01 2003 ORA-00204: error in reading (block 1, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27091: skgfqio: unable to queue I/O SVR4 Error: 6: No such device or address Additional information: 1 The raid array had not been powered on -- - However Fri Sep 12 15:33:08 2003 ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Fri Sep 12 15:33:11 2003 ORA-205 signalled during: alter database mount... Now the file system is available, but the file itself has disappeared. It was not corrupted, just disappeared. We duplex a copy to an internal disk. So recovery was easy. However once this was fixed Fri Sep 12 16:18:58 2003 Thread recovery: start rolling forward thread 1 Fri Sep 12 16:18:58 2003 Errors in file /opt/oracle/admin/BBRO/udump/bbro_ora_1804.trc: ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: '/u2/oradata/BBRO/redo0301.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 ORA-313 signalled during: ALTER DATABASE OPEN... -- --- These files are on a RAID 1 LUN. Both copies of the file are gone. Again not corrupted but gone. I don't know if using duplexing rather than RAID 1 would have mattered here, but I am changing things so that one group of redo logs is on internal disk and written via the duplexing method. Ian MacGregor Stanford linear Accelerator Center [EMAIL PROTECTED] -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: MacGregor, Ian A. INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: zhu chao INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like
RE: Raid Arrays and Power Loss
Okay, core questions: -as someone asked, what's the make/model of storage? -has your raid array lost its config? In other words, is the storage there, just with an empty vtoc/volume table/partition table (insert your particular OS nomenclature) -Is the filesystem good, just empty? When you say the file is gone, is the /u1 directory empty, or is the filesystem structure there, just that file is gone? Okay, I just saw your message that shows its solaris 8 + veritas. Here's what probably happened. The box was powered on without the RAID array powered on and consequently veritas doesn't see the disk groups/volumes that are on the RAID array. Have you tried doing (as root): vxconfigd -km enable This will cause a rescan of the existing volume groups. Afterwards, what does a vxprint -hrt look like? In general, power loss to a RAID array will not produce the results you describe - I think its far more likely that a system-array interaction is preventing proper access to your storage. Thanks, Matt -- Matthew Zito GridApp Systems Email: [EMAIL PROTECTED] Cell: 646-220-3551 Phone: 212-358-8211 x 359 http://www.gridapp.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of MacGregor, Ian A. Sent: Tuesday, September 16, 2003 12:34 AM To: Multiple recipients of list ORACLE-L Subject: Raid Arrays and Power Loss Last Friday was hot here, and rumor has it our 230 KV power line sagged and touched some tree branches. The local power company shut it off. Leaving our systems to depend on UPS. About 30 minutes afterwards one system produced these errors. This was jus before the system went dead Fri Sep 12 12:58:40 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00221: error on write to controlfile ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 CKPT: terminating instance due to error 221 Instance terminated by CKPT, pid = 1420 -- --- Things look pretty shaky here. When things were restarted the following error was produced. Fri Sep 12 13:32:01 2003 ORA-00204: error in reading (block 1, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27091: skgfqio: unable to queue I/O SVR4 Error: 6: No such device or address Additional information: 1 The raid array had not been powered on -- - However Fri Sep 12 15:33:08 2003 ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Fri Sep 12 15:33:11 2003 ORA-205 signalled during: alter database mount... Now the file system is available, but the file itself has disappeared. It was not corrupted, just disappeared. We duplex a copy to an internal disk. So recovery was easy. However once this was fixed Fri Sep 12 16:18:58 2003 Thread recovery: start rolling forward thread 1 Fri Sep 12 16:18:58 2003 Errors in file /opt/oracle/admin/BBRO/udump/bbro_ora_1804.trc: ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: '/u2/oradata/BBRO/redo0301.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 ORA-313 signalled during: ALTER DATABASE OPEN... -- --- These files are on a RAID 1 LUN. Both copies of the file are gone. Again not corrupted but gone. I don't know if using duplexing rather than RAID 1 would have mattered here, but I am changing things so that one group of redo logs is on internal disk and written via the duplexing method. Ian MacGregor Stanford linear Accelerator Center [EMAIL PROTECTED] -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: MacGregor, Ian A. INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself
RE: Raid Arrays and Power Loss
The Raid Array is a Sun A1000. I'm not sure the vintage, but the disks are 18 GB. The Raid array did not lose its configuration. The storage is still there. Neither affected file system was every empty, but a couple of files were lost. One on each file system. The box is located at one of our interaction regions (IR's). some additional information [results truncated] [EMAIL PROTECTED] $ last reboot rebootsystem boot Fri Sep 12 15:32 rebootsystem boot Mon Aug 25 14:24 When the Fri Sep 12 13:32:01 2003 ORA-00204: error in reading (block 1, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27091: skgfqio: unable to queue I/O SVR4 Error: 6: No such device or address Additional information: 1 Error occurred the raid box was off. I had thought that the unix box had already been rebooted but that turns out to be false. After the box was rebooted with the raid array on Fri Sep 12 15:33:08 2003 ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Fri Sep 12 15:33:11 2003 The other files on /u1 were fine. Also concerning The other error Fri Sep 12 16:18:58 2003 Thread recovery: start rolling forward thread 1 Fri Sep 12 16:18:58 2003 Errors in file /opt/oracle/admin/BBRO/udump/bbro_ora_1804.trc: ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: '/u2/oradata/BBRO/redo0301.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 The other files are /u2 were fine. The files in question just disappeared. I know this is not normal and raid boxes do not normally lose files, but it's hard to argue against the empirical evidence here that they can. It may be that either I or the folks down an IR-2 induced the problems. But files were indeed lost on two different LUN's. My current thinking is that the two files were being written when the power was turned off on the raid array or there was not enough to keep the disks spinning because the UPS had been drained. The battery for the cache was reporting low, but based on the number of hours it operation. Should it not have maintained the cache? Ian MacGregor Stanford Linear Accelerator Center [EMAIL PROTECTED] -Original Message- Sent: Tuesday, September 16, 2003 10:55 AM To: Multiple recipients of list ORACLE-L Okay, core questions: -as someone asked, what's the make/model of storage? -has your raid array lost its config? In other words, is the storage there, just with an empty vtoc/volume table/partition table (insert your particular OS nomenclature) -Is the filesystem good, just empty? When you say the file is gone, is the /u1 directory empty, or is the filesystem structure there, just that file is gone? Okay, I just saw your message that shows its solaris 8 + veritas. Here's what probably happened. The box was powered on without the RAID array powered on and consequently veritas doesn't see the disk groups/volumes that are on the RAID array. Have you tried doing (as root): vxconfigd -km enable This will cause a rescan of the existing volume groups. Afterwards, what does a vxprint -hrt look like? In general, power loss to a RAID array will not produce the results you describe - I think its far more likely that a system-array interaction is preventing proper access to your storage. Thanks, Matt -- Matthew Zito GridApp Systems Email: [EMAIL PROTECTED] Cell: 646-220-3551 Phone: 212-358-8211 x 359 http://www.gridapp.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of MacGregor, Ian A. Sent: Tuesday, September 16, 2003 12:34 AM To: Multiple recipients of list ORACLE-L Subject: Raid Arrays and Power Loss Last Friday was hot here, and rumor has it our 230 KV power line sagged and touched some tree branches. The local power company shut it off. Leaving our systems to depend on UPS. About 30 minutes afterwards one system produced these errors. This was jus before the system went dead Fri Sep 12 12:58:40 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1 Additional information: 8192 Fri Sep 12 12:58:42 2003 Errors in file /opt/oracle/admin/BBRO/bdump/bbro_ckpt_1420.trc: ORA-00221: error on write to controlfile ORA-00206: error in writing (block 3, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27063: skgfospo: number of bytes read/written is incorrect SVR4 Error: 5: I/O error Additional information: -1
RE: Raid Arrays and Power Loss
My Veritas-trained co-worker says they ran into the same situation in the class and fsck was able to find the missing inodes and repair the damage. We were thinking that it could be Solaris not flushing the writes that could be your problem. I was warned about that for HP/UX's syncer during training and am told there's a similar function on Solaris. I'm just the messenger... Rich Rich Jesse System/Database Administrator [EMAIL PROTECTED] Quad/Tech Inc, Sussex, WI USA -Original Message- From: MacGregor, Ian A. [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 16, 2003 2:05 PM To: Multiple recipients of list ORACLE-L Subject: RE: Raid Arrays and Power Loss The Raid Array is a Sun A1000. I'm not sure the vintage, but the disks are 18 GB. The Raid array did not lose its configuration. The storage is still there. Neither affected file system was every empty, but a couple of files were lost. One on each file system. The box is located at one of our interaction regions (IR's). some additional information [results truncated] [EMAIL PROTECTED] $ last reboot rebootsystem boot Fri Sep 12 15:32 rebootsystem boot Mon Aug 25 14:24 When the Fri Sep 12 13:32:01 2003 ORA-00204: error in reading (block 1, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27091: skgfqio: unable to queue I/O SVR4 Error: 6: No such device or address Additional information: 1 Error occurred the raid box was off. I had thought that the unix box had already been rebooted but that turns out to be false. After the box was rebooted with the raid array on Fri Sep 12 15:33:08 2003 ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Fri Sep 12 15:33:11 2003 The other files on /u1 were fine. Also concerning The other error Fri Sep 12 16:18:58 2003 Thread recovery: start rolling forward thread 1 Fri Sep 12 16:18:58 2003 Errors in file /opt/oracle/admin/BBRO/udump/bbro_ora_1804.trc: ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: '/u2/oradata/BBRO/redo0301.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 The other files are /u2 were fine. The files in question just disappeared. I know this is not normal and raid boxes do not normally lose files, but it's hard to argue against the empirical evidence here that they can. It may be that either I or the folks down an IR-2 induced the problems. But files were indeed lost on two different LUN's. My current thinking is that the two files were being written when the power was turned off on the raid array or there was not enough to keep the disks spinning because the UPS had been drained. The battery for the cache was reporting low, but based on the number of hours it operation. Should it not have maintained the cache? Ian MacGregor Stanford Linear Accelerator Center [EMAIL PROTECTED] -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Jesse, Rich INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
RE: Raid Arrays and Power Loss
Thanks. I'll keep this in mind, if it happens again. -Original Message- Sent: Tuesday, September 16, 2003 1:55 PM To: Multiple recipients of list ORACLE-L My Veritas-trained co-worker says they ran into the same situation in the class and fsck was able to find the missing inodes and repair the damage. We were thinking that it could be Solaris not flushing the writes that could be your problem. I was warned about that for HP/UX's syncer during training and am told there's a similar function on Solaris. I'm just the messenger... Rich Rich Jesse System/Database Administrator [EMAIL PROTECTED] Quad/Tech Inc, Sussex, WI USA -Original Message- From: MacGregor, Ian A. [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 16, 2003 2:05 PM To: Multiple recipients of list ORACLE-L Subject: RE: Raid Arrays and Power Loss The Raid Array is a Sun A1000. I'm not sure the vintage, but the disks are 18 GB. The Raid array did not lose its configuration. The storage is still there. Neither affected file system was every empty, but a couple of files were lost. One on each file system. The box is located at one of our interaction regions (IR's). some additional information [results truncated] [EMAIL PROTECTED] $ last reboot rebootsystem boot Fri Sep 12 15:32 rebootsystem boot Mon Aug 25 14:24 When the Fri Sep 12 13:32:01 2003 ORA-00204: error in reading (block 1, # blocks 1) of controlfile ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27091: skgfqio: unable to queue I/O SVR4 Error: 6: No such device or address Additional information: 1 Error occurred the raid box was off. I had thought that the unix box had already been rebooted but that turns out to be false. After the box was rebooted with the raid array on Fri Sep 12 15:33:08 2003 ORA-00202: controlfile: '/u1/oradata/BBRO/BBROcntrl01.ctl' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Fri Sep 12 15:33:11 2003 The other files on /u1 were fine. Also concerning The other error Fri Sep 12 16:18:58 2003 Thread recovery: start rolling forward thread 1 Fri Sep 12 16:18:58 2003 Errors in file /opt/oracle/admin/BBRO/udump/bbro_ora_1804.trc: ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: '/u2/oradata/BBRO/redo0301.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 The other files are /u2 were fine. The files in question just disappeared. I know this is not normal and raid boxes do not normally lose files, but it's hard to argue against the empirical evidence here that they can. It may be that either I or the folks down an IR-2 induced the problems. But files were indeed lost on two different LUN's. My current thinking is that the two files were being written when the power was turned off on the raid array or there was not enough to keep the disks spinning because the UPS had been drained. The battery for the cache was reporting low, but based on the number of hours it operation. Should it not have maintained the cache? Ian MacGregor Stanford Linear Accelerator Center [EMAIL PROTECTED] -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Jesse, Rich INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: MacGregor, Ian A. INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).