RE: Root Cause Analysis White Papers
Peter - I can see your desires going in two directions: 1. Operational procedures - Things like how do I know I'm running out of disk space before I've run out. 2. Change management. Most failures come from changes to the software. Like somebody adds some programs and that affects other programs. Or somebody changes a table and that affects another program. For both of these, I wouldn't limit myself to sources specific to Oracle. Good standard I.S. procedures. I haven't read it, but Oracle 24x7 Tips and Techniques by Venkat S. Devraj and Ravi Balwada looks as if it has some chapters that may be pertinent. Here is a link that might work, but will probably get chopped. http://shop.barnesandnoble.com/booksearch/isbnInquiry.asp?userid=1G60ZMKA1J; mscssid=DDX7X88RWH4S9NNU2QH8344EP6QJ4VX7isbn=0072119993 Dennis Williams DBA Lifetouch, Inc. [EMAIL PROTECTED] -Original Message- Sent: Tuesday, April 30, 2002 11:11 AM To: Multiple recipients of list ORACLE-L We have been having some heavy discussions about system failures, root cause analysis and developing some proactive metrics. Generally, our problems revolve around frequently late nights for the On Call DBA because something out of our control goes wrong. The damagement folks want to fix the immediate problem and consider the job done. The DBAs are asking for an approach that will allow us to identify potential problems before something breaks at 3:00 a.m. Does anyone know of a source of white papers or other data that has been generated for systems, storage or databases? We can always roll out own, but why recreate someone else's work. = Pete Barnett Lead Database Administrator The Regence Group [EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Health - your guide to health and wellness http://health.yahoo.com -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Peter Barnett INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: DENNIS WILLIAMS INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: Root Cause Analysis White Papers
-- Peter Barnett [EMAIL PROTECTED] We have been having some heavy discussions about system failures, root cause analysis and developing some proactive metrics. Generally, our problems revolve around frequently late nights for the On Call DBA because something out of our control goes wrong. The damagement folks want to fix the immediate problem and consider the job done. The DBAs are asking for an approach that will allow us to identify potential problems before something breaks at 3:00 a.m. Does anyone know of a source of white papers or other data that has been generated for systems, storage or databases? We can always roll out own, but why recreate someone else's work. See the Usenix doc's on Auditing, sys-admin skills for examples. LISA papers also have covered audit procedures. What you are really asking for is a system audit. The results from a full audit would be a good place to start looking at what is done on the systems, what goes wrong with them and where to look further for root causes. Audit results also give you the facts you'll need in convincing the manglement that something really is wrong and it needs fixing. -- Steven Lembark 2930 W. Palmer Workhorse Computing Chicago, IL 60647 +1 800 762 1582 -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Steven Lembark INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: Root Cause Analysis White Papers
Hello Peter A good source if the calls themselves. After each call analyze the root cause and start checking for it. This way you will be able in a short time to eliminate a lot of problems. Yechiel Adar Mehish - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Tuesday, April 30, 2002 6:11 PM We have been having some heavy discussions about system failures, root cause analysis and developing some proactive metrics. Generally, our problems revolve around frequently late nights for the On Call DBA because something out of our control goes wrong. The damagement folks want to fix the immediate problem and consider the job done. The DBAs are asking for an approach that will allow us to identify potential problems before something breaks at 3:00 a.m. Does anyone know of a source of white papers or other data that has been generated for systems, storage or databases? We can always roll out own, but why recreate someone else's work. = Pete Barnett Lead Database Administrator The Regence Group [EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Health - your guide to health and wellness http://health.yahoo.com -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Peter Barnett INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Yechiel Adar INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: Root Cause Analysis White Papers
Also Have a look around TechRepublic http://www.techrepublic.com. They usually have articles about things like this and may even have some templates that you can use as a guide to get started. There approach would be very IT generalist Cheers -- = Peter McLarty E-mail: [EMAIL PROTECTED] Technical ConsultantWWW: http://www.mincom.com APAC Technical Services Phone: +61 (0)7 3303 3461 Brisbane, AustraliaMobile: +61 (0)402 094 238 Facsimile: +61 (0)7 3303 3048 = A great pleasure in life is doing what people say you cannot do. - Walter Bagehot (1826-1877 British Economist) = Mincom The People, The Experience, The Vision = Steven Lembark [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 01-05-2002 02:44 AM Please respond to ORACLE-L To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Fax to: Subject:Re: Root Cause Analysis White Papers -- Peter Barnett [EMAIL PROTECTED] We have been having some heavy discussions about system failures, root cause analysis and developing some proactive metrics. Generally, our problems revolve around frequently late nights for the On Call DBA because something out of our control goes wrong. The damagement folks want to fix the immediate problem and consider the job done. The DBAs are asking for an approach that will allow us to identify potential problems before something breaks at 3:00 a.m. Does anyone know of a source of white papers or other data that has been generated for systems, storage or databases? We can always roll out own, but why recreate someone else's work. See the Usenix doc's on Auditing, sys-admin skills for examples. LISA papers also have covered audit procedures. What you are really asking for is a system audit. The results from a full audit would be a good place to start looking at what is done on the systems, what goes wrong with them and where to look further for root causes. Audit results also give you the facts you'll need in convincing the manglement that something really is wrong and it needs fixing. -- Steven Lembark 2930 W. Palmer Workhorse Computing Chicago, IL 60647 +1 800 762 1582 -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Steven Lembark INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- This transmission is for the intended addressee only and is confidential information. If you have received this transmission in error, please delete it and notify the sender. The contents of this e-mail are the opinion of the writer only and are not endorsed by the Mincom Group of companies unless expressly stated otherwise. STG35511 Description: Binary data