Re: SAN Comparison Question
I would go for the one that promises to have a RAID-5 implementation that doesn't suffer from the usual RAID-5 problems. Heh-heh. Mogens Jay Hostetter wrote: One of our hardware guys is seeking an opinion on SANs. He is comparing the Hitachi Thunder 9500 to the HP EVA 5000. Does anybody have any pros or cons to offer for either one? Good or bad experiences? Thank you, Jay **DISCLAIMER This e-mail message and any files transmitted with it are intended for the use of the individual or entity to which they are addressed and may contain information that is privileged, proprietary and confidential. If you are not the intended recipient, you may not use, copy or disclose to anyone the message or any information contained in the message. If you have received this communication in error, please notify the sender and delete this e-mail message. The contents do not represent the opinion of DE except to the extent that it relates to their official business. -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: =?ISO-8859-1?Q?Mogens_N=F8rgaard?= INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
RE: SAN configuration for Banner
Title: Message Hi I *think* but I may have entirely misread the comment that point 4 is just wrong. It sounds like you only have one raid 5 set (which is shared between 2 machines). If this is the case then the clustering has no effect on the fault tolerance/performance of the raid 5 set. Or am i just on crack here? Point 5. writing to the datafiles *may* be acceptable on a raid5 set (it just depends what the volume of write activity is - does anyone have any idea what the IO requirements will be? if so do the calculations), writing to redo/archive almost certainly won't be. I'd hold out for Raid 1 for the logs and maybe compromise on Raid 5 for the data. as for the last point, I'd also like to see the justification for raid5 is 3 times more likely to suffer dataloss. I'm afraid that a) I don't think I believe it and b) you've got the logs anyway :( OK just a) really. corrections and clarifications welcome. You may also wish to look at James Morle's SANE SAN paper at www.oaktable.net it would appear to be pertinent (and he does know whereof which he speaks) Niall -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sam BootsmaSent: 19 November 2003 15:55To: Multiple recipients of list ORACLE-LSubject: FW: SAN configuration for Banner Hi List, We are approaching the cusp of a decision on how to store Oracle data files on our SAN. We dont have the SAN yet, but it is due to arrive any week (if not any day). I passed Carys Is RAID 5 Really a Bargain? paper to our Sys Admin, which he read and succinctly summarized for the Technical Manager here. I have also read through a couple of papers referenced in the BAARF site. The Sys Admin comments were: Dell would like to know what RAID mode we want configured on the SAN for the B80 and 6C4 computers. Sam has told me that, in the Oracle community, mirroring (RAID1) is preferred over RAID 5 for various reasons (RAID5 is: more costly for write-intensive applications, 3 times more likely to incur data loss, suffers from massive performance degradation during partial outages). RAID1 will be more costly per unit of usable storage. Mirroring seems to be the best choice. Let me know what you think. Here is the Managers response: Any suggestions on how I can counter points 4 and 5 and the last point before his Thanks line? Currently we have two B80s (AIX 4.3.3) set up in a HA configuration. They share an external disk array. So if a hardware component in the primary box fails, then it will automatically failover to the secondary box (and at the same time, the secondary box takes control of the external disk array). I think the clustering term in point (4) is referring to this setup. Thanks for any suggestions. Sam. Sent: November 18, 2003 5:08 PMSubject: RE: SAN configuration for Banner All the points are valid...however..my thought processes were as follows: 1. The System Core Application disks are resident on the disks within the CPU and Mirrorred (Everyone OK with that I think) 2. The Databases are Resident on the SAN 3. The SAN disks are RAID 5 as the provide more usable space for the cost as compared to mirrorring 4. As the IBM Systems (B80's 6C4's) are clusterd thus effectivley Mirrors the RAID 5 Arrays mitigating the issues Sam raises re preformance degradation (which will only ever arisein the event of a failed disk/automated rebuild which is usually configurable to address performance degradation) 5. Write to Disk/Commit to Database should be a background process (although I recognise this is a transaction/write intensivebased system) This is a standard model that all servers are being deployed with and unless there are any specific technical reasons why this will not work it is the way I would like to see the systems implemented. Remeber, with the SAN...Reconfiguration of Disks is not a large issue anymore if required in the future. Although not an AIX/Oracle guy...I disagree with the statement that RAID5 is 3 times more susceptable to incur Data Loss. RAID 5 is a proven technology Thanks. Andrew -Original Message- From: Carl Nowak Sent: Tue 18/11/2003 2:56 PM To: Andrew Riem Subject: SAN configuration for Banner Dell would like to know what RAID mode we want configured on the SAN for the B80 and 6C4 computers. Sam has told me that, in the Oracle community, mirroring (RAID1) is preferred over RAID 5 for various reasons (RAID5 is: more costly for write-intensive applications, 3 times more likely to incur data loss, suffers from massive performance degradation during partial outages). RAID1 will be more costly per unit
RE: SAN configuration for Banner
Title: Message Hi I agree with you on point 4 and the SA here also confirms that the Manager has his facts wrong on this point. I havent seen much here in terms of quantitative measurement of IO I can ask around. Id like to do DBA work, but they have me working on an Data Architecture project (which is great experience), so I have no time to be a DBA. Cary Millsaps paper Is RAID 5 Really a Bargain, states that RAID 5 is three times more likely to incur data loss than a RAID 1 Array. A date says it is Jan. 3, 2000 so almost 4 years ago. I dont know if this is still true today or not. But I would also be interested in more details on this. Thanks! Sam. -Original Message- From: Niall Litchfield [mailto:[EMAIL PROTECTED] Sent: November 20, 2003 7:44 AM To: Multiple recipients of list ORACLE-L Subject: RE: SAN configuration for Banner Hi I *think* but I may have entirely misread the comment that point 4 is just wrong. It sounds like you only have one raid 5 set (which is shared between 2 machines). If this is the case then the clustering has no effect on the fault tolerance/performance of the raid 5 set. Or am i just on crack here? Point 5. writing to the datafiles *may* be acceptable on a raid5 set (it just depends what the volume of write activity is - does anyone have any idea what the IO requirements will be? if so do the calculations), writing to redo/archive almost certainly won't be. I'd hold out for Raid 1 for the logs and maybe compromise on Raid 5 for the data. as for the last point, I'd also like to see the justification for raid5 is 3 times more likely to suffer dataloss. I'm afraid that a) I don't think I believe it and b) you've got the logs anyway :( OK just a) really. corrections and clarifications welcome. You may also wish to look at James Morle's SANE SAN paper at www.oaktable.net it would appear to be pertinent (and he does know whereof which he speaks) Niall -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sam Bootsma Sent: 19 November 2003 15:55 To: Multiple recipients of list ORACLE-L Subject: FW: SAN configuration for Banner Hi List, We are approaching the cusp of a decision on how to store Oracle data files on our SAN. We dont have the SAN yet, but it is due to arrive any week (if not any day). I passed Carys Is RAID 5 Really a Bargain? paper to our Sys Admin, which he read and succinctly summarized for the Technical Manager here. I have also read through a couple of papers referenced in the BAARF site. The Sys Admin comments were: Dell would like to know what RAID mode we want configured on the SAN for the B80 and 6C4 computers. Sam has told me that, in the Oracle community, mirroring (RAID1) is preferred over RAID 5 for various reasons (RAID5 is: more costly for write-intensive applications, 3 times more likely to incur data loss, suffers from massive performance degradation during partial outages). RAID1 will be more costly per unit of usable storage. Mirroring seems to be the best choice. Let me know what you think. Here is the Managers response: Any suggestions on how I can counter points 4 and 5 and the last point before his Thanks line? Currently we have two B80s (AIX 4.3.3) set up in a HA configuration. They share an external disk array. So if a hardware component in the primary box fails, then it will automatically failover to the secondary box (and at the same time, the secondary box takes control of the external disk array). I think the clustering term in point (4) is referring to this setup. Thanks for any suggestions. Sam. Sent: November 18, 2003 5:08 PM Subject: RE: SAN configuration for Banner All the points are valid...however..my thought processes were as follows: 1. The System Core Application disks are resident on the disks within the CPU and Mirrorred (Everyone OK with that I think) 2. The Databases are Resident on the SAN 3. The SAN disks are RAID 5 as the provide more usable space for the cost as compared to mirrorring 4. As the IBM Systems (B80's 6C4's) are clusterd thus effectivley Mirrors the RAID 5 Arrays mitigating the issues Sam raises re preformance degradation (which will only ever arisein the event of a failed disk/automated rebuild which is usually configurable to address performance degradation) 5. Write to Disk/Commit to Database should be a background process (although I recognise this is a transaction/write intensivebased system) This is a standard model that all servers are being deployed with and unless there are any specific technical reasons why this will not work it is the way I would like to see the systems implemented. Remeber, with the SAN...Reconfiguration of Disks is not a large issue anymore if required in the future. Although not an AIX/Oracle guy...I disagree
Re: SAN-Eva3000 experiences
Check out www.baarf.com regarding RAID levels. Check especially the Sane SAN paper by James Morle as to do's and dont's. EVA stuff is expensive, and some of our customers have had to spend much time and money on spare parts and consultants. Others have been happy with it. Mogens Jeroen van Sluisdam wrote: Hi, We're in the middle of buying a storage solution that will probably be an eva3000. Because I'm new with these kind of storages and I will get implementation advice from consultants I would like to have some background with experiences in implementing an oracle database on an eva3000. Any do's and don'ts ?? Any advice on raid-levels to use? Thanks in advance, Jeroen -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: =?ISO-8859-1?Q?Mogens_N=F8rgaard?= INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
re SAN, Heterogenous environments, Backups
John, re your suggestion to ask the vendors about the Veritas File System, both Sun and Hitachi said that the Veritas VxFS filesystem is not portable across servers. Thus a Sun server cannot read an VxFS file system on an HP server. We cannot, therefore, backup a SAN shadow-image of a Veritas Volume holding VxFS for HP to a Sun backup server as the Sun server cannot mount the file system. Hemant K Chitale http://hkchital.tripod.com -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Hemant K Chitale INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
RE: SAN, Heterogenous environments, Backups
John, I forwarded your suggestion, together, with more information on the Veritas Foundation Suite to the team negotiating the hardware solution with the vendors [my involvement came in as I was allowed to us about how database backups in a SAN were to be done]. Hemant At 09:29 AM 07-02-03 -0800, you wrote: Hemant, Could you check what happens if all the filesystems that were 'shadow copied' were VxFS filesystems? I would assume that Veritas maintains the same formats across *nix versions Btw, I haven't implemented Heterogenous implementations on a SAN, but am just suggesting an alternate... Let us know what they say! John Kanagaraj Oracle Applications DBA DBSoft Inc (W): 408-970-7002 I don't know what the future holds for me, but I do know who holds my future! ** The opinions and statements above are entirely my own and not those of my employer or clients ** -Original Message- From: Hemant K Chitale [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 05, 2003 5:34 PM To: Multiple recipients of list ORACLE-L Subject: SAN, Heterogenous environments, Backups We are considering implementing a SAN, using Sun StorEdge SS9970 [OEM from Hitachi], where database servers will be a mix of Sun Solaris and HP HPUX running Oracle 8i, 9i, 9iRAC [in phase-2]. As the database files will be on File Systems [Sun or HP], the solutions provider says that backups of the HP file systems cannot be done by the backup server running Solaris. The proposal is to use shadow copies. The Sun Engineer says that the shadow copy must be mounted as a file system on the backup server and as the backup server is a Sun server it wouldn't be able to mount the HP file system. How are Heterogenous implementations on a SAN done ? How are backups done ? Hemant K Chitale http://hkchital.tripod.com -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Hemant K Chitale INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: John Kanagaraj INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). Hemant K Chitale My web site page is : http://hkchital.tripod.com -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Hemant K Chitale INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
RE: SAN, Heterogenous environments, Backups
Hemant, Could you check what happens if all the filesystems that were 'shadow copied' were VxFS filesystems? I would assume that Veritas maintains the same formats across *nix versions Btw, I haven't implemented Heterogenous implementations on a SAN, but am just suggesting an alternate... Let us know what they say! John Kanagaraj Oracle Applications DBA DBSoft Inc (W): 408-970-7002 I don't know what the future holds for me, but I do know who holds my future! ** The opinions and statements above are entirely my own and not those of my employer or clients ** -Original Message- From: Hemant K Chitale [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 05, 2003 5:34 PM To: Multiple recipients of list ORACLE-L Subject: SAN, Heterogenous environments, Backups We are considering implementing a SAN, using Sun StorEdge SS9970 [OEM from Hitachi], where database servers will be a mix of Sun Solaris and HP HPUX running Oracle 8i, 9i, 9iRAC [in phase-2]. As the database files will be on File Systems [Sun or HP], the solutions provider says that backups of the HP file systems cannot be done by the backup server running Solaris. The proposal is to use shadow copies. The Sun Engineer says that the shadow copy must be mounted as a file system on the backup server and as the backup server is a Sun server it wouldn't be able to mount the HP file system. How are Heterogenous implementations on a SAN done ? How are backups done ? Hemant K Chitale http://hkchital.tripod.com -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Hemant K Chitale INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: John Kanagaraj INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: SAN
The ADABAS/Natural/Predict list server that I know about: http://www.uark.edu/sag-l/faq.html You can also contact me directly and I will help if I can. Yechiel Adar Mehish - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Friday, August 16, 2002 6:03 PM So if I get a problem with my Mainframe Adabas databases Natural or Predict environments where do I post? -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Yechiel Adar INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
RE: SAN
So if I get a problem with my Mainframe Adabas databases Natural or Predict environments where do I post? To come back to the SAN side of things. We have a number. Basically the expensive one works extremely well with Oracle databases. Split out the database, the log files and the redo log files as you would normally. Get the SAN configured so these go to different lungs or whatever your SAN vendor calls them. Set the block size to the same as the Oracle database normally 8k. What we have found on the Unix side the disk drives in the SAN are much faster than the drives we still have attached to the Unix servers. Also we have people who actually tune the SAN. Within the SAN we have different disk drives with different characteristics cheap slow and big, expensive, fast and small. The SAN tuning basically moves the clients who have a lot of traffic to the faster drives transparently. When reading cache hit ratios of 95% in Oracle plus the hardware caches on the disk controller plus the hardware cache on the drive itself means that roughly only 2 or 3 transactions in a 100 actually make it all the way to the hard drive. Then the ideal configuration is to have as many spindles as possible so that the heads don't have to move too much. RAID 5 works very well. Some applications will make heavy read and write use of a temp database so watch out for this. When writing each transaction has to be written to the hard drive when committed, plus the redo log files etc. so the work on the hard drives is much heavier. Also when rebuilding indexes. RAID 5 does not work well since to write a block on one drive you have to read the appropriate blocks on the other drives and write both the data drive and the error checking drive. Mirrored drives are best. Your SAN people should understand this. Our expensive SAN has something like 64 gigs of memory in its cache and we have actually seen performance increases of 30 fold on some databases. Just a final comment, currently we have a UNIX server running an Oracle database that is managing to have hardware problems on the disk drive that runs the redo log files. Down time means unhappy clients. I'd love to see them move to the SAN. A second UNIX client had a disk controller failure such that it reported it was writing the data to the RAID 5 disk system unfortunately it wasn't. Took us a while to spot that one. I have another client who resisted the idea of the central SAN on cost grounds. He's running on a Compaq server but has a brand X disk subsystem that runs RAID 5. He has a large load that takes 15 hours writing all the time, including building indexes which really hammers the disk drives. Compaq actually specially design their drives to run cool even if they are writing all the time, Brand X does not seem to be quite so concerned. If you get a hardware read error on a disk normally the server will try to reread it to get the data off, you just see a delay. Nice systems note which drives are having errors and signal please replace this drive before it fails. Brand X just struggles on. If it gets a really hard error it calculates the value from the other drives. Only trouble is you get to the point where the technician tells you, you've so many errors on this RAID system I suggest you back it up immediately and I'll reformat the drives to keep it going. It takes five hours to back that server up. Needless to say the biggest 15 gig file wouldn't back up there were hard errors across the drives. That file of course was the large database file. Fortunately this took place on a Friday, it took us most of the weekend to clean it up. So for me reliability is important. At the moment the SAN reliability is good. Cheerio John -Original Message- Sent: August 13, 2002 3:29 PM To: Multiple recipients of list ORACLE-L Nice to see someone who knows what ADABAS is. Ron S. -Original Message- Sent: Tuesday, August 13, 2002 1:50 PM To: Multiple recipients of list ORACLE-L You wrote: Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? How about using a variation of the algorithm ADABAS (database) is using for sequential user reads: 1) Return the block requested. 2) If the next block is requested in a short period read the next 5 blocks to the controller cache. 3) If those are read in order and request come for the next block read each time 10 blocks. This is just from the top of my head but will assist greatly for full table scan of big tables. Also remember that in Oracle before 9i blocks used in FTS are put at the end of lru so are great candidates to be overwritten and you will read them again and again. Yechiel Adar Mehish - Original Message - To: Multiple recipients of list ORACLE-L
Re: SAN
I want to stress again that I am not I/O expert. As to your question: Assume a table with 54 block divided in two extents of 27 blocks. (I deliberately took a worst case) With direct read you get 54 physical I/O to the disks. With My algorithm you get 10 physical I/O to the disk with nine extra block read into the controller memory when the computer is doing other jobs. Since tables have usually extents bigger then 27 blocks the saving can be much bigger. Yechiel Adar Mehish - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Wednesday, August 14, 2002 12:13 AM But inside a Oracle datafile a table may not lie in contigious blocks. so I you are doing a full table scan 2 extents can lie next to each other than then the remaining 2 can sit at the end of the datafile. Will the non-Oracle process be able to decipher this and do a read ahead of those two extents? Probably not... Babu Yechiel Adar [EMAIL PROTECTED]@fatcity.com on 08/13/2002 01:50:15 PM Please respond to [EMAIL PROTECTED] Sent by:[EMAIL PROTECTED] To:Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: You wrote: Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? How about using a variation of the algorithm ADABAS (database) is using for sequential user reads: 1) Return the block requested. 2) If the next block is requested in a short period read the next 5 blocks to the controller cache. 3) If those are read in order and request come for the next block read each time 10 blocks. This is just from the top of my head but will assist greatly for full table scan of big tables. Also remember that in Oracle before 9i blocks used in FTS are put at the end of lru so are great candidates to be overwritten and you will read them again and again. Yechiel Adar Mehish - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Tuesday, August 13, 2002 5:30 PM Tim, Jared and Kirti Jared, Kirti : Thanks a lot for your input. and yes, I read the Sane SAN paper. Tim : Many thanks to you for pointing out some of the big misassumptions I had made. I have corrected most of the stuff you had mentioned except for these .* I'm less clear on whether SANs themselves perform read-ahead and the conditions under which they do so. I'm pretty sure that they are smarter about it than what you describe; usually read-ahead mechanisms are triggered by detected patterns of usage, not algorithms as simple as described... Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? * your example about read-ahead conflicts makes some invalid assumptions, namely about space being allocated in blocks not extents (when does *that* ever happen?) and about read-ahead being set to 3 blocks (again, when does *that* ever happen?). It does not happen. However I am going to be talking to a bunch of non-Oracle folks and management so I want to keep it as simple as possible. Altogether, empirical evidence (i.e. many successful SAN implementations under Oracle over several years) does not lend credence to your basic assertion that SAN and Oracle don't go well together. It is a fact that they do... I am not trying to make a statement SAN and Oracle dont go well together. I am trying to convince my management that buying a SAN does not mean that we never need to worry about IO any more. Even a SAN needs to be configured. Currently they are under the impression that there are no IO problems but my database IO waits are 50% of the total response time. All my index, table data are scattered all over the disks - many on the same disk and the answer I get is No, we are not tasking the SAN at all. There are no IO issues Thanks a lot Babu Tim Gorman [EMAIL PROTECTED]@fatcity.com on 08/13/2002 07:58:29 AM Please respond to [EMAIL PROTECTED] Sent by:[EMAIL PROTECTED] To:Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Babu, Is it possible that you are confusing the term SAN with the term NAS? As I read through your email, I couldn't help thinking that you discussing network-attached storage rather than storage-area networks. If so, some of my comments below might change slightly, but not much... --- Most of your major assumptions are correct, but there are important errors... * DBWR only does RW, never RR or SR. Mostly Oracle
Re: SAN
Babu, Is it possible that you are confusing the term SAN with the term NAS? As I read through your email, I couldn't help thinking that you discussing network-attached storage rather than storage-area networks. If so, some of my comments below might change slightly, but not much... --- Most of your major assumptions are correct, but there are important errors... * DBWR only does RW, never RR or SR. Mostly Oracle server processes do RR and SR, but ARCH also does SR, as do backup processes (whatever they are); everyone always forgets to add backup processes to the mix... * SW is characteristic of LGWR and ARCH, but also of processes performing sorting (i.e. direct writes wait-event). I think you'll agree that databases generating a lot of redo (and archived redo) and performing lots of sorting are not necessarily misconfigured. The amount of redo generated is really a characteristic of the application itself, not the database configuration. High amounts of sorting can possibly be tuned, but that too is more a characteristic of the application and users usage of it than database configuration... * I'm not sure what your conclusions regarding RAID5 chunks or RAID0 stripes are, but I suspect they are incorrect. Oracle DB_BLOCK_SIZE should not come close to matching RAID5 chunksize of RAID0 stripesize; even (DB_BLOCK_SIZE * DB_FILE_MULTIBLOCK_READ_COUNT), which denotes the largest I/O requests (for full-table scans) generated by Oracle should be much smaller than RAID5 chunksize or RAID0 stripesize, for most databases. So, whatever conclusions you are drawing from that point about sizes is likely incorrect... * I'm less clear on whether SANs themselves perform read-ahead and the conditions under which they do so. I'm pretty sure that they are smarter about it than what you describe; usually read-ahead mechanisms are triggered by detected patterns of usage, not algorithms as simple as described... Anyway, based on your mistaken assumptions, your list of conflicts between SAN and Oracle are quite mistaken as well... * the difference between the stripe width and DB_BLOCK_SIZE is not excess I/O at the SAN level; the disk drives do not necessary read the entire stripe or chunk; they merely *store* data in those extents on the device. They don't have to read/write in those increments... * your example about read-ahead conflicts makes some invalid assumptions, namely about space being allocated in blocks not extents (when does *that* ever happen?) and about read-ahead being set to 3 blocks (again, when does *that* ever happen?). You do have some of the basic ideas right, but please remember that your assumptions may be overly simplistic or just unlikely. Moreover, remember that some of your basic assumptions (especially regarding SW and database configuration) are just plain wrong... * your points about caching are mostly correct, except for DBWR doing reads again. Also, even though LGWR uses something called the log buffer, please be aware that this data structure is not a cache. A buffer is a data structure into which data is written once and read only once; a cache is a data structure into which data is (hopefully) written once and (hopefully) read many times. So, I/O in the LGWR stream is *not* cached by Oracle at all. The buffer mechanism is there purely to facilitate concurrency and the multiplexing of multiple server processes generating redo into the single LGWR process performing the write to online redo log files. Lastly, your comment about SAN's buffer can never really provide to Oracle the data it reads most ? Its already there in Oracle is just plain incorrect. Please remember the distinction between a buffer and a cache, first of all. Second, remember that not all I/O is cached by Oracle (i.e. redo). Third, please remember that database performance health is not guaranteed by high BCHR in Oracle anyway... --- Many of the concepts discussed here are not characteristic only of SANs; they also pertain to file-systems, logical volume managers, JBOD, and NAS, not just SANs. Please rethink some of the concepts you are thinking about... Altogether, empirical evidence (i.e. many successful SAN implementations under Oracle over several years) does not lend credence to your basic assertion that SAN and Oracle don't go well together. It is a fact that they do... --- If my original supposition that you are confusing SAN with NAS is correct, then I would agree with you that NAS and Oracle don't go well together in most situations, especially those involving high volumes of I/O. NAS is great for non-DBMS uses (i.e. file serving) and for uses with low-volumes of I/O from DBMSs (i.e. DEV environment), but there is lots of empirical evidence out there indicating that NAS stinks for high-volumes of I/O from DBMSs. Of course, that opinion is based on the current state of affairs -- many NAS vendors have significant advances in technology in the
Re: SAN
Tim, Jared and Kirti Jared, Kirti : Thanks a lot for your input. and yes, I read the Sane SAN paper. Tim : Many thanks to you for pointing out some of the big misassumptions I had made. I have corrected most of the stuff you had mentioned except for these .* I'm less clear on whether SANs themselves perform read-ahead and the conditions under which they do so. I'm pretty sure that they are smarter about it than what you describe; usually read-ahead mechanisms are triggered by detected patterns of usage, not algorithms as simple as described... Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? * your example about read-ahead conflicts makes some invalid assumptions, namely about space being allocated in blocks not extents (when does *that* ever happen?) and about read-ahead being set to 3 blocks (again, when does *that* ever happen?). It does not happen. However I am going to be talking to a bunch of non-Oracle folks and management so I want to keep it as simple as possible. Altogether, empirical evidence (i.e. many successful SAN implementations under Oracle over several years) does not lend credence to your basic assertion that SAN and Oracle don't go well together. It is a fact that they do... I am not trying to make a statement SAN and Oracle dont go well together. I am trying to convince my management that buying a SAN does not mean that we never need to worry about IO any more. Even a SAN needs to be configured. Currently they are under the impression that there are no IO problems but my database IO waits are 50% of the total response time. All my index, table data are scattered all over the disks - many on the same disk and the answer I get is No, we are not tasking the SAN at all. There are no IO issues Thanks a lot Babu Tim Gorman [EMAIL PROTECTED]@fatcity.com on 08/13/2002 07:58:29 AM Please respond to [EMAIL PROTECTED] Sent by:[EMAIL PROTECTED] To:Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Babu, Is it possible that you are confusing the term SAN with the term NAS? As I read through your email, I couldn't help thinking that you discussing network-attached storage rather than storage-area networks. If so, some of my comments below might change slightly, but not much... --- Most of your major assumptions are correct, but there are important errors... * DBWR only does RW, never RR or SR. Mostly Oracle server processes do RR and SR, but ARCH also does SR, as do backup processes (whatever they are); everyone always forgets to add backup processes to the mix... * SW is characteristic of LGWR and ARCH, but also of processes performing sorting (i.e. direct writes wait-event). I think you'll agree that databases generating a lot of redo (and archived redo) and performing lots of sorting are not necessarily misconfigured. The amount of redo generated is really a characteristic of the application itself, not the database configuration. High amounts of sorting can possibly be tuned, but that too is more a characteristic of the application and users usage of it than database configuration... * I'm not sure what your conclusions regarding RAID5 chunks or RAID0 stripes are, but I suspect they are incorrect. Oracle DB_BLOCK_SIZE should not come close to matching RAID5 chunksize of RAID0 stripesize; even (DB_BLOCK_SIZE * DB_FILE_MULTIBLOCK_READ_COUNT), which denotes the largest I/O requests (for full-table scans) generated by Oracle should be much smaller than RAID5 chunksize or RAID0 stripesize, for most databases. So, whatever conclusions you are drawing from that point about sizes is likely incorrect... * I'm less clear on whether SANs themselves perform read-ahead and the conditions under which they do so. I'm pretty sure that they are smarter about it than what you describe; usually read-ahead mechanisms are triggered by detected patterns of usage, not algorithms as simple as described... Anyway, based on your mistaken assumptions, your list of conflicts between SAN and Oracle are quite mistaken as well... * the difference between the stripe width and DB_BLOCK_SIZE is not excess I/O at the SAN level; the disk drives do not necessary read the entire stripe or chunk; they merely *store* data in those extents on the device. They don't have to read/write in those increments... * your example about read-ahead conflicts makes some invalid assumptions, namely about space being allocated in blocks not extents (when does *that* ever happen?) and about read-ahead being set to 3 blocks (again, when does *that* ever happen?). You do have some of the basic ideas right, but please remember that your assumptions may be overly simplistic or just unlikely. Moreover, remember that some of your basic
Re: SAN
You wrote: Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? How about using a variation of the algorithm ADABAS (database) is using for sequential user reads: 1) Return the block requested. 2) If the next block is requested in a short period read the next 5 blocks to the controller cache. 3) If those are read in order and request come for the next block read each time 10 blocks. This is just from the top of my head but will assist greatly for full table scan of big tables. Also remember that in Oracle before 9i blocks used in FTS are put at the end of lru so are great candidates to be overwritten and you will read them again and again. Yechiel Adar Mehish - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Tuesday, August 13, 2002 5:30 PM Tim, Jared and Kirti Jared, Kirti : Thanks a lot for your input. and yes, I read the Sane SAN paper. Tim : Many thanks to you for pointing out some of the big misassumptions I had made. I have corrected most of the stuff you had mentioned except for these .* I'm less clear on whether SANs themselves perform read-ahead and the conditions under which they do so. I'm pretty sure that they are smarter about it than what you describe; usually read-ahead mechanisms are triggered by detected patterns of usage, not algorithms as simple as described... Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? * your example about read-ahead conflicts makes some invalid assumptions, namely about space being allocated in blocks not extents (when does *that* ever happen?) and about read-ahead being set to 3 blocks (again, when does *that* ever happen?). It does not happen. However I am going to be talking to a bunch of non-Oracle folks and management so I want to keep it as simple as possible. Altogether, empirical evidence (i.e. many successful SAN implementations under Oracle over several years) does not lend credence to your basic assertion that SAN and Oracle don't go well together. It is a fact that they do... I am not trying to make a statement SAN and Oracle dont go well together. I am trying to convince my management that buying a SAN does not mean that we never need to worry about IO any more. Even a SAN needs to be configured. Currently they are under the impression that there are no IO problems but my database IO waits are 50% of the total response time. All my index, table data are scattered all over the disks - many on the same disk and the answer I get is No, we are not tasking the SAN at all. There are no IO issues Thanks a lot Babu Tim Gorman [EMAIL PROTECTED]@fatcity.com on 08/13/2002 07:58:29 AM Please respond to [EMAIL PROTECTED] Sent by:[EMAIL PROTECTED] To:Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Babu, Is it possible that you are confusing the term SAN with the term NAS? As I read through your email, I couldn't help thinking that you discussing network-attached storage rather than storage-area networks. If so, some of my comments below might change slightly, but not much... --- Most of your major assumptions are correct, but there are important errors... * DBWR only does RW, never RR or SR. Mostly Oracle server processes do RR and SR, but ARCH also does SR, as do backup processes (whatever they are); everyone always forgets to add backup processes to the mix... * SW is characteristic of LGWR and ARCH, but also of processes performing sorting (i.e. direct writes wait-event). I think you'll agree that databases generating a lot of redo (and archived redo) and performing lots of sorting are not necessarily misconfigured. The amount of redo generated is really a characteristic of the application itself, not the database configuration. High amounts of sorting can possibly be tuned, but that too is more a characteristic of the application and users usage of it than database configuration... * I'm not sure what your conclusions regarding RAID5 chunks or RAID0 stripes are, but I suspect they are incorrect. Oracle DB_BLOCK_SIZE should not come close to matching RAID5 chunksize of RAID0 stripesize; even (DB_BLOCK_SIZE * DB_FILE_MULTIBLOCK_READ_COUNT), which denotes the largest I/O requests (for full-table scans) generated by Oracle should be much smaller than RAID5 chunksize or RAID0 stripesize, for most databases. So, whatever conclusions you are drawing from that point about sizes is likely incorrect... * I'm less clear on
RE: SAN
Nice to see someone who knows what ADABAS is. Ron S. -Original Message- Sent: Tuesday, August 13, 2002 1:50 PM To: Multiple recipients of list ORACLE-L You wrote: Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? How about using a variation of the algorithm ADABAS (database) is using for sequential user reads: 1) Return the block requested. 2) If the next block is requested in a short period read the next 5 blocks to the controller cache. 3) If those are read in order and request come for the next block read each time 10 blocks. This is just from the top of my head but will assist greatly for full table scan of big tables. Also remember that in Oracle before 9i blocks used in FTS are put at the end of lru so are great candidates to be overwritten and you will read them again and again. Yechiel Adar Mehish - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Tuesday, August 13, 2002 5:30 PM Tim, Jared and Kirti Jared, Kirti : Thanks a lot for your input. and yes, I read the Sane SAN paper. Tim : Many thanks to you for pointing out some of the big misassumptions I had made. I have corrected most of the stuff you had mentioned except for these .* I'm less clear on whether SANs themselves perform read-ahead and the conditions under which they do so. I'm pretty sure that they are smarter about it than what you describe; usually read-ahead mechanisms are triggered by detected patterns of usage, not algorithms as simple as described... Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? * your example about read-ahead conflicts makes some invalid assumptions, namely about space being allocated in blocks not extents (when does *that* ever happen?) and about read-ahead being set to 3 blocks (again, when does *that* ever happen?). It does not happen. However I am going to be talking to a bunch of non-Oracle folks and management so I want to keep it as simple as possible. Altogether, empirical evidence (i.e. many successful SAN implementations under Oracle over several years) does not lend credence to your basic assertion that SAN and Oracle don't go well together. It is a fact that they do... I am not trying to make a statement SAN and Oracle dont go well together. I am trying to convince my management that buying a SAN does not mean that we never need to worry about IO any more. Even a SAN needs to be configured. Currently they are under the impression that there are no IO problems but my database IO waits are 50% of the total response time. All my index, table data are scattered all over the disks - many on the same disk and the answer I get is No, we are not tasking the SAN at all. There are no IO issues Thanks a lot Babu Tim Gorman [EMAIL PROTECTED]@fatcity.com on 08/13/2002 07:58:29 AM Please respond to [EMAIL PROTECTED] Sent by:[EMAIL PROTECTED] To:Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Babu, Is it possible that you are confusing the term SAN with the term NAS? As I read through your email, I couldn't help thinking that you discussing network-attached storage rather than storage-area networks. If so, some of my comments below might change slightly, but not much... --- Most of your major assumptions are correct, but there are important errors... * DBWR only does RW, never RR or SR. Mostly Oracle server processes do RR and SR, but ARCH also does SR, as do backup processes (whatever they are); everyone always forgets to add backup processes to the mix... * SW is characteristic of LGWR and ARCH, but also of processes performing sorting (i.e. direct writes wait-event). I think you'll agree that databases generating a lot of redo (and archived redo) and performing lots of sorting are not necessarily misconfigured. The amount of redo generated is really a characteristic of the application itself, not the database configuration. High amounts of sorting can possibly be tuned, but that too is more a characteristic of the application and users usage of it than database configuration... * I'm not sure what your conclusions regarding RAID5 chunks or RAID0 stripes are, but I suspect they are incorrect. Oracle DB_BLOCK_SIZE should not come close to matching RAID5 chunksize of RAID0 stripesize; even (DB_BLOCK_SIZE * DB_FILE_MULTIBLOCK_READ_COUNT), which denotes the largest I/O requests (for full-table scans) generated by Oracle should be much smaller than RAID5 chunksize or
Re: SAN
But inside a Oracle datafile a table may not lie in contigious blocks. so I you are doing a full table scan 2 extents can lie next to each other than then the remaining 2 can sit at the end of the datafile. Will the non-Oracle process be able to decipher this and do a read ahead of those two extents? Probably not... Babu Yechiel Adar [EMAIL PROTECTED]@fatcity.com on 08/13/2002 01:50:15 PM Please respond to [EMAIL PROTECTED] Sent by:[EMAIL PROTECTED] To:Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: You wrote: Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? How about using a variation of the algorithm ADABAS (database) is using for sequential user reads: 1) Return the block requested. 2) If the next block is requested in a short period read the next 5 blocks to the controller cache. 3) If those are read in order and request come for the next block read each time 10 blocks. This is just from the top of my head but will assist greatly for full table scan of big tables. Also remember that in Oracle before 9i blocks used in FTS are put at the end of lru so are great candidates to be overwritten and you will read them again and again. Yechiel Adar Mehish - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: Tuesday, August 13, 2002 5:30 PM Tim, Jared and Kirti Jared, Kirti : Thanks a lot for your input. and yes, I read the Sane SAN paper. Tim : Many thanks to you for pointing out some of the big misassumptions I had made. I have corrected most of the stuff you had mentioned except for these .* I'm less clear on whether SANs themselves perform read-ahead and the conditions under which they do so. I'm pretty sure that they are smarter about it than what you describe; usually read-ahead mechanisms are triggered by detected patterns of usage, not algorithms as simple as described... Will the smarter algorithm look inside the contents of a file before reading it? If it does not, then how will it be able to intelligently read ahead what data Oracle wants from inside its datafile? If it does, how does it decipher the Oracle's way of storage? * your example about read-ahead conflicts makes some invalid assumptions, namely about space being allocated in blocks not extents (when does *that* ever happen?) and about read-ahead being set to 3 blocks (again, when does *that* ever happen?). It does not happen. However I am going to be talking to a bunch of non-Oracle folks and management so I want to keep it as simple as possible. Altogether, empirical evidence (i.e. many successful SAN implementations under Oracle over several years) does not lend credence to your basic assertion that SAN and Oracle don't go well together. It is a fact that they do... I am not trying to make a statement SAN and Oracle dont go well together. I am trying to convince my management that buying a SAN does not mean that we never need to worry about IO any more. Even a SAN needs to be configured. Currently they are under the impression that there are no IO problems but my database IO waits are 50% of the total response time. All my index, table data are scattered all over the disks - many on the same disk and the answer I get is No, we are not tasking the SAN at all. There are no IO issues Thanks a lot Babu Tim Gorman [EMAIL PROTECTED]@fatcity.com on 08/13/2002 07:58:29 AM Please respond to [EMAIL PROTECTED] Sent by:[EMAIL PROTECTED] To:Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Babu, Is it possible that you are confusing the term SAN with the term NAS? As I read through your email, I couldn't help thinking that you discussing network-attached storage rather than storage-area networks. If so, some of my comments below might change slightly, but not much... --- Most of your major assumptions are correct, but there are important errors... * DBWR only does RW, never RR or SR. Mostly Oracle server processes do RR and SR, but ARCH also does SR, as do backup processes (whatever they are); everyone always forgets to add backup processes to the mix... * SW is characteristic of LGWR and ARCH, but also of processes performing sorting (i.e. direct writes wait-event). I think you'll agree that databases generating a lot of redo (and archived redo) and performing lots of sorting are not necessarily misconfigured. The amount of redo generated is really a characteristic of the application itself, not the database configuration. High amounts of sorting can possibly be tuned, but that too is more a characteristic of the application and users usage of it than database configuration... * I'm not sure what your
Re: SAN issues
Babu, Nice comprehensive list of things to consider with a SAN, Just a couple of thoughts. ® Oracle requests DBWR-1 for IDX1 and waits. DBWR-1 makes a Unix IO call and waits for Unix to return data. Unix talks to SAN and SAN starts reading from the disk. Assume that it takes 3 seconds to read the entire IDX1. SAN starts returning data in chunks to Unix and Unix gives it back to Oracle. Data is read from Disk by server processes, not by DBWR. ® Now a slightly bigger picture. There are 6 processes trying to read the data from six different tables. This occurs regardless of the type of storage system, so I'm not sure it really belongs in a list of SAN specific concerns. ® Lets forget all this buffering, caches etc. Assume we have 10 disks in two LUNs. Both the LUNs share the 10 disks. Each of this LUN is made visible to Unix as a mountpoint. The DBA uses one mountpoint for indexes and one mountpoint for tables. You can have this same kind of configuration problem with any disk storage manager. Don't forget the management issue with SANs. SA's love them because it greatly reduces the amount of work they must do to manage storage. They can be properly configured from a database point of view, at least as far as distribuing IO is concerned, you just need to make it known that you would like some input on it's configuration. Jared [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 08/12/2002 01:38 PM Please respond to ORACLE-L To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Subject:SAN issues All I an trying to get our management understand the issues related to SAN. These are my thoughts. Let me know what you think about it... (PS : Apologies if you recv this twice. I posted it but I never saw it come through the list and so I posted again) Babu SAN Issues SAN and Oracle ? Conflicting IO behavior ® There are four types of IO in Oracle 1.Random Reads (RR) ? DBWR - Using indexes 2.Sequential Reads (SR) ? DBWR - Full table scans 3.Random Writes (RW) ? DBWR ? Writing dirty blocks 4.Sequential Writes (SW) ? LGWR, Arch ? Writing redo logs and Redo Archival + Control files ® Bulk of any Oracle database's IO is done in RR, SR and RW. If SW is very high it denotes configuration problems. ® SAN (or for that matter any RAID device) is configured for writing or reading large chunks at a time. The stripe size on most SANs and RAID devices are 256K or more. Compare this to the Oracle block size of 4k/8k in most databases (going upto 32K in datawarehouses) ® SANs do Read Ahead. If one block is requested, they read more than one blocks while at the disk hoping that the same process will request the other blocks some time soon. Here is the conflict. ® When ever Oracle does a RR, SR or RW it writes randomly and not sequentially. It will read/write a particular block at a time in case of RR and RW and 'x' blocks (where x = dbfile_multi_block_read_count) in case of SR. Therefore only during SR will Oracle use the entire stripe width. In all other cases, The difference in the stripe width and db_block_size will be excess IO. ® Why read ahead will cause a conflict : ® The internal structure of a datafile could be as follows. The file consists of 10 blocks. These are occupied by 3 tables. The blocks shown below are numbered using table_name.block_number |-+-+-+-+-+-+-+-+-+-| | | | | | | | | | | | | 1.1 | 1.2 | 2.1 | 3.1 | 3.2 | 3.3 | 2.2 | 1.3 | 2.3 | 3.4 | | | | | | | | | | | | |-+-+-+-+-+-+-+-+-+-| ® The first block on the datafile is the first block of table 1, second block is the second block of table 1, the third block is the first block of table 2 and so on.. (For simplicity sake, I am assuming Oracle will allocate space in blocks and not in extents) ® Now assume Oracle requests the first block of table 1. Assume read ahead is set to three blocks (three blocks will be read instead of 2 blocks). In this case the SAN will read 2.1, 3.1,3.2. ® The blocks 3.1 and 3.2 will be entirely useless as Oracle is never going to read it. SAN cannot tell that the block 2.2 that Oracle might possible request next is the 7th block in the datafile and so it can never read ahead intelligently. Why the buffer of SAN has very little impact w.r.t Oracle read performance? ® Oracle has its own buffering for all IO types ® DBWR reads and writes uses the DB Buffer Cache ® LGWR uses the Log buffer ® Db buffer Cache is managed by a LRU
RE: SAN issues
Babu, If you have not already done so, please also review a paper by James Morle : Sane SAN http://www.scaleabilities.com/whitepapers.shtml - Kirti -Original Message- Sent: Monday, August 12, 2002 6:45 PM To: Multiple recipients of list ORACLE-L Babu, Nice comprehensive list of things to consider with a SAN, Just a couple of thoughts. ® Oracle requests DBWR-1 for IDX1 and waits. DBWR-1 makes a Unix IO call and waits for Unix to return data. Unix talks to SAN and SAN starts reading from the disk. Assume that it takes 3 seconds to read the entire IDX1. SAN starts returning data in chunks to Unix and Unix gives it back to Oracle. Data is read from Disk by server processes, not by DBWR. ® Now a slightly bigger picture. There are 6 processes trying to read the data from six different tables. This occurs regardless of the type of storage system, so I'm not sure it really belongs in a list of SAN specific concerns. ® Lets forget all this buffering, caches etc. Assume we have 10 disks in two LUNs. Both the LUNs share the 10 disks. Each of this LUN is made visible to Unix as a mountpoint. The DBA uses one mountpoint for indexes and one mountpoint for tables. You can have this same kind of configuration problem with any disk storage manager. Don't forget the management issue with SANs. SA's love them because it greatly reduces the amount of work they must do to manage storage. They can be properly configured from a database point of view, at least as far as distribuing IO is concerned, you just need to make it known that you would like some input on it's configuration. Jared [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 08/12/2002 01:38 PM Please respond to ORACLE-L To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] cc: Subject:SAN issues All I an trying to get our management understand the issues related to SAN. These are my thoughts. Let me know what you think about it... (PS : Apologies if you recv this twice. I posted it but I never saw it come through the list and so I posted again) Babu SAN Issues SAN and Oracle ? Conflicting IO behavior ® There are four types of IO in Oracle 1.Random Reads (RR) ? DBWR - Using indexes 2.Sequential Reads (SR) ? DBWR - Full table scans 3.Random Writes (RW) ? DBWR ? Writing dirty blocks 4.Sequential Writes (SW) ? LGWR, Arch ? Writing redo logs and Redo Archival + Control files ® Bulk of any Oracle database's IO is done in RR, SR and RW. If SW is very high it denotes configuration problems. ® SAN (or for that matter any RAID device) is configured for writing or reading large chunks at a time. The stripe size on most SANs and RAID devices are 256K or more. Compare this to the Oracle block size of 4k/8k in most databases (going upto 32K in datawarehouses) ® SANs do Read Ahead. If one block is requested, they read more than one blocks while at the disk hoping that the same process will request the other blocks some time soon. Here is the conflict. ® When ever Oracle does a RR, SR or RW it writes randomly and not sequentially. It will read/write a particular block at a time in case of RR and RW and 'x' blocks (where x = dbfile_multi_block_read_count) in case of SR. Therefore only during SR will Oracle use the entire stripe width. In all other cases, The difference in the stripe width and db_block_size will be excess IO. ® Why read ahead will cause a conflict : ® The internal structure of a datafile could be as follows. The file consists of 10 blocks. These are occupied by 3 tables. The blocks shown below are numbered using table_name.block_number |-+-+-+-+-+-+-+- +-+-| | | | | | | | | | | | | 1.1 | 1.2 | 2.1 | 3.1 | 3.2 | 3.3 | 2.2 | 1.3 | 2.3 | 3.4 | | | | | | | | | | | | |-+-+-+-+-+-+-+- +-+-| ® The first block on the datafile is the first block of table 1, second block is the second block of table 1, the third block is the first block of table 2 and so on.. (For simplicity sake, I am assuming Oracle will allocate space in blocks and not in extents) ® Now assume Oracle requests the first block of table 1. Assume read ahead is set to three blocks (three blocks will be read instead of 2 blocks). In this case the SAN will read 2.1, 3.1,3.2. ® The blocks 3.1 and 3.2 will be entirely useless as Oracle is never going to read it. SAN cannot tell that the block 2.2 that Oracle might possible request next is the 7th block in the datafile and so it can never read ahead intelligently.
Re: SAN - Oracle - Pitfalls - Adv/Disadvantages??
James Morle wrote a paper about this called Sane SAN. You can download it from www.oraperf.com. Anjo. Mandar A. Ghosalkar wrote: Hello Guys, any guys here who have SAN. We are inviting a SAN vendor for possible solutions for our enterprise. i am unaware about how SAN would affect me as DBA. Also we are thinking about how we can use OS level block replication between two database servers located in different cities(SF and LA). any suggestion about pitfalls? TIA Mandar -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Mandar A. Ghosalkar INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Anjo Kolk INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: SAN - Oracle - Pitfalls - Adv/Disadvantages??
I like SAN's for alot of reasons and if you are using them in a traditional non-replicated environment, they work well. Just don't buy into the sales hype that you don't need to do phsycial database design because we have should big caches etc... Those big caches are great for improving write performance, but don't help that much on heavy random read activity. I have seen very poor performance on very large EMC arrays due to this mentality, so do your homework. The remained of this e-mail deals w/doing block (or track) level replication between SANS... The only method of real-time block level replication supported by oracle is synchronous. So you need to worry about the impact synchronous replication will have on your transaction stream (especially log buffer writes). For EMC, they only ship entire tracks, so this really slows down log writes as you end up shipping the same track over and over again synchrounously to the remote EMC boxes memory. Also, there are limits on the number of hops between the two locations and transport delay (speed of light is a factor). About the only way it makes sense is if you can pull dark fiber between the two sites. If you can afford to lose some transactions, then you can use a split mirror approach (EMC has another product called timefinder which is useful here). This approach does not impact your online transaction stream and is less demanding from a network perspecive. Both of these solutions are cool black box solutions that a DBA doesn't need to worry about. My own opinion is after reality sets in and you realize that the synchronous approach will not work, why not just go with DataGuard. Same bennefit at a fraction of the cost. Bill --- Mandar A. Ghosalkar [EMAIL PROTECTED] wrote: Hello Guys, any guys here who have SAN. We are inviting a SAN vendor for possible solutions for our enterprise. i am unaware about how SAN would affect me as DBA. Also we are thinking about how we can use OS level block replication between two database servers located in different cities(SF and LA). any suggestion about pitfalls? TIA Mandar -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Mandar A. Ghosalkar INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). __ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Bill Pass INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Re: SAN Implementation
I personally love James Morle's Sane SAN paper available on for instance www.OraPerf.com. James is cool and knows what he's talking about. I used to say with pride that James is the guy who wrote the book which is placed in my bathroom. Turns out that while we've been away for the Database Forum here in Sydney, Bjorn Engsig has replaced it with Jonathan's book. I should add that there's been a lot of protest over that from the OakTable members. They want both of the books there... possibly along with Tom Kyte's book, of course... But seriously: Get James' paper. If you have questions about it, email him. He responds. Nikunj Gupta wrote: Hello Group / Guru's Anyone has White Paper on SAN, it Implementation especially with ORACLE ?? Any thought, personal experience with pros and cons will be highly appreciated. TIA Enlighten Me. -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Mogens =?ISO-8859-1?Q?N=F8rgaard?= INET: [EMAIL PROTECTED] Fat City Network Services-- (858) 538-5051 FAX: (858) 538-5051 San Diego, California-- Public Internet access / Mailing Lists To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).