Re: SAN issues

Jared . Still Mon, 12 Aug 2002 16:23:04 -0700

Babu,

Nice comprehensive list of things to consider with a SAN,

Just a couple of thoughts.

> Ž Oracle requests DBWR-1 for IDX1 and waits. DBWR-1 makes a Unix IO
> call and waits for Unix to return data. Unix talks to SAN and SAN starts
> reading from the disk. Assume that it takes 3 seconds to read the entire
> IDX1. SAN starts returning data in chunks to Unix and Unix gives it back
to
> Oracle.

Data is read from Disk by server processes, not by DBWR.

> Ž Now a slightly bigger picture. There are 6 processes trying to
read
> the data from six different tables.

This occurs regardless of the type of storage system, so I'm not sure it
really belongs in a list of SAN specific concerns.

> Ž Lets forget all this buffering, caches etc. Assume we have 10
disks
> in two LUNs. Both the LUNs share the 10 disks. Each of this LUN is made
> visible to Unix as a mountpoint. The DBA uses one mountpoint for indexes
> and one mountpoint for tables.

You can have this same kind of configuration problem with any disk
storage manager.

Don't forget the management issue with SANs. SA's love them because it
greatly reduces the amount of work they must do to manage storage. They
can be properly configured from a database point of view, at least as
far as distribuing IO is concerned, you just need to make it known that
you would like some input on it's configuration.

Jared

[EMAIL PROTECTED]
Sent by: [EMAIL PROTECTED]
08/12/2002 01:38 PM
Please respond to ORACLE-L

To: Multiple recipients of list ORACLE-L <[EMAIL PROTECTED]>
cc:
Subject: SAN issues

All

I an trying to get our management understand the issues related to SAN.
These are my thoughts. Let me know what you think about it...
(PS : Apologies if you recv this twice. I posted it but I never saw it
come
through the list and so I posted again)

Babu

SAN Issues

SAN and Oracle ? Conflicting IO behavior
Ž There are four types of IO in Oracle
1. Random Reads (RR) ? DBWR - Using indexes
2. Sequential Reads (SR) ? DBWR - Full table scans
3. Random Writes (RW) ? DBWR ? Writing dirty blocks
4. Sequential Writes (SW) ? LGWR, Arch ? Writing redo logs and Redo
Archival + Control files
Ž Bulk of any Oracle database's IO is done in RR, SR and RW. If SW is
very high it denotes configuration problems.

Ž SAN (or for that matter any RAID device) is configured for writing
or
reading large chunks at a time. The stripe size on most SANs and RAID
devices are 256K or more. Compare this to the Oracle block size of 4k/8k
in
most databases (going upto 32K in datawarehouses)
Ž SANs do "Read Ahead". If one block is requested, they read more than
one blocks "while at the disk" hoping that the same process will request
the other blocks some time soon.

Here is the conflict.
Ž When ever Oracle does a RR, SR or RW it writes randomly and not
sequentially. It will read/write a particular block at a time in case of
RR and RW and 'x' blocks (where x = dbfile_multi_block_read_count) in case
of SR. Therefore only during SR will Oracle use the entire stripe width.
In
all other cases, The difference in the stripe width and db_block_size will
be excess IO.
Ž Why "read ahead" will cause a conflict :
Ž The internal structure of a datafile could be as follows. The
file consists of 10 blocks. These are occupied by 3 tables. The
blocks shown below are numbered using table_name.block_number

|---------+---------+---------+---------+---------+---------+---------+---------+---------+---------|
| | | | | | | | |
| |
| 1.1 | 1.2 | 2.1 | 3.1 | 3.2 | 3.3 | 2.2 |
1.3 | 2.3 | 3.4 |
| | | | | | | | |
| |

|---------+---------+---------+---------+---------+---------+---------+---------+---------+---------|

Ž The first block on the datafile is the first block of table 1,
second block is the second block of table 1, the third block is the
first block of table 2 and so on.. (For simplicity sake, I am
assuming Oracle will allocate space in blocks and not in extents)
Ž Now assume Oracle requests the first block of table 1. Assume
read ahead is set to three blocks (three blocks will be read instead
of 2 blocks). In this case the SAN will read 2.1, 3.1,3.2.
Ž The blocks 3.1 and 3.2 will be entirely useless as Oracle is
never going to read it. SAN cannot tell that the block 2.2 that
Oracle might possible request next is the 7th block in the datafile
and so it can never "read ahead" intelligently.

Why the buffer of SAN has very little impact w.r.t Oracle read
performance?
Ž Oracle has its own buffering for all IO types
Ž DBWR reads and writes uses the DB Buffer Cache
Ž LGWR uses the Log buffer
Ž Db buffer Cache is managed by a LRU Algorithm (Touchcount from 9I).
Ž Bulk of the IO done by Oracle is Logical IO (LIO) and not Physical
IO
(PIO).
Ž Assume the buffer cache hit ratio is 80%. This means that only 20%
of
the IO calls are PIO. Only 20% of the calls ever hit the SAN's cache.
Since
this 20% is probably the least requested/never requested data (going by
Oracle's LRU algorithm) , its quite likely that the SAN's buffers don't
have this either.
Ž Given that Oracle is going to cache even this 20% in its buffers,
the
next PIO call is going to be for something totally different ? which is
not
there in the SAN's buffer.
Ž Couple this with the read-ahead (discussed earlier), Our SAN's
buffer
is now populated with lots of data that Oracle might never use a PIO to
retrieve.
Ž Thus the SAN's buffer can never really provide to Oracle the data it
reads most ? Its already there in Oracle.
Ž To be fair, SAN's huge buffers will come as a boon to small
databases
? where the entire database can be cached in the SAN's buffers.

SAN or no SAN ? Why will performance be affected if we have indexes and
tables on the same disk

Ž Lets forget all this buffering, caches etc. Assume we have 10 disks
in two LUNs. Both the LUNs share the 10 disks. Each of this LUN is made
visible to Unix as a mountpoint. The DBA uses one mountpoint for indexes
and one mountpoint for tables.
Ž Since we have used all the 10 disks for both the LUNs, the structure
of one disk can look like this. The first two blocks are a chunk of LUN1
where we have the table TAB1. The next three blocks are a chunk of LUN2
where we have indexes IDX1, IDX2 and IDX3(for tables TAB1, TAB2 and TAB3).
|------------------+------------------+------------------+------------------+-----------------------------------------------|
| | | | | |
| 1.TAB1 | 1.TAB1 | 2.IDX1 | 2.IDX2 |
2.IDX3 |
| | | | | |
|------------------+------------------+------------------+------------------+-----------------------------------------------|

Ž Assume when Oracle is reading TAB1 using IDX1. Also assume there are
no
caches and all IO is physical.
Ž Oracle requests DBWR-1 for IDX1 and waits. DBWR-1 makes a Unix IO
call and waits for Unix to return data. Unix talks to SAN and SAN starts
reading from the disk. Assume that it takes 3 seconds to read the entire
IDX1. SAN starts returning data in chunks to Unix and Unix gives it back
to
Oracle.
Ž Oracle uses that data to request TAB1 using DBWR-2. Now DBWR-2 goes
to Unix which in turn goes to the SAN. SAN now goes to the disk but the
disk has only one head ? which can read or write. In this case that head
is
already busy reading IDX1.
Ž Now two things can happen
Ž Either the disk head starts running between TAB1 and IDX1 increasing
the latency time
Ž Or TAB1's read waits till IDX1's reads complete.
Ž Either way the parallel operation is now serialized leading to poor
response time.

How Oracle performs an IO and Why can Oracle suffer from an IO problem
while Unix/SAN statistics don't show any?

Ž DBWR performs nearly all of the IO for Oracle. LGWR, Arch etc do
some
IO but DBWR beats them all by sheer volume.
Ž There can be multiple DBWR processes. Lets assume that there are two
DBWR processes (or DBWR_IO_SLAVES if you prefer).
Ž Lets assume that it takes one second for each DBWR to perform a
single IO call. There are no waits on the Unix or IO Subsystem.
Ž Consider a routine scenario ? Oracle is doing a table read via a
index. Oracle sends DBWR1 to read the index block and it returns the data
after 1 second. Oracle then uses that data to ask DBWR2 read the data from
the table. In this case the process that requested the data had to wait 2
seconds before getting it ? a 2 sec IO wait.
Ž Now a slightly bigger picture. There are 6 processes trying to read
the data from six different tables. Since we have only 2 DBWRs they have
to
service all this IO. Each process would post the DBWR to do their read and
wait on "dbfile sequential read". However the DBWRs can service only one
request at a time. So while DBWRs 1 and 2 service Processes 1 and 2, the
Processes 3-7 wait for IO.
Ž This internal queue is never visible to the UNIX or to the SAN. So
the SAN/Unix never shows a IO queue but there is a IO issue that makes
processes wait.

--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author:
INET: [EMAIL PROTECTED]

Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051
San Diego, California -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).

Re: SAN issues

Reply via email to