Hi Bill & list,

The main function of a "read ahead algorithm" is to
anticipate the nature of I/O requests on a given track
of a disk's platter and see whether it is beneficial
to "pre-fetch" some of the blocks, so that subsequent
requests can be serviced from the either the
controller's or the file system's cache, without
having to "go to disk" multiple times.

The OS (or the sub-system) should normally return only
the same number of blocks as requested. But, if there
are multiple "read requests" from Oracle that are
physically contiguous on disk and they also occur in a
rapid succession, the OS or the I/O sub-system (as the
case may be), "second guesses" the requestor's intent
and assumes that more of the other blocks in the same
track will also be requested, in the near future.

For a real "sequential scan", like in a full-table
scan or an index fast-full scan, this is beneficial.
But in the case of a range scan where only "a few
contiguous blocks" are requested, pre-fetching 128K or
256K worth of data is wasteful use of a system's I/O
resources. This is because, not all the blocks that is
pre-fetched will be consumed.

The issue of an 8K DB_BLOCK_SIZE with say a 512-byte
File System (or OS) Block Size, is that there is a
1-is-to-16 ratio between logical and physical blocks.
So, for example if 4 Oracle blocks are requested, they
translate into 64 FS (or OS) blocks. If these blocks
are contiguous (and chances are good that leaf blocks
in an index can be contiguous ), it becomes an "ideal
condition" for the read-ahead algorithm to engage. So
instead of servicing 32K of data, the sub-system
retrieves 128K or 256K worth of data.

And, even if you have a 1-is-to-2 ratio between
logical and physical blocks (DB_BLOCK_SIZE is 8K and
FS Block Size is 4K), under the "right conditions",
the read-ahead algorithm will engage and pre-fetch in
a wasteful manner. So the bottom line is follows:

Keep DB_BLOCK_SIZE = FS(or OS) Block Size

This way, if Oracle requests for a few blocks in a
track, the OS does not pre-fetch all of the blocks in
the track. As mentioned before, in case of a "real
sequential scan", the pre-fetch comes in goodstead.

Hope that helps,

Gaja

--- Bill Buchan <[EMAIL PROTECTED]> wrote:
> 
> Sorry, I'm a bit non-clued up on this "read ahead
> algorithm".  Could I be a 
> pain and ask for more details?  Does the OS return
> one OS block if exactly 
> one is requested, but if 2 are requested it thinks
> "aha! sequential scan" 
> and goes and gets 4 or 8 or something?
> 
> The follow on is, does this mean you should use a
> (minimal) 2k block size 
> on UFS, 512 bytes blocks, or is this read-ahead
> overhead a smaller 
> performance hit than that of using a database block
> size which is too small 
> for the application?
> 
> Thanks
> - Bill.
> 
> 
> At 08:48 26/04/02 -0800, you wrote:
> >All,
> >
> >You always want to ensure that your DB_BLOCK_SIZE =
> >File System Block Size. This is to avoid wasted I/O
> >and also the case where the "read ahead algorithm"
> is
> >triggered accidentally, when 1 Database Block
> results
> >in multiple file system blocks being read from
> disk.
> >
> >If your application performs range scans, there is
> a
> >high possibility that multiple "single database
> block"
> >read requests to a set of contiguous blocks, may
> >result in the "read ahead algorithm" performing
> 128K
> >or 256K pre-fetches, even though your application
> may
> >have not required all 128K or 256K.
> >
> >This problem is rampant on ufs file systems where
> the
> >default block size is 512 bytes, and with a 8K
> >DB_BLOCK_SIZE, it takes 16 file system blocks to
> store
> >1 DB block on disk. However, even if you have
> advanced
> >file systems and have a 1-is-to-2 ratio of DB block
> >is-to FS blocks, you are still in danger of
> >overloading your I/O sub-system, "under the right
> >conditions".
> 
> -- 
> Please see the official ORACLE-L FAQ:
> http://www.orafaq.com
> -- 
> Author: Bill Buchan
>   INET: [EMAIL PROTECTED]
> 
> Fat City Network Services    -- (858) 538-5051  FAX:
> (858) 538-5051
> San Diego, California        -- Public Internet
> access / Mailing Lists
>
--------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an
> E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of
> 'ListGuru') and in
> the message BODY, include a line containing: UNSUB
> ORACLE-L
> (or the name of mailing list you want to be removed
> from).  You may
> also send the HELP command for other information
> (like subscribing).


=====
Gaja Krishna Vaidyanatha
Director, Storage Management Products,
Quest Software, Inc.
Co-author - Oracle Performance Tuning 101
http://www.osborne.com/database_erp/0072131454/0072131454.shtml

__________________________________________________
Do You Yahoo!?
Yahoo! Games - play chess, backgammon, pool and more
http://games.yahoo.com/
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Gaja Krishna Vaidyanatha
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to