Hi Gaja,

Once again I've not tested, but I've some questions about your comments on "physically contiguous" and "Keep DB_BLOCK_SIZE = FS(or OS) Block Size"
 

"physically contiguous":

We know that disk sectors are read and then transfered to bus. There will be a delay while transfering read sectors to bus, but before the reading next sector. Since disk rotates, while transfering current sector to bus, some or all of the next physically contiguous sector may be missed. If these sectors were really physically contiguous, OS would wait for the next rotation of the disk to read entire next physically contiguous sector. So, OS doesn't put logically contiguous sectors as physically contiguous. By depending on disk rotation speed and transfer speed to bus, it's scattered to disk. An optimized disk management system finds next sector immediately after the current sector is transfered to bus. This may be done by putting gaps between logically contiguous sectors. These gaps may be used for other data. Of course, there may be different implementation, but there will be always a delay to bus and there will be always a miss to next sector(s).

I've not tested but If Oracle sequential data is stored as physically contiguous, it's a real problem for IO subsystem. I guess it's logically contiguous.

"Keep DB_BLOCK_SIZE = FS(or OS) Block Size":

As I remember(???), Oracle uses bytes as parameters in IO system calls. And, let's say we created a db which has DB_BLOCK_SIZE = FS/OS block size. is it guarantee that each new Oracle block will be written to new OS block ? Every file is identified by a file handle in OS level, and also there should be a specific value in a register which points to last offset of the file. I mean, next insert may be appended to current OS block if there is free space, and new block(s) may be allocated for the remainings. Here is a sample:

- OS block size = db block size = 2k
- 1K of the last OS block is free and we would like to insert 4K.

1K is appended to last OS block, a new OS block is allocated for the remaining 2K, another new OS block is allocated for the remaining 1K. In this sample, 4K is scattered to 3 blocks, not 2 blocks.

I think this will not be a problem for Oracle. Because Oracle uses its own format. The check between block header and tail will prevent scattered data from any corruptions in physically different blocks.
 

I've not tested them, I may be wrong. Looking forward to hearing a confirmation
 

regards....
 
 
 
 

Gaja Krishna Vaidyanatha wrote:

Hi Bill & list,

The main function of a "read ahead algorithm" is to
anticipate the nature of I/O requests on a given track
of a disk's platter and see whether it is beneficial
to "pre-fetch" some of the blocks, so that subsequent
requests can be serviced from the either the
controller's or the file system's cache, without
having to "go to disk" multiple times.

The OS (or the sub-system) should normally return only
the same number of blocks as requested. But, if there
are multiple "read requests" from Oracle that are
physically contiguous on disk and they also occur in a
rapid succession, the OS or the I/O sub-system (as the
case may be), "second guesses" the requestor's intent
and assumes that more of the other blocks in the same
track will also be requested, in the near future.

For a real "sequential scan", like in a full-table
scan or an index fast-full scan, this is beneficial.
But in the case of a range scan where only "a few
contiguous blocks" are requested, pre-fetching 128K or
256K worth of data is wasteful use of a system's I/O
resources. This is because, not all the blocks that is
pre-fetched will be consumed.

The issue of an 8K DB_BLOCK_SIZE with say a 512-byte
File System (or OS) Block Size, is that there is a
1-is-to-16 ratio between logical and physical blocks.
So, for example if 4 Oracle blocks are requested, they
translate into 64 FS (or OS) blocks. If these blocks
are contiguous (and chances are good that leaf blocks
in an index can be contiguous ), it becomes an "ideal
condition" for the read-ahead algorithm to engage. So
instead of servicing 32K of data, the sub-system
retrieves 128K or 256K worth of data.

And, even if you have a 1-is-to-2 ratio between
logical and physical blocks (DB_BLOCK_SIZE is 8K and
FS Block Size is 4K), under the "right conditions",
the read-ahead algorithm will engage and pre-fetch in
a wasteful manner. So the bottom line is follows:

Keep DB_BLOCK_SIZE = FS(or OS) Block Size

This way, if Oracle requests for a few blocks in a
track, the OS does not pre-fetch all of the blocks in
the track. As mentioned before, in case of a "real
sequential scan", the pre-fetch comes in goodstead.

Hope that helps,

Gaja

--- Bill Buchan <[EMAIL PROTECTED]> wrote:
>
> Sorry, I'm a bit non-clued up on this "read ahead
> algorithm".  Could I be a
> pain and ask for more details?  Does the OS return
> one OS block if exactly
> one is requested, but if 2 are requested it thinks
> "aha! sequential scan"
> and goes and gets 4 or 8 or something?
>
> The follow on is, does this mean you should use a
> (minimal) 2k block size
> on UFS, 512 bytes blocks, or is this read-ahead
> overhead a smaller
> performance hit than that of using a database block
> size which is too small
> for the application?
>
> Thanks
> - Bill.
>
>
> At 08:48 26/04/02 -0800, you wrote:
> >All,
> >
> >You always want to ensure that your DB_BLOCK_SIZE =
> >File System Block Size. This is to avoid wasted I/O
> >and also the case where the "read ahead algorithm"
> is
> >triggered accidentally, when 1 Database Block
> results
> >in multiple file system blocks being read from
> disk.
> >
> >If your application performs range scans, there is
> a
> >high possibility that multiple "single database
> block"
> >read requests to a set of contiguous blocks, may
> >result in the "read ahead algorithm" performing
> 128K
> >or 256K pre-fetches, even though your application
> may
> >have not required all 128K or 256K.
> >
> >This problem is rampant on ufs file systems where
> the
> >default block size is 512 bytes, and with a 8K
> >DB_BLOCK_SIZE, it takes 16 file system blocks to
> store
> >1 DB block on disk. However, even if you have
> advanced
> >file systems and have a 1-is-to-2 ratio of DB block
> >is-to FS blocks, you are still in danger of
> >overloading your I/O sub-system, "under the right
> >conditions".
>
> --
> Please see the official ORACLE-L FAQ:
> http://www.orafaq.com
> --
> Author: Bill Buchan
>   INET: [EMAIL PROTECTED]
>
> Fat City Network Services    -- (858) 538-5051  FAX:
> (858) 538-5051
> San Diego, California        -- Public Internet
> access / Mailing Lists
>
--------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an
> E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of
> 'ListGuru') and in
> the message BODY, include a line containing: UNSUB
> ORACLE-L
> (or the name of mailing list you want to be removed
> from).  You may
> also send the HELP command for other information
> (like subscribing).

=====
Gaja Krishna Vaidyanatha
Director, Storage Management Products,
Quest Software, Inc.
Co-author - Oracle Performance Tuning 101
http://www.osborne.com/database_erp/0072131454/0072131454.shtml

__________________________________________________
Do You Yahoo!?
Yahoo! Games - play chess, backgammon, pool and more
http://games.yahoo.com/
--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Gaja Krishna Vaidyanatha
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

--
Danisment Gazi Unal
http://www.ubTools.com
 

Reply via email to