Hi, First off, let me say that this a wonderfully explained set of formulas and analysis.
Isn't this getting a little off the mark though? Last I checked, almost every database on the planet (yes even pervasive sql) when allocating pages/extents, left an amount of space unutilized at the end. In fact, if you do a "reorg" in SQL server, it specifically asks how much space you want to remain free in each page. Now why would you want that? So that when you add a row to a table with a clustered index (ie. A primary key, where the table is physically ordered the same as the index) the database does not have to add an extent at the end of the space to house the new row. This cuts down on logical fragmentation which is a far larger killer of databases than the fragmentation that these formulas show. By these formuls, every signle one of my SQL database is 25% fragmented (why, because every Sunday they do online reorgs to fix their logical fragmentation). Logical fragmentation turns large sequential reads into large random reads. These are principles from Oracle and SQL server and may not apply to the TSM database, but as a relational database, I don't know why they wouldn't. Oh, and I don't know of a utility that can give you the info that I was talking about for TSM, but the equivalent can be obtained in SQL by doing a (dbcc showcontig (tablename)). My $0.02 Michael Wheelock Integris Health -----Original Message----- From: Zlatko Krastev [mailto:[EMAIL PROTECTED] Sent: Monday, October 20, 2003 2:12 PM To: [EMAIL PROTECTED] Subject: Database fragmentation formula (was Re: Online DB Reorg) ***** ATTENTION ***** Those who do not like the mathematics, skip to the end (search for word "select"). ********************* >From my mathematical background this query is not showing very good results. The formula looks like ( MAX_REDUCTION_MB * 256 ) 100 - --------------------------------- (USABLE_PAGES - USED_PAGES) * 100 Thus closer we are to fully used database, less accurate the formula would be. Moreover if we fill the DB at 100% the result will be division by 0 while the database might be both fragmented and not fragmented. One of our goals is to fully utilize our resources. In that exact moment the query would be useless. Also the formula is of no use if there is no legend how to interpret the numbers. For example our test server DB is giving PERCENT_FRAG=26.92 while being nearly unfragmented. So I would dare to recommend another formula (in pages): used - needed fragm_p = --------------- x 100 used i.e. what space is wasted from all used (in %). The needed space (in pages) can be found if we multiply PCT_UTILIZED by USABLE_PAGES and divide by 100 to remove percentages: needed = PCT_UTILIZED x USABLE_PAGES / 100 while the used space (in pages) is readable from USED_PAGES column. Therefore my final formula would be (the lines may be split by mailers): USED_PAGES - PCT_UTILIZED x USABLE_PAGES / 100 fragm_p = ------------------------------------------------ x 100 USED_PAGES and the final query would be: select cast(100 * (USED_PAGES - PCT_UTILIZED * USABLE_PAGES / 100) - / USED_PAGES as decimal(9,5)) as "Unused page parts [pages]" from db Now the percentage shows the percentage of wasted space vs. used space. 0% would mean database is fully populated with no holes, 100% are impossible (as completely empty pages would not be counted, and 99+% mean each page is filled with something small just to allocate it. PART 2. Beyond how much space is wasted inside pages we would be also interested in how many empty pages we are losing due to partition-allocation scheme. Again the math first. Same formula (but now in MB): used - needed fragm_p = --------------- x 100 used Now needed space is derived from CAPACITY_MB field: needed = PCT_UTILIZED x CAPACITY_MB / 100 while actual usage is the size to which we can reduce the DB: used = CAPACITY_MB - MAX_REDUCTION_MB the resulting formula would look like (the lines may be split by mailers): (CAPACITY_MB - MAX_REDUCTION_MB) - PCT_UTILIZED x CAPACITY_MB / 100 fragm_p = --------------------------------------------------------------------- x 100 CAPACITY_MB - MAX_REDUCTION_MB Division by zero cannot happen as TSM server does not allow us to reduce the DB under one partition. Now the query for this percentage would be: select cast(((CAPACITY_MB - MAX_REDUCTION_MB) - - (PCT_UTILIZED * CAPACITY_MB / 100) ) - / (CAPACITY_MB - MAX_REDUCTION_MB) * 100 - as decimal(9,5)) as "Allocation waste [%]" from db And the final big-big query would look like: select cast(USED_PAGES - PCT_UTILIZED * USABLE_PAGES / 100 - as decimal (20,3)) as "Unused page parts [pages]", - cast(100 * (USED_PAGES - PCT_UTILIZED * USABLE_PAGES / 100) - / USED_PAGES as decimal(9,5)) as "Page fragmentation [%]", - cast( (CAPACITY_MB - MAX_REDUCTION_MB) - - (PCT_UTILIZED * CAPACITY_MB / 100) as decimal (10,2)) - as "Overallocated space [MB]", - cast(((CAPACITY_MB - MAX_REDUCTION_MB) - - (PCT_UTILIZED * CAPACITY_MB / 100) ) - / (CAPACITY_MB - MAX_REDUCTION_MB) * 100 - as decimal(9,5)) as "Allocation waste [%]" from db If someone already invented these formulae, I would congratulate him/her. Even if I am the first dared to do this hard work, there is no Nobel price for mathematics :-)) Zlatko Krastev IT Consultant Remco Post <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> 20.10.2003 13:30 Please respond to "ADSM: Dist Stor Manager" To: [EMAIL PROTECTED] cc: Subject: Re: Online DB Reorg On Sat, 18 Oct 2003 14:35:09 -0400 "Talafous, John G." <[EMAIL PROTECTED]> wrote: > Remco, > Would you be willing to share your SQL query that reports on DB > fragmentation? > I was allready looking at Eric (he probably saved my thingy somewhere usefull, I just saved it in my sent-mail folder), here it is... select cast((100 - ( cast(MAX_REDUCTION_MB as float) * 256 ) / - (cast(USABLE_PAGES as float) - cast(USED_PAGES as float) ) * 100) as - decimal(4,2)) as percent_frag from db Note that I still think this is one of the more useless queries I've ever build... > Thanks to all, > John G. Talafous IS Technical Principal > The Timken Company Global Software Support > P.O. Box 6927 Data Management > 1835 Dueber Ave. S.W. Phone: (330)-471-3390 > Canton, Ohio USA 44706-0927 Fax : (330)-471-4034 > [EMAIL PROTECTED] http://www.timken.com -- Met vriendelijke groeten, Remco Post SARA - Reken- en Netwerkdiensten http://www.sara.nl High Performance Computing Tel. +31 20 592 8008 Fax. +31 20 668 3167 "I really didn't foresee the Internet. But then, neither did the computer industry. Not that that tells us very much of course - the computer industry didn't even foresee that the century was going to end." -- Douglas Adams This e-mail may contain identifiable health information that is subject to protection under state and federal law. This information is intended to be for the use of the individual named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited and may be punishable by law. If you have received this electronic transmission in error, please notify us immediately by electronic mail (reply).
