RE: Tricky SQL Question -- Solved
Title: RE: Tricky SQL Question -- Solved Jonathan, Thanks for the tips ... let me see how I incorporate this ... things to do ... 1. write a *clever* routine to look at sys.mon_mods$ with dbms_stats.flush_database_monitoring_info to decide which tables to analyze in the next session. 2. Split the groups automatically between two instances ... if once instance is down, other will pick up all streams .. *this is easy to do in pl/sql). This allows me to have only two scripts to setup analysis on my 10 production and 24 other instances without a major maintenance. Now I am able to *predict* how much time it takes ... based on last analysis ... here is a sample input based on yesterday's data ... [EMAIL PROTECTED] . oraenv ORACLE_SID = [CSI2] ? ABC [EMAIL PROTECTED] sys SQL*Plus: Release 9.2.0.2.0 - Production on Fri Mar 7 08:36:56 2003 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected. SQL set serveroutput on SQL exec system.dbpk_statistics.refresh_rank; Ranking based on analyze time is complete. Select from view V_ANALYSIS_INFO for an *estimate*. PL/SQL procedure successfully completed. SQL set line 200 SQL select * from system.v_analysis_info; Analysis Information - Group 01 includes 2160 tables, analysis should take approx 00134.10 seconds. Group 02 includes 413 tables, analysis should take approx 00134.03 seconds. Group 03 includes 33 tables, analysis should take approx 00130.91 seconds. Group 04 includes 11 tables, analysis should take approx 00128.63 seconds. Group 05 includes 4 tables, analysis should take approx 00113.87 seconds. Group 06 includes 4 tables, analysis should take approx 00152.77 seconds. Group 07 includes 2 tables, analysis should take approx 00098.63 seconds. Group 08 includes 2 tables, analysis should take approx 00180.71 seconds. 8 rows selected. I have configured it to make 8 parallel streams .. Here are some things that I am doing ... 1. The information is stored in a index organized table 2. As soon as the script loads, it loads (bulk collect) the lost of tables belonging to the group specified into an array. 3. Start executing dbms_stats on the tables in the array based on their parameters, capture elapsed times for analysis 4. If any analysis errors out, it also captures error message 5. Update the index organized table with a. last analyzed timestamp b. time it took to analyze the table c. error message if any d. uses dbms_stats to get latest rowcount 6. Exits The package has procedures to performs the set-up (tables/view/procedure/package creation). It also does two types of ranks, first time when analysis times are not available, it groups them by row count. After first analysis, it re-ranks them based on analysis time which is more accurate than row count. Thanks once again for all the ideas ... Raj - Rajendra dot Jamadagni at espn dot com Any views expressed here are strictly personal. QOTD: Any clod can have facts, having an opinion is an art !! *This e-mail message is confidential, intended only for the named recipient(s) above and may contain information that is privileged, attorney work product or exempt from disclosure under applicable law. If you have received this message in error, or are not the named recipient(s), please immediately notify corporate MIS at (860) 766-2000 and delete this e-mail message from your computer, Thank you.*1
RE: Tricky SQL Question -- Solved
Title: RE: Tricky SQL Question -- Solved Okay, I cracked it ... if you are interested, read on ... it is not very optimal, but close to what I want. To me 8 streams is standard, so you'd see 8 as hardcoded. Also I found that select sum(obj_last_analyze_time)/8 from statistics_info / was about 8425 (i.e. ~ 85 seconds). So I wrote this not-so-dynamic sql select group_id, sum(tm1), count(*) from( SELECT obj_owner, obj_name, tm1, case when roll_sum = 8400*1 then 1 else case when roll_sum = 8400*2 then 2 else case when roll_sum = 8400*3 then 3 else case when roll_sum = 8400*4 then 4 else case when roll_sum = 8400*5 then 5 else case when roll_sum = 8400*6 then 6 else case when roll_sum = 8400*7 then 7 else 8 end end end end end end end group_id FROM (SELECT rnum, obj_owner, obj_name, tm1, SUM (tm1) OVER (ORDER BY rnum RANGE UNBOUNDED PRECEDING) roll_sum FROM (SELECT ROWNUM rnum, obj_owner, obj_name, tm1 FROM (SELECT obj_owner ,obj_name,obj_last_analyze_time tm1 FROM statistics_info ORDER BY obj_last_analyze_time))) ) group by group_id / The output is as follows ... GROUP_ID TOT_TIME TOT_TABLES -- 1 8397 1755 2 8387 667 3 8204 135 4 7984 20 5 8954 7 6 6928 3 7 7113 2 8 11438 1 I'll probably make it dynamic enough ... inside my package ... Cheers Raj - Rajendra dot Jamadagni at espn dot com Any views expressed here are strictly personal. QOTD: Any clod can have facts, having an opinion is an art !! -Original Message- From: Jamadagni, Rajendra Sent: Thursday, March 06, 2003 1:16 PM To: '[EMAIL PROTECTED]' Subject: RE: Tricky SQL Question - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: 06 March 2003 16:49 Hi all, I have a tricky situation ... I have a table columns are owner varchar2(), name varchar2(), ana_tm number ana_tm represents how much time it took to perform statistics collection for owner.name value. the number ranges from 0 to about 12000 right now, and is subject to change. and say sum(ana_tm) over the table is say X. What I'd like to have is split this data into say N groups (Let's say 8), so that each group ends up having a sum(ana_tm) ~ X/N (i.e. X/8 in this example). What I need is a way in SQL to splice the table list in eight groups so that when I run a parallel 8 stream analyze, they all roughly take same amount of time. I tried width_bucket() and it doesn't give me things that I need. It assumes a linear distribution, which I do not have. Is this possible to do in SQL only? Thanks in advance, yes, you can go crazy with syntax, it is 9202. Raj *This e-mail message is confidential, intended only for the named recipient(s) above and may contain information that is privileged, attorney work product or exempt from disclosure under applicable law. If you have received this message in error, or are not the named recipient(s), please immediately notify corporate MIS at (860) 766-2000 and delete this e-mail message from your computer, Thank you.*1
Re: Tricky SQL Question -- Solved
Very clever ! Can I make a couple of suggestions: You've got a very large number of tables in one group - and the startup time for the analyze might have a big impact on this group - so how about adding in (say) one second to the analyze type in order to cater for startup. Also - how about taking out any tables which individually take up more than the sum(all_times)/count(streams) before running the query on the rest. You might try randomising the ordering for the rest of the tables instead of ordering them by analyze time (since you have a large number and a lot use very small times) - I suspect this would help to flatten out the peaks in the timing, and make the number of tables per stream much more even - so reducing the effect of startup times. I have a very simple-minded (sub-optimal) procedural solution, but I'm trying to work out a way of expressing it non-procedurally. If I succeed I'll let you know. Regards Jonathan Lewis http://www.jlcomp.demon.co.uk Coming soon one-day tutorials: Cost Based Optimisation Trouble-shooting and Tuning Indexing Strategies (see http://www.jlcomp.demon.co.uk/tutorial.html ) UK___March 19th UK___April 8th UK___April 22nd USA_(FL)_May 2nd Next Seminar dates: (see http://www.jlcomp.demon.co.uk/seminar.html ) USA_(CA, TX)_August The Co-operative Oracle Users' FAQ http://www.jlcomp.demon.co.uk/faq/ind_faq.html - Original Message - To: Multiple recipients of list ORACLE-L [EMAIL PROTECTED] Sent: 06 March 2003 19:38 Okay, I cracked it ... if you are interested, read on ... it is not very optimal, but close to what I want. To me 8 streams is standard, so you'd see 8 as hardcoded. Also I found that select sum(obj_last_analyze_time)/8 from statistics_info / was about 8425 (i.e. ~ 85 seconds). So I wrote this not-so-dynamic sql select group_id, sum(tm1), count(*) from( SELECT obj_owner, obj_name, tm1, case when roll_sum = 8400*1 then 1 else case when roll_sum = 8400*2 then 2 else case when roll_sum = 8400*3 then 3 else case when roll_sum = 8400*4 then 4 else case when roll_sum = 8400*5 then 5 else case when roll_sum = 8400*6 then 6 else case when roll_sum = 8400*7 then 7 else 8 end end end end end end end group_id FROM (SELECT rnum, obj_owner, obj_name, tm1, SUM (tm1) OVER (ORDER BY rnum RANGE UNBOUNDED PRECEDING) roll_sum FROM (SELECT ROWNUM rnum, obj_owner, obj_name, tm1 FROM (SELECT obj_owner ,obj_name,obj_last_analyze_time tm1 FROM statistics_info ORDER BY obj_last_analyze_time))) ) group by group_id / The output is as follows ... GROUP_ID TOT_TIME TOT_TABLES -- 1 8397 1755 2 8387 667 3 8204 135 4 7984 20 5 8954 7 6 6928 3 7 7113 2 8 11438 1 I'll probably make it dynamic enough ... inside my package ... Cheers Raj - Rajendra dot Jamadagni at espn dot com Any views expressed here are strictly personal. QOTD: Any clod can have facts, having an opinion is an art !! -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Jonathan Lewis INET: [EMAIL PROTECTED] Fat City Network Services-- 858-538-5051 http://www.fatcity.com San Diego, California-- Mailing list and web hosting services - To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).