Re: Expense of 'over ... partition by'

Jared . Still Fri, 05 Dec 2003 09:41:42 -0800

Thanks Jonathan.

Lack of some decent docs and my inexperience with analytics led
me to the (partition by 1). Someone mentioned that Tom Kyte's book
has a good chapter on them, so I'll go look that up.

You're right, the group by is unnecessary in that query, it's an artifact
of an earlier incarnation of the query. Removing it seemed to make
little difference in

Larry Elkins pointed out that the SQL could be greatly simplified via
the ratio_to_report() function. It appears below.

The standard SQL approach is more appropriate for this, I agree.
I'm just trying to use newer functionality as much as possible so
as to be familiar with it.

select
low_optimal_size/1024 low_optimal_size_kb
, (high_optimal_size+1)/1024 high_optimal_size_kb
, optimal_executions
, onepass_executions
, multipasses_executions
, total_executions
, optimal_executions / total_executions * 100 pct_optimal_executions
, onepass_executions / total_executions * 100 pct_onepass_executions
, multipasses_executions / total_executions * 100 pct_multipasses_executions
, ratio_to_report(optimal_executions) over ( ) * 100 pct_total_optimal_executions
from v$sql_workarea_histogram
where total_executions != 0
order by low_optimal_size_kb

Here is the latest run_stats from it - quite an improvement with ratio_to_report() ( RUN1 )

NAME RUN1 RUN2 DIFF
---------------------------------------- ---------- ---------- ----------
LATCH.Consistent RBA 1 0 -1
LATCH.lgwr LWN SCN 1 0 -1
LATCH.mostly latch-free SCN 1 0 -1
LATCH.redo allocation 19 18 -1
STAT...calls to kcmgcs 7 6 -1
STAT...consistent gets 4 5 1
STAT...cursor authentications 0 1 1
STAT...deferred (CURRENT) block cleanout 4 3 -1
applications

STAT...redo entries 17 18 1
STAT...enqueue requests 1 0 -1
STAT...db block gets 28 29 1
STAT...consistent gets - examination 0 1 1
STAT...cleanout - number of ktugct calls 0 1 1
STAT...calls to get snapshot scn: kcmgss 1006 1005 -1
STAT...active txn count during cleanout 0 1 1
LATCH.undo global data 1 0 -1
LATCH.library cache pin 2012 2010 -2
STAT...session logical reads 32 34 2
LATCH.redo writing 2 0 -2
LATCH.cache buffers chains 102 105 3
STAT...recursive cpu usage 32 29 -3
LATCH.library cache pin allocation 4 8 4
LATCH.messages 6 0 -6
LATCH.shared pool 1001 1008 7
LATCH.library cache 2014 2022 8
STAT...redo size 27084 27496 412
LATCH.SQL memory manager workarea list l 0 2000 2000
atch

27 rows selected.

Jared

"Jonathan Lewis" <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]

12/05/2003 01:14 AM
Please respond to ORACLE-L

To: Multiple recipients of list ORACLE-L <[EMAIL PROTECTED]>
cc:
Subject: Re: Expense of 'over ... partition by'

Jared,

I think what you've discovered is just a repeat
of the fact that different functionality is appropriate
in different circumstances.

Imagine replacing your v$sql_workarea_histogram
with a chunky SQL statement that crunched through
a massive table producing a small result set.

In those circumstances, your analytic approach would
sort a small set twice having done one big crunch. With
the group by approach, you would have to crunch the big
data set twice. I know which option would be cheaper.
(You then have to wonder whether you could produce
the small result set using subquery factoring 'with subquery'
as another possible optimisation strategy).

BTW - did you notice how Oracle didn't do a sort for
the order by in the GROUP BY example, because
the optimizer could infer that the data had already been
ordered by the GROUP BY ? That's the reason why
your GROUP BY example did less sorting. (I'm not
sure you need the GROUP BY, though I may be missing
something).

BTW-2: in the analytic clause, the (partition by 1) is not
necessary, you can write:
> , sum(optimal_executions) over ( )

Regards

Jonathan Lewis
http://www.jlcomp.demon.co.uk

The educated person is not the person
who can answer the questions, but the
person who can question the answers -- T. Schick Jr

One-day tutorials:
http://www.jlcomp.demon.co.uk/tutorial.html

Three-day seminar:
see http://www.jlcomp.demon.co.uk/seminar.html
____UK___November

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

----- Original Message -----
To: "Multiple recipients of list ORACLE-L" <[EMAIL PROTECTED]>
Sent: Friday, December 05, 2003 12:24 AM

> While working on some scripts to monitor PGA usage on 9i, I came across
> something interesting while experimenting with different forms of SQL.
>
> I have recently been forcing myself to make use of 'OVER..PARTITION BY' in
> SQL so as to be more comfortable in using it. Can't add new tools to the
> box until I
> know how to use them. :) Yes, I know I should have been using them long
> ago.
>
> Anyway, I thought it might be interesting to compare the forms of SQL with
> and
> without the use of OVER...PARTITION BY.
>
> This SQL can be run on any instance that has PGA_AGGREGATE_TARGET set.
>
> Here is the SQL using OVER:
>
> select
> low_optimal_size_kb
> , high_optimal_size_kb
> , optimal_executions
> , onepass_executions
> , multipasses_executions
> , total_executions
> , optimal_executions / sum_optimal_executions * 100
> pct_optimal_executions
> from (
> select
> low_optimal_size/1024 low_optimal_size_kb
> , (high_optimal_size+1)/1024 high_optimal_size_kb
> , optimal_executions
> , onepass_executions
> , multipasses_executions
> , total_executions
> , sum(optimal_executions) over ( partition by 1 )
> sum_optimal_executions
> from v$sql_workarea_histogram
> where total_executions != 0
> ) a
> order by low_optimal_size_kb
> /
>
> and here is the SQL using good old GROUP BY
>
> select
> low_optimal_size_kb
> , high_optimal_size_kb
> , optimal_executions
> , onepass_executions
> , multipasses_executions
> , total_executions
> , optimal_executions / sum_optimal_executions * 100
> pct_optimal_executions
> from (
> select
> h.low_optimal_size/1024 low_optimal_size_kb
> , (h.high_optimal_size+1)/1024 high_optimal_size_kb
> , h.optimal_executions
> , h.onepass_executions
> , h.multipasses_executions
> , h.total_executions
> , hs.sum_optimal_executions
> from v$sql_workarea_histogram h,
> (
> select sum(optimal_executions)
> sum_optimal_executions
> from v$sql_workarea_histogram
> ) hs
> where h.total_executions != 0
> group by h.low_optimal_size/1024
> ,(h.high_optimal_size+1)/1024
> , h.optimal_executions
> , h.onepass_executions
> , h.multipasses_executions
> , h.total_executions
> , hs.sum_optimal_executions
> ) a
> order by low_optimal_size_kb
> /
>
>
> The new version is significantly simpler.
>
> It then seemed that it might be interesting to compare the performance and
> scalability of the two methods.
>
> This is where it gets interesting.
>
>
> 16:10:47 rsysdevdb.radisys.com - [EMAIL PROTECTED] SQL> @run_stats
>
> NAME RUN1 RUN2 DIFF
> ---------------------------------------- ---------- ---------- ----------
> LATCH.lgwr LWN SCN 1 0 -1
> LATCH.mostly latch-free SCN 1 0 -1
> LATCH.undo global data 1 0 -1
> STAT...active txn count during cleanout 1 0 -1
> STAT...consistent gets 5 4 -1
> STAT...db block gets 28 29 1
> STAT...enqueue requests 1 0 -1
> STAT...redo entries 17 18 1
> STAT...deferred (CURRENT) block cleanout 4 3 -1
> applications
>
> STAT...consistent gets - examination 1 0 -1
> STAT...cleanout - number of ktugct calls 1 0 -1
> STAT...calls to kcmgcs 7 6 -1
> STAT...calls to get snapshot scn: kcmgss 1006 1005 -1
> LATCH.Consistent RBA 2 0 -2
> STAT...recursive cpu usage 29 31 2
> LATCH.redo allocation 20 18 -2
> LATCH.cache buffers chains 102 105 3
> LATCH.redo writing 4 0 -4
> LATCH.library cache 2014 2008 -6
> LATCH.library cache pin 2012 2006 -6
> LATCH.messages 8 0 -8
> STAT...redo size 27096 27508 412
> STAT...sorts (memory) 1004 2004 1000
> LATCH.SQL memory manager workarea list l 0 2000 2000
> atch
>
> STAT...workarea executions - optimal 2008 4008 2000
> STAT...sorts (rows) 6112 10112 4000
>
> 26 rows selected.
>
> RUN1 is the the GROUP BY SQL
> RUN2 is the OVER...PARTITION BY SQL
>
> The OVER version of the SQL is significantly more expensive in terms of
> sorting and latching.
>
> Has anyone else noticed this?
>
> Or perhaps my use of OVER..PARTITION BY needs some optimization, which is
> clearly
> in the realm of possibility. :)
>
> Jared
>
>
>

--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Jonathan Lewis
INET: [EMAIL PROTECTED]

Fat City Network Services -- 858-538-5051 http://www.fatcity.com
San Diego, California -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).

Re: Expense of 'over ... partition by'

Reply via email to