I ran the code from Oracle SQL 101 which Jared posted modifying it to find the median
of the sal column on that table
SQL> select
2 rownum,
3 sal
4 from (
5 select sal
6 from scott.emp
7 where sal is not null
8 union
9 select 1 from dual where 1=2
10 )
11 group by sal, rownum
12 having rownum >= (
13 select decode( mod(total_freq,2),
14 1,trunc(total_freq/2 + 1),
15 0,trunc(total_freq/2)
16 )
17 from (
18 select count(*) total_freq
19 from scott.emp
20 where sal is not null
21 )
22 )
23 and rownum <= (
24 select decode( mod(total_freq,2),
25 1,trunc(total_freq/2 + 1),
26 0,trunc(total_freq/2 + 1)
27 )
28 from (
29 select count(*) total_freq
30 from scott.emp
31 where sal is not null
32 )
33 )
34 /
values
averaged
ROWNUM in median
---------- ----------
7 1600
8 2450
----------
Median 2025
--------------------------------------------------------------------------------------
This answer is different from the result of the code I posted which uses the new
analytical functions.
select
case
when mod(number_salaried,2) = 0 then
(select sum(sal)/2 from(select sal, row_number()
over ( order by sal) as salrank
from scott.emp)
where salrank = number_salaried/2
or salrank = number_salaried/2 +1)
else
(select sal from(select sal, row_number()
over ( order by sal) as salrank
from scott.emp)
where salrank = ceil(number_salaried/2))
end median
from (select sal,rank() over (order by sal) as rk from scott.emp),
(select count(sal) number_salaried from scott.emp)
where rk = 1
/
MEDIAN
---------
1550
------------------------------------------------------------------------------------------------
Emp is a 14 row table . The median should be the average of the seventh and eighth
values.
I cleared the computes and columns and ran the first part of the SQL 101 code
clear computes
utes cleared
select
rownum,
sal
from (
select sal
from scott.emp
where sal is not null
union
select 1 from dual where 1=2
)
group by sal, rownum
/
ROWNUM SAL
------ ----------
1 800
2 950
3 1100
4 1250
5 1300
6 1500
7 1600
8 2450
9 2850
10 2975
11 3000
ROWNUM SAL
------ ----------
12 5000
and also ran the part of my code which corresponded. I changed my code slightly so
the salrank column would print.
SQL> select salrank, sal from(select sal, row_number()
2 over ( order by sal) as salrank
3 from scott.emp)
4 /
SALRANK SAL
---------- ----------
1 800
2 950
3 1100
4 1250
5 1250
6 1300
7 1500
8 1600
9 2450
10 2850
11 2975
SALRANK SAL
---------- ----------
12 3000
13 3000
14 5000
14 rows selected.
----------------------------------------------------------------------------------------------
The reason for the different answers is now apparent. The SQL 101 code is tossing
duplicate
records. It's been a long time since my stats classes, but I'm about 99.999999%
confident you don't purge duplicates when computing a median. But even if I'm wrong
about this, the SQL 101 code has reduced the set to 12 members, but it is still
computing the median as if there were 14 members;
that is, it is taking the average of the 7th and 8th values and not the average of the
6th and 7th.
I hope there was a caveat in SQL 101 book stating the code only worked against columns
with unique values, not including nulls.
Ian MacGregor
Stanford Linear Accelerator Center
[EMAIL PROTECTED]
--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: MacGregor, Ian A.
INET: [EMAIL PROTECTED]
Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051
San Diego, California -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).