Hello,

I wrote article about statistical function - when I tested Joe Celko's
method, I found some problems on not unique dataset:

on distinct dataset is rule so rows here is max(hi), then there is min(lo):

create table x1 (a integer);

insert into x1 select generate_series(1,10);

postgres=# select row_number() over (order by a), row_number() over
(order by a desc) from x1;
 row_number | row_number
------------+------------
         10 |          1
          9 |          2
          8 |          3
          7 |          4
          6 |          5
          5 |          6
          4 |          7
          3 |          8
          2 |          9
          1 |         10
(10 rows)

but on other set I got

truncate table x1;
insert into x1 
values(2),(2),(3),(3),(4),(4),(5),(5),(6),(6),(6),(8),(9),(9),(10),(10);

postgres=# select row_number() over (order by a), row_number() over
(order by a desc) from x1;
 row_number | row_number
------------+------------
         16 |          1
         15 |          2
         14 |          3
         11 |          4
         13 |          5
         12 |          6
          9 |          7
         10 |          8
          7 |          9
          8 |         10
          5 |         11
          6 |         12
          4 |         13
          3 |         14
          1 |         15
          2 |         16
(16 rows)

I am not sure, is this correct? When this solution is correct, then
Joe Celko's method for median calculation is buggy.

Regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to