On Thu, Jul 22, 2010 at 11:31:23AM +0000, Tim Landscheidt wrote:
> Richard Huxton <d...@archonet.com> wrote:
> 
> >>> What I want to get is the values grouped by "subset", where a subset is a 
> >>> set of rows with identical column until the colum changes.
> >>> Is there a way to get
> 
> >>> | 2 | B |
> >>> | 4 | C |
> >>> | 4 | B |
> >>> | 3 | D |
> 
> >>> by SQL only?
> 
> >> I think, the problem is that there are 2 identical groups. I think, you
> >> can write a pl/pgsql-proc, selecting all ordered by the date-field and
> >> walking through the result to do the grouping, checking if the 2nd
> >> column is different from the previous.
> 
> >> With plain SQL it's maybe possible too, but i don't know how ...
> 
> > It should be do-able in 8.4 onwards, look into windowing
> > functions. In particular the lag() function:
> 
> > SELECT
> >     mycode,
> >     mydate,
> >     lag(mycode) OVER (ORDER BY mydate) AS prev_code
> > FROM
> >     mytable
> > ORDER BY mydate;
> 
> > It should be possible to use that as a subquery with an
> > outer query that compares mycode=prev_code to get a run
> > length.
> 
> Hmmm. Can the outer query be done without using "WITH
> RECURSIVE"?

How about this:

select
    a, b, c, d, sum
from (
    select
        a, b, c, d, new_partition,
        sum(e) over (partition by partition_num) 
    from (
        select
            a, b, c, d, e,
            case when
                lag(a, 1, null) over (order by d) is null or
                lag(a, 1, null) over (order by d) != a or
                lag(b, 1, null) over (order by d) is null or
                lag(b, 1, null) over (order by d) != b or
                lag(c, 1, null) over (order by d) is null or
                lag(c, 1, null) over (order by d) != c
                    then nextval('a')
                else currval('a')
            end as partition_num,
            case when
                lag(a, 1, null) over (order by d) is null or
                lag(a, 1, null) over (order by d) != a or
                lag(b, 1, null) over (order by d) is null or
                lag(b, 1, null) over (order by d) != b or
                lag(c, 1, null) over (order by d) is null or
                lag(c, 1, null) over (order by d) != c
                    then 'T'::boolean
                else 'f'::boolean
            end as new_partition
        from foo
    ) bar
) baz
where
    new_partition;

Here's my test table:
5432 j...@postgres# select * from foo;
 a | b | c |           d            | e 
---+---+---+------------------------+---
 1 | 9 | 1 | 2007-01-01 00:00:05-07 | 8
 9 | 2 | 1 | 2007-01-01 00:00:10-07 | 4
 9 | 2 | 1 | 2007-01-01 00:00:15-07 | 6
 9 | 2 | 1 | 2007-01-01 00:00:20-07 | 2
 6 | 5 | 7 | 2007-01-01 00:00:25-07 | 3
 4 | 9 | 0 | 2007-01-01 00:00:30-07 | 0
 4 | 9 | 0 | 2007-01-01 00:00:35-07 | 7
 5 | 2 | 7 | 2007-01-01 00:01:25-07 | 7
 5 | 2 | 7 | 2007-01-01 00:01:30-07 | 7
 5 | 2 | 7 | 2007-01-01 00:01:35-07 | 9
 5 | 2 | 7 | 2007-01-01 00:01:40-07 | 2
 5 | 2 | 7 | 2007-01-01 00:01:45-07 | 5
 5 | 2 | 7 | 2007-01-01 00:01:50-07 | 8
 5 | 2 | 7 | 2007-01-01 00:01:55-07 | 5
 5 | 2 | 7 | 2007-01-01 00:02:00-07 | 9
 7 | 8 | 8 | 2007-01-01 00:02:05-07 | 7
 7 | 8 | 8 | 2007-01-01 00:02:10-07 | 8
 9 | 3 | 0 | 2007-01-01 00:02:15-07 | 0
 9 | 3 | 0 | 2007-01-01 00:02:20-07 | 8
 9 | 2 | 1 | 2007-01-01 00:02:25-07 | 3
 9 | 2 | 1 | 2007-01-01 00:02:30-07 | 3
(21 rows)

...and these results...
 a | b | c |           d            | sum 
---+---+---+------------------------+-----
 1 | 9 | 1 | 2007-01-01 00:00:05-07 |   8
 9 | 2 | 1 | 2007-01-01 00:00:10-07 |  12
 6 | 5 | 7 | 2007-01-01 00:00:25-07 |   3
 4 | 9 | 0 | 2007-01-01 00:00:30-07 |   7
 5 | 2 | 7 | 2007-01-01 00:01:25-07 |  52
 7 | 8 | 8 | 2007-01-01 00:02:05-07 |  15
 9 | 3 | 0 | 2007-01-01 00:02:15-07 |   8
 9 | 2 | 1 | 2007-01-01 00:02:25-07 |   6
(8 rows)

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Attachment: signature.asc
Description: Digital signature

Reply via email to