On Tue, Nov 30, 2010 at 7:47 PM, Samuel Gendler <sgend...@ideasculptor.com>wrote:
> I have a fact table with a number of foreign keys to dimension data and > some measure columns. Let's assume I've got dimension columns a,b,c,d,e, > and f and measure columns x, and y. > > I need to be able to find the value of f, x, and y for the row with > min(x/y) when e in (1,2) and max(x/y) when e not in (1,2) with grouping by > a,b,c, and d. > > A query which would identify the row in question would look like this: > > select min(x/y), a, b, c, d, e from my_table group by a,b,c,d,e order by > a,b,c,d > > But I don't know how to go about also extracting the value of x, y, and f > for the row which is matched - I suppose I could join to a subquery, but > that strikes me as probably not being the most efficient way to accomplish > it. > > Actually, it occurs to me that I've potentially got more than one row for > any given a,b,c,d,e,f tuple, but I want a min for all rows aggregated by f > > Basically, for each (a,b,c,d,e) I want to return sum(x), sum(y), and f (as > well as a,b,c,d,e) for the f with min(sum(x)/sum(y)) > > I solved this, but I'm not sure if my solution is ideal -- select distinct causes multiple rows that tie for rank = 1 -- to condense down to a single row. Outer query also strips -- any rows from windows with less than 3 entries. select distinct count, total_ms, time, fk_e, fk_a, fk_b, fk_c from ( -- grab count and total_ms for each fk_g, grouped by -- a,b,c, time, and e. Use rank() window function -- to rank each row in the window, ordered by total_ms/count -- and use count(*) as window function to get total rows in -- the window. select sum(f.x) as count, sum(f.y) as total_ms, f.time as time, f.fk_a, f.fk_b, f.fk_c, f.fk_e, f.fk_f, rank() over (w_agg order by sum(f.x)/sum(f.y)) as rank, count(*) over (w_agg) as rows from fact_table f group by 3,4,5,6,7,8 window w_agg as (partition by time, fk_a, fk_b, fk_c, fk_e) ) q where q.rank = 1 and q.rows >= 3 I will need to run this twice - once with rank() ordered descending and the inner query actually limited by fk_e in (4,14) and once with rank() ordered ascending for fk_e not in (4,14). It is possible I can put a case statement in the inner select in order to handle that. I haven't tried it yet. Is there a better solution than the outer 'select distinct' in order to only see rows that have rank = 1 and to never have duplicate rows?