On Fri, Feb 23, 2018 at 6:42 AM, Viktor Fougstedt <vik...@chalmers.se>

> Hi.
> This answer is perhaps useful if I understand your problem correctly. But
> I might have interpreted it wrongly. :-)
> I would probably start with merging intervals so that overlapping and
> adjacent intervals become single continuous intervals, then select from
> those merged intervals.
> We have an application with a lot of interval handling in PostgreSQL, and
> we use many variants of algorithms based on one by Itzik Ben Gan which he
> calls “packing intervals”. The post we started with was an old reader’s
> challenge from SQL Magazine. It has since been updated when MS-SQL started
> supporting window functions better (Itzik Ben Gan is a MS-SQL-guy).
> Basically, it is a few CTE:s which convert the intervals into “start” (+1)
> and “stop” (-1) events, then keeps a running sum of these, and finally
> creates new intervals by noting that the merged (or “packed”) intervals
> starts with events that had sum=0 before them and stops with events which
> have sum=0 after them.
> It involves both CTE:s and window functions and is quite a beautiful
> example of SQL, IMHO.
> I think it’s best to google it, he does a good job of explaining how it
> works.
> Hope that helps a bit at least.
> /Viktor
> Hi.  Thanks for the many helpful responses!  Although I didn't end up with
exactly what I was looking for, I think I got to something that works good
enough for now.  I did it with CTEs, in a way I think similar to what you
were suggesting (my printer is out of ink, so I didn't actually get to look
at that book yet!)  I ended up having to do 4 passes:

1) Identify the starts and ends of continuous ranges
2) Eliminate the middle-point records (non stops/starts)
3) Merge the stop date in with the starts
4) Eliminate the stops

I couldn't see how to do it in less steps, but if there's a way...

I posted the query along with some sample data below.

Thanks again!



CREATE TEMP TABLE sample_data (
client_id INTEGER,
start_date DATE,
end_date DATE,
episode INTEGER -- I won't actually have this info; just a label I added
for clarity and testing




WITH peek3 AS (
WITH peek2 AS (
WITH peek AS (
  daterange(start_date,end_date,'[]') AS range,
  COALESCE(daterange(start_date,end_date+1,'[]') &&
  lead(daterange(start_date,end_date,'[]')) OVER (PARTITION BY client_id
ORDER BY start_date),false) AS continues,
  COALESCE(daterange(start_date,end_date,'[]') &&
  lag(daterange(start_date,end_date+1,'[]')) OVER (PARTITION BY client_id
ORDER BY start_date),false) AS is_continued
FROM peek
  WHERE NOT (is_continued AND continues)
SELECT client_id,episode,range,
  CASE WHEN lead(is_continued) OVER (PARTITION BY client_id ORDER BY range)
  lead(upper(range)) OVER (PARTITION BY client_id ORDER BY range)
  ELSE upper(range)
  END) AS full_range
FROM peek2
WHERE NOT is_continued

AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
(253) 245-3801

Subscribe to the mailing list
<agency-general-requ...@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

Reply via email to