2014-12-27 6:43 GMT+01:00 Merlin Moncure <mmonc...@gmail.com>:

> On Fri, Dec 26, 2014 at 11:41 PM, Merlin Moncure <mmonc...@gmail.com>
> wrote:
> > On Fri, Dec 26, 2014 at 1:19 PM, Dan S <strd...@gmail.com> wrote:
> >> Well I'm trying to implement a window-function that works on
> range_types and
> >> produces 'atomic ranges' for each input range.
> >> Let's say I have a set of ranges some overlapping some not, and I want
> to
> >> split each range at every boundary of every overlapping range and return
> >> those.
> >> So for each range r I want to return an array of ranges that consists of
> >> range r split at every overlapping range boundary.
> >> I need to consider both upper and lower boundaries and to be able to do
> this
> >> in one pass over the data I need to sort both the upper and lower
> boundaries
> >> and also sort the original ranges so I can loop over the ranges and
> >> boundaries in lockstep to produce the arrays.
> >> As a last step I sort back the arrays of atomic ranges in the original
> order
> >> in the window so I can read out the tuplesort in that order and return
> each
> >> array to its corresponding range r.
> >>
> >> (The result can be used to answer questions like what are the maximum
> number
> >> of simultaneously overlapping ranges and at which ranges the maximum
> occurs)
>
> shoot -- sorry for empty mail -- misclick.
>
> anyways, can you give some precise examples of what you want to occur?
>  for example, a set of sql taking actual inputs and the output you
> would like to see...
>
> merlin
>

Say I have theses ranges (in a table or as output of a sql statement):

[2000-01-01,2000-05-01)
[2000-02-01,2003-07-01)
[2002-06-01,2003-03-01)
[2003-04-01,2004-08-01)

then for the first range [2000-01-01,2000-05-01) I want the array output :
{[2000-01-01,2000-02-01),[2000-02-01,2000-05-01)}

for the next range [2000-02-01,2003-07-01) I want the array output:
{[2000-02-01,2000-05-01),[2000-05-01,2002-06-01),[2002-06-01,2003-03-01),[2003-03-01,2003-04-01),[2003-04-01,2003-07-01)}

and for the third range [2002-06-01,2003-03-01) I want the output:
{[2002-06-01,2003-03-01)}

and for the last range [2003-04-01,2004-08-01)  the output should be:
{[2003-04-01,2003-07-01),[2003-07-01,2004-08-01)}

As you can see each input range is split up at each boundary where it is
overlapped by the other ranges in the set.
I have written a window function which does this.
On the first call into the window function it calculates all the split up
ranges and stores the resulting arrays into a tuplesort and sort these back
into the original order in the window function.
On subsequent calls into the window function it pulls one array of ranges
at a time from the tuplesort and returns that as output.
Then on the last call into the window function I need to release the
tuplesort so to not leak resources.

The problem is that I can't determine which is the last call into the
function so I can't release the tuplesort at the appropriate time.
I need a way to determine when to release the tuplesort, for example a way
to register a callback at appropriate time.

Sql wise the table can look like this and say I have the 4 ranges above in
the table.

create table t1 (id int, dr daterange);

and the query with my window function: select id, dr,
window_range_split(dr) over () from t1
this query will exhaust the whole window and I can determine when to
release the tuplesort, I release it when I pull out the last tuple in the
window.

However this query won't work: select id, dr, window_range_split(dr) over
() from t1 limit 2
the number of tuples in the window is 4 and I have no idea that there are
only going to be 2 calls into the window function and so I fail release the
tuplesort on the second call.

I should also mention that it's not specifically dateranges I'm targeting.
the window function takes anyrange as argument type and returns anyarray.

Best Regards
Dan S

Reply via email to