Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

Peter Moser Mon, 20 Feb 2017 08:39:11 -0800

On Wed, Feb 15, 2017 at 9:33 PM, David G. Johnston
<[email protected]> wrote:
> On Wed, Feb 15, 2017 at 12:24 PM, Robert Haas <[email protected]> wrote:
>>
>> So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN,
>> except apparently there's no join type and the optimizer can never
>> reorder these operations with each other or with other joins.  Is that
>> right?  The optimizer changes in this patch seem fairly minimal, so
>> I'm guessing it can't be doing anything very complex here.
>>
>> + * INPUT:
>> + *             (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c
>> + *             where q can be any join qualifier, and r.ts, r.te, s.ts,
>> and s.t
>> e
>> + *             can be any column name.
>> + *
>> + * OUTPUT:
>> + *             (
>> + *             SELECT r.*, GREATEST(r.ts, s.ts) P1, LEAST(r.te, s.te) P2
>> + *      FROM
>> + *      (
>> + *             SELECT *, row_id() OVER () rn FROM r
>> + *      ) r
>> + *      LEFT OUTER JOIN
>> + *      s
>> + *      ON q AND r.ts < s.te AND r.te > s.ts
>> + *      ORDER BY rn, P1, P2
>> + *      ) c
>>
>> It's hard to see what's going on here.  What's ts?  What's te?  If you
>> used longer names for these things, it might be a bit more
>> self-documenting.
>
>
> Just reasoning out loud here...
>
> ISTM ts and te are "temporal [range] start" and "temporal [range] end" (or
> probably just the common "timestamp start/end")
>
> From what I can see it is affecting an intersection of the two ranges and,
> furthermore, splitting the LEFT range into sub-ranges that match up with the
> sub-ranges found on the right side.  From the example above this seems like
> it should be acting on self-normalized ranges - but I may be missing
> something by evaluating this out of context.
>
> r1 [1, 6] [ts, te] [time period start, time period end]
> s1 [2, 3]
> s2 [3, 4]
> s3 [5, 7]
>
> r LEFT JOIN s ON (r.ts < s.te AND r.te > s.ts)
>
> r1[1, 6],s1[2, 3] => [max(r.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[2, 3]
> r1[1, 6],s2[3, 4] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[3, 4]
> r1[1, 6],s3[5, 7] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[5, 6]
>
> Thus the intersection is [2,6] but since s1 has three ranges that begin
> between 2 and 6 (i.e., 2, 3, and 5) there are three output records that
> correspond to those sub-ranges.


Yes, this is what the internal rewriting produces for r1.
Note that till now we only support half-open ranges, i.e., [), but for
visibility I will continue this example using closed ranges [].
The executor function (ExecTemporalAdjustment) gets this (the output above) as
the input and will then produce:

r1[1, 1]
r1[2, 3]
r1[3, 4]
r1[5, 6]

Which means also for the ALIGN the non-overlapping parts are retained.

>
> The description in the OP basically distinguishes between NORMALIZE and
> ALIGN in that ALIGN, as described above, affects an INTERSECTION on the two
> ranges - discarding the non-overlapping data - while NORMALIZE performs the
> alignment while also retaining the non-overlapping data.

Also for ALIGN we retain the non-overlapping part.
Intersections are symmetric/commutative, so a subsequent outer join can then use
equality on the ranges
to produce join matches (for overlapping) as well as null-extend the
produced non-overlapping parts.

The difference between ALIGN and NORMALIZE is how they split, while ALIGN
produces intersections between pairs of tuples (used for joins) and the
non-overlapping parts, NORMALIZE produces intersections between groups of tuples
(used for aggregation, so that all tuples with the same group have equal
ranges) and the non-overlapping parts.

For instance, the NORMALIZE between r1, s1, s2, and s3 in your example above
would give the following:

r1[1, 1]
r1[2, 2]
r1[3, 3]
r1[4, 4]
r1[5, 6]

>
> The rest of the syntax seems to deal with selecting subsets of range records
> based upon attribute data.

Yes, exactly!


Best regards,
Anton, Johann, Michael, Peter


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

Reply via email to