Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-11-11 Thread Robert Haas
On Fri, Oct 6, 2017 at 3:22 PM, Mike Rylander  wrote:
> I've also been following this feature with great interest, and would
> definitely throw whatever tiny weight I have, sitting out here in the
> the peanut gallery, behind accepting the ALIGN and NORMALIZE syntax.
> I estimate that about a third of the non-trivial queries in the
> primary project I work on (and have, on Postgres, for the last 13+
> years) would be simpler with support of the proposed syntax, and some
> of the most complex business logic would be simplified nearly to the
> point of triviality.

This is really good input.  If the feature weren't useful, then it
wouldn't make sense to try to figure out how to integrate it, but if
it is, then we should try.

I don't think that implementing a feature like this by SQL
transformation can work.  It's certainly got the advantage of
simplicity of implemention, but there are quite a few things that seem
like they won't always work correctly.  For instance:

+ * INPUT:
+ * (r ALIGN s ON q WITH (r.t, s.t)) c
+ *
+ *  where r and s are input relations, q can be any
+ *  join qualifier, and r.t, s.t can be any column name. The latter
+ *  represent the valid time intervals, that is time point start,
+ *  and time point end of each tuple for each input relation. These
+ *  are two half-open, i.e., [), range typed values.
+ *
+ * OUTPUT:
+ *  (
+ * SELECT r.*, GREATEST(LOWER(r.t), LOWER(s.t)) P1,
+ * LEAST(UPPER(r.t), UPPER(s.t)) P2
+ *  FROM
+ *  (
+ *  SELECT *, row_id() OVER () rn FROM r
+ *  ) r
+ *  LEFT OUTER JOIN
+ *  s
+ *  ON q AND r.t && s.t
+ *  ORDER BY rn, P1, P2
+ *  ) c

One problem with this is that we end up looking up functions in
pg_catalog by name: LOWER, UPPER, LEAST, GREATEST.  In particular,
when we do this...

+fcUpperRarg = makeFuncCall(SystemFuncName("upper"),
+   list_make1(crRargTs),
+   UNKNOWN_LOCATION);

...we're hoping and praying that we're going to latch onto the first of these:

rhaas=# \df upper
  List of functions
   Schema   | Name  | Result data type | Argument data types |  Type
+---+--+-+
 pg_catalog | upper | anyelement   | anyrange| normal
 pg_catalog | upper | text | text| normal
(2 rows)

But that's only true as long as there isn't another function in
pg_catalog with a match to the specific range type that is being used
here, and there's nothing to stop a user from creating one, and then
their query, which does not anywhere in its query text mention the
name of that function, will start failing.  We're not going to accept
that limitation.  Looking up functions by name rather than by OID or
using an opclass or something is pretty much a death sentence for a
core feature, and this patch does a lot of it.

A related problem is that, because all of this transformation is being
done in the parser, when you use this temporal syntax to create a
view, and then run pg_dump to dump that view, you are going to (I
think) get the transformed version, not the original.  Things like
transformTemporalClause are going to have user-visible effects: the
renaming you do there will (I think) be reflected in the deparsed
output of views.  That's not good.  Users have a right to expect that
what comes out of deparsing will at least resemble what they put in.

Error reporting might be a problem too: makeTemporalQuerySkeleton is
creating parse nodes that have no location, so if an error develops at
that point, how will the user correlate that with what they typed in?

I suspect there are also problems with plan invalidation.  Any
decisions we make at parse time are fixed forever.  DDL changes can
force a re-plan, but not a re-parse.

Overall, I think that the whole approach here probably needs to be
scrapped and rethought.  The stuff this patch is doing really belongs
in the optimizer, not the parser, I think.  It could possibly happen
at a relatively early stage in the optimizer so that the rest of the
optimizer can see the results of the transformation and, well,
optimize.  But parse time is way too early.

Unrelated to the above, this patch introduces various kinds of helper
functions which are general-purpose in function but dumped in with the
temporal support because it happens to use them.  For instance:

+static Form_pg_type
+typeGet(Oid id)
+{
+HeapTupletp;
+Form_pg_type typtup;
+
+tp = SearchSysCache1(TYPEOID, ObjectIdGetDatum(id));
+if (!HeapTupleIsValid(tp))
+ereport(ERROR,
+(errcode(ERROR),
+ errmsg("cache lookup failed for type %u", id)));
+
+typtup = (Form_pg_type) GETSTRUCT(tp);
+ReleaseSysCache(tp);
+return typtup;
+}

A function as general as typeGet() certainly does not belong in

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-10-06 Thread Mike Rylander
On Fri, Oct 6, 2017 at 1:22 PM, Paul A Jungwirth
 wrote:
> On Fri, Jul 22, 2016 at 4:15 AM, Anton Dignös  wrote:
>> We would like to contribute to PostgreSQL a solution that supports the query
>> processing of "at each time point". The basic idea is to offer two new
>> operators, NORMALIZE and ALIGN, whose purpose is to adjust (or split) the
>> ranges of tuples so that subsequent queries can use the usual grouping and
>> equality conditions to get the intended results.
>
> I just wanted to chime in and say that the work these people have done
> is *amazing*. I read two of their papers yesterday [1, 2], and if you
> are interested in temporal data, I encourage you to read them too. The
> first one is only 12 pages and quite readable. After that the second
> is easy because it covers a lot of the same ground but adds "scaling"
> of values when a tuple is split, and some other interesting points.
> Their contributions could be used to implement SQL:2011 syntax but go
> way beyond that.
>

I've also been following this feature with great interest, and would
definitely throw whatever tiny weight I have, sitting out here in the
the peanut gallery, behind accepting the ALIGN and NORMALIZE syntax.
I estimate that about a third of the non-trivial queries in the
primary project I work on (and have, on Postgres, for the last 13+
years) would be simpler with support of the proposed syntax, and some
of the most complex business logic would be simplified nearly to the
point of triviality.

Anyway, that's my $0.02.

Thank you, Anton and Peter!

-- Mike


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-10-06 Thread Paul A Jungwirth
On Fri, Jul 22, 2016 at 4:15 AM, Anton Dignös  wrote:
> We would like to contribute to PostgreSQL a solution that supports the query
> processing of "at each time point". The basic idea is to offer two new
> operators, NORMALIZE and ALIGN, whose purpose is to adjust (or split) the
> ranges of tuples so that subsequent queries can use the usual grouping and
> equality conditions to get the intended results.

I just wanted to chime in and say that the work these people have done
is *amazing*. I read two of their papers yesterday [1, 2], and if you
are interested in temporal data, I encourage you to read them too. The
first one is only 12 pages and quite readable. After that the second
is easy because it covers a lot of the same ground but adds "scaling"
of values when a tuple is split, and some other interesting points.
Their contributions could be used to implement SQL:2011 syntax but go
way beyond that.

Almost every project I work on could use temporal database support,
but there is nothing available in the Open Source world. The
temporal_tables extension [3] offers transaction-time support, which
is great for auditing, but it has no valid-time support (aka
application-time or state-time). Same with Magnus Hagander's TARDIS
approach [4], Chronomodel [5] (an extension to the Rails ORM), or any
other project I've seen. But valid-time is the more valuable
dimension, because it tells you the history of the thing itself (not
just when the database was changed). Also nothing is even attempting
full bitemporal support.

The ideas behind temporal data are covered extensively in Snodgrass's
1999 book [6], which shows how valuable it is to handle temporal data
in a principled way, rather than ad hoc. But that book also
demonstrates how complex the queries become to do things like temporal
foreign key constraints and temporal joins. I was sad to learn that
his proposed TSQL2 was rejected as a standard back in the 90s,
although the critiques by C. J. Date [7] have some merit. In
particular, since TSQL2 used *statement* modifiers, some of the
behavior was unclear or bad when using subqueries, views, and
set-returning functions. It makes more sense to have temporal
*operators*, so alongside inner join you have temporal inner join, and
likewise with temporal left outer join, temporal
union/intersection/difference, temporal aggregation, etc. (I think the
drawbacks of TSQL2 came from pursuing an unachievable goal, which was
to enable seamlessly converting existing non-temporal tables to
temporal without breaking any queries.)

Another unsatisfactory approach at historical data, from the industry
rather than academia, is in chapter 4 and elsewhere of Ralph Kimball's
*Data Warehouse Toolkit* [8]. His first suggestion (Type 1 Dimensions)
is to ignore the problem and overwrite old data with new. His Type 2
approach (make a new row) is better but loses the continuity between
the old row and the new. Type 3 fixes that but supports only one
change, not several. And anyway his ideas are tailored to star-schema
designs so are not as broadly useful. Workarounds like bridge tables
and "put the data in the fact table" are even more wedded to a
star-schema approach. But I think his efforts do show how valuable
historical data is, and how hard it is to handle without built-in
support.

As far as I can tell SQL:2011 avoids the statement modifier problem
(I'm not 100% sure), but it is quite limited, mostly covering
transaction-time semantics and not giving any way to do valid-time
outer joins or aggregations. It is clearly an early first step.
Unfortunately the syntax feels (to me) crippled by over-specificity,
like it will have a hard time growing to support all the things you'd
want to do.

The research by Dignös et al shows how you can define temporal
variants for every operator in the relational algebra, and then
implement them by using just two transformations (ALIGN and NORMALIZE)
combined with the existing non-temporal operators. It has a strong
theoretical basis and avoids the TSQL2 problems with composability.
And unlike SQL:2011 it has a great elegance and completeness I haven't
seen anywhere else.

I believe with range types the approach was to build up useful
primitives rather than jumping straight to a less-factored full
implementation of temporal features. (This in spite of SQL:2011
choosing to model begin/end times as separate columns, not as ranges.
:-) It seems to me the Dignös work follows the same philosophy. Their
ALIGN and NORMALIZE could be used to implement SQL:2011 features, but
they are also useful for much more. In their papers they actually
suggest that these transformations need not be exposed to end-users,
although it was convenient to have access to them for their own
research. I think it'd be great if Postgres's SQL dialect supported
them though, since SQL:2011 leaves out so much.

Anyway, I wanted to thank them for their excellent work, their
generosity, and also their 

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-22 Thread Peter Moser
2017-09-22 10:21 GMT+02:00 Pavel Stehule :
> Currently Postgres has zero support for SQL:2011 temporal tables. Isn't
> better start with already standard features than appends some without
> standard? The standard has some concept and if we start out of this concept,
> then the result will be far to standard probably.

We will focus for now on the Range Merge Join algorithm by Jeff Davis,
which implements a temporal join with overlap predicates.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-22 Thread Pavel Stehule
2017-09-22 10:15 GMT+02:00 Peter Moser :

> 2017-09-22 10:06 GMT+02:00 Pavel Stehule :
> > ANSI SQL 2011 has temporal data support
> >
> > https://www.slideshare.net/CraigBaumunk/temporal-
> extensions-tosql20112012010438
>
> As operations it only supports temporal inner joins using the overlap
> predicate.
> Temporal aggregation, temporal outer joins, temporal duplicate
> elimination, and temporal set operations are not supported in
> SQL:2011.
> Please see [1] Section 2.5 Future directions.
>
> Best regards,
> Anton, Johann, Michael, Peter
>
>
> [1] https://cs.ulb.ac.be/public/_media/teaching/infoh415/
> tempfeaturessql2011.pdf


Thank you for info.

Currently Postgres has zero support for SQL:2011 temporal tables. Isn't
better start with already standard features than appends some without
standard? The standard has some concept and if we start out of this
concept, then the result will be far to standard probably.

Regards

Pavel


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-22 Thread Peter Moser
2017-09-22 10:06 GMT+02:00 Pavel Stehule :
> ANSI SQL 2011 has temporal data support
>
> https://www.slideshare.net/CraigBaumunk/temporal-extensions-tosql20112012010438

As operations it only supports temporal inner joins using the overlap predicate.
Temporal aggregation, temporal outer joins, temporal duplicate
elimination, and temporal set operations are not supported in
SQL:2011.
Please see [1] Section 2.5 Future directions.

Best regards,
Anton, Johann, Michael, Peter


[1] https://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-22 Thread Pavel Stehule
2017-09-22 9:59 GMT+02:00 Peter Moser :

> 2017-09-12 16:33 GMT+02:00 Simon Riggs :
> > PostgreSQL tries really very hard to implement the SQL Standard and
> > just the standard. ISTM that the feedback you should have been given
> > is that this is very interesting but will not be committed in its
> > current form; I am surprised to see nobody has said that, though you
> > can see the truth of that since nobody is actively looking to review
> > or commit this. Obviously if the standard were changed to support
> > these things we'd suddenly be interested...
>
> Ok, we understand that PostgreSQL wants to strictly follow the SQL
> standard, which is not yet defined for temporal databases. In this
> context we understand your comment and agree on your position.
>

ANSI SQL 2011 has temporal data support

https://www.slideshare.net/CraigBaumunk/temporal-extensions-tosql20112012010438

Regards

Pavel Stehule


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-22 Thread Peter Moser
2017-09-12 16:33 GMT+02:00 Simon Riggs :
> PostgreSQL tries really very hard to implement the SQL Standard and
> just the standard. ISTM that the feedback you should have been given
> is that this is very interesting but will not be committed in its
> current form; I am surprised to see nobody has said that, though you
> can see the truth of that since nobody is actively looking to review
> or commit this. Obviously if the standard were changed to support
> these things we'd suddenly be interested...

Ok, we understand that PostgreSQL wants to strictly follow the SQL
standard, which is not yet defined for temporal databases. In this
context we understand your comment and agree on your position.

Our approach with the two temporal primitives is more far-reaching and
comprehensive: it supports all operators of a temporal relational
algebra by systematically transforming the temporal operators to the
nontemporal counterparts, thereby taking advantage of all features of
the underlying DBMS. This requires of course also new syntax.

> What I think I'm lacking is a clear statement of why we need to have
> new syntax to solve the problem ...

The newly introduced syntax of the two primitives comes from our
decision to divide a bigger patch into two parts: an primitives-only
patch, and a temporal query rewrite patch. We thought of discussing
and providing a second patch in the future, which would then
automatically rewrite temporal queries into their non-temporal
counterparts using these two primitives.

> ... and why the problem itself is
> important.

The main idea about these temporal primitives is to have only two new
operators to provide the whole range of temporal queries, that is,
temporal joins, temporal set operations, temporal duplicate
elimination, and temporal aggregation. The benefit of the primitives
is that it is minimal invasive to the postgres kernel due to the reuse of all
standard operators after the temporal split (or normalization). It can
therefore also use existing optimizations already implemented.

An alternative approach would be to implement for each operator a
separate algorithm.  For instance, Jeff Davis is implementing a temporal
join into the existing Merge Join Executor (see [1]). Note that a
temporal join is the only operator that can be implemented without
introducing new syntax due to the overlap predicate.  For all other
temporal operators a discussion about new syntax is necessary anyway,
independent of the implementation approach.

> PostgreSQL supports the ability to produce Set Returning Functions and
> various other extensions. Would it be possible to change this so that
> we don't add new syntax to the parser but rather we do this as a set
> of functions?

Set Returning Functions would indeed be a possibility to implement
temporal query processing without new syntax, though it has some serious
drawbacks: the user has to specify the schema of the query results; the
performance might be a problem, since functions are treated as
black-boxes for the optimizer, loosing selection-pushdown and similar
optimizations.

> An alternative might be for us to implement a pluggable parser, so
> that you can have an "alternate syntax" plugin. If we did that, you
> can then maintain the plugin outside of core until the time when SQL
> Standard is updated and we can implement directly. We already support
> the ability to invent new plan nodes, so this could be considered as
> part of the plugin.

This is an interesting idea to look into when it is ready some day.



With all these clarifications in mind, we thought to focus meanwhile
on improving the performance of temporal query processing in special
cases (eg., joins), similar to the pending Range Merge Join patch by
Jeff Davis in [1]. Hereby, we like to contribute to it as reviewers
and hopefully add some improvements or valuable ideas from our
research area.


Best regards,
Anton, Johann, Michael, Peter


[1] 
https://www.postgresql.org/message-id/flat/camp0ubfwaffw3o_ngkqprpmm56m4wetexjprb2gp_nrdaec...@mail.gmail.com#camp0ubfwaffw3o_ngkqprpmm56m4wetexjprb2gp_nrdaec...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-12 Thread Simon Riggs
On 30 March 2017 at 13:11, Peter Moser  wrote:
> 2017-03-01 10:56 GMT+01:00 Peter Moser :
>> A similar walkthrough for ALIGN will follow soon.
>>
>> We are thankful for any suggestion or ideas, to be used to write a
>> good SGML documentation.
>
> The attached README explains the ALIGN operation step-by-step with a
> TEMPORAL LEFT OUTER JOIN example. That is, we start from a query
> input, show how we rewrite it during parser stage, and show how the
> final execution generates result tuples.

Thanks for this contribution. I know what it takes to do significant
contributions and I know it can be frustrating when things languish
for months.

I am starting to look at temporal queries myself so I will begin an interest.

PostgreSQL tries really very hard to implement the SQL Standard and
just the standard. ISTM that the feedback you should have been given
is that this is very interesting but will not be committed in its
current form; I am surprised to see nobody has said that, though you
can see the truth of that since nobody is actively looking to review
or commit this. Obviously if the standard were changed to support
these things we'd suddenly be interested...

What I think I'm lacking is a clear statement of why we need to have
new syntax to solve the problem and why the problem itself is
important.

PostgreSQL supports the ability to produce Set Returning Functions and
various other extensions. Would it be possible to change this so that
we don't add new syntax to the parser but rather we do this as a set
of functions?

An alternative might be for us to implement a pluggable parser, so
that you can have an "alternate syntax" plugin. If we did that, you
can then maintain the plugin outside of core until the time when SQL
Standard is updated and we can implement directly. We already support
the ability to invent new plan nodes, so this could be considered as
part of the plugin.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-08-31 Thread Thomas Munro
On Tue, Aug 1, 2017 at 12:53 AM, Peter Moser  wrote:
> On 06.04.2017 01:24, Andres Freund wrote:
>>
>> Unfortunately I don't think this patch has received sufficient design
>> and implementation to consider merging it into v10.  As code freeze is
>> in two days, I think we'll have to move this to the next commitfest.
>
>
> We rebased our patch on top of commit
> 393d47ed0f5b764341c7733ef60e8442d3e9bdc2
> from "Mon Jul 31 11:24:51 2017 +0900".

Hi Peter,

This patch still applies, but no longer compiles:

nodeTemporalAdjustment.c: In function ‘ExecTemporalAdjustment’:
nodeTemporalAdjustment.c:286:21: error: incompatible types when
assigning to type ‘Form_pg_attribute’ from type
‘FormData_pg_attribute’
   node->datumFormat = curr->tts_tupleDescriptor->attrs[tc->attNumP1 - 1];
 ^

After commits 2cd70845 and c6293249 you need to change expressions of
that format to, for example:

   node->datumFormat = TupleDescAttr(curr->tts_tupleDescriptor,
tc->attNumP1 - 1);

Thanks!

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-07-31 Thread Peter Moser

On 06.04.2017 01:24, Andres Freund wrote:

Unfortunately I don't think this patch has received sufficient design
and implementation to consider merging it into v10.  As code freeze is
in two days, I think we'll have to move this to the next commitfest.


We rebased our patch on top of commit 
393d47ed0f5b764341c7733ef60e8442d3e9bdc2

from "Mon Jul 31 11:24:51 2017 +0900".

Best regards,
Anton, Johann, Michael, Peter
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7648201..a373358 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -919,6 +919,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_SeqScan:
 			pname = sname = "Seq Scan";
 			break;
+		case T_TemporalAdjustment:
+			if(((TemporalAdjustment *) plan)->temporalCl->temporalType == TEMPORAL_TYPE_ALIGNER)
+pname = sname = "Adjustment(for ALIGN)";
+			else
+pname = sname = "Adjustment(for NORMALIZE)";
+			break;
 		case T_SampleScan:
 			pname = sname = "Sample Scan";
 			break;
diff --git a/src/backend/executor/Makefile b/src/backend/executor/Makefile
index 083b20f..b0d6d15 100644
--- a/src/backend/executor/Makefile
+++ b/src/backend/executor/Makefile
@@ -29,6 +29,6 @@ OBJS = execAmi.o execCurrent.o execExpr.o execExprInterp.o \
nodeCtescan.o nodeNamedtuplestorescan.o nodeWorktablescan.o \
nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o \
-   nodeTableFuncscan.o
+   nodeTableFuncscan.o nodeTemporalAdjustment.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index 396920c..7dd7474 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -113,6 +113,7 @@
 #include "executor/nodeValuesscan.h"
 #include "executor/nodeWindowAgg.h"
 #include "executor/nodeWorktablescan.h"
+#include "executor/nodeTemporalAdjustment.h"
 #include "nodes/nodeFuncs.h"
 #include "miscadmin.h"
 
@@ -364,6 +365,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
  estate, eflags);
 			break;
 
+		case T_TemporalAdjustment:
+			result = (PlanState *) ExecInitTemporalAdjustment((TemporalAdjustment *) node,
+ estate, eflags);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;		/* keep compiler quiet */
@@ -711,6 +717,10 @@ ExecEndNode(PlanState *node)
 			ExecEndLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			ExecEndTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
diff --git a/src/backend/executor/nodeTemporalAdjustment.c b/src/backend/executor/nodeTemporalAdjustment.c
new file mode 100644
index 000..ff2aa85
--- /dev/null
+++ b/src/backend/executor/nodeTemporalAdjustment.c
@@ -0,0 +1,571 @@
+#include "postgres.h"
+#include "executor/executor.h"
+#include "executor/nodeTemporalAdjustment.h"
+#include "utils/memutils.h"
+#include "access/htup_details.h"/* for heap_getattr */
+#include "utils/lsyscache.h"
+#include "nodes/print.h"		/* for print_slot */
+#include "utils/datum.h"		/* for datumCopy */
+#include "utils/rangetypes.h"
+
+/*
+ * #define TEMPORAL_DEBUG
+ * XXX PEMOSER Maybe we should use execdebug.h stuff here?
+ */
+#ifdef TEMPORAL_DEBUG
+static char*
+datumToString(Oid typeinfo, Datum attr)
+{
+	Oid			typoutput;
+	bool		typisvarlena;
+	getTypeOutputInfo(typeinfo, , );
+	return OidOutputFunctionCall(typoutput, attr);
+}
+
+#define TPGdebug(...) 	{ printf(__VA_ARGS__); printf("\n"); fflush(stdout); }
+#define TPGdebugDatum(attr, typeinfo) 	TPGdebug("%s = %s %ld\n", #attr, datumToString(typeinfo, attr), attr)
+#define TPGdebugSlot(slot) { printf("Printing Slot '%s'\n", #slot); print_slot(slot); fflush(stdout); }
+
+#else
+#define datumToString(typeinfo, attr)
+#define TPGdebug(...)
+#define TPGdebugDatum(attr, typeinfo)
+#define TPGdebugSlot(slot)
+#endif
+
+/*
+ * isLessThan
+ *		We must check if the sweepline is before a timepoint, or if a timepoint
+ *		is smaller than another. We initialize the function call info during
+ *		ExecInit phase.
+ */
+static bool
+isLessThan(Datum a, Datum b, TemporalAdjustmentState* node)
+{
+	node->ltFuncCallInfo.arg[0] = a;
+	node->ltFuncCallInfo.arg[1] = b;
+	node->ltFuncCallInfo.argnull[0] = false;
+	node->ltFuncCallInfo.argnull[1] = false;
+
+	/* Return value is never null, due to the pre-defined sub-query output */
+	return DatumGetBool(FunctionCallInvoke(>ltFuncCallInfo));
+}
+
+/*
+ * isEqual
+ *		We must check if two timepoints are equal. We initialize the function
+ *		call info during ExecInit phase.
+ */
+static bool
+isEqual(Datum a, Datum b, TemporalAdjustmentState* node)
+{
+	node->eqFuncCallInfo.arg[0] = a;
+	node->eqFuncCallInfo.arg[1] = b;
+	

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-04-05 Thread Andres Freund
Hi,

On 2017-03-30 14:11:28 +0200, Peter Moser wrote:
> 2017-03-01 10:56 GMT+01:00 Peter Moser :
> > A similar walkthrough for ALIGN will follow soon.
> >
> > We are thankful for any suggestion or ideas, to be used to write a
> > good SGML documentation.
> 
> The attached README explains the ALIGN operation step-by-step with a
> TEMPORAL LEFT OUTER JOIN example. That is, we start from a query
> input, show how we rewrite it during parser stage, and show how the
> final execution generates result tuples.

Unfortunately I don't think this patch has received sufficient design
and implementation to consider merging it into v10.  As code freeze is
in two days, I think we'll have to move this to the next commitfest.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-03-30 Thread Peter Moser
2017-03-01 10:56 GMT+01:00 Peter Moser :
> A similar walkthrough for ALIGN will follow soon.
>
> We are thankful for any suggestion or ideas, to be used to write a
> good SGML documentation.

The attached README explains the ALIGN operation step-by-step with a
TEMPORAL LEFT OUTER JOIN example. That is, we start from a query
input, show how we rewrite it during parser stage, and show how the
final execution generates result tuples.


Best regards,
Anton, Michael, Johann, Peter

ALIGN


  This is an exemplary walkthrough of a temporal ALIGN operation, from the
  query input, over the query rewrite, until the result tuple outputs during
  execution.



EXAMPLE


  These tables represent employees and projects in a company. An employee works
  for a department over a certain time period. Different projects get developed
  in a department. The start and end time points of each project is stored in
  the period column.

CREATE TABLE emp (empname VARCHAR, dept VARCHAR, period INT4RANGE);
INSERT INTO emp VALUES
('Ann', 'DB', '[1,5)'),
('Bea', 'DB', '[3,8)'),
('Ron', 'AI', '[6,9)');

CREATE TABLE prj (prjname VARCHAR, dept VARCHAR, period INT4RANGE);
INSERT INTO prj VALUES
('PR1', 'DB', '[3,7)'),
('PR2', 'DB', '[1,5)'),
('PR3', 'HW', '[2,8)');


Timeline representation
---

EMPNAME  DEPT  PERIOD
Ann  DB[1,5)    <- Tuple 1, valid from 1 to 5 excl.
Bea  DB[3,8)-   <- Tuple 2
Ron  AI[6,9)   ---  <- Tuple 3
  123456789 <- Timeline

PRJNAME  DEPT  PERIOD
PR1  DB[3,7)<- Tuple 1, valid from 3 to 7 excl.
PR2  DB[1,5)    <- Tuple 2
PR3  HW[2,8)   --   <- Tuple 3
  123456789 <- Timeline


TEMPORAL LEFT OUTER JOIN query
--

  Query: At each time point, to which projects is an  employee assigned, and
 when does an employee not have an assigned project?


WITH emp AS (SELECT period u, * FROM emp),
 prj AS (SELECT period v, * FROM prj)
SELECT empname, prjname, dept, period FROM (
emp ALIGN prj ON emp.dept = prj.dept WITH (emp.period, prj.period)
) emp_aligned
LEFT OUTER JOIN (
prj ALIGN emp ON emp.dept = prj.dept WITH (prj.period, emp.period)
) prj_aligned
USING(dept, period)
WHERE period = u * v OR u IS NULL OR v IS NULL;


Result
--

 empname | prjname | dept | period
-+-+--+
 Ann | PR2 | DB   | [1,5)   
 Ann | PR1 | DB   | [3,5) --
 Bea | PR1 | DB   | [3,7) 
 Bea | PR2 | DB   | [3,5) --
 Bea | | DB   | [7,8) -
 Ron | | AI   | [6,9)---
123456789  <- Timeline



STEP-BY-STEP EXPLANATION


  In this chapter we describe first the rewrite and processing of the two ALIGN
  clauses, that are,

( emp ALIGN prj ON emp.dept = prj.dept WITH (emp.period, prj.period)
) emp_aligned

  ...and...

( prj ALIGN emp ON emp.dept = prj.dept WITH (prj.period, emp.period)
) prj_aligned.

  Secondly, we explain what the above query does in general, and why we need two
  ALIGNs, the WHERE-clause and CTEs (WITH clauses).


ALIGN subquery processing
-

  After receiving the ALIGN query (see above) as input, we rewrite it into
  the following query:

SELECT emp.*, GREATEST(LOWER(emp.period), LOWER(prj.period)) p1,
 LEAST(UPPER(emp.period), UPPER(prj.period)) p2
FROM
(SELECT *, row_id() OVER () rn FROM emp) emp
LEFT OUTER JOIN
prj
ON emp.dept = prj.dept AND emp.period && prj.period
ORDER BY rn, p1, p2;

  Then, we rewrite the second ALIGN subquery analoguously.


Intermediate results: These are the inputs of our ALIGN execution nodes
---
  (See Appendix [1] for details)

  For emp ALIGN prj...

TUPIDempname | dept | period | rn | p1 | p2
-+--++++
  A  Ann | DB   | [1,5)  |  1 |  1 |  5
  B  Ann | DB   | [1,5)  |  1 |  3 |  5
  C  Bea | DB   | [3,8)  |  2 |  3 |  5
  D  Bea | DB   | [3,8)  |  2 |  3 |  7
  E  Ron | AI   | [6,9)  |  3 |  6 |  9


  For prj ALIGN emp...


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-03-01 Thread Peter Moser
2017-02-15 20:24 GMT+01:00 Robert Haas :
> There's no documentation in this patch.  I'm not sure you want to go
> to the trouble of writing SGML documentation until this has been
> reviewed enough that it has a real chance of getting committed, but on
> the other hand we're obviously all struggling to understand what it
> does, so I think if not SGML documentation it at least needs a real
> clear explanation of what the syntax is and does in a README or
> something, even just for initial review.

The attached README explains the NORMALIZE operation step-by-step with
an example. That is, we start from a query input, show how we rewrite
it during parser stage, and show how the final execution generates
result tuples. A similar walkthrough for ALIGN will follow soon.

We are thankful for any suggestion or ideas, to be used to write a
good SGML documentation.

Best regards,
Anton, Michael, Johann, Peter

NORMALIZE


This is an exemplary walkthrough of a temporal NORMALIZE operation, from the
query input, over the query rewrite, until the result tuple outputs during
execution.



EXAMPLE


Question: How many employees work in a department at each time point?

CREATE TABLE empl (name VARCHAR, dept VARCHAR, time INT4RANGE);
INSERT INTO empl VALUES
('Ann', 'DB', '[1,5)'),
('Ann', 'AI', '[3,8)'),
('Bea', 'DB', '[3,9)');


Timeline representation:

NAME  DEPT  TIME
Ann   DB[1,5)   <- Tuple 1, valid from 1 to 5 excl.
Ann   AI[3,8) - <- Tuple 2
Bea   DB[3,9) --<- Tuple 3
123456789   <- Timeline


NORMALIZE query:

SELECT count(*), dept, time
FROM (
empl a NORMALIZE empl b USING(dept) WITH (a.time, b.time)
) c
GROUP BY dept, time;


Result:

COUNT  DEPT TIME
1  DB   [1,3)   --
1  AI   [3,8) -
2  DB   [3,5) --
1  DB   [5,9)   
123456789  <- Timeline



STEP-BY-STEP EXPLANATION


After receiving the NORMALIZE query (see above) as input, we rewrite it into
the following query:

SELECT a.*, P1
FROM (
SELECT *, row_id() OVER () rn FROM empl
) a
LEFT OUTER JOIN (
SELECT dept, LOWER(time) P1 FROM empl
UNION
SELECT dept, UPPER(time) P1 FROM empl
) b
ON a.dept = b.dept AND P1 <@ a.time
ORDER BY rn, P1;


Intermediate result: This is the input of our NORMALIZE execution node...
(See Appendix [1] for details)

 TUPID   name | dept | time  | rn | p1
--+--+---++
   A Ann  | DB   | [1,5) |  1 |  1
   B Ann  | DB   | [1,5) |  1 |  3
   C Ann  | AI   | [3,8) |  2 |  3
   D Bea  | DB   | [3,9) |  3 |  3
   E Bea  | DB   | [3,9) |  3 |  5



We have three possibilities inside NORMALIZE during execution:
(See Appendix [2] for details)

1) CURR == PREV
   a) S < CURR.P1   --> generate (CURR, [S, CURR.P1)) and set S = CURR.P1
   b) otherwise --> fetch next tuple
  if fetched tuple is NULL --> goto(2) forcing CURR != PREV to be true
2) CURR != PREV
   a) S < PREV.te   --> generate (PREV, [S, PREV.te))
   ... set S = CURR.ts and goto(1) forcing CURR == PREV to be true



Executor steps for our example query:

1) First call of ExecTemporalAdjustment: Fetch tuple A, which is now CURR.
   Set sweepline S = 1 (A.ts), and copy tuple A also into PREV. Goto(1)
   forcing CURR == PREV to be true.

2) (1): CURR == PREV, that is, tuple A is equal to itself.
   (1b): Fetch tuple B; CURR = B, PREV = A

3) (1): A == B
   (1a): S < B.P1 (1 < 3), hence we generate a result tuple:
(Ann, DB, [1,3))
   ...where the interval is [S, CURR.P1) and all other attributes are
   copied from the current tuple (omitting helper columns: RN and P1)
   We update the sweepline: S = 3 (B.P1).

4) (1): A == B.
   (1b): Fetch tuple C; CURR = C, PREV = B

5) (2): B != C.
   (2a): S < B.te (3 < 5), hence we generate a result tuple:
(Ann, DB, [3, 5))
   We update the sweepline: S = 3 (C.ts).

6) (1): B == C (forced).
   (1b): Fetch tuple D; CURR = D, PREV = C

7) (2): C != D.
   (2a): S < C.te (3 < 8), hence we generate a result tuple:
(Ann, AI, [3,8))
   We update the sweepline: S = 3 (D.ts).

8) (1): C == D (forced).
   (1b): Fetch tuple E; CURR = E, PREV = D

9) (1): D == E.
   (1a): S < E.P1 (3 < 5), hence we generate a result tuple:
(Bea, DB, [3, 5))
   We update the sweepline: S = 5 (E.P1).

10) (1): D == E.
(1b): 

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-27 Thread Peter Moser
2017-02-24 21:25 GMT+01:00 Jim Nasby :
> On 2/24/17 6:40 AM, Peter Moser wrote:
>>
>> Do you think it is better to remove the syntax for ranges expressed in
>> different columns?
>
>
> It's not that hard to construct a range type on-the-fly from 2 columns, so
> (without having looked at the patch or really followed the thread) I would
> think the answer is yes.

We discussed and decided to remove the syntax for separate columns.
The patch with "range-types-only" syntax will be send soon.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-24 Thread Jim Nasby

On 2/24/17 6:40 AM, Peter Moser wrote:

Do you think it is better to remove the syntax for ranges expressed in
different columns?


It's not that hard to construct a range type on-the-fly from 2 columns, 
so (without having looked at the patch or really followed the thread) I 
would think the answer is yes.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-24 Thread Peter Moser
2017-02-22 19:43 GMT+01:00 Peter Eisentraut :
> On 2/16/17 07:41, Robert Haas wrote:
>> Also, it sounds like all of this is intended to work with ranges that
>> are stored in different columns rather than with PostgreSQL's built-in
>> range types.
>
> Yeah, that should certainly be changed.

Our syntax supports PostgreSQL's built-in range types and ranges that
are stored in different columns.

For instance, for range types an ALIGN query would look like this:
  SELECT * FROM (r ALIGN s ON q WITH (r.t, s.t)) c

... and for ranges in different columns like this:
  SELECT * FROM (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c

... where r and s are input relations, q can be any join qualifier, and
r.t, s.t, r.ts, r.te, s.ts, and s.te can be any column name. The
latter represent the valid time intervals, that is time point start,
and time point end of each tuple for each input relation. These can
be defined as four scalars, or two half-open, i.e., [), range typed
values.




It would reduce the size of our patch and simplify the overall structure,
if we would remove the possibility to express valid time start points and end
points in different columns.

Do you think it is better to remove the syntax for ranges expressed in
different columns?

However, internally we still need to split the
range types into two separate points, because NORMALIZE does not make a
distinction between start and end timepoints while grouping, therefore we
have only one timepoint attribute there (i.e., P1), which is the union of
start and end timepoints (see executor/nodeTemporalAdjustment.c). A second
constraint is, that we support currently only half-open intervals, that is,
any interval definition (open/closed/half-open) different from the PostgreSQL's
default, i.e., [), leads to undefined results.

Best regards,
Anton, Johann, Michael, Peter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-22 Thread Peter Eisentraut
On 2/16/17 07:41, Robert Haas wrote:
> Also, it sounds like all of this is intended to work with ranges that
> are stored in different columns rather than with PostgreSQL's built-in
> range types.

Yeah, that should certainly be changed.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-20 Thread Peter Moser
On Wed, Feb 15, 2017 at 9:33 PM, David G. Johnston
 wrote:
> On Wed, Feb 15, 2017 at 12:24 PM, Robert Haas  wrote:
>>
>> So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN,
>> except apparently there's no join type and the optimizer can never
>> reorder these operations with each other or with other joins.  Is that
>> right?  The optimizer changes in this patch seem fairly minimal, so
>> I'm guessing it can't be doing anything very complex here.
>>
>> + * INPUT:
>> + * (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c
>> + * where q can be any join qualifier, and r.ts, r.te, s.ts,
>> and s.t
>> e
>> + * can be any column name.
>> + *
>> + * OUTPUT:
>> + * (
>> + * SELECT r.*, GREATEST(r.ts, s.ts) P1, LEAST(r.te, s.te) P2
>> + *  FROM
>> + *  (
>> + * SELECT *, row_id() OVER () rn FROM r
>> + *  ) r
>> + *  LEFT OUTER JOIN
>> + *  s
>> + *  ON q AND r.ts < s.te AND r.te > s.ts
>> + *  ORDER BY rn, P1, P2
>> + *  ) c
>>
>> It's hard to see what's going on here.  What's ts?  What's te?  If you
>> used longer names for these things, it might be a bit more
>> self-documenting.
>
>
> Just reasoning out loud here...
>
> ISTM ts and te are "temporal [range] start" and "temporal [range] end" (or
> probably just the common "timestamp start/end")
>
> From what I can see it is affecting an intersection of the two ranges and,
> furthermore, splitting the LEFT range into sub-ranges that match up with the
> sub-ranges found on the right side.  From the example above this seems like
> it should be acting on self-normalized ranges - but I may be missing
> something by evaluating this out of context.
>
> r1 [1, 6] [ts, te] [time period start, time period end]
> s1 [2, 3]
> s2 [3, 4]
> s3 [5, 7]
>
> r LEFT JOIN s ON (r.ts < s.te AND r.te > s.ts)
>
> r1[1, 6],s1[2, 3] => [max(r.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[2, 3]
> r1[1, 6],s2[3, 4] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[3, 4]
> r1[1, 6],s3[5, 7] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[5, 6]
>
> Thus the intersection is [2,6] but since s1 has three ranges that begin
> between 2 and 6 (i.e., 2, 3, and 5) there are three output records that
> correspond to those sub-ranges.

Yes, this is what the internal rewriting produces for r1.
Note that till now we only support half-open ranges, i.e., [), but for
visibility I will continue this example using closed ranges [].
The executor function (ExecTemporalAdjustment) gets this (the output above) as
the input and will then produce:

r1[1, 1]
r1[2, 3]
r1[3, 4]
r1[5, 6]

Which means also for the ALIGN the non-overlapping parts are retained.

>
> The description in the OP basically distinguishes between NORMALIZE and
> ALIGN in that ALIGN, as described above, affects an INTERSECTION on the two
> ranges - discarding the non-overlapping data - while NORMALIZE performs the
> alignment while also retaining the non-overlapping data.

Also for ALIGN we retain the non-overlapping part.
Intersections are symmetric/commutative, so a subsequent outer join can then use
equality on the ranges
to produce join matches (for overlapping) as well as null-extend the
produced non-overlapping parts.

The difference between ALIGN and NORMALIZE is how they split, while ALIGN
produces intersections between pairs of tuples (used for joins) and the
non-overlapping parts, NORMALIZE produces intersections between groups of tuples
(used for aggregation, so that all tuples with the same group have equal
ranges) and the non-overlapping parts.

For instance, the NORMALIZE between r1, s1, s2, and s3 in your example above
would give the following:

r1[1, 1]
r1[2, 2]
r1[3, 3]
r1[4, 4]
r1[5, 6]

>
> The rest of the syntax seems to deal with selecting subsets of range records
> based upon attribute data.

Yes, exactly!


Best regards,
Anton, Johann, Michael, Peter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-16 Thread Robert Haas
On Wed, Feb 15, 2017 at 3:33 PM, David G. Johnston
 wrote:
> The description in the OP basically distinguishes between NORMALIZE and
> ALIGN in that ALIGN, as described above, affects an INTERSECTION on the two
> ranges - discarding the non-overlapping data - while NORMALIZE performs the
> alignment while also retaining the non-overlapping data.

Hmm.  So is ALIGN like an inner join and NORMALIZE like a full outer
join?  Couldn't you have left and right outer joins, too, like you
null-extend the parts of the lefthand range that don't match but throw
away parts of the righthand range that don't match?

Also, it sounds like all of this is intended to work with ranges that
are stored in different columns rather than with PostgreSQL's built-in
range types.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-15 Thread David G. Johnston
On Wed, Feb 15, 2017 at 12:24 PM, Robert Haas  wrote:

> On Tue, Jan 24, 2017 at 4:32 AM, Peter Moser  wrote:
> >> Using common terms such as ALIGN and NORMALIZE for such a specific
> >> functionality seems a bit wrong.
> >
> > Would ALIGN RANGES/RANGE ALIGN and NORMALIZE RANGES/RANGE NORMALIZE be
> better
> > options? We are also thankful for any suggestion or comments about the
> syntax.
>
> So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN,
> except apparently there's no join type and the optimizer can never
> reorder these operations with each other or with other joins.  Is that
> right?  The optimizer changes in this patch seem fairly minimal, so
> I'm guessing it can't be doing anything very complex here.
>
> + * INPUT:
> + * (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c
> + * where q can be any join qualifier, and r.ts, r.te, s.ts,
> and s.t
> e
> + * can be any column name.
> + *
> + * OUTPUT:
> + * (
> + * SELECT r.*, GREATEST(r.ts, s.ts) P1, LEAST(r.te, s.te) P2
> + *  FROM
> + *  (
> + * SELECT *, row_id() OVER () rn FROM r
> + *  ) r
> + *  LEFT OUTER JOIN
> + *  s
> + *  ON q AND r.ts < s.te AND r.te > s.ts
> + *  ORDER BY rn, P1, P2
> + *  ) c
>
> It's hard to see what's going on here.  What's ts?  What's te?  If you
> used longer names for these things, it might be a bit more
> self-documenting.
>

Just reasoning out loud here...​

ISTM ts and te are "temporal [range] start" and "temporal [range] end"​ (or
probably just the common "timestamp start/end")

​From what I can see it is affecting an intersection of the two ranges and,
furthermore, splitting the LEFT range into sub-ranges that match up with
the sub-ranges found on the right side.  From the example above this seems
like it should be acting on self-normalized ranges - but I may be missing
something by evaluating this out of context.

r1 [1, 6] [ts, te] [time period start, time period end]
s1 [2, 3]
s2 [3, 4]
s3 [5, 7]

r LEFT JOIN s ON (r.ts < s.te AND r.te > s.ts)

r1[1, 6],s1[2, 3] => [max(r.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[2, 3]
r1[1, 6],s2[3, 4] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[3, 4]
r1[1, 6],s3[5, 7] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[5, 6]

Thus the intersection is [2,6] but since s1 has three ranges that begin
between 2 and 6 (i.e., 2, 3, and 5) there are three output records that
correspond to those sub-ranges.

The description in the OP basically distinguishes between NORMALIZE and
ALIGN in that ALIGN, as described above, affects an INTERSECTION on the two
ranges - discarding the non-overlapping data - while NORMALIZE performs the
alignment while also retaining the non-overlapping data.

The rest of the syntax seems to deal with selecting subsets of range
records based upon attribute data.

David J.


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-15 Thread Robert Haas
On Tue, Jan 24, 2017 at 4:32 AM, Peter Moser  wrote:
>> Using common terms such as ALIGN and NORMALIZE for such a specific
>> functionality seems a bit wrong.
>
> Would ALIGN RANGES/RANGE ALIGN and NORMALIZE RANGES/RANGE NORMALIZE be better
> options? We are also thankful for any suggestion or comments about the syntax.

So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN,
except apparently there's no join type and the optimizer can never
reorder these operations with each other or with other joins.  Is that
right?  The optimizer changes in this patch seem fairly minimal, so
I'm guessing it can't be doing anything very complex here.

What happens if you perform the ALIGN or NORMALIZE operation using
something other than an equality operator, like, say, less-than?  Or
an arbitrary user-defined operator.

There's no documentation in this patch.  I'm not sure you want to go
to the trouble of writing SGML documentation until this has been
reviewed enough that it has a real chance of getting committed, but on
the other hand we're obviously all struggling to understand what it
does, so I think if not SGML documentation it at least needs a real
clear explanation of what the syntax is and does in a README or
something, even just for initial review.

We don't have anything quite like this in PostgreSQL today.  An
ordinary join just matches up things in relation A and relation B and
outputs the matching rows, and something like a SRF takes a set of
input rows and returns a set of output rows.  This is a hybrid - it
takes in two sets of rows, one from each relation being joined, and
produces a derived set of output rows that takes into account both
inputs.

+ * INPUT:
+ * (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c
+ * where q can be any join qualifier, and r.ts, r.te, s.ts, and s.t
e
+ * can be any column name.
+ *
+ * OUTPUT:
+ * (
+ * SELECT r.*, GREATEST(r.ts, s.ts) P1, LEAST(r.te, s.te) P2
+ *  FROM
+ *  (
+ * SELECT *, row_id() OVER () rn FROM r
+ *  ) r
+ *  LEFT OUTER JOIN
+ *  s
+ *  ON q AND r.ts < s.te AND r.te > s.ts
+ *  ORDER BY rn, P1, P2
+ *  ) c

It's hard to see what's going on here.  What's ts?  What's te?  If you
used longer names for these things, it might be a bit more
self-documenting.

If we are going to transform an ALIGN operator in to a left outer
join, why do we also have an executor node for it?

+   fcLowerLarg = makeFuncCall(SystemFuncName("lower"),
+
list_make1(crLargTs),
+
UNKNOWN_LOCATION);
+   fcLowerRarg = makeFuncCall(SystemFuncName("lower"),
+
list_make1(crRargTs),
+
UNKNOWN_LOCATION);
+   fcUpperLarg = makeFuncCall(SystemFuncName("upper"),
+
list_make1(crLargTs),
+
UNKNOWN_LOCATION);
+   fcUpperRarg = makeFuncCall(SystemFuncName("upper"),
+
list_make1(crRargTs),
+
UNKNOWN_LOCATION);

Why is a temporal operator calling functions that upper-case and
lower-case strings?  In one sense this whole function (and much of the
nearby code) is very straightforward code and you can see exactly why
it's doing it.  In another sense it's totally inscrutable: WHY is it
doing any of that stuff?

-   char   *strategy;   /* partitioning strategy
('list' or 'range') */
-   List   *partParams; /* List of PartitionElems */
-   int location;   /* token
location, or -1 if unknown */
+   char   *strategy;   /* partitioning strategy ('list' or 'range') */
+   List   *partParams; /* List of PartitionElems */
+   int location;   /* token location, or
-1 if unknown */

I think this is some kind of mistake on your end while generating the
patch.  It looks like you patched one version of the source code, and
diffed against another.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-15 Thread Robert Haas
On Tue, Jan 24, 2017 at 4:32 AM, Peter Moser  wrote:
> NORMALIZE: splits all the ranges of one relation according to all the range
> boundaries of another (but possibly the same) relation whenever some equality
> condition over some given attributes is satisfied.
>
> When the two relations are the same, all ranges with the given equal 
> attributes
> are either equal or disjoint. After this, the traditional GROUP BY or DISTINCT
> can be applied. The attributes given to NORMALIZE for a (temporal) GROUP BY 
> are
> the grouping attributes and for a (temporal) DISTINCT the target list
> attributes.
>
> When the two relations are different, but they each contain disjoint ranges
> for the same attributes (as the current limitation for the set operations is)
> we perform a symmetric NORMALIZE on each of them. Then we have a similar
> situation as before, i.e., in both relations ranges with the same attributes
> are either equal or disjoint and a traditional set operation
> (EXCEPT/INTERSECT/UNION) can be applied. The attributes given to NORMALIZE for
> a (temporal) EXCEPT/INTERSECT/UNION are the target list attributes.
>
>
> ALIGN: splits all the ranges of one relation according to all the range
> intersections of another relation, i.e., it produces all intersections and
> non-overlapping parts, whenever some condition is satisfied.
>
> We perform a symmetric ALIGN on each relation, after which a traditional inner
> or outer join can be applied using equality on the ranges to calculate the
> overlap. The condition given to a (temporal) inner or outer join is the
> join condition without overlap.

I don't quite understand the difference between NORMALIZE and ALIGN.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-06 Thread Peter Eisentraut
On 2/2/17 12:43 PM, Peter Moser wrote:
> Hereby, we used the following commands to create both patches:
> git diff --no-prefix origin/master -- src/ ':!src/test/*' >
> tpg_primitives_out_v4.patch
> 
> git diff --no-prefix origin/master -- src/test/ >
> tpg_primitives_out_tests_v2.patch
> 
> We have also tested our patches on the current HEAD with the command:
> patch -p0 < patch-file
> 
> Both worked without problems or warnings on our Linux machine.
> Could you please explain, which problems occurred while you tried to
> apply our patches?

Your patches apply OK for me.

In the future, just use git diff without any options or git
format-patch, and put both the code and the tests in one patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-01-31 Thread Michael Paquier
On Tue, Jan 24, 2017 at 6:32 PM, Peter Moser  wrote:
> [reviews and discussions]

The patch proposed has rotten. Please provide a rebase. By the way, I
am having a hard time applying your patches with patch or any other
methods... I am moving it to CF 2017-03 because of the lack of
reviews.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-01-24 Thread Peter Moser
2017-01-18 3:57 GMT+01:00 Peter Eisentraut :
>
> On 1/13/17 9:22 AM, Peter Moser wrote:
> > The goal of temporal aligners and normalizers is to split ranges to allow a
> > reduction from temporal queries to their non-temporal counterparts.
> > Splitting
> > ranges is necessary for temporal query processing. Temporal aligners and
> > normalizer may then be used as building-blocks for any temporal query
> > construct.
>
> I would need to see the exact definitions of these constructs.  Please
> send some documentation.
>
> > We have published two papers, that contain formal definitions and
> > related work
> > for the temporal aligner and normalizer. Please see [1] and [2].
>
> I don't have access to those.

The papers can be freely downloaded from
http://www.inf.unibz.it/~dignoes/publications.html using the "Author-ize link".

> >> I think there are probably many interesting applications for normalizing
> >> or otherwise adjusting ranges.  I'd like to see an overview and
> >> consideration of other applications.
> >
> > Please see the attached file adjustment.sql for some interesting
> > applications.
>
> That's surely interesting, but without knowing what these operations are
> supposed to do, I can only reverse engineer and guess.

Intuitively what they do is as follows:


NORMALIZE: splits all the ranges of one relation according to all the range
boundaries of another (but possibly the same) relation whenever some equality
condition over some given attributes is satisfied.

When the two relations are the same, all ranges with the given equal attributes
are either equal or disjoint. After this, the traditional GROUP BY or DISTINCT
can be applied. The attributes given to NORMALIZE for a (temporal) GROUP BY are
the grouping attributes and for a (temporal) DISTINCT the target list
attributes.

When the two relations are different, but they each contain disjoint ranges
for the same attributes (as the current limitation for the set operations is)
we perform a symmetric NORMALIZE on each of them. Then we have a similar
situation as before, i.e., in both relations ranges with the same attributes
are either equal or disjoint and a traditional set operation
(EXCEPT/INTERSECT/UNION) can be applied. The attributes given to NORMALIZE for
a (temporal) EXCEPT/INTERSECT/UNION are the target list attributes.


ALIGN: splits all the ranges of one relation according to all the range
intersections of another relation, i.e., it produces all intersections and
non-overlapping parts, whenever some condition is satisfied.

We perform a symmetric ALIGN on each relation, after which a traditional inner
or outer join can be applied using equality on the ranges to calculate the
overlap. The condition given to a (temporal) inner or outer join is the
join condition without overlap.

> >> Ideally, I'd like to see these things implemented as some kind of
> >> user-space construct, like an operator or function.  I think we'd need a
> >> clearer definition of what it is they do before we can evaluate that.
> >
> > Can you please explain what you mean by "user-space construct" in this case.
>
> Implement them using the extensibility features, such as a user-defined
> operator.  I don't know if it's possible, but it's something to consider.

We experimented with user-defined operators and set-returning functions. The
main issue with these is that ALIGN and NORMALIZE rely on comparison and
processing of one tuple with many tuples at a time that is not possible with
operators, and for set-returning functions there is no clean way of passing
tables and/or conditions.

> Using common terms such as ALIGN and NORMALIZE for such a specific
> functionality seems a bit wrong.

Would ALIGN RANGES/RANGE ALIGN and NORMALIZE RANGES/RANGE NORMALIZE be better
options? We are also thankful for any suggestion or comments about the syntax.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-01-17 Thread Peter Eisentraut
On 1/13/17 9:22 AM, Peter Moser wrote:
> The goal of temporal aligners and normalizers is to split ranges to allow a
> reduction from temporal queries to their non-temporal counterparts.
> Splitting
> ranges is necessary for temporal query processing. Temporal aligners and
> normalizer may then be used as building-blocks for any temporal query
> construct.

I would need to see the exact definitions of these constructs.  Please
send some documentation.

> We have published two papers, that contain formal definitions and
> related work
> for the temporal aligner and normalizer. Please see [1] and [2].

I don't have access to those.

>> I think there are probably many interesting applications for normalizing
>> or otherwise adjusting ranges.  I'd like to see an overview and
>> consideration of other applications.
> 
> Please see the attached file adjustment.sql for some interesting
> applications.

That's surely interesting, but without knowing what these operations are
supposed to do, I can only reverse engineer and guess.

>> Ideally, I'd like to see these things implemented as some kind of
>> user-space construct, like an operator or function.  I think we'd need a
>> clearer definition of what it is they do before we can evaluate that.
> 
> Can you please explain what you mean by "user-space construct" in this case.

Implement them using the extensibility features, such as a user-defined
operator.  I don't know if it's possible, but it's something to consider.

Using common terms such as ALIGN and NORMALIZE for such a specific
functionality seems a bit wrong.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-01-13 Thread Peter Moser
> What this patch does is to add two new clauses for FROM-list items,
> NORMALIZE and ALIGN, which reshuffle a set of ranges into a new list
> that can then be aggregated more easily.  From the original message:
>
> > For NORMALIZE the tuples' ranges need to be split into all sub-ranges
> > according to all matching ranges of the second relation. For this we
> > create a subquery that first joins one relation with the range
> > boundaries of the other and then sorts the result. The executor
> > function splits the ranges in a sweep-line based manner.
> >
> > For ALIGN the tuples' ranges must be split into all intersections and
> > differences with the other relation according to the join condition.
> > For this we create a subquery that first joins the two relations and
> > then sorts the result. The executor function splits the ranges
> > accordingly in a sweep-line based manner.
>
> So there isn't really temporal query processing as such here, only some
> helpers that can make it easier.

The goal of temporal aligners and normalizers is to split ranges to allow a
reduction from temporal queries to their non-temporal counterparts.
Splitting
ranges is necessary for temporal query processing. Temporal aligners and
normalizer may then be used as building-blocks for any temporal query
construct.

> I can see how those operations can be useful, but it would help if there
> were a more formal definition to be able to check that further.

We have published two papers, that contain formal definitions and related
work
for the temporal aligner and normalizer. Please see [1] and [2].

> What I'm missing here is some references: existing implementations,
> standards, documentation, research papers, alternative ideas, rejected
> alternatives, etc.

A good overview of existing implementations in DBMSs, SQL standard, and
history
is given in [3].

> Also, the submission is missing documentation and test cases.  There are
> technical terms used in the code that I don't understand.

We added a second patch with test cases and expected results. We are now
writing the documentation in sgml-format.

> I think there are probably many interesting applications for normalizing
> or otherwise adjusting ranges.  I'd like to see an overview and
> consideration of other applications.

Please see the attached file adjustment.sql for some interesting
applications.

> Ideally, I'd like to see these things implemented as some kind of
> user-space construct, like an operator or function.  I think we'd need a
> clearer definition of what it is they do before we can evaluate that.

Can you please explain what you mean by "user-space construct" in this case.



Best regards,
Anton, Johann, Michael, Peter


[1] Anton Dignös, Michael H. Böhlen, Johann Gamper:
Temporal alignment. SIGMOD Conference 2012: 433-444
http://doi.acm.org/10.1145/2213836.2213886
[2] Anton Dignös, Michael H. Böhlen, Johann Gamper, Christian S. Jensen:
Extending the Kernel of a Relational DBMS with Comprehensive Support for
Sequenced Temporal Queries. ACM Trans. Database Syst. 41(4): 26:1-26:46
(2016)
http://doi.acm.org/10.1145/2967608
[3] https://www2.cs.arizona.edu/people/rts/sql3.html and
https://www2.cs.arizona.edu/people/rts/tsql2.html


adjustment.sql
Description: application/sql
diff --git src/test/regress/expected/temporal_primitives.out src/test/regress/expected/temporal_primitives.out
new file mode 100644
index 000..6e4cc0d
--- /dev/null
+++ src/test/regress/expected/temporal_primitives.out
@@ -0,0 +1,841 @@
+--
+-- TEMPORAL PRIMITIVES
+--
+SET datestyle TO ymd;
+CREATE COLLATION "de_DE.utf8" (LC_COLLATE = "de_DE.utf8",
+   LC_CTYPE = "de_DE.utf8" );
+CREATE TEMP TABLE tpg_table1 (a char, b char, ts int, te int);
+CREATE TEMP TABLE tpg_table2 (c int, d char, ts int, te int);
+INSERT INTO tpg_table1 VALUES
+('a','B',1,7),
+('b','B',3,9),
+('c','G',8,10);
+INSERT INTO tpg_table2 VALUES
+(1,'B',2,5),
+(2,'B',3,4),
+(3,'B',7,9);
+-- VALID TIME columns (i.e., ts and te) are no longer at the end of the
+-- targetlist.
+CREATE TEMP TABLE tpg_table3 AS
+	SELECT a, ts, te, b FROM tpg_table1;
+CREATE TEMP TABLE tpg_table4 AS
+	SELECT c, ts, d, te FROM tpg_table2;
+-- VALID TIME columns represented as range type
+CREATE TEMP TABLE tpg_table5 AS
+	SELECT int4range(ts, te) t, a, b FROM tpg_table1;
+CREATE TEMP TABLE tpg_table6 AS
+	SELECT int4range(ts, te) t, c a, d b FROM tpg_table2;
+-- VALID TIME columns as VARCHARs
+CREATE TEMP TABLE tpg_table7 (a int, ts varchar, te varchar);
+CREATE TEMP TABLE tpg_table8 (a int,
+			  ts varchar COLLATE "de_DE.utf8",
+			  te varchar COLLATE "POSIX");
+INSERT INTO tpg_table7 VALUES
+(0, 'A', 'D'),
+(1, 'C', 'X'),
+(0, 'ABC', 'BCD'),
+(0, 'xABC', 'xBCD'),
+(0, 'BAA', 'BBB');
+INSERT INTO tpg_table8 VALUES
+(0, 'A', 'D'),
+(1, 'C', 'X');
+-- Tables to check different data types, and corner cases
+CREATE TEMP TABLE tpg_table9 (a int, ts timestamp, te timestamp);
+CREATE TEMP TABLE tpg_table10 

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-01-04 Thread Peter Eisentraut
So I'm looking at this patch in the commit fest.  I have only a general
understanding of temporal query processing.

What this patch does is to add two new clauses for FROM-list items,
NORMALIZE and ALIGN, which reshuffle a set of ranges into a new list
that can then be aggregated more easily.  From the original message:

> For NORMALIZE the tuples' ranges need to be split into all sub-ranges
> according to all matching ranges of the second relation. For this we
> create a subquery that first joins one relation with the range
> boundaries of the other and then sorts the result. The executor
> function splits the ranges in a sweep-line based manner.
>
> For ALIGN the tuples' ranges must be split into all intersections and
> differences with the other relation according to the join condition.
> For this we create a subquery that first joins the two relations and
> then sorts the result. The executor function splits the ranges
> accordingly in a sweep-line based manner.

So there isn't really temporal query processing as such here, only some
helpers that can make it easier.

I can see how those operations can be useful, but it would help if there
were a more formal definition to be able to check that further.

What I'm missing here is some references: existing implementations,
standards, documentation, research papers, alternative ideas, rejected
alternatives, etc.

Also, the submission is missing documentation and test cases.  There are
technical terms used in the code that I don't understand.

I think there are probably many interesting applications for normalizing
or otherwise adjusting ranges.  I'd like to see an overview and
consideration of other applications.

Ideally, I'd like to see these things implemented as some kind of
user-space construct, like an operator or function.  I think we'd need a
clearer definition of what it is they do before we can evaluate that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-12-16 Thread Peter Moser

Am 16.12.2016 um 07:17 schrieb David Fetter:

On Wed, Dec 07, 2016 at 03:57:33PM +0100, Peter Moser wrote:

Am 05.12.2016 um 06:11 schrieb Haribabu Kommi:



On Tue, Oct 25, 2016 at 8:44 PM, Peter Moser > wrote:


We decided to follow your recommendation and add the patch to the
commitfest.


Path is not applying properly to HEAD.
Moved to next CF with "waiting on author" status.



We updated our patch. We tested it with the latest
commit dfe530a09226a9de80f2b4c3d5f667bf51481c49.


This looks neat, but it no longer applies to master.  Is a rebase in
the offing?


We rebased our patch on top of HEAD, that is, commit 
93513d1b6559b2d0805f0b02d312ee550e3d010b.



Best regards,
Anton, Johann, Michael, Peter
diff --git src/backend/commands/explain.c src/backend/commands/explain.c
index 0a669d9..09406d4 100644
--- src/backend/commands/explain.c
+++ src/backend/commands/explain.c
@@ -875,6 +875,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_SeqScan:
 			pname = sname = "Seq Scan";
 			break;
+		case T_TemporalAdjustment:
+			if(((TemporalAdjustment *) plan)->temporalCl->temporalType == TEMPORAL_TYPE_ALIGNER)
+pname = sname = "Adjustment(for ALIGN)";
+			else
+pname = sname = "Adjustment(for NORMALIZE)";
+			break;
 		case T_SampleScan:
 			pname = sname = "Sample Scan";
 			break;
diff --git src/backend/executor/Makefile src/backend/executor/Makefile
index 51edd4c..42801d3 100644
--- src/backend/executor/Makefile
+++ src/backend/executor/Makefile
@@ -25,6 +25,8 @@ OBJS = execAmi.o execCurrent.o execGrouping.o execIndexing.o execJunk.o \
nodeSamplescan.o nodeSeqscan.o nodeSetOp.o nodeSort.o nodeUnique.o \
nodeValuesscan.o nodeCtescan.o nodeWorktablescan.o \
nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
-   nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o
+   nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o \
+   nodeTemporalAdjustment.o
+
 
 include $(top_srcdir)/src/backend/common.mk
diff --git src/backend/executor/execProcnode.c src/backend/executor/execProcnode.c
index 554244f..610d753 100644
--- src/backend/executor/execProcnode.c
+++ src/backend/executor/execProcnode.c
@@ -114,6 +114,7 @@
 #include "executor/nodeValuesscan.h"
 #include "executor/nodeWindowAgg.h"
 #include "executor/nodeWorktablescan.h"
+#include "executor/nodeTemporalAdjustment.h"
 #include "nodes/nodeFuncs.h"
 #include "miscadmin.h"
 
@@ -334,6 +335,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
  estate, eflags);
 			break;
 
+		case T_TemporalAdjustment:
+			result = (PlanState *) ExecInitTemporalAdjustment((TemporalAdjustment *) node,
+ estate, eflags);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;		/* keep compiler quiet */
@@ -531,6 +537,10 @@ ExecProcNode(PlanState *node)
 			result = ExecLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			result = ExecTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;
@@ -779,6 +789,10 @@ ExecEndNode(PlanState *node)
 			ExecEndLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			ExecEndTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
@@ -812,3 +826,4 @@ ExecShutdownNode(PlanState *node)
 
 	return planstate_tree_walker(node, ExecShutdownNode, NULL);
 }
+
diff --git src/backend/executor/nodeTemporalAdjustment.c src/backend/executor/nodeTemporalAdjustment.c
new file mode 100644
index 000..95d58a3
--- /dev/null
+++ src/backend/executor/nodeTemporalAdjustment.c
@@ -0,0 +1,537 @@
+#include "postgres.h"
+#include "executor/executor.h"
+#include "executor/nodeTemporalAdjustment.h"
+#include "utils/memutils.h"
+#include "access/htup_details.h"/* for heap_getattr */
+#include "utils/lsyscache.h"
+#include "nodes/print.h"		/* for print_slot */
+#include "utils/datum.h"		/* for datumCopy */
+
+/*
+ * #define TEMPORAL_DEBUG
+ * XXX PEMOSER Maybe we should use execdebug.h stuff here?
+ */
+#ifdef TEMPORAL_DEBUG
+static char*
+datumToString(Oid typeinfo, Datum attr)
+{
+	Oid			typoutput;
+	bool		typisvarlena;
+	getTypeOutputInfo(typeinfo, , );
+	return OidOutputFunctionCall(typoutput, attr);
+}
+
+#define TPGdebug(...) 	{ printf(__VA_ARGS__); printf("\n"); fflush(stdout); }
+#define TPGdebugDatum(attr, typeinfo) 	TPGdebug("%s = %s %ld\n", #attr, datumToString(typeinfo, attr), attr)
+#define TPGdebugSlot(slot) { printf("Printing Slot '%s'\n", #slot); print_slot(slot); fflush(stdout); }
+
+#else
+#define datumToString(typeinfo, attr)
+#define TPGdebug(...)
+#define TPGdebugDatum(attr, typeinfo)
+#define TPGdebugSlot(slot)
+#endif
+
+/*
+ * isLessThan

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-12-15 Thread David Fetter
On Wed, Dec 07, 2016 at 03:57:33PM +0100, Peter Moser wrote:
> Am 05.12.2016 um 06:11 schrieb Haribabu Kommi:
> > 
> > 
> > On Tue, Oct 25, 2016 at 8:44 PM, Peter Moser  > > wrote:
> > 
> > 
> > We decided to follow your recommendation and add the patch to the
> > commitfest.
> > 
> > 
> > Path is not applying properly to HEAD.
> > Moved to next CF with "waiting on author" status.
> > 
> 
> We updated our patch. We tested it with the latest
> commit dfe530a09226a9de80f2b4c3d5f667bf51481c49.

This looks neat, but it no longer applies to master.  Is a rebase in
the offing?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-12-07 Thread Peter Moser

Am 05.12.2016 um 06:11 schrieb Haribabu Kommi:



On Tue, Oct 25, 2016 at 8:44 PM, Peter Moser > wrote:


We decided to follow your recommendation and add the patch to the
commitfest.


Path is not applying properly to HEAD.
Moved to next CF with "waiting on author" status.



We updated our patch. We tested it with the latest
commit dfe530a09226a9de80f2b4c3d5f667bf51481c49.



Regards,
Hari Babu
Fujitsu Australia


Best regards,
Anton, Johann, Michael, Peter
diff --git src/backend/commands/explain.c src/backend/commands/explain.c
index 0a669d9..09406d4 100644
--- src/backend/commands/explain.c
+++ src/backend/commands/explain.c
@@ -875,6 +875,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_SeqScan:
 			pname = sname = "Seq Scan";
 			break;
+		case T_TemporalAdjustment:
+			if(((TemporalAdjustment *) plan)->temporalCl->temporalType == TEMPORAL_TYPE_ALIGNER)
+pname = sname = "Adjustment(for ALIGN)";
+			else
+pname = sname = "Adjustment(for NORMALIZE)";
+			break;
 		case T_SampleScan:
 			pname = sname = "Sample Scan";
 			break;
diff --git src/backend/executor/Makefile src/backend/executor/Makefile
index 51edd4c..42801d3 100644
--- src/backend/executor/Makefile
+++ src/backend/executor/Makefile
@@ -25,6 +25,8 @@ OBJS = execAmi.o execCurrent.o execGrouping.o execIndexing.o execJunk.o \
nodeSamplescan.o nodeSeqscan.o nodeSetOp.o nodeSort.o nodeUnique.o \
nodeValuesscan.o nodeCtescan.o nodeWorktablescan.o \
nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
-   nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o
+   nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o \
+   nodeTemporalAdjustment.o
+
 
 include $(top_srcdir)/src/backend/common.mk
diff --git src/backend/executor/execProcnode.c src/backend/executor/execProcnode.c
index 554244f..610d753 100644
--- src/backend/executor/execProcnode.c
+++ src/backend/executor/execProcnode.c
@@ -114,6 +114,7 @@
 #include "executor/nodeValuesscan.h"
 #include "executor/nodeWindowAgg.h"
 #include "executor/nodeWorktablescan.h"
+#include "executor/nodeTemporalAdjustment.h"
 #include "nodes/nodeFuncs.h"
 #include "miscadmin.h"
 
@@ -334,6 +335,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
  estate, eflags);
 			break;
 
+		case T_TemporalAdjustment:
+			result = (PlanState *) ExecInitTemporalAdjustment((TemporalAdjustment *) node,
+ estate, eflags);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;		/* keep compiler quiet */
@@ -531,6 +537,10 @@ ExecProcNode(PlanState *node)
 			result = ExecLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			result = ExecTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;
@@ -779,6 +789,10 @@ ExecEndNode(PlanState *node)
 			ExecEndLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			ExecEndTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
@@ -812,3 +826,4 @@ ExecShutdownNode(PlanState *node)
 
 	return planstate_tree_walker(node, ExecShutdownNode, NULL);
 }
+
diff --git src/backend/executor/nodeTemporalAdjustment.c src/backend/executor/nodeTemporalAdjustment.c
new file mode 100644
index 000..e45ec03
--- /dev/null
+++ src/backend/executor/nodeTemporalAdjustment.c
@@ -0,0 +1,528 @@
+#include "postgres.h"
+#include "executor/executor.h"
+#include "executor/nodeTemporalAdjustment.h"
+#include "utils/memutils.h"
+#include "access/htup_details.h"/* for heap_getattr */
+#include "utils/lsyscache.h"
+#include "nodes/print.h"		/* for print_slot */
+#include "utils/datum.h"		/* for datumCopy */
+
+/*
+ * #define TEMPORAL_DEBUG
+ * XXX PEMOSER Maybe we should use execdebug.h stuff here?
+ */
+#ifdef TEMPORAL_DEBUG
+static char*
+datumToString(Oid typeinfo, Datum attr)
+{
+	Oid			typoutput;
+	bool		typisvarlena;
+	getTypeOutputInfo(typeinfo, , );
+	return OidOutputFunctionCall(typoutput, attr);
+}
+
+#define TPGdebug(...) 	{ printf(__VA_ARGS__); printf("\n"); fflush(stdout); }
+#define TPGdebugDatum(attr, typeinfo) 	TPGdebug("%s = %s %ld\n", #attr, datumToString(typeinfo, attr), attr)
+#define TPGdebugSlot(slot) { printf("Printing Slot '%s'\n", #slot); print_slot(slot); fflush(stdout); }
+
+#else
+#define datumToString(typeinfo, attr)
+#define TPGdebug(...)
+#define TPGdebugDatum(attr, typeinfo)
+#define TPGdebugSlot(slot)
+#endif
+
+/*
+ * isLessThan
+ *		We must check if the sweepline is before a timepoint, or if a timepoint
+ *		is smaller than another. We initialize the function call info during
+ *		ExecInit phase.
+ */
+static bool
+isLessThan(Datum a, Datum b, TemporalAdjustmentState* node)

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-12-04 Thread Haribabu Kommi
On Tue, Oct 25, 2016 at 8:44 PM, Peter Moser  wrote:

>
> We decided to follow your recommendation and add the patch to the
> commitfest.
>
>
Path is not applying properly to HEAD.
Moved to next CF with "waiting on author" status.


Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-10-25 Thread Peter Moser

On 27.07.2016 at 16:09 Robert Haas wrote:

On Fri, Jul 22, 2016 at 7:15 AM, Anton Dignös  wrote:

We would like to contribute to PostgreSQL a solution that supports the query
processing of "at each time point". The basic idea is to offer two new
operators, NORMALIZE and ALIGN, whose purpose is to adjust (or split) the
ranges of tuples so that subsequent queries can use the usual grouping and
equality conditions to get the intended results.


I think that it is great that you want to contribute your work to
PostgreSQL.  I don't know whether there will be a consensus that this
is generally useful functionality that we should accept, but I commend
the effort anyhow.  Assuming there is, getting this into a state that
we consider committable will probably take quite a bit of additional
work on your part; no one will do it for you.



Hi hackers,

thank you for your feedback.

We are aware that contributing to PostgreSQL is a long way with a lot
of work.  We are committed to go all the way and do the work as
discussed in the community.

We had some internal discussions about the project, looking also at
some other patches to better understand whether the patch is 
work-in-progress or ready for commitfest.




If you're still
interested in proceeding given those caveats, please add your patch
here so that it gets reviewed:

https://commitfest.postgresql.org/action/commitfest_view/open



We decided to follow your recommendation and add the patch to the 
commitfest.



Looking forward for your feedback,
Anton, Johann, Michael, Peter


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-07-27 Thread Robert Haas
On Fri, Jul 22, 2016 at 7:15 AM, Anton Dignös  wrote:
> We would like to contribute to PostgreSQL a solution that supports the query
> processing of "at each time point". The basic idea is to offer two new
> operators, NORMALIZE and ALIGN, whose purpose is to adjust (or split) the
> ranges of tuples so that subsequent queries can use the usual grouping and
> equality conditions to get the intended results.

I think that it is great that you want to contribute your work to
PostgreSQL.  I don't know whether there will be a consensus that this
is generally useful functionality that we should accept, but I commend
the effort anyhow.  Assuming there is, getting this into a state that
we consider committable will probably take quite a bit of additional
work on your part; no one will do it for you.  If you're still
interested in proceeding given those caveats, please add your patch
here so that it gets reviewed:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-07-23 Thread Anton Dignös
On Sat, Jul 23, 2016 at 12:01 AM, David Fetter  wrote:
> On Fri, Jul 22, 2016 at 01:15:17PM +0200, Anton Dignös wrote:
>> Hi hackers,
>>
>> we are a group of researches that work on temporal databases.  Our
>> main focus is the processing of data with time intervals, such as
>> the range types in PostgreSQL.
>
> Thanks for your hard work so far!
>
> [Explanation and examples elided]
>
> To what extent, if any, are you attempting to follow the SQL:2011
> standard?
>
> http://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf

The querying in the SQL:2011 standard is based on simple SQL range restrictions
and period predicates (OVERLAP, PRECEDES, FOR SYSTEM_TIME AS OF, etc) that
functionality-wise in PostgreSQL are already covered by the operators and
functions on range types.

Operations such as aggregation, outer joins, set-operations on ranges
(mentioned in
Section 2.5 "Future directions" in the above paper) are not yet part of the
standard. These are the operations that require the adjustment (or splitting) of
ranges.

Best,

Anton


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-07-22 Thread David Fetter
On Fri, Jul 22, 2016 at 01:15:17PM +0200, Anton Dignös wrote:
> Hi hackers,
> 
> we are a group of researches that work on temporal databases.  Our
> main focus is the processing of data with time intervals, such as
> the range types in PostgreSQL.

Thanks for your hard work so far!

[Explanation and examples elided]

To what extent, if any, are you attempting to follow the SQL:2011
standard?

http://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers