Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-22 Thread Peter Moser
2017-09-22 10:21 GMT+02:00 Pavel Stehule :
> Currently Postgres has zero support for SQL:2011 temporal tables. Isn't
> better start with already standard features than appends some without
> standard? The standard has some concept and if we start out of this concept,
> then the result will be far to standard probably.

We will focus for now on the Range Merge Join algorithm by Jeff Davis,
which implements a temporal join with overlap predicates.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-22 Thread Peter Moser
2017-09-22 10:06 GMT+02:00 Pavel Stehule :
> ANSI SQL 2011 has temporal data support
>
> https://www.slideshare.net/CraigBaumunk/temporal-extensions-tosql20112012010438

As operations it only supports temporal inner joins using the overlap predicate.
Temporal aggregation, temporal outer joins, temporal duplicate
elimination, and temporal set operations are not supported in
SQL:2011.
Please see [1] Section 2.5 Future directions.

Best regards,
Anton, Johann, Michael, Peter


[1] https://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-09-22 Thread Peter Moser
2017-09-12 16:33 GMT+02:00 Simon Riggs :
> PostgreSQL tries really very hard to implement the SQL Standard and
> just the standard. ISTM that the feedback you should have been given
> is that this is very interesting but will not be committed in its
> current form; I am surprised to see nobody has said that, though you
> can see the truth of that since nobody is actively looking to review
> or commit this. Obviously if the standard were changed to support
> these things we'd suddenly be interested...

Ok, we understand that PostgreSQL wants to strictly follow the SQL
standard, which is not yet defined for temporal databases. In this
context we understand your comment and agree on your position.

Our approach with the two temporal primitives is more far-reaching and
comprehensive: it supports all operators of a temporal relational
algebra by systematically transforming the temporal operators to the
nontemporal counterparts, thereby taking advantage of all features of
the underlying DBMS. This requires of course also new syntax.

> What I think I'm lacking is a clear statement of why we need to have
> new syntax to solve the problem ...

The newly introduced syntax of the two primitives comes from our
decision to divide a bigger patch into two parts: an primitives-only
patch, and a temporal query rewrite patch. We thought of discussing
and providing a second patch in the future, which would then
automatically rewrite temporal queries into their non-temporal
counterparts using these two primitives.

> ... and why the problem itself is
> important.

The main idea about these temporal primitives is to have only two new
operators to provide the whole range of temporal queries, that is,
temporal joins, temporal set operations, temporal duplicate
elimination, and temporal aggregation. The benefit of the primitives
is that it is minimal invasive to the postgres kernel due to the reuse of all
standard operators after the temporal split (or normalization). It can
therefore also use existing optimizations already implemented.

An alternative approach would be to implement for each operator a
separate algorithm.  For instance, Jeff Davis is implementing a temporal
join into the existing Merge Join Executor (see [1]). Note that a
temporal join is the only operator that can be implemented without
introducing new syntax due to the overlap predicate.  For all other
temporal operators a discussion about new syntax is necessary anyway,
independent of the implementation approach.

> PostgreSQL supports the ability to produce Set Returning Functions and
> various other extensions. Would it be possible to change this so that
> we don't add new syntax to the parser but rather we do this as a set
> of functions?

Set Returning Functions would indeed be a possibility to implement
temporal query processing without new syntax, though it has some serious
drawbacks: the user has to specify the schema of the query results; the
performance might be a problem, since functions are treated as
black-boxes for the optimizer, loosing selection-pushdown and similar
optimizations.

> An alternative might be for us to implement a pluggable parser, so
> that you can have an "alternate syntax" plugin. If we did that, you
> can then maintain the plugin outside of core until the time when SQL
> Standard is updated and we can implement directly. We already support
> the ability to invent new plan nodes, so this could be considered as
> part of the plugin.

This is an interesting idea to look into when it is ready some day.



With all these clarifications in mind, we thought to focus meanwhile
on improving the performance of temporal query processing in special
cases (eg., joins), similar to the pending Range Merge Join patch by
Jeff Davis in [1]. Hereby, we like to contribute to it as reviewers
and hopefully add some improvements or valuable ideas from our
research area.


Best regards,
Anton, Johann, Michael, Peter


[1] 
https://www.postgresql.org/message-id/flat/camp0ubfwaffw3o_ngkqprpmm56m4wetexjprb2gp_nrdaec...@mail.gmail.com#camp0ubfwaffw3o_ngkqprpmm56m4wetexjprb2gp_nrdaec...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-07-31 Thread Peter Moser

On 06.04.2017 01:24, Andres Freund wrote:

Unfortunately I don't think this patch has received sufficient design
and implementation to consider merging it into v10.  As code freeze is
in two days, I think we'll have to move this to the next commitfest.


We rebased our patch on top of commit 
393d47ed0f5b764341c7733ef60e8442d3e9bdc2

from "Mon Jul 31 11:24:51 2017 +0900".

Best regards,
Anton, Johann, Michael, Peter
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7648201..a373358 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -919,6 +919,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_SeqScan:
 			pname = sname = "Seq Scan";
 			break;
+		case T_TemporalAdjustment:
+			if(((TemporalAdjustment *) plan)->temporalCl->temporalType == TEMPORAL_TYPE_ALIGNER)
+pname = sname = "Adjustment(for ALIGN)";
+			else
+pname = sname = "Adjustment(for NORMALIZE)";
+			break;
 		case T_SampleScan:
 			pname = sname = "Sample Scan";
 			break;
diff --git a/src/backend/executor/Makefile b/src/backend/executor/Makefile
index 083b20f..b0d6d15 100644
--- a/src/backend/executor/Makefile
+++ b/src/backend/executor/Makefile
@@ -29,6 +29,6 @@ OBJS = execAmi.o execCurrent.o execExpr.o execExprInterp.o \
nodeCtescan.o nodeNamedtuplestorescan.o nodeWorktablescan.o \
nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o \
-   nodeTableFuncscan.o
+   nodeTableFuncscan.o nodeTemporalAdjustment.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index 396920c..7dd7474 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -113,6 +113,7 @@
 #include "executor/nodeValuesscan.h"
 #include "executor/nodeWindowAgg.h"
 #include "executor/nodeWorktablescan.h"
+#include "executor/nodeTemporalAdjustment.h"
 #include "nodes/nodeFuncs.h"
 #include "miscadmin.h"
 
@@ -364,6 +365,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
  estate, eflags);
 			break;
 
+		case T_TemporalAdjustment:
+			result = (PlanState *) ExecInitTemporalAdjustment((TemporalAdjustment *) node,
+ estate, eflags);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;		/* keep compiler quiet */
@@ -711,6 +717,10 @@ ExecEndNode(PlanState *node)
 			ExecEndLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			ExecEndTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
diff --git a/src/backend/executor/nodeTemporalAdjustment.c b/src/backend/executor/nodeTemporalAdjustment.c
new file mode 100644
index 000..ff2aa85
--- /dev/null
+++ b/src/backend/executor/nodeTemporalAdjustment.c
@@ -0,0 +1,571 @@
+#include "postgres.h"
+#include "executor/executor.h"
+#include "executor/nodeTemporalAdjustment.h"
+#include "utils/memutils.h"
+#include "access/htup_details.h"/* for heap_getattr */
+#include "utils/lsyscache.h"
+#include "nodes/print.h"		/* for print_slot */
+#include "utils/datum.h"		/* for datumCopy */
+#include "utils/rangetypes.h"
+
+/*
+ * #define TEMPORAL_DEBUG
+ * XXX PEMOSER Maybe we should use execdebug.h stuff here?
+ */
+#ifdef TEMPORAL_DEBUG
+static char*
+datumToString(Oid typeinfo, Datum attr)
+{
+	Oid			typoutput;
+	bool		typisvarlena;
+	getTypeOutputInfo(typeinfo, , );
+	return OidOutputFunctionCall(typoutput, attr);
+}
+
+#define TPGdebug(...) 	{ printf(__VA_ARGS__); printf("\n"); fflush(stdout); }
+#define TPGdebugDatum(attr, typeinfo) 	TPGdebug("%s = %s %ld\n", #attr, datumToString(typeinfo, attr), attr)
+#define TPGdebugSlot(slot) { printf("Printing Slot '%s'\n", #slot); print_slot(slot); fflush(stdout); }
+
+#else
+#define datumToString(typeinfo, attr)
+#define TPGdebug(...)
+#define TPGdebugDatum(attr, typeinfo)
+#define TPGdebugSlot(slot)
+#endif
+
+/*
+ * isLessThan
+ *		We must check if the sweepline is before a timepoint, or if a timepoint
+ *		is smaller than another. We initialize the function call info during
+ *		ExecInit phase.
+ */
+static bool
+isLessThan(Datum a, Datum b, TemporalAdjustmentState* node)
+{
+	node->ltFuncCallInfo.arg[0] = a;
+	node->ltFuncCallInfo.arg[1] = b;
+	node->ltFuncCallInfo.argnull[0] = false;
+	node->ltFuncCallInfo.argnull[1] = false;
+
+	/* Return value is never null, due to the pre-defined sub-query output */
+	return DatumGetBool(FunctionCallInvoke(>ltFuncCallInfo));
+}
+
+/*
+ * isEqual
+ *		We must check if two timepoints are equal. We initialize the function
+ *		call info during ExecInit phase.
+ */
+static bool
+isEqual(Datum a, Datum b, TemporalAdjustmentState* node)
+{
+	node->eqFuncCallInfo.arg[0] = a;
+	node->eqFuncCallInfo.arg[1] = b;
+	

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-03-30 Thread Peter Moser
2017-03-01 10:56 GMT+01:00 Peter Moser <pitiz...@gmail.com>:
> A similar walkthrough for ALIGN will follow soon.
>
> We are thankful for any suggestion or ideas, to be used to write a
> good SGML documentation.

The attached README explains the ALIGN operation step-by-step with a
TEMPORAL LEFT OUTER JOIN example. That is, we start from a query
input, show how we rewrite it during parser stage, and show how the
final execution generates result tuples.


Best regards,
Anton, Michael, Johann, Peter

ALIGN


  This is an exemplary walkthrough of a temporal ALIGN operation, from the
  query input, over the query rewrite, until the result tuple outputs during
  execution.



EXAMPLE


  These tables represent employees and projects in a company. An employee works
  for a department over a certain time period. Different projects get developed
  in a department. The start and end time points of each project is stored in
  the period column.

CREATE TABLE emp (empname VARCHAR, dept VARCHAR, period INT4RANGE);
INSERT INTO emp VALUES
('Ann', 'DB', '[1,5)'),
('Bea', 'DB', '[3,8)'),
('Ron', 'AI', '[6,9)');

CREATE TABLE prj (prjname VARCHAR, dept VARCHAR, period INT4RANGE);
INSERT INTO prj VALUES
('PR1', 'DB', '[3,7)'),
('PR2', 'DB', '[1,5)'),
('PR3', 'HW', '[2,8)');


Timeline representation
---

EMPNAME  DEPT  PERIOD
Ann  DB[1,5)    <- Tuple 1, valid from 1 to 5 excl.
Bea  DB[3,8)-   <- Tuple 2
Ron  AI[6,9)   ---  <- Tuple 3
  123456789 <- Timeline

PRJNAME  DEPT  PERIOD
PR1  DB[3,7)<- Tuple 1, valid from 3 to 7 excl.
PR2  DB[1,5)    <- Tuple 2
PR3  HW[2,8)   --   <- Tuple 3
  123456789 <- Timeline


TEMPORAL LEFT OUTER JOIN query
--

  Query: At each time point, to which projects is an  employee assigned, and
 when does an employee not have an assigned project?


WITH emp AS (SELECT period u, * FROM emp),
 prj AS (SELECT period v, * FROM prj)
SELECT empname, prjname, dept, period FROM (
emp ALIGN prj ON emp.dept = prj.dept WITH (emp.period, prj.period)
) emp_aligned
LEFT OUTER JOIN (
prj ALIGN emp ON emp.dept = prj.dept WITH (prj.period, emp.period)
) prj_aligned
USING(dept, period)
WHERE period = u * v OR u IS NULL OR v IS NULL;


Result
--

 empname | prjname | dept | period
-+-+--+
 Ann | PR2 | DB   | [1,5)   
 Ann | PR1 | DB   | [3,5) --
 Bea | PR1 | DB   | [3,7) 
 Bea | PR2 | DB   | [3,5) --
 Bea | | DB   | [7,8) -
 Ron | | AI   | [6,9)---
123456789  <- Timeline



STEP-BY-STEP EXPLANATION


  In this chapter we describe first the rewrite and processing of the two ALIGN
  clauses, that are,

( emp ALIGN prj ON emp.dept = prj.dept WITH (emp.period, prj.period)
) emp_aligned

  ...and...

( prj ALIGN emp ON emp.dept = prj.dept WITH (prj.period, emp.period)
) prj_aligned.

  Secondly, we explain what the above query does in general, and why we need two
  ALIGNs, the WHERE-clause and CTEs (WITH clauses).


ALIGN subquery processing
-

  After receiving the ALIGN query (see above) as input, we rewrite it into
  the following query:

SELECT emp.*, GREATEST(LOWER(emp.period), LOWER(prj.period)) p1,
 LEAST(UPPER(emp.period), UPPER(prj.period)) p2
FROM
(SELECT *, row_id() OVER () rn FROM emp) emp
LEFT OUTER JOIN
prj
ON emp.dept = prj.dept AND emp.period && prj.period
ORDER BY rn, p1, p2;

  Then, we rewrite the second ALIGN subquery analoguously.


Intermediate results: These are the inputs of our ALIGN execution nodes
---
  (See Appendix [1] for details)

  For emp ALIGN prj...

TUPIDempname | dept | period | rn | p1 | p2
-+--++++
  A  Ann | DB   | [1,5)  |  1 |  1 |  5
  B  Ann | DB   | [1,5)  |  1 |  3 |  5
  C  Bea | DB   | [3,8)  |  2 |  3 |  5
  D  Bea | DB   | [3,8)  |  2 |  3 |  7
  E  Ron

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-03-01 Thread Peter Moser
2017-02-15 20:24 GMT+01:00 Robert Haas :
> There's no documentation in this patch.  I'm not sure you want to go
> to the trouble of writing SGML documentation until this has been
> reviewed enough that it has a real chance of getting committed, but on
> the other hand we're obviously all struggling to understand what it
> does, so I think if not SGML documentation it at least needs a real
> clear explanation of what the syntax is and does in a README or
> something, even just for initial review.

The attached README explains the NORMALIZE operation step-by-step with
an example. That is, we start from a query input, show how we rewrite
it during parser stage, and show how the final execution generates
result tuples. A similar walkthrough for ALIGN will follow soon.

We are thankful for any suggestion or ideas, to be used to write a
good SGML documentation.

Best regards,
Anton, Michael, Johann, Peter

NORMALIZE


This is an exemplary walkthrough of a temporal NORMALIZE operation, from the
query input, over the query rewrite, until the result tuple outputs during
execution.



EXAMPLE


Question: How many employees work in a department at each time point?

CREATE TABLE empl (name VARCHAR, dept VARCHAR, time INT4RANGE);
INSERT INTO empl VALUES
('Ann', 'DB', '[1,5)'),
('Ann', 'AI', '[3,8)'),
('Bea', 'DB', '[3,9)');


Timeline representation:

NAME  DEPT  TIME
Ann   DB[1,5)   <- Tuple 1, valid from 1 to 5 excl.
Ann   AI[3,8) - <- Tuple 2
Bea   DB[3,9) --<- Tuple 3
123456789   <- Timeline


NORMALIZE query:

SELECT count(*), dept, time
FROM (
empl a NORMALIZE empl b USING(dept) WITH (a.time, b.time)
) c
GROUP BY dept, time;


Result:

COUNT  DEPT TIME
1  DB   [1,3)   --
1  AI   [3,8) -
2  DB   [3,5) --
1  DB   [5,9)   
123456789  <- Timeline



STEP-BY-STEP EXPLANATION


After receiving the NORMALIZE query (see above) as input, we rewrite it into
the following query:

SELECT a.*, P1
FROM (
SELECT *, row_id() OVER () rn FROM empl
) a
LEFT OUTER JOIN (
SELECT dept, LOWER(time) P1 FROM empl
UNION
SELECT dept, UPPER(time) P1 FROM empl
) b
ON a.dept = b.dept AND P1 <@ a.time
ORDER BY rn, P1;


Intermediate result: This is the input of our NORMALIZE execution node...
(See Appendix [1] for details)

 TUPID   name | dept | time  | rn | p1
--+--+---++
   A Ann  | DB   | [1,5) |  1 |  1
   B Ann  | DB   | [1,5) |  1 |  3
   C Ann  | AI   | [3,8) |  2 |  3
   D Bea  | DB   | [3,9) |  3 |  3
   E Bea  | DB   | [3,9) |  3 |  5



We have three possibilities inside NORMALIZE during execution:
(See Appendix [2] for details)

1) CURR == PREV
   a) S < CURR.P1   --> generate (CURR, [S, CURR.P1)) and set S = CURR.P1
   b) otherwise --> fetch next tuple
  if fetched tuple is NULL --> goto(2) forcing CURR != PREV to be true
2) CURR != PREV
   a) S < PREV.te   --> generate (PREV, [S, PREV.te))
   ... set S = CURR.ts and goto(1) forcing CURR == PREV to be true



Executor steps for our example query:

1) First call of ExecTemporalAdjustment: Fetch tuple A, which is now CURR.
   Set sweepline S = 1 (A.ts), and copy tuple A also into PREV. Goto(1)
   forcing CURR == PREV to be true.

2) (1): CURR == PREV, that is, tuple A is equal to itself.
   (1b): Fetch tuple B; CURR = B, PREV = A

3) (1): A == B
   (1a): S < B.P1 (1 < 3), hence we generate a result tuple:
(Ann, DB, [1,3))
   ...where the interval is [S, CURR.P1) and all other attributes are
   copied from the current tuple (omitting helper columns: RN and P1)
   We update the sweepline: S = 3 (B.P1).

4) (1): A == B.
   (1b): Fetch tuple C; CURR = C, PREV = B

5) (2): B != C.
   (2a): S < B.te (3 < 5), hence we generate a result tuple:
(Ann, DB, [3, 5))
   We update the sweepline: S = 3 (C.ts).

6) (1): B == C (forced).
   (1b): Fetch tuple D; CURR = D, PREV = C

7) (2): C != D.
   (2a): S < C.te (3 < 8), hence we generate a result tuple:
(Ann, AI, [3,8))
   We update the sweepline: S = 3 (D.ts).

8) (1): C == D (forced).
   (1b): Fetch tuple E; CURR = E, PREV = D

9) (1): D == E.
   (1a): S < E.P1 (3 < 5), hence we generate a result tuple:
(Bea, DB, [3, 5))
   We update the sweepline: S = 5 (E.P1).

10) (1): D == E.
(1b): 

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-27 Thread Peter Moser
2017-02-24 21:25 GMT+01:00 Jim Nasby <jim.na...@bluetreble.com>:
> On 2/24/17 6:40 AM, Peter Moser wrote:
>>
>> Do you think it is better to remove the syntax for ranges expressed in
>> different columns?
>
>
> It's not that hard to construct a range type on-the-fly from 2 columns, so
> (without having looked at the patch or really followed the thread) I would
> think the answer is yes.

We discussed and decided to remove the syntax for separate columns.
The patch with "range-types-only" syntax will be send soon.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-24 Thread Peter Moser
2017-02-22 19:43 GMT+01:00 Peter Eisentraut :
> On 2/16/17 07:41, Robert Haas wrote:
>> Also, it sounds like all of this is intended to work with ranges that
>> are stored in different columns rather than with PostgreSQL's built-in
>> range types.
>
> Yeah, that should certainly be changed.

Our syntax supports PostgreSQL's built-in range types and ranges that
are stored in different columns.

For instance, for range types an ALIGN query would look like this:
  SELECT * FROM (r ALIGN s ON q WITH (r.t, s.t)) c

... and for ranges in different columns like this:
  SELECT * FROM (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c

... where r and s are input relations, q can be any join qualifier, and
r.t, s.t, r.ts, r.te, s.ts, and s.te can be any column name. The
latter represent the valid time intervals, that is time point start,
and time point end of each tuple for each input relation. These can
be defined as four scalars, or two half-open, i.e., [), range typed
values.




It would reduce the size of our patch and simplify the overall structure,
if we would remove the possibility to express valid time start points and end
points in different columns.

Do you think it is better to remove the syntax for ranges expressed in
different columns?

However, internally we still need to split the
range types into two separate points, because NORMALIZE does not make a
distinction between start and end timepoints while grouping, therefore we
have only one timepoint attribute there (i.e., P1), which is the union of
start and end timepoints (see executor/nodeTemporalAdjustment.c). A second
constraint is, that we support currently only half-open intervals, that is,
any interval definition (open/closed/half-open) different from the PostgreSQL's
default, i.e., [), leads to undefined results.

Best regards,
Anton, Johann, Michael, Peter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-02-20 Thread Peter Moser
On Wed, Feb 15, 2017 at 9:33 PM, David G. Johnston
 wrote:
> On Wed, Feb 15, 2017 at 12:24 PM, Robert Haas  wrote:
>>
>> So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN,
>> except apparently there's no join type and the optimizer can never
>> reorder these operations with each other or with other joins.  Is that
>> right?  The optimizer changes in this patch seem fairly minimal, so
>> I'm guessing it can't be doing anything very complex here.
>>
>> + * INPUT:
>> + * (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c
>> + * where q can be any join qualifier, and r.ts, r.te, s.ts,
>> and s.t
>> e
>> + * can be any column name.
>> + *
>> + * OUTPUT:
>> + * (
>> + * SELECT r.*, GREATEST(r.ts, s.ts) P1, LEAST(r.te, s.te) P2
>> + *  FROM
>> + *  (
>> + * SELECT *, row_id() OVER () rn FROM r
>> + *  ) r
>> + *  LEFT OUTER JOIN
>> + *  s
>> + *  ON q AND r.ts < s.te AND r.te > s.ts
>> + *  ORDER BY rn, P1, P2
>> + *  ) c
>>
>> It's hard to see what's going on here.  What's ts?  What's te?  If you
>> used longer names for these things, it might be a bit more
>> self-documenting.
>
>
> Just reasoning out loud here...
>
> ISTM ts and te are "temporal [range] start" and "temporal [range] end" (or
> probably just the common "timestamp start/end")
>
> From what I can see it is affecting an intersection of the two ranges and,
> furthermore, splitting the LEFT range into sub-ranges that match up with the
> sub-ranges found on the right side.  From the example above this seems like
> it should be acting on self-normalized ranges - but I may be missing
> something by evaluating this out of context.
>
> r1 [1, 6] [ts, te] [time period start, time period end]
> s1 [2, 3]
> s2 [3, 4]
> s3 [5, 7]
>
> r LEFT JOIN s ON (r.ts < s.te AND r.te > s.ts)
>
> r1[1, 6],s1[2, 3] => [max(r.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[2, 3]
> r1[1, 6],s2[3, 4] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[3, 4]
> r1[1, 6],s3[5, 7] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[5, 6]
>
> Thus the intersection is [2,6] but since s1 has three ranges that begin
> between 2 and 6 (i.e., 2, 3, and 5) there are three output records that
> correspond to those sub-ranges.

Yes, this is what the internal rewriting produces for r1.
Note that till now we only support half-open ranges, i.e., [), but for
visibility I will continue this example using closed ranges [].
The executor function (ExecTemporalAdjustment) gets this (the output above) as
the input and will then produce:

r1[1, 1]
r1[2, 3]
r1[3, 4]
r1[5, 6]

Which means also for the ALIGN the non-overlapping parts are retained.

>
> The description in the OP basically distinguishes between NORMALIZE and
> ALIGN in that ALIGN, as described above, affects an INTERSECTION on the two
> ranges - discarding the non-overlapping data - while NORMALIZE performs the
> alignment while also retaining the non-overlapping data.

Also for ALIGN we retain the non-overlapping part.
Intersections are symmetric/commutative, so a subsequent outer join can then use
equality on the ranges
to produce join matches (for overlapping) as well as null-extend the
produced non-overlapping parts.

The difference between ALIGN and NORMALIZE is how they split, while ALIGN
produces intersections between pairs of tuples (used for joins) and the
non-overlapping parts, NORMALIZE produces intersections between groups of tuples
(used for aggregation, so that all tuples with the same group have equal
ranges) and the non-overlapping parts.

For instance, the NORMALIZE between r1, s1, s2, and s3 in your example above
would give the following:

r1[1, 1]
r1[2, 2]
r1[3, 3]
r1[4, 4]
r1[5, 6]

>
> The rest of the syntax seems to deal with selecting subsets of range records
> based upon attribute data.

Yes, exactly!


Best regards,
Anton, Johann, Michael, Peter


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-01-24 Thread Peter Moser
2017-01-18 3:57 GMT+01:00 Peter Eisentraut <peter.eisentr...@2ndquadrant.com>:
>
> On 1/13/17 9:22 AM, Peter Moser wrote:
> > The goal of temporal aligners and normalizers is to split ranges to allow a
> > reduction from temporal queries to their non-temporal counterparts.
> > Splitting
> > ranges is necessary for temporal query processing. Temporal aligners and
> > normalizer may then be used as building-blocks for any temporal query
> > construct.
>
> I would need to see the exact definitions of these constructs.  Please
> send some documentation.
>
> > We have published two papers, that contain formal definitions and
> > related work
> > for the temporal aligner and normalizer. Please see [1] and [2].
>
> I don't have access to those.

The papers can be freely downloaded from
http://www.inf.unibz.it/~dignoes/publications.html using the "Author-ize link".

> >> I think there are probably many interesting applications for normalizing
> >> or otherwise adjusting ranges.  I'd like to see an overview and
> >> consideration of other applications.
> >
> > Please see the attached file adjustment.sql for some interesting
> > applications.
>
> That's surely interesting, but without knowing what these operations are
> supposed to do, I can only reverse engineer and guess.

Intuitively what they do is as follows:


NORMALIZE: splits all the ranges of one relation according to all the range
boundaries of another (but possibly the same) relation whenever some equality
condition over some given attributes is satisfied.

When the two relations are the same, all ranges with the given equal attributes
are either equal or disjoint. After this, the traditional GROUP BY or DISTINCT
can be applied. The attributes given to NORMALIZE for a (temporal) GROUP BY are
the grouping attributes and for a (temporal) DISTINCT the target list
attributes.

When the two relations are different, but they each contain disjoint ranges
for the same attributes (as the current limitation for the set operations is)
we perform a symmetric NORMALIZE on each of them. Then we have a similar
situation as before, i.e., in both relations ranges with the same attributes
are either equal or disjoint and a traditional set operation
(EXCEPT/INTERSECT/UNION) can be applied. The attributes given to NORMALIZE for
a (temporal) EXCEPT/INTERSECT/UNION are the target list attributes.


ALIGN: splits all the ranges of one relation according to all the range
intersections of another relation, i.e., it produces all intersections and
non-overlapping parts, whenever some condition is satisfied.

We perform a symmetric ALIGN on each relation, after which a traditional inner
or outer join can be applied using equality on the ranges to calculate the
overlap. The condition given to a (temporal) inner or outer join is the
join condition without overlap.

> >> Ideally, I'd like to see these things implemented as some kind of
> >> user-space construct, like an operator or function.  I think we'd need a
> >> clearer definition of what it is they do before we can evaluate that.
> >
> > Can you please explain what you mean by "user-space construct" in this case.
>
> Implement them using the extensibility features, such as a user-defined
> operator.  I don't know if it's possible, but it's something to consider.

We experimented with user-defined operators and set-returning functions. The
main issue with these is that ALIGN and NORMALIZE rely on comparison and
processing of one tuple with many tuples at a time that is not possible with
operators, and for set-returning functions there is no clean way of passing
tables and/or conditions.

> Using common terms such as ALIGN and NORMALIZE for such a specific
> functionality seems a bit wrong.

Would ALIGN RANGES/RANGE ALIGN and NORMALIZE RANGES/RANGE NORMALIZE be better
options? We are also thankful for any suggestion or comments about the syntax.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-01-13 Thread Peter Moser
> What this patch does is to add two new clauses for FROM-list items,
> NORMALIZE and ALIGN, which reshuffle a set of ranges into a new list
> that can then be aggregated more easily.  From the original message:
>
> > For NORMALIZE the tuples' ranges need to be split into all sub-ranges
> > according to all matching ranges of the second relation. For this we
> > create a subquery that first joins one relation with the range
> > boundaries of the other and then sorts the result. The executor
> > function splits the ranges in a sweep-line based manner.
> >
> > For ALIGN the tuples' ranges must be split into all intersections and
> > differences with the other relation according to the join condition.
> > For this we create a subquery that first joins the two relations and
> > then sorts the result. The executor function splits the ranges
> > accordingly in a sweep-line based manner.
>
> So there isn't really temporal query processing as such here, only some
> helpers that can make it easier.

The goal of temporal aligners and normalizers is to split ranges to allow a
reduction from temporal queries to their non-temporal counterparts.
Splitting
ranges is necessary for temporal query processing. Temporal aligners and
normalizer may then be used as building-blocks for any temporal query
construct.

> I can see how those operations can be useful, but it would help if there
> were a more formal definition to be able to check that further.

We have published two papers, that contain formal definitions and related
work
for the temporal aligner and normalizer. Please see [1] and [2].

> What I'm missing here is some references: existing implementations,
> standards, documentation, research papers, alternative ideas, rejected
> alternatives, etc.

A good overview of existing implementations in DBMSs, SQL standard, and
history
is given in [3].

> Also, the submission is missing documentation and test cases.  There are
> technical terms used in the code that I don't understand.

We added a second patch with test cases and expected results. We are now
writing the documentation in sgml-format.

> I think there are probably many interesting applications for normalizing
> or otherwise adjusting ranges.  I'd like to see an overview and
> consideration of other applications.

Please see the attached file adjustment.sql for some interesting
applications.

> Ideally, I'd like to see these things implemented as some kind of
> user-space construct, like an operator or function.  I think we'd need a
> clearer definition of what it is they do before we can evaluate that.

Can you please explain what you mean by "user-space construct" in this case.



Best regards,
Anton, Johann, Michael, Peter


[1] Anton Dignös, Michael H. Böhlen, Johann Gamper:
Temporal alignment. SIGMOD Conference 2012: 433-444
http://doi.acm.org/10.1145/2213836.2213886
[2] Anton Dignös, Michael H. Böhlen, Johann Gamper, Christian S. Jensen:
Extending the Kernel of a Relational DBMS with Comprehensive Support for
Sequenced Temporal Queries. ACM Trans. Database Syst. 41(4): 26:1-26:46
(2016)
http://doi.acm.org/10.1145/2967608
[3] https://www2.cs.arizona.edu/people/rts/sql3.html and
https://www2.cs.arizona.edu/people/rts/tsql2.html


adjustment.sql
Description: application/sql
diff --git src/test/regress/expected/temporal_primitives.out src/test/regress/expected/temporal_primitives.out
new file mode 100644
index 000..6e4cc0d
--- /dev/null
+++ src/test/regress/expected/temporal_primitives.out
@@ -0,0 +1,841 @@
+--
+-- TEMPORAL PRIMITIVES
+--
+SET datestyle TO ymd;
+CREATE COLLATION "de_DE.utf8" (LC_COLLATE = "de_DE.utf8",
+   LC_CTYPE = "de_DE.utf8" );
+CREATE TEMP TABLE tpg_table1 (a char, b char, ts int, te int);
+CREATE TEMP TABLE tpg_table2 (c int, d char, ts int, te int);
+INSERT INTO tpg_table1 VALUES
+('a','B',1,7),
+('b','B',3,9),
+('c','G',8,10);
+INSERT INTO tpg_table2 VALUES
+(1,'B',2,5),
+(2,'B',3,4),
+(3,'B',7,9);
+-- VALID TIME columns (i.e., ts and te) are no longer at the end of the
+-- targetlist.
+CREATE TEMP TABLE tpg_table3 AS
+	SELECT a, ts, te, b FROM tpg_table1;
+CREATE TEMP TABLE tpg_table4 AS
+	SELECT c, ts, d, te FROM tpg_table2;
+-- VALID TIME columns represented as range type
+CREATE TEMP TABLE tpg_table5 AS
+	SELECT int4range(ts, te) t, a, b FROM tpg_table1;
+CREATE TEMP TABLE tpg_table6 AS
+	SELECT int4range(ts, te) t, c a, d b FROM tpg_table2;
+-- VALID TIME columns as VARCHARs
+CREATE TEMP TABLE tpg_table7 (a int, ts varchar, te varchar);
+CREATE TEMP TABLE tpg_table8 (a int,
+			  ts varchar COLLATE "de_DE.utf8",
+			  te varchar COLLATE "POSIX");
+INSERT INTO tpg_table7 VALUES
+(0, 'A', 'D'),
+(1, 'C', 'X'),
+(0, 'ABC', 'BCD'),
+(0, 'xABC', 'xBCD'),
+(0, 'BAA', 'BBB');
+INSERT INTO tpg_table8 VALUES
+(0, 'A', 'D'),
+(1, 'C', 'X');
+-- Tables to check different data types, and corner cases
+CREATE TEMP TABLE tpg_table9 (a int, ts timestamp, te timestamp);
+CREATE TEMP TABLE tpg_table10 

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-12-16 Thread Peter Moser

Am 16.12.2016 um 07:17 schrieb David Fetter:

On Wed, Dec 07, 2016 at 03:57:33PM +0100, Peter Moser wrote:

Am 05.12.2016 um 06:11 schrieb Haribabu Kommi:



On Tue, Oct 25, 2016 at 8:44 PM, Peter Moser <pitiz...@gmail.com
<mailto:pitiz...@gmail.com>> wrote:


We decided to follow your recommendation and add the patch to the
commitfest.


Path is not applying properly to HEAD.
Moved to next CF with "waiting on author" status.



We updated our patch. We tested it with the latest
commit dfe530a09226a9de80f2b4c3d5f667bf51481c49.


This looks neat, but it no longer applies to master.  Is a rebase in
the offing?


We rebased our patch on top of HEAD, that is, commit 
93513d1b6559b2d0805f0b02d312ee550e3d010b.



Best regards,
Anton, Johann, Michael, Peter
diff --git src/backend/commands/explain.c src/backend/commands/explain.c
index 0a669d9..09406d4 100644
--- src/backend/commands/explain.c
+++ src/backend/commands/explain.c
@@ -875,6 +875,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_SeqScan:
 			pname = sname = "Seq Scan";
 			break;
+		case T_TemporalAdjustment:
+			if(((TemporalAdjustment *) plan)->temporalCl->temporalType == TEMPORAL_TYPE_ALIGNER)
+pname = sname = "Adjustment(for ALIGN)";
+			else
+pname = sname = "Adjustment(for NORMALIZE)";
+			break;
 		case T_SampleScan:
 			pname = sname = "Sample Scan";
 			break;
diff --git src/backend/executor/Makefile src/backend/executor/Makefile
index 51edd4c..42801d3 100644
--- src/backend/executor/Makefile
+++ src/backend/executor/Makefile
@@ -25,6 +25,8 @@ OBJS = execAmi.o execCurrent.o execGrouping.o execIndexing.o execJunk.o \
nodeSamplescan.o nodeSeqscan.o nodeSetOp.o nodeSort.o nodeUnique.o \
nodeValuesscan.o nodeCtescan.o nodeWorktablescan.o \
nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
-   nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o
+   nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o \
+   nodeTemporalAdjustment.o
+
 
 include $(top_srcdir)/src/backend/common.mk
diff --git src/backend/executor/execProcnode.c src/backend/executor/execProcnode.c
index 554244f..610d753 100644
--- src/backend/executor/execProcnode.c
+++ src/backend/executor/execProcnode.c
@@ -114,6 +114,7 @@
 #include "executor/nodeValuesscan.h"
 #include "executor/nodeWindowAgg.h"
 #include "executor/nodeWorktablescan.h"
+#include "executor/nodeTemporalAdjustment.h"
 #include "nodes/nodeFuncs.h"
 #include "miscadmin.h"
 
@@ -334,6 +335,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
  estate, eflags);
 			break;
 
+		case T_TemporalAdjustment:
+			result = (PlanState *) ExecInitTemporalAdjustment((TemporalAdjustment *) node,
+ estate, eflags);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;		/* keep compiler quiet */
@@ -531,6 +537,10 @@ ExecProcNode(PlanState *node)
 			result = ExecLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			result = ExecTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;
@@ -779,6 +789,10 @@ ExecEndNode(PlanState *node)
 			ExecEndLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			ExecEndTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
@@ -812,3 +826,4 @@ ExecShutdownNode(PlanState *node)
 
 	return planstate_tree_walker(node, ExecShutdownNode, NULL);
 }
+
diff --git src/backend/executor/nodeTemporalAdjustment.c src/backend/executor/nodeTemporalAdjustment.c
new file mode 100644
index 000..95d58a3
--- /dev/null
+++ src/backend/executor/nodeTemporalAdjustment.c
@@ -0,0 +1,537 @@
+#include "postgres.h"
+#include "executor/executor.h"
+#include "executor/nodeTemporalAdjustment.h"
+#include "utils/memutils.h"
+#include "access/htup_details.h"/* for heap_getattr */
+#include "utils/lsyscache.h"
+#include "nodes/print.h"		/* for print_slot */
+#include "utils/datum.h"		/* for datumCopy */
+
+/*
+ * #define TEMPORAL_DEBUG
+ * XXX PEMOSER Maybe we should use execdebug.h stuff here?
+ */
+#ifdef TEMPORAL_DEBUG
+static char*
+datumToString(Oid typeinfo, Datum attr)
+{
+	Oid			typoutput;
+	bool		typisvarlena;
+	getTypeOutputInfo(typeinfo, , );
+	return OidOutputFunctionCall(typoutput, attr);
+}
+
+#define TPGdebug(...) 	{ printf(__VA_ARGS__); printf("\n"); fflush(stdout); }
+#define TPGdebugDatum(attr, typeinfo) 	TPGdebug("%s = %s %ld\n", #attr, datumToString(typeinfo, attr), attr)
+#define TPGdebug

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-12-07 Thread Peter Moser

Am 05.12.2016 um 06:11 schrieb Haribabu Kommi:



On Tue, Oct 25, 2016 at 8:44 PM, Peter Moser <pitiz...@gmail.com
<mailto:pitiz...@gmail.com>> wrote:


We decided to follow your recommendation and add the patch to the
commitfest.


Path is not applying properly to HEAD.
Moved to next CF with "waiting on author" status.



We updated our patch. We tested it with the latest
commit dfe530a09226a9de80f2b4c3d5f667bf51481c49.



Regards,
Hari Babu
Fujitsu Australia


Best regards,
Anton, Johann, Michael, Peter
diff --git src/backend/commands/explain.c src/backend/commands/explain.c
index 0a669d9..09406d4 100644
--- src/backend/commands/explain.c
+++ src/backend/commands/explain.c
@@ -875,6 +875,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		case T_SeqScan:
 			pname = sname = "Seq Scan";
 			break;
+		case T_TemporalAdjustment:
+			if(((TemporalAdjustment *) plan)->temporalCl->temporalType == TEMPORAL_TYPE_ALIGNER)
+pname = sname = "Adjustment(for ALIGN)";
+			else
+pname = sname = "Adjustment(for NORMALIZE)";
+			break;
 		case T_SampleScan:
 			pname = sname = "Sample Scan";
 			break;
diff --git src/backend/executor/Makefile src/backend/executor/Makefile
index 51edd4c..42801d3 100644
--- src/backend/executor/Makefile
+++ src/backend/executor/Makefile
@@ -25,6 +25,8 @@ OBJS = execAmi.o execCurrent.o execGrouping.o execIndexing.o execJunk.o \
nodeSamplescan.o nodeSeqscan.o nodeSetOp.o nodeSort.o nodeUnique.o \
nodeValuesscan.o nodeCtescan.o nodeWorktablescan.o \
nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
-   nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o
+   nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o tqueue.o spi.o \
+   nodeTemporalAdjustment.o
+
 
 include $(top_srcdir)/src/backend/common.mk
diff --git src/backend/executor/execProcnode.c src/backend/executor/execProcnode.c
index 554244f..610d753 100644
--- src/backend/executor/execProcnode.c
+++ src/backend/executor/execProcnode.c
@@ -114,6 +114,7 @@
 #include "executor/nodeValuesscan.h"
 #include "executor/nodeWindowAgg.h"
 #include "executor/nodeWorktablescan.h"
+#include "executor/nodeTemporalAdjustment.h"
 #include "nodes/nodeFuncs.h"
 #include "miscadmin.h"
 
@@ -334,6 +335,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
  estate, eflags);
 			break;
 
+		case T_TemporalAdjustment:
+			result = (PlanState *) ExecInitTemporalAdjustment((TemporalAdjustment *) node,
+ estate, eflags);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;		/* keep compiler quiet */
@@ -531,6 +537,10 @@ ExecProcNode(PlanState *node)
 			result = ExecLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			result = ExecTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			result = NULL;
@@ -779,6 +789,10 @@ ExecEndNode(PlanState *node)
 			ExecEndLimit((LimitState *) node);
 			break;
 
+		case T_TemporalAdjustmentState:
+			ExecEndTemporalAdjustment((TemporalAdjustmentState *) node);
+			break;
+
 		default:
 			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
 			break;
@@ -812,3 +826,4 @@ ExecShutdownNode(PlanState *node)
 
 	return planstate_tree_walker(node, ExecShutdownNode, NULL);
 }
+
diff --git src/backend/executor/nodeTemporalAdjustment.c src/backend/executor/nodeTemporalAdjustment.c
new file mode 100644
index 000..e45ec03
--- /dev/null
+++ src/backend/executor/nodeTemporalAdjustment.c
@@ -0,0 +1,528 @@
+#include "postgres.h"
+#include "executor/executor.h"
+#include "executor/nodeTemporalAdjustment.h"
+#include "utils/memutils.h"
+#include "access/htup_details.h"/* for heap_getattr */
+#include "utils/lsyscache.h"
+#include "nodes/print.h"		/* for print_slot */
+#include "utils/datum.h"		/* for datumCopy */
+
+/*
+ * #define TEMPORAL_DEBUG
+ * XXX PEMOSER Maybe we should use execdebug.h stuff here?
+ */
+#ifdef TEMPORAL_DEBUG
+static char*
+datumToString(Oid typeinfo, Datum attr)
+{
+	Oid			typoutput;
+	bool		typisvarlena;
+	getTypeOutputInfo(typeinfo, , );
+	return OidOutputFunctionCall(typoutput, attr);
+}
+
+#define TPGdebug(...) 	{ printf(__VA_ARGS__); printf("\n"); fflush(stdout); }
+#define TPGdebugDatum(attr, typeinfo) 	TPGdebug("%s = %s %ld\n", #attr, datumToString(typeinfo, attr), attr)
+#define TPGdebugSlot(slot) { printf("Printing Slot '%s'\n", #slot); print_slot(slot); fflush(stdout); }
+
+#else
+#define datumToString(typeinfo, attr)
+#define TPGdebug(...)
+#define TPGdebugDatum(attr, typeinfo)
+#define TPGdebugSlot(slot)
+#endif
+

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2016-10-25 Thread Peter Moser

On 27.07.2016 at 16:09 Robert Haas wrote:

On Fri, Jul 22, 2016 at 7:15 AM, Anton Dignös  wrote:

We would like to contribute to PostgreSQL a solution that supports the query
processing of "at each time point". The basic idea is to offer two new
operators, NORMALIZE and ALIGN, whose purpose is to adjust (or split) the
ranges of tuples so that subsequent queries can use the usual grouping and
equality conditions to get the intended results.


I think that it is great that you want to contribute your work to
PostgreSQL.  I don't know whether there will be a consensus that this
is generally useful functionality that we should accept, but I commend
the effort anyhow.  Assuming there is, getting this into a state that
we consider committable will probably take quite a bit of additional
work on your part; no one will do it for you.



Hi hackers,

thank you for your feedback.

We are aware that contributing to PostgreSQL is a long way with a lot
of work.  We are committed to go all the way and do the work as
discussed in the community.

We had some internal discussions about the project, looking also at
some other patches to better understand whether the patch is 
work-in-progress or ready for commitfest.




If you're still
interested in proceeding given those caveats, please add your patch
here so that it gets reviewed:

https://commitfest.postgresql.org/action/commitfest_view/open



We decided to follow your recommendation and add the patch to the 
commitfest.



Looking forward for your feedback,
Anton, Johann, Michael, Peter


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Development with Eclipse - Wrong error messages in IDE

2016-02-08 Thread Peter Moser

On 05.02.2016 um 18:40 Jason Petersen wrote:

On Feb 3, 2016, at 2:38 AM, Peter Moser <pitiz...@gmail.com> wrote:

Does anyone had similar problems? Do I have to configure Eclipse to understand 
the PG_RMGR macro or is there another possibility to teach Eclipse these macros?


Hi,



I just built 9.6 under Eclipse CDT to try this out and was able to open e.g. 
heapam.c without any error markers.

I added PostgreSQL as a “Makefile Project with Existing Code” after running 
./configure from the command-line. After that, I built the project from within 
Eclipse by adding the ‘all’ make target and running it.


I imported PG the same way, configured from terminal with

export CFLAGS="-g0"

./configure \
--prefix="/home/p/pg/build" \
--enable-debug \
--enable-depend \
--enable-cassert

I built the project from command-line, not from within Eclipse. First I 
thought that this may have caused all this problems, but no...




One setting I usually change: right-click the project, pick Properties, then drill 
down through C/C++ General -> Preprocessor Include Paths. In the Provider pane, 
there is an entry for “CDT GCC Build Output Parser”. I’m not sure if this is 
strictly necessary, but I set my “Container to keep discovered entries” setting to 
“File”.

Basically, Eclipse scans the make output for -I flags, then notes all the 
includes used to build each file, so the static analyzer, etc. can have more 
accurate information (it is crucial that the “Compiler command pattern” in this 
window be a regex that will match the compiler binary you use, so if you have 
/usr/local/bin/gcc, and “gcc” is the pattern, you are in for trouble).

After running the build, Eclipse should now know what includes are used for each file 
and stop whining. If it ever seems to have problems, you can kick it by running a 
clean target, then all, then picking “Project -> C/C++ Index -> Rebuild” (I 
think).


Thanks for all your suggestions, tried all of them, but it made no 
difference. What finally solved my problems was to delete the BUILD and 
cluster DATA folders that I had created, re-configured the system. 
Deleted the project in Eclipse, and imported it again after configure. 
All wrong markers disappeared. Maybe the C-file index of Eclipse was 
corrupted or so. Also rebuilding it didn't work. The only difference 
from my previous attempt was the CFLAGS thing (see above), but I do not 
know if this changes anything related to markers.



Peter



--
Jason Petersen
Software Engineer | Citus Data
303.736.9255
ja...@citusdata.com




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Development with Eclipse - Wrong error messages in IDE

2016-02-05 Thread Peter Moser

Peter Moser wrote:


I have some strange error message inside Eclipse, that some symbols cannot
be found. I work with version 9.6 currently. For instance,

Symbol 'RM_HEAP_ID' could not be resolved
src/backend/access/heap/heapam.c

It affects all occurrences of symbols that are defined in
src/include/access/rmgrlist.h. Eclipse just says "Syntax error" here.

However, the source code compiles and runs without any compile-time error or
warning. It is just an IDE problem I think, but it distracts me from finding
real bugs.


Disclaimer: I've never used eclipse.

The problem is some perhaps-too-clever stuff we do to avoid repetitive
declarations of things.  The rmgr stuff uses a PG_RMGR macro, which is
defined differently in src/backend/access/transam/rmgr.c and
src/include/access/rmgr.h; the latter contains the actual enum
definition.  On the other hand Eclipse is trying to be too clever by
processing the C files, but not actually getting it completely right
(which is understandable, really).  We have other similar cases, such as
grammar keywords (kwlist.h)

I'm afraid that you'd have to teach Eclipse to deal with such things
(which might be tricky) or live with it.



Ok,
thank you for the comment.

I think, I can live with it.

Perhaps, when I have some spare time I give it a try to solve this 
"non-issue"...


Cheers,
Peter


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Development with Eclipse - Wrong error messages in IDE

2016-02-03 Thread Peter Moser

Good morning hackers,
I have some strange error message inside Eclipse, that some symbols 
cannot be found. I work with version 9.6 currently. For instance,


Symbol 'RM_HEAP_ID' could not be resolved
src/backend/access/heap/heapam.c

It affects all occurrences of symbols that are defined in
src/include/access/rmgrlist.h. Eclipse just says "Syntax error" here.

However, the source code compiles and runs without any compile-time 
error or warning. It is just an IDE problem I think, but it distracts me 
from finding real bugs.


Does anyone had similar problems? Do I have to configure Eclipse to 
understand the PG_RMGR macro or is there another possibility to teach 
Eclipse these macros?


Thanks for any help or comment.

Best regards,
Peter


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Creating unique or "internal-use-only" column names (ColumnRef)

2015-09-08 Thread Peter Moser



Am 07.09.2015 um 16:40 schrieb Tom Lane:

Andrew Dunstan  writes:

On 09/07/2015 09:28 AM, Alvaro Herrera wrote:

This seems pretty much the same as a junk attribute, if I understand you
correctly.  I suggest given a look at how those work.



Is that actually documented anywhere much?


I don't think there's much besides a code comment here and there.
Grepping for functions that touch the "resjunk" field of TargetListEntries
should give you the lay of the land.

regards, tom lane



I have marked them as resjunk already. The problem is that the subquery 
I build contains another subquery. As SQL it looks something like the 
following:


select *
from
(select *, row_number() over () rn from r) r
left outer join
(select *, ts p1 from r union all select *, te p1 from r) s
on p1 >= r.ts and p1 < r.te
order by rn, p1;

I set then the output columns of the outer select to resjunk for rn and 
p1, like this...


i = list_length(qry->targetList);
get_tle_by_resno(qry->targetList, i)->resjunk = true;
get_tle_by_resno(qry->targetList, --i)->resjunk = true;


However, I cannot do that inside, because I need them above... or do I 
miss something here?


~peter


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Creating unique or "internal-use-only" column names (ColumnRef)

2015-09-07 Thread Peter Moser

Good afternoon,
is it possible to create unique column names or to give column names 
inside the parser phase that do not interfer with the outside world, 
i.e. with column names from tables or aliases given by the user.


Some short background:
I created my own from-clause-item, that gets rewritten into a sub-query. 
I do this because I want to re-use existing code as much as possible. 
The rewritten sub-query gets transformed with "transformRangeSubselect"...
Within this sub-query I need 3 columns that shouldn't interfer with 
columns from the input. We refer to them from a JOIN-ON clause and an 
ORDER-BY clause.


Example code:

ColumnRef   *ref;
ref = makeNode(ColumnRef);
ref->fields = list_make1(makeString("some_unique_name"));
ref->location = -1; /* Unknown location */

...

sb1 = makeNode(SortBy);
sb1->node = ref;

...

ssResult = makeNode(SelectStmt);
ssResult->withClause = NULL;
ssResult->fromClause = list_make1(joinExpr);
ssResult->targetList = list_make1(rtAStarWithR);  /* input = r.* */
ssResult->sortClause = list_make2(sb1, sb2);

Is there a possibility for such column names?

Thanks for your help,
Peter



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to compare different datums within from a tuple?

2015-08-12 Thread Peter Moser



Am 11.08.2015 um 21:03 schrieb Peter Eisentraut:

On 8/10/15 12:36 PM, Peter Moser wrote:

Can someone tell me, how I can compare two datum fields, when I do not
know the data type in advance inside an executor function?

For example, x less than y where x and y are of various types that
form intervals. I have found the method ExecTuplesMatch, but it is only
for equality comparison, I think. Another one is ApplySortComparator...
maybe that's the correct way to go?

Some code to make things clearer...

Datum x = heap_getattr(out-tts_tuple,
 node-xpos,
 out-tts_tupleDescriptor,
 isNull1);
Datum y = slot_getattr(curr, node-ypos, isNull2);

if (compareDatumWithCorrectMethod(x,y)  0)
{
  /* do something */
}


The tuple descriptor will contain the data type of the datum, so you can
use that to look up the default btree operator class and call the
respective operators in there.  But note that there is no single notion
of comparison in the system.  Comparison depends on operator class,
access method, possibly collation, null value treatment.  Some types
don't support comparison beyond equality.  A robust patch would need to
take that into account.



Ok, thank you.
Now I have a first solution. I am just wondering if this is robust, or 
do I miss something? Thanks for any comments...


My executor consumes rows from my own rewritten sub-query. From this 
sub-query I extract one sortGroupClause and from that the eqop and 
sortop during planning.


sgc = (SortGroupClause *) llast(sortClause);
node-eqOperator = sgc-eqop;
node-ltOperator = sgc-sortop;

The last sort clause uses the same types as the executor needs to 
compare later.


The executor initializes the methods with:

state-ltFunctionInfo = (FmgrInfo *) palloc(sizeof(FmgrInfo));
ltOperatorId = get_opcode(node-ltOperator);
fmgr_info(ltOperatorId, state-ltFunctionInfo);

state-eqFunctionInfo = (FmgrInfo *) palloc(sizeof(FmgrInfo));
eqOperatorId = get_opcode(node-eqOperator);
fmgr_info(eqOperatorId, state-eqFunctionInfo);

Finally I use them in this way...

static bool
isLessThan(Datum a, Datum b, FmgrInfo *ltFunctionInfo)
{
return DatumGetBool(FunctionCall2(ltFunctionInfo, a, b));
}

static bool
isEqual(Datum a, Datum b, FmgrInfo *eqFunctionInfo)
{
return DatumGetBool(FunctionCall2(eqFunctionInfo, a, b));
}




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to compare different datums within from a tuple?

2015-08-12 Thread Peter Moser



Am 11.08.2015 um 13:41 schrieb Anastasia Lubennikova:

Can someone tell me, how I can compare two datum fields, when I do
not know the data type in advance inside an executor function?


In a nutshell, there is no way to compare Datums.
Datum is an abstact data type. It's the backend internal representation
of a single value of any SQL data type.
The code using Datum has to know which type it is, since the Datum
itself doesn't contain that information.

For example, x less than y where x and y are of various types that
form intervals. I have found the method ExecTuplesMatch, but it is
only for equality comparison, I think. Another one is
ApplySortComparator... maybe that's the correct way to go?

Some code to make things clearer...

Datum x = heap_getattr(out-tts_tuple,
 node-xpos,
 out-tts_tupleDescriptor,
 isNull1);
Datum y = slot_getattr(curr, node-ypos, isNull2);

if (compareDatumWithCorrectMethod(x,y)  0)
{
  /* do something */
}

Thx,
Peter


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org
mailto:pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Maybe you can use DatumGetXXX function to extract value. For example,
DatumGetInt32.
http://doxygen.postgresql.org/postgres_8h.html#aacbc8a3ac6d52d85feaf0b7ac1b1160c
--
Best regards,
Lubennikova Anastasia


I did this with another column, that has always the result of 
row_number() as content, which I think will be fine when I use 
DatumGetInt32.


Thanks,
Peter


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How to compare different datums within from a tuple?

2015-08-10 Thread Peter Moser

Hello,
I try to write my first patch. It is too early to tell more about it, 
but I am just fiddling around with some prototypes.


Can someone tell me, how I can compare two datum fields, when I do not 
know the data type in advance inside an executor function?


For example, x less than y where x and y are of various types that 
form intervals. I have found the method ExecTuplesMatch, but it is only 
for equality comparison, I think. Another one is ApplySortComparator... 
maybe that's the correct way to go?


Some code to make things clearer...

Datum x = heap_getattr(out-tts_tuple,
node-xpos,
out-tts_tupleDescriptor,
isNull1);
Datum y = slot_getattr(curr, node-ypos, isNull2);

if (compareDatumWithCorrectMethod(x,y)  0)
{
 /* do something */
}

Thx,
Peter


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers