Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-22 Thread Arseny Sher
Masahiko Sawada <sawada.m...@gmail.com> writes:

> On Sun, Sep 17, 2017 at 2:01 AM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:

>> +  
>> +   Since dropping a replication slot is not transactional, we cannot run
>> +   DROP SUBSCRIPTION inside a transaction block if 
>> dropping
>> +   the replication slot. Also, DROP SUBSCRIPTOIN stops 
>> the
>> +   workers if the subscription got disabled in a transaction block even if
>> +   the transaction rolls back.
>> +  
>
> Hmm, isn't there necessary to care and mention about this kind of
> inconsistent behavior in docs?

Well, docs already say that dropping sub with replication slot is
non-transactional, see 'Description' section of DROP SUBSCRIPTION. As
for the second sentence, normally launcher will restart the worker
immediately after ABORT: old worker wakes up the launcher on exit, the
launcher waits on pg_subscription lock and restarts the worker instantly
after the rollback. If we tell users that the workers are stopped after
DROP SUBSCRIPTION rollback, we should also say that they will be
restarted soon.

--
Arseny Sher


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-15 Thread Arseny Sher
Peter Eisentraut <peter.eisentr...@2ndquadrant.com> writes:

> Here is a simple patch that fixes this, based on my original proposal
> point #4.

I checked, it passes the tests and solves the problem. However, isn't
this

+   if (slotname || !subenabled)

is a truism? Is it possible that subscription has no slot but still
enabled?

Besides, we can avoid stopping the workers if subscription has no
associated replication origin, though this probably means that
subscription was broken by user and is not worth it.

--
Arseny Sher


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-14 Thread Arseny Sher
Peter Eisentraut <peter.eisentr...@2ndquadrant.com> writes:

> On 9/13/17 07:00, Arseny Sher wrote:
>> On the other hand, forbidding to execute disable sub and drop sub in one
>> transaction makes it impossible e.g. to drop subscription in trigger
>
> Disabling a subscription before dropping it isn't very useful, is it?
> So I'm not sure this is an important use case we need to optimize.  We
> just need to prevent it from hanging.

Not quite so. Currently it is forbidded to set subscription's slot to
NONE on enabled sub, and the only way to drop sub without touching the
slot is to run ALTER SUBSCRIPTION SET (SLOT_NAME = NONE) beforehand.
Practically this means that we have to go through disable sub -> alter
sub -> drop sub pipeline if we want to left the slot alone or just would
like to drop sub in transaction block -- with replication slot set the
latter is impossible.

--
Arseny Sher


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-13 Thread Arseny Sher
Peter Eisentraut <peter.eisentr...@2ndquadrant.com> writes:
> We can break this in any number of ways:
>
> - (your patch) Kill workers right away after ALTER SUBSCRIPTION DISABLE,
> thus breaking the appearance of transactional DDL somewhat.
> ...
> - Have DROP SUBSCRIPTION attempt to kill workers if the subscription is
> disabled (and possibly, was changed in the same transaction), which
> would address this scenario very narrowly.

Actually, my patch is closer to the last variant. I proposed to kill the
workers in DROP SUBSCRIPTION, and only if we are dropping replication
origin (which is probably always the case, though). I agree that it is
somewhat narrow and still slightly violates transactionality of DROP
SUBSCRIPTION, meaning that its changes (stopped workers) are seen before
the commit.

However, do we care much about that? Is there any chance that users will
rely on living apply workers after DROP SUBSCRIPTION, but before the
transaction commit? In which situation this might be useful?

On the other hand, forbidding to execute disable sub and drop sub in one
transaction makes it impossible e.g. to drop subscription in trigger as
long as Postgres doesn't have autonomous transactions.


Tom Lane <t...@sss.pgh.pa.us> writes:
> ISTM the second of those (refuse to drop an in-use subscription) is
> by far the least surprising behavior.  However, I wonder if there aren't
> race conditions involved here.  What if we haven't yet committed a
> DROP SUBSCRIPTION, and some new worker starts up after we look for
> workers?

We hold a lock on subscription till the end of transaction, so workers
won't start.

> If there aren't variants of that that will break all four options,
> it's not very obvious why not.

I see it this way:
* We want effect of drop sub invisible till commit, so we can't stop
  workers before commit.
* Drop of replication origin needs to be executed in one transaction with
  drop sub, it writes to WAL and so must be executed before commit.
* Apply worker needs RO for its work, it owns origin for the whole
  lifetime.

Something should be given up here. One more idea that was not yet
mentioned is to abandon attempts to drop subs and ROs simultenously and
just garbage-collect replication origins periodically, but that doesn't
sound as an elegant solution.


Masahiko Sawada <sawada.m...@gmail.com> writes:

>> I don't think this is reliable -- what if worker suddenly dies without
>> accomplishing the job?
>
> The apply worker will be launched by the launcher later. If DROP
> SUBSCRIPTION is issued before the apply worker launches again, DROP
> SUBSCRIPTION itself can remove the replication origin.

Why launcher would restart the worker if we already destroyed the
subscription? Consider the sequence of actions:

* We check in DROP SUBSCRIPTION that worker alive and don't remove RO.
* DROP SUBSCRIPTION commits.
* Worker is killed by some villain before it had the chance to drop RO.
  It might be killed even before drop sub commit, but after the check,
  we are left with orphan RO anyway.

--
Arseny Sher


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-12 Thread Arseny Sher
Masahiko Sawada <sawada.m...@gmail.com> writes:

> FWIW, perhaps we can change the replication origin management so that
> DROP SUBSCRIPTION doesn't drop the replication origin and the apply
> worker itself removes it when exit. When an apply worker exits it
> removes the replication origin if the corresponding subscription had
> been removed.

I don't think this is reliable -- what if worker suddenly dies without
accomplishing the job? It is enough to kill the worker during execution
of (possibly long) transaction with DROP SUBSCRIPTION to meet such
situation. Dropping replication origin in one transaction with DROP
SUBSCRIPTION seems natural to me.

--
Arseny Sher


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-05 Thread Arseny Sher
Arseny Sher <a.s...@postgrespro.ru> writes:

> Attached patch fixes this by stopping workers before RO drop, as
> already done in case when we drop replication slot.

Sorry, here is the patch.
>From 008d54dfe2e8ba610bb7c897cfbb4ee7a700aa2e Mon Sep 17 00:00:00 2001
From: Arseny Sher <sher-...@yandex.ru>
Date: Mon, 4 Sep 2017 16:55:08 +0300
Subject: [PATCH] Stop LR workers before dropping replication origin.

Previously, executing ALTER SUBSCRIPTION DISABLE and DROP SUBSCRIPTION in one
transaction would hang forever, because worker didn't stop until transaction is
commited, and transaction couldn't commit before worker exit since the latter
occupies ReplicationState.
---
 src/backend/commands/subscriptioncmds.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 77620dfd42..5d608855a1 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -886,6 +886,10 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
 	else
 		slotname = NULL;
 
+	/* Get originid */
+	snprintf(originname, sizeof(originname), "pg_%u", subid);
+	originid = replorigin_by_name(originname, true);
+
 	/*
 	 * Since dropping a replication slot is not transactional, the replication
 	 * slot stays dropped even if the transaction rolls back.  So we cannot
@@ -909,9 +913,10 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
 	ReleaseSysCache(tup);
 
 	/*
-	 * If we are dropping the replication slot, stop all the subscription
-	 * workers immediately, so that the slot becomes accessible.  Otherwise
-	 * just schedule the stopping for the end of the transaction.
+	 * We stop all subscription workers immediately in two cases:
+	 * - We are dropping the replication slot, to make the slot accessible;
+	 * - We are dropping repl origin tracking, to release ReplicationState;
+	 * Otherwise just schedule the stopping for the end of the transaction.
 	 *
 	 * New workers won't be started because we hold an exclusive lock on the
 	 * subscription till the end of the transaction.
@@ -923,7 +928,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
 	{
 		LogicalRepWorker *w = (LogicalRepWorker *) lfirst(lc);
 
-		if (slotname)
+		if (slotname || originid != InvalidRepOriginId)
 			logicalrep_worker_stop(w->subid, w->relid);
 		else
 			logicalrep_worker_stop_at_commit(w->subid, w->relid);
@@ -937,8 +942,6 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
 	RemoveSubscriptionRel(subid, InvalidOid);
 
 	/* Remove the origin tracking if exists. */
-	snprintf(originname, sizeof(originname), "pg_%u", subid);
-	originid = replorigin_by_name(originname, true);
 	if (originid != InvalidRepOriginId)
 		replorigin_drop(originid, false);
 
-- 
2.11.0



--
Arseny Sher
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-05 Thread Arseny Sher
Hello,

Currently, DROP SUBSCRIPTION on REL_10_STABLE would block forever in the
following setup:

node 1:
create table t (i int);
create publication p for table t;

node 2:
create table t (i int);
create subscription s CONNECTION 'port=5432' publication p;
begin;
alter subscription s disable ;
alter subscription s set (slot_name = none);
drop subscription s;
end;

It hangs in replorigin_drop because we wait until ReplicationState is
released. This should happen on exit of worker, but worker will not exit
until transaction commit because he doesn't see that the sub was
disabled.

Attached patch fixes this by stopping workers before RO drop, as
already done in case when we drop replication slot. I am not sure
whether this is a proper solution, but for now I don't see any problems
with early stop of workers: even if transaction later aborts, launcher
will happily restart them, isn't it?


--
Arseny Sher
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expand_dbname in postgres_fdw

2017-07-27 Thread Arseny Sher
Tom Lane <t...@sss.pgh.pa.us> writes:

> I really don't see anything wrong with the FDW's documentation.  To claim
> that it's not clear, you have to suppose that a connstring's dbname field
> is allowed to recursively contain a connstring.  However, if you've got a
> concrete suggestion about improving the wording, let's see it.
>
> Now on the other hand, libpq's documentation seems a little confusing
> on this point independently of the FDW: so far as I can see, what "certain
> contexts" means is not clearly defined anywhere, and for that matter
> "checked for extended formats" is a masterpiece of unhelpful obfuscation.
>
>   regards, tom lane

I have to admit that you are right: strictly speaking, FDW doc says it
accepts the same options as libpq connstring => libpq connstring itself
can't contain another connstring => expansion is not allowed. However,
we might probably save the readers a few mental cycles if we avoid the
words 'connection strings' and just say that recognized options are the
same as of libpq ones, with (probably, see below) an explicit addition
that dbname is not expanded.

Regarding "checking for extended formats" phrase, I see three solutions:
1) In the description of 'dbname' parameter mention all the cases where
  it is possibly expanded, which doesn't sound as a good idea;
2) Specify whether dbname is expanded in every reference to the list of
  libpq options, which is arguably better.
3) We can also replace "In certain contexts" with "Where explicitly
   mentioned" in the desciption of 'dbname', and, while referencing
   options, never say anything about dbname if it is not expanded.

Besides, why not substitute "checked for extended formats" with "might
be recognized as a connection string" -- are there any other formats
than connstr?

The changes with point 3 chosen might look as in attached file.

>From 96d74f050724a8373c04e48205510a5a7e80d447 Mon Sep 17 00:00:00 2001
From: Arseny Sher <sher-...@yandex.ru>
Date: Thu, 27 Jul 2017 22:45:04 +0300
Subject: [PATCH] libpq docs: dbname param is expanded only when explicitly
 mentioned.

Also, avoid mentioning connstring in the desciption of parameters accepted by
postgres_fdw server.
---
 doc/src/sgml/libpq.sgml| 7 +++
 doc/src/sgml/postgres-fdw.sgml | 6 +++---
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index ad5e9b95b4..dde6647fc5 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1040,10 +1040,9 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   dbname
   
   
-   The database name.  Defaults to be the same as the user name.
-   In certain contexts, the value is checked for extended
-   formats; see  for more details on
-   those.
+   The database name.  Defaults to be the same as the user name. Where
+   explicitly said, the value can be recognized as a connection string;
+   see  for more details on those.
   
   
  
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index d83fc9e52b..e3c41bfd97 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -100,9 +100,9 @@
 

 A foreign server using the postgres_fdw foreign data wrapper
-can have the same options that libpq accepts in
-connection strings, as described in ,
-except that these options are not allowed:
+can have the same options that are recognized by libpq, as
+described in , except that these
+options are not allowed:
 
 
  
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expand_dbname in postgres_fdw

2017-07-27 Thread Arseny Sher
On Thu, Jul 27, 2017 at 12:21 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> The main problem to my mind is that a connection string could possibly
> override items meant to be specified elsewhere.  In particular it ought
> not be allowed to specify the remote username or password, because those
> are supposed to come from the user mapping object not the server object.
> I suspect you could break things by trying to specify client_encoding
> there, as well.

Attached patch allows dbname expansion and makes sure that it doesn't
contain any invalid options. Whether you decide to commit it or not
(at the moment I don't see any security implications, at least not more than
in usual dbname expansion usage, e.g. in psql, but who knows), it seems
to me that the documentation should be updated since currently it is not
clear on the subject, as the beginning of this thread proves.
>From 09e3924501d219331b8b8fca12a9ea35a689ba10 Mon Sep 17 00:00:00 2001
From: Arseny Sher <sher-...@yandex.ru>
Date: Thu, 27 Jul 2017 12:41:17 +0300
Subject: [PATCH] Allow 'dbname' option contain full-fledged connstring in
 postgres_fdw

And if this is the case, validate that it doesn't contain any invalid options
such as client_encoding or user.
---
 contrib/postgres_fdw/connection.c |  2 +-
 contrib/postgres_fdw/option.c | 80 ---
 2 files changed, 68 insertions(+), 14 deletions(-)

diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index be4ec07cf9..bfcd9ed2e3 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -263,7 +263,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
 		/* verify connection parameters and make connection */
 		check_conn_params(keywords, values);
 
-		conn = PQconnectdbParams(keywords, values, false);
+		conn = PQconnectdbParams(keywords, values, true);
 		if (!conn || PQstatus(conn) != CONNECTION_OK)
 			ereport(ERROR,
 	(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 67e1c59951..26c11e5179 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -50,8 +50,10 @@ static PQconninfoOption *libpq_options;
  * Helper functions
  */
 static void InitPgFdwOptions(void);
+static void validate_dbname(const char *dbname);
+static void get_opts_hint(StringInfo buf, Oid context, bool libpq_options);
 static bool is_valid_option(const char *keyword, Oid context);
-static bool is_libpq_option(const char *keyword);
+static bool is_libpq_option(const char *keyword, Oid context);
 
 
 /*
@@ -86,16 +88,8 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
 			 * Unknown option specified, complain about it. Provide a hint
 			 * with list of valid options for the object.
 			 */
-			PgFdwOption *opt;
 			StringInfoData buf;
-
-			initStringInfo();
-			for (opt = postgres_fdw_options; opt->keyword; opt++)
-			{
-if (catalog == opt->optcontext)
-	appendStringInfo(, "%s%s", (buf.len > 0) ? ", " : "",
-	 opt->keyword);
-			}
+			get_opts_hint(, catalog, false);
 
 			ereport(ERROR,
 	(errcode(ERRCODE_FDW_INVALID_OPTION_NAME),
@@ -143,6 +137,10 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
 		 errmsg("%s requires a non-negative integer value",
 def->defname)));
 		}
+		else if (strcmp(def->defname, "dbname") == 0)
+		{
+			validate_dbname(defGetString(def));
+		}
 	}
 
 	PG_RETURN_VOID();
@@ -250,6 +248,59 @@ InitPgFdwOptions(void)
 }
 
 /*
+ * If dbname param specified, make sure it doesn't carry invalid
+ * options if it is a connstring.
+ */
+static void
+validate_dbname(const char *dbname)
+{
+	PQconninfoOption *lopts;
+	PQconninfoOption *lopt;
+
+	/* If it is not a connstring, just skip the check */
+	if ((lopts = PQconninfoParse(dbname, NULL)) != NULL)
+	{
+		for (lopt = lopts; lopt->keyword; lopt++)
+		{
+			if (lopt->val != NULL && !is_libpq_option(lopt->keyword,
+	  ForeignServerRelationId))
+			{
+StringInfoData buf;
+get_opts_hint(, ForeignServerRelationId, true);
+
+ereport(ERROR,
+		(errcode(ERRCODE_FDW_INVALID_OPTION_NAME),
+		 errmsg("invalid option in dbname connstring \"%s\"",
+lopt->keyword),
+		 errhint("Valid options in this context are: %s",
+ buf.data)));
+			}
+		}
+		PQconninfoFree(lopts);
+	}
+}
+
+/*
+ * Put to 'buf' a hint with list of valid options for the object 'context'. If
+ * libpq_options is true, only libpq options are provided. 'buf' must point to
+ * existing StringInfoData struct when called.
+ */
+static void
+get_opts_hint(StringInfo buf, Oid context, bool libpq_options)
+{
+	PgFdwOption *opt;
+
+	initStringInfo(buf);
+	for (opt = postgres_fdw_options; opt->keyword; opt++)
+	{
+		if (context == opt->optcontext &&
+			

[HACKERS] expand_dbname in postgres_fdw

2017-07-25 Thread Arseny Sher
Hi,

Is there any particular reason why postgres_fdw forbids usage of
connection strings by passing expand_dbname = false to
PQconnectdbParams? Attached patch shows where this happens.

>From 6bf3741976b833379f5bb370aa41f73a51b99afd Mon Sep 17 00:00:00 2001
From: Arseny Sher <sher-...@yandex.ru>
Date: Tue, 25 Jul 2017 21:36:45 +0300
Subject: [PATCH] expand_dbname=true in postgres_fdw

---
 contrib/postgres_fdw/connection.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index be4ec07cf9..bfcd9ed2e3 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -263,7 +263,7 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
 		/* verify connection parameters and make connection */
 		check_conn_params(keywords, values);
 
-		conn = PQconnectdbParams(keywords, values, false);
+		conn = PQconnectdbParams(keywords, values, true);
 		if (!conn || PQstatus(conn) != CONNECTION_OK)
 			ereport(ERROR,
 	(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
-- 
2.11.0


--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-04-03 Thread Arseny Sher
Time is short, student's application deadline is on 3rd April. I decided
to reformulate the project scope myself. Here is the proposal:

https://docs.google.com/document/d/1dvBETE6IJA9AcXd11XJNPsF_VPcDhSjy7rlsxj262l8/edit?usp=sharing

The main idea is that now there is a formalized goal of the project,
"partial support of all TPC-H queries".

I am also CC'ing people who was mentioned in "Potential Mentors" section
on GSoC wiki page.


--
Arseny Sher


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Push-based query executor discussion

2017-03-22 Thread Arseny Sher
> While I admire your fearlessness, I think the chances of you being
> able to bring a project of this type to a successful conclusion are
> remote.  Here is what I said about this topic previously:
>
> http://postgr.es/m/CA+Tgmoa=kzhj+twxyq+vku21nk3prkrjsdbhjubn7qvc8uk...@mail.gmail.com

Well, as I said, I don't pretend that I will support full functionality:
>> instead, we should decide which part of this work (if any) is
>> going to be done in the course of GSoC. Probably, all TPC-H queries
>> with and without index support is a good initial target, but this
>> needs to be discussed.

I think that successfull completion of this project should be a clear
and justified answer to the question "Is this idea is good enough to
work on merging it into the master?", not the production-ready patches
themselves. Nevertheless, of course project success criterion must be
reasonably formalized -- e.g. implement nodes X with features Y, etc.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [GSoC] Push-based query executor discussion

2017-03-06 Thread Arseny Sher
patch ready to be merged into Postgres master straight away, because
it is rather radical change; and it seems that supporting both
executors simultaneously is also a bad idea because many code would be
duplicated in this case.


*Related work*
There are other works aimed for improving executor performance. I can
mention at least four approaches:
* JITing things [6][10][17]
* batched and/or vectorized execution [7][8][9]
* expression evaluation optimizations [10][17]
* slot deforming optimizations [10]

The latter two are orthogonal to the proposed project.
Batched execution and JIT can be applied together, and some study [10]
shows benefits of such combined approach.

While batched execution and push-based execution model can be applied
together too, they seemingly solve more or less same problems -- code
and data locality, avoiding reloading node's state and better use of
modern CPU features.  However, batched execution requires massive
changes to the current code and seems harder to implement; IIRC I have
seen some patches on this at mailing lists, but I am not aware which
work is the most complete as of now. It is not easy to compare these
approaches theoretically; probably, the best way to estimate their
effect is to implement them and run benchmarks.

Relationship between JIT-compilation and push-based execution model is
interesting: paper [5] shows that HyPer system with JIT + push runs 4x
times faster on some queries than JIT + pull. It should be noted that
HyPer uses column-wise storage though.
Full query compiler developed at ISP RAS [6] speeds up query
processing 2-5 times on TPC-H queries and exploits push-based
execution model. However, supporting it requires implementing executor
logic twice: in plain C for usual interpreter and in LLVM API for JIT
compiler. Ideally we would like to write the code once and be able to
use it either with and without JIT compilation.
There is an ongoing work at ISP RAS to make this possible using
automatic runtime code specialization; however, experiments have
showed that specialization of original Postgres C code doesn't give
significant improvement because of Volcano-style model. We expect that
after making a switch to push-based model in Postgres code we will
achieve speedup comparable to full-query JIT using runtime
specialization.


*About me*
My name is Arseny Sher. Currently, I am studying master's degree at
CMC MSU [12] and working at ISP RAS [13]. Earlier I got bachelor's
degree at the same faculty. I started working with Postgres at the end
of October and I love its extensibility and excellent quality of
code. My previous work was mainly connected with big graphs
computation (keywords are Spark, GraphX, Scala, GraphLab). I also did
some scala.js coding for Russian philologists and participated in
project for IMDG performance comparison, doing mostly devops stuff
(Docker, Ansible, Python). Here are my stackoverflow [14] & github
accounts [15].

The idea of this project was born when my colleagues working on JITing
Postgres realized that runtime specialization of C code written in
push-based architecture should be much more efficient than
specializing existing code (see 'Related work' section), and at the
time I’ve decided that I want my thesis to be connected with
PostgreSQL.

I am ready to work full-time this summer and I think that push-based
execution of all TPC-H queries is quite an achievable goal; however, I
haven't yet studied all required nodes in detail and I will make more
exact estimations if the community supports this project.

P.S. Should letters like this go to hackers or students mailing list?
The latter seems more suitable, but it looks rather dead...


References

[1] 
https://wiki.postgresql.org/wiki/GSoC_2017#Implementing_push-based_query_executor
[2] 
https://www.postgresql.org/message-id/CAFRJ5K3sDGSn%3DpKgnsobYQX4CMTOU%3D0uJ-vt2kF3t1FsVnTCRQ%40mail.gmail.com
[4] Efficiently Compiling Efficient Query Plans for Modern Hardware,
http://www.vldb.org/pvldb/vol4/p539-neumann.pdf
[5] Compiling Database Queries into Machine Code,
http://sites.computer.org/debull/A14mar/p3.pdf
[6] LLVM Cauldron, slides
http://llvm.org/devmtg/2016-09/slides/Melnik-PostgreSQLLLVM.pdf
[7] 
https://www.postgresql.org/message-id/flat/CA%2BTgmobx8su_bYtAa3DgrqB%2BR7xZG6kHRj0ccMUUshKAQVftww%40mail.gmail.com#ca+tgmobx8su_bytaa3dgrqb+r7xzg6khrj0ccmuushkaqvf...@mail.gmail.com
[8] 
https://www.postgresql.org/message-id/flat/20160624232953.beub22r6yqux4gcp%40alap3.anarazel.de#20160624232953.beub22r6yqux4...@alap3.anarazel.de
[9] 
https://www.postgresql.org/message-id/flat/50877c0c-fb88-b601-3115-55a8c70d693e%40postgrespro.ru#50877c0c-fb88-b601-3115-55a8c70d6...@postgrespro.ru
[10] 
https://www.postgresql.org/message-id/flat/20161206034955.bh33paeralxbtluv%40alap3.anarazel.de#20161206034955.bh33paeralxbt...@alap3.anarazel.de
[11] Vectorization vs. Compilation in Query Execution