date:20180129

Re: PATCH: pgbench - break out timing data for initialization phases

2018-01-29 Thread Fabien COELHO



Hello Doug,


With patch and ‘-I dtgvpf’ options:
pgrun pgbench -i -s 2000 -F 90 -q -I dtgvpf
dropping old tables...
creating tables...
generating data...
…
2 of 2 tuples (100%) done (elapsed 168.76 s, remaining 0.00 s)
vacuuming...
creating primary keys...
creating foreign keys...
total time: 353.52 s (drop 1.67 s, tables 0.11 s, insert 168.82 s, commit 0.46 
s, primary 92.32 s, foreign 40.11 s, vacuum 50.03 s)
done.


I'm in favor of such a feature.

However, I think that the durations should be shown in the order in which 
the initialization is performed.


I would suggest to:

- move the time measure in the initialization loop, instead of doing it
 in each function, so that it is done just in one place.

- maybe store the actions in some array/list data structure, eg:
  "{ char * phase; double duration; }", so that they can be kept
  in order and eventually repeated.

In order to extract the commit time, I'd say that explicit begin and 
commit should be separate instructions triggerred by '(' and ')'.


Also, I'm not sure of the one line display, maybe it could be done while 
it is in progress, i.e. something like:

  dropping table...
  table drop: 1.67 s
  creating table...
  table creation: 0.11 s
  ...
In which case there is no need for storing the actions and their 
durations, only the running total is needed.


--
Fabien.

Re: [HACKERS] [PATCH] Lockable views

2018-01-29 Thread Thomas Munro

On Tue, Jan 30, 2018 at 6:48 PM, Tatsuo Ishii  wrote:
>>> You need to DROP VIEW lock_view4 and lock_view5 in the regression
>>> test as well.
>>
>> Thank you for reviewing the patch.
>>
>> I fixed this and attached a updated patch v6.
>
> Looks good to me. If there's no objection, especially from Thomas
> Munro, I will mark this as "ready for committer".

About the idea:  it makes some kind of sense to me that we should lock
the underlying table, in all the same cases that you could do DML on
the view automatically.  I wonder if this is a problem for the
soundness:  "Tables appearing in a subquery are ignored and not
locked."  I can imagine using this for making backwards-compatible
schema changes, in which case the LOCK-based transaction isolation
techniques might benefit from this behaviour.  I'd be interested to
hear about the ideal use case you have in mind.

About the patch:  I didn't study it in detail.  It builds, has
documentation and passes all tests.  Would it be a good idea to add an
isolation test to show that the underlying relation is actually
locked?

Typo:

+ /* Check permissions with the view owner's priviledge. */

s/priviledge/privilege/

Grammar:

+/*
+ * Check whether the view is lockable.
+ *
+ * Currently, only auto-updatable views can be locked, that is,
+ * views whose definition are simple and that doesn't have
+ * instead of rules or triggers are lockable.

s/definition are simple and that doesn't/definition is simple and that don't/
s/instead of/INSTEAD OF/ ?

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] [PATCH] Lockable views

2018-01-29 Thread Tatsuo Ishii

>> You need to DROP VIEW lock_view4 and lock_view5 in the regression
>> test as well.
> 
> Thank you for reviewing the patch.
> 
> I fixed this and attached a updated patch v6.

Looks good to me. If there's no objection, especially from Thomas
Munro, I will mark this as "ready for committer".

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

RE: Temporary tables prevent autovacuum, leading to XID wraparound

2018-01-29 Thread Tsunakawa, Takayuki

From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
> On Mon, Jan 29, 2018 at 3:33 PM, Tsunakawa, Takayuki
>  wrote:
> > I can understand your concern.  On the other hand, it's unfair that one
> database could monopolize all workers, because other databases might also
> be facing wraparound risk.
> 
> On third thought, we can change the policy of launching workers so that
> the launcher dispatches workers evenly to wraparound-risky databases
> instead of choosing only one most wraparound-risky database.

+1

Regards
Takayuki Tsunakawa

Re: [HACKERS] [PATCH] Lockable views

2018-01-29 Thread Yugo Nagata

On Tue, 30 Jan 2018 13:58:30 +0900 (JST)
Tatsuo Ishii  wrote:

> > Attached is the updated patch v5 including fixing SGML and rebase to HEAD.
> 
> You need to DROP VIEW lock_view4 and lock_view5 in the regression
> test as well.

Thank you for reviewing the patch.

I fixed this and attached a updated patch v6.

Regards,

> 
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp


-- 
Yugo Nagata 
diff --git a/doc/src/sgml/ref/lock.sgml b/doc/src/sgml/ref/lock.sgml
index b2c7203..6ddd128 100644
--- a/doc/src/sgml/ref/lock.sgml
+++ b/doc/src/sgml/ref/lock.sgml
@@ -46,6 +46,14 @@ LOCK [ TABLE ] [ ONLY ] name [ * ]
   
 
   
+   Automatically updatable views (see )
+   that do not have INSTEAD OF triggers or
+   INSTEAD rules are also lockable. When a view is
+   locked, its base relations are also locked recursively with the same
+   lock mode. Tables appearing in a subquery are ignored and not locked.
+  
+
+  
When acquiring locks automatically for commands that reference
tables, PostgreSQL always uses the least
restrictive lock mode possible. LOCK TABLE
diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c
index 6479dcb..7ae3a84 100644
--- a/src/backend/commands/lockcmds.c
+++ b/src/backend/commands/lockcmds.c
@@ -19,15 +19,20 @@
 #include "commands/lockcmds.h"
 #include "miscadmin.h"
 #include "parser/parse_clause.h"
+#include "parser/parsetree.h"
 #include "storage/lmgr.h"
 #include "utils/acl.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
+#include "rewrite/rewriteHandler.h"
+#include "access/heapam.h"
 
-static void LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait);
-static AclResult LockTableAclCheck(Oid relid, LOCKMODE lockmode);
+static void LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait, Oid userid);
+static AclResult LockTableAclCheck(Oid relid, LOCKMODE lockmode, Oid userid);
 static void RangeVarCallbackForLockTable(const RangeVar *rv, Oid relid,
 			 Oid oldrelid, void *arg);
+static void LockViewRecurse(Oid reloid, Oid root_reloid, LOCKMODE lockmode, bool nowait);
+static void LockViewCheck(Relation view, Query *viewquery);
 
 /*
  * LOCK TABLE
@@ -62,8 +67,10 @@ LockTableCommand(LockStmt *lockstmt)
 		  RangeVarCallbackForLockTable,
 		  (void *) >mode);
 
-		if (recurse)
-			LockTableRecurse(reloid, lockstmt->mode, lockstmt->nowait);
+		if (get_rel_relkind(reloid) == RELKIND_VIEW)
+			LockViewRecurse(reloid, reloid, lockstmt->mode, lockstmt->nowait);
+		else if (recurse)
+			LockTableRecurse(reloid, lockstmt->mode, lockstmt->nowait, GetUserId());
 	}
 }
 
@@ -86,15 +93,17 @@ RangeVarCallbackForLockTable(const RangeVar *rv, Oid relid, Oid oldrelid,
 		return;	/* woops, concurrently dropped; no permissions
  * check */
 
-	/* Currently, we only allow plain tables to be locked */
-	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
+
+	/* Currently, we only allow plain tables or auto-updatable views to be locked */
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE &&
+		relkind != RELKIND_VIEW)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("\"%s\" is not a table",
+ errmsg("\"%s\" is not a table or view",
 		rv->relname)));
 
 	/* Check permissions. */
-	aclresult = LockTableAclCheck(relid, lockmode);
+	aclresult = LockTableAclCheck(relid, lockmode, GetUserId());
 	if (aclresult != ACLCHECK_OK)
 		aclcheck_error(aclresult, get_relkind_objtype(get_rel_relkind(relid)), rv->relname);
 }
@@ -107,7 +116,7 @@ RangeVarCallbackForLockTable(const RangeVar *rv, Oid relid, Oid oldrelid,
  * multiply-inheriting children more than once, but that's no problem.
  */
 static void
-LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait)
+LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait, Oid userid)
 {
 	List	   *children;
 	ListCell   *lc;
@@ -120,7 +129,7 @@ LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait)
 		AclResult	aclresult;
 
 		/* Check permissions before acquiring the lock. */
-		aclresult = LockTableAclCheck(childreloid, lockmode);
+		aclresult = LockTableAclCheck(childreloid, lockmode, userid);
 		if (aclresult != ACLCHECK_OK)
 		{
 			char	   *relname = get_rel_name(childreloid);
@@ -157,15 +166,82 @@ LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait)
 			continue;
 		}
 
-		LockTableRecurse(childreloid, lockmode, nowait);
+		LockTableRecurse(childreloid, lockmode, nowait, userid);
 	}
 }
 
 /*
+ * Apply LOCK TABLE recursively over a view
+ */
+static void
+LockViewRecurse(Oid reloid, Oid root_reloid, LOCKMODE lockmode, bool nowait)
+{
+	Relation		 view;
+	Query			*viewquery;
+	RangeTblRef		*rtr;
+	RangeTblEntry	*base_rte;
+	Oid baseoid;
+	AclResult		 aclresult;
+	char			*relname;
+	char			 relkind;
+
+	view = heap_open(reloid, NoLock);
+

Re: Enhance pg_stat_wal_receiver view to display connected host

2018-01-29 Thread Michael Paquier

On Tue, Jan 30, 2018 at 03:10:12PM +1100, Haribabu Kommi wrote:
> Ok, understood. As the libpq gives preference to hostaddr connection
> parameter than host while connecting. How about going with one column
> "remote_host" that displays either hostaddr(if exists) or hostname. So that
> one column that displays the actual remote host to where it connected?
>
> Note : The one column approach for both host and hostaddr will depend on
> how we go with PQhostaddr() function.

Yeah, we don't want to begin a open battle for that.  Using one column as
a first step would still be useful anyway.  If the discussion about
PQhostaddr() comes to a result at some point, then it could make sense
to integrate that with pg_stat_wal_receiver.  The problem with PQhost
handling strangely multiple host values is inconsistent though.

> OK. I will move the patch to next commitfest.

Thanks.  Let's see what others think on all those threads.
--
Michael

signature.asc
Description: PGP signature

Re: [HACKERS] [PATCH] Lockable views

2018-01-29 Thread Tatsuo Ishii

> Attached is the updated patch v5 including fixing SGML and rebase to HEAD.

You need to DROP VIEW lock_view4 and lock_view5 in the regression
test as well.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: [HACKERS] GnuTLS support

2018-01-29 Thread Michael Paquier

On Mon, Jan 29, 2018 at 09:16:56PM -0500, Peter Eisentraut wrote:
> On 1/28/18 23:43, Michael Paquier wrote:
>> The comment at the top of PQinitSSL mentions OpenSSL, you may want to
>> get rid of it as well.
> 
> I think that whole business is actually obsolete as of OpenSSL 1.1.0 and
> not applicable to GnuTLS, so we might just leave it to die with some
> appropriate documentation updates.

Yes, that would be fine as well.

>> To be honest, I find this refactoring confusing. As things stand on
>> HEAD, fe-secure.c depends on the contents of fe-secure-openssl.c, and
>> the dependency goes only in one direction.  With your patch, you also
>> make fe-secure-openssl.c call things within fe-secure.c.  It seems to me
>> that a cleaner split would be to introduce a common file for all the
>> low-level routines like the two ones you are introducing here, say
>> fe-secure-common.c. aND pq_verify_peer_name_matches_certificate_name and
>> pq_verify_peer_name_matches_certificate should be moved to that.
> 
> I like that.  Updated patch attached.

Thanks for the new version. This looks sane to me.

 ifeq ($(with_openssl),yes)
-OBJS += fe-secure-openssl.o sha2_openssl.o
+OBJS += fe-secure-openssl.o fe-secure-common.o sha2_openssl.o
 else
As fe-secure-common.c will only be built with SSL builds, you need also
to tweak Mkvcbuild.pm and remove it from libpq's compilation.

+   /* If string does not end in pattern (minus the wildcard), we don't
+* match */
Such comment blocks are not Postgres-like.

src/interfaces/libpq/nls.mk is updated automatically with translations
by the way?
--
Michael


signature.asc
Description: PGP signature

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Jeff Davis

Hi,

On Mon, Jan 29, 2018 at 10:40 AM, Andres Freund  wrote:
> Hi,
>
> On 2018-01-29 10:28:18 -0800, Jeff Davis wrote:
>> OK. How about this: are you open to changes that move us in the
>> direction of extensibility later? (By this I do *not* mean imposing a
>> bunch of requirements on you... either small changes to your patches
>> or something part of another commit.)
>
> I'm good with that.
>
>
>> Or are you determined that this always should be a part of core?

> I'm strongly against there not being an in-core JIT. I'm not at all
> against adding APIs that allow to do different JIT implementations out
> of core.

I can live with that.

I recommend that you discuss with packagers and a few others, to
reduce the chance of disagreement later.

> Well, the source would require an actual compiler around. And the
> inlining *just* for the function code itself isn't actually that
> interesting, you e.g. want to also be able to

I think you hit enter too quicly... what's the rest of that sentence?

Regards,
  Jeff Davis

Re: Temporary tables prevent autovacuum, leading to XID wraparound

2018-01-29 Thread Masahiko Sawada

On Mon, Jan 29, 2018 at 3:33 PM, Tsunakawa, Takayuki
 wrote:
> From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
>> What I thought is that a worker reports these two values after scanned
>> pg_class and after freezed a table. The launcher decides to launch a new
>> worker if the number of tables requiring anti-wraparound vacuum is greater
>> than the number of workers running on the database. Similarly, the
>> autovacuum launcher doesn't launch a new worker if two values are equal,
>> which means all tables requiring an anti-wraparound vacuum is being vacuumed.
>> There are chances that new relation is added during a worker is running
>> on the last one table that requires anti-wraparound vacuum and launcher
>> launches a new worker on the database. I think it's no problem because the
>> new worker would update that two values and exits soon.
>
> I got it.  Currently, the launcher assigns all workers to one database even 
> if that database has only one table in danger of wraparound.  With your 
> suggestion, the launcher assigns as many workers as the tables to be frozen, 
> and use remaining workers for the other databases.
>
>
>> > How about just modifying do_start_worker(), so that the launcher chooses
>> a database in the following order?
>> >
>> > 1. wraparound-risky database not being vacuumed by any worker 2.
>> > non-wraparound-risky database  not being vacuumed by any worker 3.
>> > wraparound-risky database being vacuumed by any worker 4.
>> > non-wraparound-risky database being vacuumed by any worker
>> >
>>
>> IMO the limiting the number of worker on a database to 1 seems risky.
>> If a database has many tables that require an anti-wraparound vacuum, it
>> takes a long time to freeze the all of these tables. In current
>> implementation, as I mentioned as above, launcher can launch multiple worker
>> on the one database.
>
> I can understand your concern.  On the other hand, it's unfair that one 
> database could monopolize all workers, because other databases might also be 
> facing wraparound risk.

On third thought, we can change the policy of launching workers so
that the launcher dispatches workers evenly to wraparound-risky
databases instead of choosing only one most wraparound-risky database.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Enhance pg_stat_wal_receiver view to display connected host

2018-01-29 Thread Haribabu Kommi

On Mon, Jan 29, 2018 at 7:06 PM, Michael Paquier 
wrote:

> On Tue, Jan 16, 2018 at 05:56:22PM +1100, Haribabu Kommi wrote:
> > Without PQhostaddr() function, for the connections where the host is not
> > specified, it will be difficult to find out to remote server.
>
> That's true as well, but hostaddr should be used with host only to save
> IP lookups... There are recent threads on the matter, like this one:
> https://www.postgresql.org/message-id/15728.1493654814%40sss.pgh.pa.us
> See particularly the commits cited in this message. PQhostaddr has been
> already introduced, and reverted in the tree.
>

Thanks for the link, I checked it. The main reason for the revert was
because
of using host to return the hostaddr even though the host is not specified.
But it may be fine to return the hostaddr with a different function.
Following
are sequence of commits that has changed the behavior.

commit 274bb2b3 introduced the support of returning PQhost() function to
return the connected host. (internally hostaddr also)

commit 11003eb5 added a check in the PQhost() function not to return the
host if the connection type is CHT_HOST_ADDRESS.

This is because an earlier change in connectOptions2() function that changes
the host with hostaddr, because of this reason, PQhost() function returns
the hostaddr and it is not expected from PQhost() function when the host is
not specified.

if (conn->pghostaddr != NULL && conn->pghostaddr[0] != '\0')
{
conn->connhost[0].host = strdup(conn->pghostaddr);
if (conn->connhost[0].host == NULL)
goto oom_error;
conn->connhost[0].type = CHT_HOST_ADDRESS;
}

But the above part of the code is changed in commit 7b02ba62
to support "Allow multiple hostaddrs to go with multiple hostnames"
With this commit, the commit  11003eb5 that blocks the host names
that are of  CHT_HOST_ADDRESS connection type is not be required.

> This may lead to some confusion as well. Take for example this
> connection string:
> 'port=,5432 hostaddr=127.0.0.1,127.0.0.1 host=/tmp,/tmp'
> =# select remote_hostname, remote_hostaddr, remote_port from
>pg_stat_wal_receiver;
>  remote_hostname | remote_hostaddr | remote_port
> -+-+-
>  /tmp| 127.0.0.1   |5432
> (1 row)
> The documentation states that in this case the IP connection is used,
> though this can be confusing for users to show both.  I'll bet that we
> would get complains about that, without at least proper documentation.
>

Ok, understood. As the libpq gives preference to hostaddr connection
parameter than host while connecting. How about going with one column
"remote_host" that displays either hostaddr(if exists) or hostname. So that
one column that displays the actual remote host to where it connected?

Note : The one column approach for both host and hostaddr will depend on
how we go with PQhostaddr() function.

> So my take would be to really just use PQhost and PQport, as this does
> not remove any usefulness of this feature.  If you want to use IP
> addresses, there is nothing preventing you to use them in host as well,
> and those would show up properly.  The commit fest is coming to an end,
> so my recommendation would be to move it on the next CF and get feedback
> on https://www.postgresql.org/message-id/CAJrrPGdrC4JTJQ4d7PT1B
> i7K8nW91XPMPQ5kJ3GWK3ts%2BW-35g%40mail.gmail.com
> before concluding on this feature.  The problem with PQhost and multiple
> hosts is quite different than the 1st thread I am referring in this
> email, so let's wait and see for Robert's input.
>

OK. I will move the patch to next commitfest.

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] GnuTLS support

2018-01-29 Thread Andres Freund

Hi,

On 2018-01-29 22:41:53 -0500, Tom Lane wrote:
> But I think a big part of the value here is to verify that we've
> cleaned up our internal APIs to the point where a different SSL/TLS
> implementation *could* be rolled underneath.

Yea, I completely agree with that.

> As part of that, we certainly want to look at gnutls.  There might be
> more practical value (at least in the short term) in porting to the
> macOS or Windows native TLS stacks.  But the more different libraries
> we look at in the process, the less likely we are to paint ourselves
> into a corner.

That's true. But any further development in the area is already going to
be painful with three libraries (openssl, native windows, native osx),
adding support for a fourth that doesn't buy as anything just seems to
make the situation worse.

Anyway, I'm only -0.5 on it, and I've said my spiel...

Greetings,

Andres Freund

Re: [HACKERS] GnuTLS support

2018-01-29 Thread Tom Lane

Andres Freund  writes:
> FWIW, I'm -0.5 on adding gnutls support. I've not seen any non-vague
> arguments for it, and having debugged gnutls using applications in the
> past, I'm not convinced we're not primarily increasing our workload by
> adding support. If gnutls would improve our windows or OSX situation,
> I'd think differently, but afaics it doesn't.

That's a fair point.  But I think a big part of the value here is to
verify that we've cleaned up our internal APIs to the point where a
different SSL/TLS implementation *could* be rolled underneath.  As part
of that, we certainly want to look at gnutls.  There might be more
practical value (at least in the short term) in porting to the macOS or
Windows native TLS stacks.  But the more different libraries we look at
in the process, the less likely we are to paint ourselves into a corner.

regards, tom lane

Re: [HACKERS] GnuTLS support

2018-01-29 Thread Michael Paquier

On Mon, Jan 29, 2018 at 07:39:33PM -0500, Peter Eisentraut wrote:
> I think most users actually still think about the whole topic as "SSL",
> and the leading library is called "OpenSSL", so I think we're fine.

Yes, that's my impression on the matter as well.  While renaming the
client-side parameters sounds not really plausible, the server-side
parameters could be renamed with an implementation-related prefix if
another implementation than OpenSSL is used.  Until that happens, any
server-side renaming does not justify the breakage in my opinion.
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] GnuTLS support

2018-01-29 Thread Michael Paquier

On Mon, Jan 29, 2018 at 06:24:18PM -0800, Andres Freund wrote:
> FWIW, I'm -0.5 on adding gnutls support. I've not seen any non-vague
> arguments for it, and having debugged gnutls using applications in the
> past, I'm not convinced we're not primarily increasing our workload by
> adding support. If gnutls would improve our windows or OSX situation,
> I'd think differently, but afaics it doesn't.

That's an interesting point.  The last patch set presented by Peter
improves the pluggability situation for Windows and OSX as well, so
those are a good addition anyway.
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] GnuTLS support

2018-01-29 Thread Andres Freund

On 2018-01-17 12:30:16 -0500, Peter Eisentraut wrote:
> On 1/2/18 10:35, Peter Eisentraut wrote:
> > On 11/26/17 20:05, Andreas Karlsson wrote:
> >> I have now implemented this in the attached patch (plus added support 
> >> for channel binding and rebased it) but I ran into one issue which I 
> >> have not yet solved. The script for the windows version takes the 
> >> --with-openssl= switch so that cannot just be translated to a 
> >> single --with-ssl switch. Should to have both --with-openssl and 
> >> --with-gnutls or --with-ssl=(openssl|gnutls) and --with-ssl-path=? 
> >> I also do not know the Windows build code very well (or really at all).
> > 
> > This patch appears to work well.
> 
> Seeing that Andres is apparently currently not available, I have started
> to dig through this patch myself and made some adjustments.

I presume you meant Andreas, right?


FWIW, I'm -0.5 on adding gnutls support. I've not seen any non-vague
arguments for it, and having debugged gnutls using applications in the
past, I'm not convinced we're not primarily increasing our workload by
adding support. If gnutls would improve our windows or OSX situation,
I'd think differently, but afaics it doesn't.

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.1

2018-01-29 Thread Craig Ringer

On 29 January 2018 at 22:53, Andres Freund  wrote:

> Hi,
>
> On 2018-01-23 23:20:38 -0800, Andres Freund wrote:
> > == Code ==
> >
> > As the patchset is large (500kb) and I'm still quickly evolving it, I do
> > not yet want to attach it. The git tree is at
> >   https://git.postgresql.org/git/users/andresfreund/postgres.git
> > in the jit branch
> >   https://git.postgresql.org/gitweb/?p=users/andresfreund/
> postgres.git;a=shortlog;h=refs/heads/jit
>
> I've just pushed an updated and rebased version of the tree:
> - Split the large "jit infrastructure" commits into a number of smaller
>   commits
> - Split the C++ file
> - Dropped some of the performance stuff done to heaptuple.c - that was
>   mostly to make performance comparisons a bit more interesting, but
>   doesn't seem important enough to deal with.
> - Added a commit renaming datetime.h symbols so they don't conflict with
>   LLVM variables anymore, removing ugly #undef PM/#define PM dance
>   around includes. Will post separately.
> - Reduced the number of pointer constants in the generated LLVM IR, by
>   doing more getelementptr accesses (stem from before the time types
>   were automatically synced)
> - Increased number of comments a bit
>
> There's a jit-before-rebase-2018-01-29 tag, for the state of the tree
> before the rebase.


If you submit the C++ support separately I'd like to sign up as reviewer
and get that in. It's non-intrusive and just makes our existing c++
compilation support actually work properly. Your patch is a more complete
version of the C++ support I hacked up during linux.conf.au - I should've
thought to look in your tree.

The only part I had to add that I don't see in yours is a workaround for
mismatched throw() annotations on our redefinition of inet_net_ntop :


src/include/port.h:

@@ -421,7 +425,7 @@ extern int   pg_codepage_to_encoding(UINT cp);

 /* port/inet_net_ntop.c */
 extern char *inet_net_ntop(int af, const void *src, int bits,
-  char *dst, size_t size);
+  char *dst, size_t size) __THROW;


src/include/c.h:

@@ -1131,6 +1131,16 @@ extern intfdatasync(int fildes);
 #define NON_EXEC_STATIC static
 #endif

+/*
+ * glibc uses __THROW when compiling with the c++ compiler, but port.h
reclares
+ * inet_net_ntop. If we don't annotate it the same way as the prototype in
+ *  we'll upset g++, so we must use __THROW from
. If
+ * we're not on glibc, we need to define it away.
+ */
+#ifndef __GNU_LIBRARY__
+#define __THROW
+#endif
+
 /* /port compatibility functions */
 #include "port.h"


This might be better solved by renaming it to pg_inet_net_ntop so we don't
conflict with a standard name.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: PATCH: Exclude unlogged tables from base backups

2018-01-29 Thread David Steele

On 1/29/18 8:10 PM, Masahiko Sawada wrote:
> On Tue, Jan 30, 2018 at 5:45 AM, Adam Brightwell
>  wrote:
>> On Mon, Jan 29, 2018 at 1:17 PM, David Steele  wrote:
>>>
>>> Whoops, my bad.  Temp relations are stored in the db directories with a
>>> "t" prefix.  Looks like we can take care of those easily enough but I
>>> think it should be a separate patch.
>>>
>>> I'll plan to submit that for CF 2018-03.
> 
> +1
> 
>>
>> I agree, I believe this should be a separate patch.
>>
>> As for the latest patch above, I have reviewed, applied and tested it.
>>
>> It looks good to me. As well, it applies cleanly against master at
>> (97d4445a03). All tests passed when running 'check-world'.
>>
>> If it is agreed that the temp file exclusion should be submitted as a
>> separate patch, then I will mark 'ready for committer'.
> 
> Agreed, please mark this patch as "Ready for Committer".

I marked it just in case some enterprising committer from another time
zone swoops in and picks it up.  Fingers crossed!

-- 
-David
da...@pgmasters.net

Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

2018-01-29 Thread David G. Johnston

On Mon, Jan 29, 2018 at 2:55 AM, David Rowley 
wrote:

> On 28 January 2018 at 12:00, Tomas Vondra 
> wrote:
> > On 01/27/2018 10:45 PM, Tom Lane wrote:
> >> David Rowley  writes:
> >>> I'd offer to put it back to the order of the enum, but I want to
> >>> minimise the invasiveness of the patch. I'm not sure yet if it should
> >>> be classed as a bug fix or a new feature.
> >>
> >> FWIW, I'd call it a new feature.
> >>
> >
> > I'm not sure what exactly the feature would be? I mean "keep statistics
> > even if you only ask for indexes" does not seem particularly helpful to
> > me. So I see it more like a bug - I certainly think it should have been
> > handled differently in 10.
>
> Now I'll ask; On me doing so, would anyone have pushed back on that
> request and said that what I'm asking is a separate feature?
>
> If the answer to that is "no", then this is a bug that should be fixed
> and backpacked to v10.

Its a bug of omission (I'm going with no one saying no to your
proposition) - though that still doesn't automatically allow us to
back-patch it.

This bug has an obvious if annoying work-around and fixing the bug will
likely cause people's code, that uses said work-around, to fail.  Breaking
people's working code in stable release branches is generally a no-no.

However, given that this was discovered 4 months after the feature was
released suggests to me that we are justified, and community-serving, to
back-patch this.  Put more bluntly, we can ask for more leeway in the first
few patch releases of a new feature since more people will benefit from 5
years of a fully-baked feature than may be harmed by said change.  We
shouldn't abuse that but an obvious new feature bug/oversight like this
seems reasonable.

David J.

Re: PATCH: Exclude unlogged tables from base backups

2018-01-29 Thread Masahiko Sawada

On Tue, Jan 30, 2018 at 5:45 AM, Adam Brightwell
 wrote:
> On Mon, Jan 29, 2018 at 1:17 PM, David Steele  wrote:
>> On 1/29/18 9:13 AM, David Steele wrote:
>>> On 1/29/18 5:28 AM, Masahiko Sawada wrote:
 But I
 have a question; can we exclude temp tables as well? The pg_basebackup
 includes even temp tables. But I don't think that it's necessary for
 backups
>>> Thank you for having another look at the patch.
>>>
>>> Temp tables should be excluded by this code which is already in
>>> basebackup.c:
>>>
>>> /* Skip temporary files */
>>> if (strncmp(de->d_name,
>>> PG_TEMP_FILE_PREFIX,
>>> strlen(PG_TEMP_FILE_PREFIX)) == 0)
>>> continue;
>>>
>>> This looks right to me.
>>
>>
>> Whoops, my bad.  Temp relations are stored in the db directories with a
>> "t" prefix.  Looks like we can take care of those easily enough but I
>> think it should be a separate patch.
>>
>> I'll plan to submit that for CF 2018-03.

+1

>
> I agree, I believe this should be a separate patch.
>
> As for the latest patch above, I have reviewed, applied and tested it.
>
> It looks good to me. As well, it applies cleanly against master at
> (97d4445a03). All tests passed when running 'check-world'.
>
> If it is agreed that the temp file exclusion should be submitted as a
> separate patch, then I will mark 'ready for committer'.

Agreed, please mark this patch as "Ready for Committer".

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: [HACKERS] path toward faster partition pruning

2018-01-29 Thread Amit Langote

Hi Jesper.

On 2018/01/29 22:13, Jesper Pedersen wrote:
> Hi Amit,
> 
> On 01/26/2018 04:17 AM, Amit Langote wrote:
>> I made a few of those myself in the updated patches.
>>
>> Thanks a lot again for your work on this.
>>
> 
> This needs a rebase.

AFAICS, v22 cleanly applies to HEAD (c12693d8f3 [1]), compiles, and make
check passes.

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c12693d8f3

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Michael Paquier

On Mon, Jan 29, 2018 at 04:34:48PM +, Simon Riggs wrote:
> In terms of timing of commits, I have marked the patch Ready For
> Committer. To me that signifies that it is ready for review by a
> Committer prior to commit.

My understanding of this meaning is different than yours.  It should not
be the author's role to mark his own patch as ready for committer, but
the role of one or more people who have reviewed in-depth the proposed
patch and feature concepts.  If you can get a committer-level individual
to review your patch, then good for you.  But review basics need to
happen first.  And based on my rough lookup of this thread this has not
happened yet.  Other people on this thread are pointing out that as
well.
--
Michael


signature.asc
Description: PGP signature

Re: Security leak with trigger functions?

2018-01-29 Thread Peter Eisentraut

On 1/22/18 16:04, Chapman Flack wrote:
>> PostgreSQL only allows a trigger action of "call this function", so in 
>> the SQL standard context that would mean we'd need to check the EXECUTE 
>> privilege of the owner of the trigger.  The trick is figuring out who 
>> the owner is.  If it's the owner of the table, then TRIGGER privilege 
>> is effectively total control over the owner of the table.  If it's 
>> whoever created the trigger, it might be useful, but I don't see how 
>> that is compatible with the intent of the SQL standard.
> 
> Hmm, it's been not quite a dozen years, have there been later threads
> that followed up on this discussion?

No, I don't think anything has changed here.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund

Hi,

On 2018-01-30 00:16:46 +0100, Tomas Vondra wrote:
> FWIW I've installed llvm 5.0.1 from distribution package, and now
> everything builds fine (I don't even need the configure tweak).
> 
> I think I had to build the other binaries because there was no 5.x llvm
> back then, but it's too far back so I don't remember.
> 
> Anyway, seems I'm fine for now.

Phew, I'm relieved.  I'd guess you buily a 5.0 version while 5.0 was
still in development, so not all 5.0 functionality was available. Hence
the inconsistent looking result.  While I think we can support 4.0
without too much problem, there's obviously no point in trying to
support old between releases versions...

> Sorry for the noise.

No worries.

- Andres

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Tomas Vondra



On 01/29/2018 11:49 PM, Tomas Vondra wrote:
> 
> ...
>
> and that indeed changes the failure to this:
> 
> Writing postgres.bki
> Writing schemapg.h
> Writing postgres.description
> Writing postgres.shdescription
> llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
> llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
> a member of ‘llvm’
>llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
>^~~~
> llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
> llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>llvm::remove_bad_alloc_error_handler();
>^~~~
> llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
> llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>llvm::remove_bad_alloc_error_handler();
>^~~~
> make[3]: *** [: llvmjit_error.o] Error 1
> make[2]: *** [common.mk:45: lib-recursive] Error 2
> make[2]: *** Waiting for unfinished jobs
> make[1]: *** [Makefile:38: all-backend-recurse] Error 2
> make: *** [GNUmakefile:11: all-src-recurse] Error 2
> 
> 
> I'm not sure what that means, though ... maybe I really have system
> broken in some strange way.
> 

FWIW I've installed llvm 5.0.1 from distribution package, and now
everything builds fine (I don't even need the configure tweak).

I think I had to build the other binaries because there was no 5.x llvm
back then, but it's too far back so I don't remember.

Anyway, seems I'm fine for now. Sorry for the noise.


-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund

Hi,

On 2018-01-29 23:49:14 +0100, Tomas Vondra wrote:
> On 01/29/2018 11:17 PM, Andres Freund wrote:
> > On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
> >> $ llvm-config --version
> >> 5.0.0svn
> >
> > Is thta llvm-config the one in /usr/local/include/ referenced by the
> > error message above?
>
> I don't see it referenced anywhere, but it comes from here:
>
> $ which llvm-config
> /usr/local/bin/llvm-config
>
> > Or is it possible that llvm-config is from a different version than
> > the one the compiler picks the headers up from?
> >
>
> I don't think so. I don't have any other llvm versions installed, AFAICS.

Hm.


> > could you go to src/backend/lib, rm llvmjit.o, and show the full output
> > of make llvmjit.o?
> >
>
> Attached.
>
> > I wonder whether the issue is that my configure patch does
> > -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> > rather than
> > -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
> > and that it thus picks up the wrong header first?
> >
>
> I've tried this configure tweak:
>
>if test -n "$LLVM_CONFIG"; then
>  for pgac_option in `$LLVM_CONFIG --cflags`; do
>case $pgac_option in
> --I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> +-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
>esac
>  done
>
> and that indeed changes the failure to this:

Err, huh?  I don't understand how that can change anything if you
actually only have only one version of LLVM installed. Perhaps the
effect was just an ordering related artifact of [parallel] make?
I.e. just a question what failed first?


> Writing postgres.bki
> Writing schemapg.h
> Writing postgres.description
> Writing postgres.shdescription
> llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
> llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
> a member of ‘llvm’
>llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
>^~~~
> llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
> llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>llvm::remove_bad_alloc_error_handler();
>^~~~
> llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
> llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
> member of ‘llvm’
>llvm::remove_bad_alloc_error_handler();
>^~~~

It's a bit hard to interpret this without the actual compiler
invocation. But I've just checked both manually by inspecting 5.0 source
and by compiling against 5.0 that that function definition definitely
exists:

andres@alap4:~/src/llvm-5$ git branch
  master
* release_50
andres@alap4:~/src/llvm-5$ ack remove_bad_alloc_error_handler
lib/Support/ErrorHandling.cpp
139:void llvm::remove_bad_alloc_error_handler() {

include/llvm/Support/ErrorHandling.h
101:void remove_bad_alloc_error_handler();

So does my system llvm 5:
$ ack remove_bad_alloc_error_handler /usr/include/llvm-5.0/
/usr/include/llvm-5.0/llvm/Support/ErrorHandling.h
101:void remove_bad_alloc_error_handler();

But not in 4.0:
$ ack remove_bad_alloc_error_handler /usr/include/llvm-4.0/


> gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
> -Wendif-labels -Wmissing-format-attribute -Wformat-security 
> -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
> -fno-omit-frame-pointer -O2 -I../../../src/include  -D_GNU_SOURCE 
> -I/usr/local/include -DNDEBUG -DLLVM_BUILD_GLOBAL_ISEL -D_GNU_SOURCE 
> -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS   -c -o 
> llvmjit.o llvmjit.c
> llvmjit.c: In function ‘llvm_get_function’:
> llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ 
> from incompatible pointer type [-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
>  ^
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
> but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>   ^~~

To me this looks like those headers are from llvm 4, rather than 5:
$ grep -A2 -B3 LLVMOrcGetSymbolAddress ~/src/llvm-4/include/llvm-c/OrcBindings.h
/**
 * Get symbol address from JIT instance.
 */
LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
 const char *SymbolName);

$ grep -A3 -B3 LLVMOrcGetSymbolAddress ~/src/llvm-5/include/llvm-c/OrcBindings.h
/**
 * Get symbol address from JIT instance.
 */
LLVMOrcErrorCode LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
 LLVMOrcTargetAddress *RetAddr,
 const char *SymbolName);

So it does appear that your llvm-config and the actually installed llvm
don't quite agree. How did you install llvm?

Re: Add RANGE with values and exclusions clauses to the Window Functions

2018-01-29 Thread Oliver Ford

On Monday, 29 January 2018, Tom Lane  wrote:

> Oliver Ford  writes:
> > On Monday, 29 January 2018, Tom Lane  wrote:
> >> I've started to go through this in some detail, and I'm wondering why
> >> you invented a FRAMEOPTION_EXCLUDE_NO_OTHERS option bit rather than
> >> just representing that choice as default (0).
>
> > My guess is that it's a little like putting "ORDER BY x ASC" when ASC is
> > usually default behavior - it adds some documentation, perhaps for people
> > new to SQL or to make your intention more explicit. That's the only
> reason
> > I can think of as to why the standards committee included it.
>
> Yeah, they like to do that.  And "ORDER BY x ASC" is actually a precise
> precedent, because we don't print ASC either, cf get_rule_orderby().
>
> regards, tom lane
>

I would strongly suggest taking it out entirely then. There really doesn't
seem a point in adding a new keyword and a new condition in the grammar if
it is going to do absolutely nothing.

If anyone thinks it's useful to have I can just take it out of ruleutils
and remove its define. But personally I would remove it entirely as it's
really just clutter.

Re: Add RANGE with values and exclusions clauses to the Window Functions

2018-01-29 Thread Tom Lane

Oliver Ford  writes:
> On Monday, 29 January 2018, Tom Lane  wrote:
>> I've started to go through this in some detail, and I'm wondering why
>> you invented a FRAMEOPTION_EXCLUDE_NO_OTHERS option bit rather than
>> just representing that choice as default (0).

> My guess is that it's a little like putting "ORDER BY x ASC" when ASC is
> usually default behavior - it adds some documentation, perhaps for people
> new to SQL or to make your intention more explicit. That's the only reason
> I can think of as to why the standards committee included it.

Yeah, they like to do that.  And "ORDER BY x ASC" is actually a precise
precedent, because we don't print ASC either, cf get_rule_orderby().

regards, tom lane

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Tomas Vondra



On 01/29/2018 11:17 PM, Andres Freund wrote:
> On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
>> On 01/29/2018 10:57 PM, Andres Freund wrote:
>>> Hi,
>>>
>>> On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
 Hi, I wanted to look at this, but my attempts to build the jit branch
 fail with some compile-time warnings (uninitialized variables) and
 errors (unknown types, incorrect number of arguments). See the file
 attached.
>>>
>>> Which git hash are you building?  What llvm version is this building
>>> against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
>>> --version return?
>>>
>>
>> I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
>> HEAD in the jit branch, AFAICS).
> 
> The warnings come from an incomplete patch I probably shouldn't have
> pushed (Heavily-WIP: JIT hashing.). They should largely be irrelevant
> (although will cause a handful of "ERROR: hm" regression failures),
> but I'll definitely pop that commit on the next rebase.  If you want you
> can just reset --hard to its parent.
> 

OK

> 
> That errors are weird however:
> 
>> ...  ^
> 
>> I'm building like this:
>>
>> $ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
>>   --with-llvm --prefix=/home/postgres/pg-llvm
>>
>> $ make -s -j4 install
>>
>> and llvm-config --version says this:
>>
>> $ llvm-config --version
>> 5.0.0svn
> 
> Is thta llvm-config the one in /usr/local/include/ referenced by the
> error message above?

I don't see it referenced anywhere, but it comes from here:

$ which llvm-config
/usr/local/bin/llvm-config

> Or is it possible that llvm-config is from a different version than
> the one the compiler picks the headers up from?
> 

I don't think so. I don't have any other llvm versions installed, AFAICS.

> could you go to src/backend/lib, rm llvmjit.o, and show the full output
> of make llvmjit.o?
> 

Attached.

> I wonder whether the issue is that my configure patch does
> -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
> rather than
> -I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
> and that it thus picks up the wrong header first?
> 

I've tried this configure tweak:

   if test -n "$LLVM_CONFIG"; then
 for pgac_option in `$LLVM_CONFIG --cflags`; do
   case $pgac_option in
--I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
+-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
   esac
 done

and that indeed changes the failure to this:

Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription
llvmjit_error.cpp: In function ‘void llvm_enter_fatal_on_oom()’:
llvmjit_error.cpp:61:3: error: ‘install_bad_alloc_error_handler’ is not
a member of ‘llvm’
   llvm::install_bad_alloc_error_handler(fatal_llvm_new_handler);
   ^~~~
llvmjit_error.cpp: In function ‘void llvm_leave_fatal_on_oom()’:
llvmjit_error.cpp:77:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
   llvm::remove_bad_alloc_error_handler();
   ^~~~
llvmjit_error.cpp: In function ‘void llvm_reset_fatal_on_oom()’:
llvmjit_error.cpp:92:3: error: ‘remove_bad_alloc_error_handler’ is not a
member of ‘llvm’
   llvm::remove_bad_alloc_error_handler();
   ^~~~
make[3]: *** [: llvmjit_error.o] Error 1
make[2]: *** [common.mk:45: lib-recursive] Error 2
make[2]: *** Waiting for unfinished jobs
make[1]: *** [Makefile:38: all-backend-recurse] Error 2
make: *** [GNUmakefile:11: all-src-recurse] Error 2


I'm not sure what that means, though ... maybe I really have system
broken in some strange way.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Wendif-labels -Wmissing-format-attribute -Wformat-security 
-fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-fno-omit-frame-pointer -O2 -I../../../src/include  -D_GNU_SOURCE 
-I/usr/local/include -DNDEBUG -DLLVM_BUILD_GLOBAL_ISEL -D_GNU_SOURCE 
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS   -c -o 
llvmjit.o llvmjit.c
llvmjit.c: In function ‘llvm_get_function’:
llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from 
incompatible pointer type [-Wincompatible-pointer-types]
  if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
 ^
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c:239:6: error: too many arguments to function ‘LLVMOrcGetSymbolAddress’
  if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
  ^~~
In file included from llvmjit.c:45:0:

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Peter Geoghegan

On Mon, Jan 29, 2018 at 1:34 PM, Simon Riggs  wrote:
>> The way that routines like ExecUpdate() interact with MERGE for WHEN
>> qual + EPQ handling seems kind of convoluted. I hope for something
>> cleaner in the next revision.
>
> Cleaner?

Yeah, cleaner. The fact that when quals kind of participate in EPQ
evaluation without that being in execMain.c seems like it could be a
lot cleaner. I don't have more specifics than that right now.

>> I don't think that ERRCODE_T_R_SERIALIZATION_FAILURE is ever okay in
>> READ COMMITTED mode.
>
> We use that code already in Hot Standby in READ COMMITTED mode.
>
> What code should it be? It needs to be a retryable errcode.

I don't think that there should be any error, so I can't say.

>> Anyway, I wonder why we shouldn't just go ahead
>> an do the WHEN NOT MATCHED INSERT on the basis of information which is
>> not visible to our snapshot. We choose to not UPDATE/DELETE on the
>> basis of information from the future already, within EPQ. My first
>> feeling is that this is a distinction without a difference, and you
>> should actually go and INSERT at this point, though I reserve the
>> right to change my mind about that.
>>
>> Yeah, EPQ semantics are icky, but would actually inserting here really
>> be any worse than what we do?
>
> I argued that was possible and desirable, but you argued it was not
> and got everybody else to agree with you. I'm surprised to see you
> change your mind on that.

You're mistaken. Nothing I've said here is inconsistent with my
previous remarks on how we deal with concurrency.

> At your request I have specifically written the logic to avoid that.
> I'm happy with that, for now, since what we have is correct, even if
> we can do better later.
>
> We can have fun with looping, live locks and weird errors another day.
> SQL Standard says this area is implementation defined.

Who said anything about a loop? Where is the loop here?

I will rephrase what I said, to make it top-down rather than
bottom-up, which may make my intent clearer:

According to your documentation, "MERGE provides a single SQL
statement that can conditionally INSERT, UPDATE or DELETE rows, a task
that would otherwise require multiple procedural language statements".
But you're introducing a behavior/error that would not occur in
equivalent procedural client code consisting of an UPDATE followed by
a (conditionally executed) INSERT statement when run in READ COMMITTED
mode. You actually get exactly the concurrency issue that you cite as
unacceptable in justifying your serialization error with such
procedural code (when the UPDATE didn't affect any rows, only
following EPQ walking the UPDATE chain from the snapshot-visible
tuple, making the client code decide to do an INSERT on the basis of
information "from the future").

I think that it isn't this patch's job to make READ COMMITTED mode any
safer than it is in that existing scenario. A scenario that doesn't
involve ON CONFLICT at all.

>> * The SQL standard explicitly imposes this limitation:
>>
>> postgres=# explain merge into cities a using cities b on a.city =
>> b.city when matched and (select 1=1) then update set country =
>> b.country;
>> ERROR:  0A000: cannot use subquery in WHEN AND condition
>> LINE 1: ...sing cities b on a.city = b.city when matched and (select 1=...
>>  ^
>> LOCATION:  transformSubLink, parse_expr.c:1865
>>
>> Why do you, though? Do we really need it? I'm just curious about your
>> thoughts on it. To be clear, I'm not asserting that you're wrong.
>
> Which limitation? Not allowing sub-selects? They are not supported, as
> the message says.

I'm simply asking you to explain why you think that it would be
problematic or even impossible to support it. The question is asked
without any agenda. I'm verifying my own understanding, as much as
anything else. I've acknowledged that the standard has something to
say on this that supports your position, which has real weight.

>> * Subselect handling is buggy:
>>
>> postgres=# merge into cities a using cities b on a.city = b.city when
>> matched and a.city = 'Straffan' then update set country = (select
>> 'Some country');
>> ERROR:  XX000: unrecognized node type: 114
>> LOCATION:  ExecInitExprRec, execExpr.c:2114
>
> Not buggy, subselects are not supported in WHEN AND clauses because
> they are not part of the planned query, nor can they be if we want to
> handle the WHEN clause logic per spec.

I'm not asking about WHEN AND here (that was my last question). I'm
asking about a subselect that appears in the targetlist.

(In any case, "unrecognized node type: 114" seems buggy to me in any context.)

>> * Why no CTE support? SQL Server has this.
>
> The SQL Standard doesn't require CTEs or RETURNING syntax, but they
> could in time be supported.

No time like the present. Is there some reason why it would be
difficult with our implementation of CTEs? I can't think why it

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund

On 2018-01-29 23:01:14 +0100, Tomas Vondra wrote:
> On 01/29/2018 10:57 PM, Andres Freund wrote:
> > Hi,
> > 
> > On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
> >> Hi, I wanted to look at this, but my attempts to build the jit branch
> >> fail with some compile-time warnings (uninitialized variables) and
> >> errors (unknown types, incorrect number of arguments). See the file
> >> attached.
> > 
> > Which git hash are you building?  What llvm version is this building
> > against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
> > --version return?
> > 
> 
> I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
> HEAD in the jit branch, AFAICS).

The warnings come from an incomplete patch I probably shouldn't have
pushed (Heavily-WIP: JIT hashing.). They should largely be irrelevant
(although will cause a handful of "ERROR: hm" regression failures),
but I'll definitely pop that commit on the next rebase.  If you want you
can just reset --hard to its parent.


That errors are weird however:

> llvmjit.c: In function ‘llvm_get_function’:
> llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ 
> from incompatible pointer type [-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
>  ^
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
> but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>   ^~~
> llvmjit.c:239:6: error: too many arguments to function 
> ‘LLVMOrcGetSymbolAddress’
>   if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
>   ^~~
> In file included from llvmjit.c:45:0:
> /usr/local/include/llvm-c/OrcBindings.h:129:22: note: declared here
>  LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
>   ^~~
> llvmjit.c:243:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ 
> from incompatible pointer type [-Wincompatible-pointer-types]
>   if (LLVMOrcGetSymbolAddress(llvm_opt3_orc, , mangled))
>  ^

> I'm building like this:
> 
> $ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
>   --with-llvm --prefix=/home/postgres/pg-llvm
> 
> $ make -s -j4 install
> 
> and llvm-config --version says this:
> 
> $ llvm-config --version
> 5.0.0svn

Is thta llvm-config the one in /usr/local/include/ referenced by the
error message above?  Or is it possible that llvm-config is from a
different version than the one the compiler picks the headers up from?

could you go to src/backend/lib, rm llvmjit.o, and show the full output
of make llvmjit.o?

I wonder whether the issue is that my configure patch does
-I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
rather than
-I*|-D*) CPPFLAGS="$pgac_option $CPPFLAGS";;
and that it thus picks up the wrong header first?

Greetings,

Andres Freund

Re: Add RANGE with values and exclusions clauses to the Window Functions

2018-01-29 Thread Oliver Ford

On Monday, 29 January 2018, Tom Lane  wrote:

> Oliver Ford  writes:
> > [ 0001-window-frame-v9.patch ]
>
> I've started to go through this in some detail, and I'm wondering why
> you invented a FRAMEOPTION_EXCLUDE_NO_OTHERS option bit rather than
> just representing that choice as default (0).  As you have it, a
> window definition that explicitly specifies EXCLUDE NO OTHERS will be
> considered unequal to one that just defaults to that behavior, in
> e.g. transformWindowFuncCall().  That seems undesirable, if not
> outright wrong.  Against that, having the bit allows ruleutils.c
> to print "EXCLUDE NO OTHERS" when the input included that, but that
> seems like a wash if not an anti-feature.  We've never been big on
> making ruleutils output distinguish explicit from implicit selection
> of a default setting, and in this case it could possibly lead to
> outputting a query in a form that is not backwards-compatible to
> older PG versions, when there's no need to be incompatible.


Exclude No Others does seem pretty pointless to me, but it's in the
standard so I included it as an option that can be printed by ruleutils. I
can't imagine it being much used, but if people do want to document that
they are not excluding other rows they can do so.

My guess is that it's a little like putting "ORDER BY x ASC" when ASC is
usually default behavior - it adds some documentation, perhaps for people
new to SQL or to make your intention more explicit. That's the only reason
I can think of as to why the standards committee included it.


> If there's some other consideration I'm missing, please say what;
> otherwise I'll change it.
>
> BTW, I generally recommend not including a catversion change in
> submitted patches.  It causes merge problems any time some other
> catversion-bumping patch gets committed, and it can't possibly be
> the right value for the final commit since you aren't likely to
> be able to predict that date in advance.  It surely doesn't hurt
> to remind the committer that a catversion bump is needed, but just
> do that in the submission message.
>

Ok won't do that again.

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Tomas Vondra

On 01/29/2018 10:57 PM, Andres Freund wrote:
> Hi,
> 
> On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
>> Hi, I wanted to look at this, but my attempts to build the jit branch
>> fail with some compile-time warnings (uninitialized variables) and
>> errors (unknown types, incorrect number of arguments). See the file
>> attached.
> 
> Which git hash are you building?  What llvm version is this building
> against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
> --version return?
> 

I'm building against fdc6c7a6dddbd6df63717f2375637660bcd00fc6 (current
HEAD in the jit branch, AFAICS).

I'm building like this:

$ ./configure --enable-debug CFLAGS="-fno-omit-frame-pointer -O2" \
  --with-llvm --prefix=/home/postgres/pg-llvm

$ make -s -j4 install

and llvm-config --version says this:

$ llvm-config --version
5.0.0svn


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] why not parallel seq scan for slow functions

2018-01-29 Thread Robert Haas

On Sun, Jan 28, 2018 at 10:13 PM, Amit Kapila  wrote:
> If we want to get rid of Gather (Merge) checks in
> apply_projection_to_path(), then we need some way to add a projection
> path to the subpath of gather node even if that is capable of
> projection as we do now.  I think changing the order of applying
> scan/join target won't address that unless we decide to do it for
> every partial path.  Another way could be that we handle that in
> generate_gather_paths, but I think that won't be the idle place to add
> projection.
>
> If we want, we can compute the parallel-safety of scan/join target
> once in grouping_planner and then pass it in apply_projection_to_path
> to address your main concern.

I spent some time today hacking on this; see attached.  It needs more
work, but you can see what I have in mind.  It's not quite the same as
what I outlined before because that turned out to not quite work, but
it does remove most of the logic from apply_projection_to_path().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-paths-tlist-cost-rmh.patch
Description: Binary data

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund

Hi,

On 2018-01-29 22:51:38 +0100, Tomas Vondra wrote:
> Hi, I wanted to look at this, but my attempts to build the jit branch
> fail with some compile-time warnings (uninitialized variables) and
> errors (unknown types, incorrect number of arguments). See the file
> attached.

Which git hash are you building?  What llvm version is this building
against?  If you didn't specify LLVM_CONFIG=... what does llvm-config
--version return?

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Tomas Vondra



On 01/24/2018 08:20 AM, Andres Freund wrote:
> Hi,
> 
> I've spent the last weeks working on my LLVM compilation patchset. In
> the course of that I *heavily* revised it. While still a good bit away
> from committable, it's IMO definitely not a prototype anymore.
> 
> There's too many small changes, so I'm only going to list the major
> things. A good bit of that is new. The actual LLVM IR emissions itself
> hasn't changed that drastically.  Since I've not described them in
> detail before I'll describe from scratch in a few cases, even if things
> haven't fully changed.
> 

Hi, I wanted to look at this, but my attempts to build the jit branch
fail with some compile-time warnings (uninitialized variables) and
errors (unknown types, incorrect number of arguments). See the file
attached.

I wonder if I'm doing something wrong, or if there's something wrong
with my environment. I do have this:

$ clang -v
clang version 5.0.0 (trunk 299717)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Writing fmgroids.h
Writing fmgrprotos.h
Writing fmgrtab.c
Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription
../../../src/include/lib/simplehash.h: In function ‘tuplehash_insert’:
execGrouping.c:428:28: warning: ‘slot’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
  econtext->ecxt_innertuple = slot;
  ~~^~
execGrouping.c:402:18: note: ‘slot’ was declared here
  TupleTableSlot *slot;
  ^~~~
../../../src/include/lib/simplehash.h: In function ‘tuplehash_lookup’:
execGrouping.c:428:28: warning: ‘slot’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
  econtext->ecxt_innertuple = slot;
  ~~^~
execGrouping.c:402:18: note: ‘slot’ was declared here
  TupleTableSlot *slot;
  ^~~~
../../../src/include/lib/simplehash.h: In function ‘tuplehash_delete’:
execGrouping.c:428:28: warning: ‘slot’ may be used uninitialized in this 
function [-Wmaybe-uninitialized]
  econtext->ecxt_innertuple = slot;
  ~~^~
execGrouping.c:402:18: note: ‘slot’ was declared here
  TupleTableSlot *slot;
  ^~~~
llvmjit.c: In function ‘llvm_get_function’:
llvmjit.c:239:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from 
incompatible pointer type [-Wincompatible-pointer-types]
  if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
 ^
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c:239:6: error: too many arguments to function ‘LLVMOrcGetSymbolAddress’
  if (LLVMOrcGetSymbolAddress(llvm_opt0_orc, , mangled))
  ^~~
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: declared here
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c:243:45: warning: passing argument 2 of ‘LLVMOrcGetSymbolAddress’ from 
incompatible pointer type [-Wincompatible-pointer-types]
  if (LLVMOrcGetSymbolAddress(llvm_opt3_orc, , mangled))
 ^
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: expected ‘const char *’ 
but argument is of type ‘LLVMOrcTargetAddress * {aka long unsigned int *}’
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c:243:6: error: too many arguments to function ‘LLVMOrcGetSymbolAddress’
  if (LLVMOrcGetSymbolAddress(llvm_opt3_orc, , mangled))
  ^~~
In file included from llvmjit.c:45:0:
/usr/local/include/llvm-c/OrcBindings.h:129:22: note: declared here
 LLVMOrcTargetAddress LLVMOrcGetSymbolAddress(LLVMOrcJITStackRef JITStack,
  ^~~
llvmjit.c: In function ‘llvm_compile_module’:
llvmjit.c:383:3: error: unknown type name ‘LLVMSharedModuleRef’
   LLVMSharedModuleRef smod;
   ^~~
llvmjit.c:388:10: warning: implicit declaration of function 
‘LLVMOrcMakeSharedModule’ [-Wimplicit-function-declaration]
   smod = LLVMOrcMakeSharedModule(context->module);
  ^~~
llvmjit.c:389:48: warning: passing argument 2 of ‘LLVMOrcAddEagerlyCompiledIR’ 
from incompatible pointer type [-Wincompatible-pointer-types]
   if

VALUES nodes and expression initialization

2018-01-29 Thread Andres Freund

Hi,

In contrast to most other nodes, nodeValuescan.c does expression
initialization at "runtime" rather than in initialization:

/*
 * Get rid of any prior cycle's leftovers.  We use 
ReScanExprContext
 * not just ResetExprContext because we want any registered 
shutdown
 * callbacks to be called.
 */
ReScanExprContext(econtext);

/*
 * Build the expression eval state in the econtext's per-tuple 
memory.
 * This is a tad unusual, but we want to delete the eval state 
again
 * when we move to the next row, to avoid growth of memory
 * requirements over a long values list.
 */
oldContext = 
MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);

this does make a good bit of sense for things like
INSERT ... VALUES (...), (...), ...
or even just a plain long
VALUES (...), (...), ...
statement (albeit the latter being a bit pointless).

but when *joining* something with a VALUES(), that's far less
beneficial. That can lead to the same VALUES() node being scanned over
and over, doing a lot of redundant initialization.

I noticed this when JITing the regression tests, and forcing every
single expression to be JITed. So maybe, just maybe, not the most
relevant case.

But it's not particularly hard to think of scenarios where that leads to
spending a good chunk of time in expression initialization, even leading
JIT aside.

The JIT case is easy enough to fix / work around, I'm however wondering
if we should do something about the repeated scan case. It's a
reasonably common pattern to join to VALUES to e.g. add additional
information to the results of an aggregate.


The case that trgiggered me looking at this is:
regression[17061][1]=# explain select count(*) from tenk1 a,  tenk1 b join 
lateral (values(a.unique1),(-1)) ss(x) on b.unique2 = ss.x;
┌─┐
│   QUERY PLAN  
  │
├─┤
│ Aggregate  (cost=1437.69..1437.70 rows=1 width=8) 
  │
│   ->  Hash Join  (cost=394.91..1387.81 rows=19952 width=0)
  │
│ Hash Cond: ("*VALUES*".column1 = b.unique2)   
  │
│ ->  Nested Loop  (cost=0.29..718.84 rows=19952 width=4)   
  │
│   ->  Index Only Scan using tenk1_unique1 on tenk1 a  
(cost=0.29..269.93 rows=9976 width=4) │
│   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=4) 
  │
│ ->  Hash  (cost=269.93..269.93 rows=9976 width=4) 
  │
│   ->  Index Only Scan using tenk1_unique2 on tenk1 b  
(cost=0.29..269.93 rows=9976 width=4) │
└─┘
(8 rows)

Here materialization doesn't help, because there's a lateral dependency
preventing VALUES from materializing.

Does anybody think such cases are common enough that we should do
something about the constant re-initialization?

Greetings,

Andres Freund

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 20:41, Peter Geoghegan  wrote:
> On Mon, Jan 29, 2018 at 6:11 AM, Simon Riggs  wrote:
>> New patch attached that correctly handles all known concurrency cases,
>> with expected test output.
>
> This revision, v13, seems much improved in this area.
>
>> This means MERGE will work just fine for "normal" UPDATEs, but it will
>> often fail (deterministically) in concurrent tests with mixed
>> insert/deletes or UPDATEs that touch the PK, as requested.
>
> * While looking at how you're handling concurrency/EPQ now, I noticed this 
> code:
>
>> +   /*
>> +* Test condition, if any
>> +*
>> +* In the absence of a condition we perform the 
>> action
>> +* unconditionally (no need to check separately since
>> +* ExecQual() will return true if there are no
>> +* conditions to evaluate).
>> +*/
>> +   if (!ExecQual(action->whenqual, econtext))
>> +   {
>> +   if (BufferIsValid(buffer))
>> +   ReleaseBuffer(buffer);
>> +   continue;
>> +   }
>
> (As well as its interactions with ExecUpdate() + ExecDelete(), which
> are significant.)
>
> The way that routines like ExecUpdate() interact with MERGE for WHEN
> qual + EPQ handling seems kind of convoluted. I hope for something
> cleaner in the next revision.

Cleaner?

> * This also stood out (not sure if you were referring/alluding to this
> in the quoted text):
>
>> +   /*
>> +* If EvalPlanQual did not return a tuple, 
>> it means we
>> +* have seen a concurrent delete, or a 
>> concurrent update
>> +* where the row has moved to another 
>> partition.
>> +*
>> +* UPDATE ignores this case and continues.
>> +*
>> +* If MERGE has a WHEN NOT MATCHED clause we 
>> know that the
>> +* user would like to INSERT something in 
>> this case, yet
>> +* we can't see the delete with our 
>> snapshot, so take the
>> +* safe choice and throw an ERROR. If the 
>> user didn't care
>> +* about WHEN NOT MATCHED INSERT then 
>> neither do we.
>> +*
>> +* XXX We might consider setting matched = 
>> false and loop
>> +* back to lmerge though we'd need to do 
>> something like
>> +* EvalPlanQual, but not quite.
>> +*/
>> +   else if (epqstate->epqresult == 
>> EPQ_TUPLE_IS_NULL &&
>> +node->mt_merge_subcommands & 
>> ACL_INSERT)
>> +   {
>> +   /*
>> +* We need to throw a retryable ERROR 
>> because of the
>> +* concurrent update which we can't 
>> handle.
>> +*/
>> +   ereport(ERROR,
>> +   
>> (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
>> +errmsg("could not serialize 
>> access due to concurrent update")));
>> +   }
>
> I don't think that ERRCODE_T_R_SERIALIZATION_FAILURE is ever okay in
> READ COMMITTED mode.

We use that code already in Hot Standby in READ COMMITTED mode.

What code should it be? It needs to be a retryable errcode.

> Anyway, I wonder why we shouldn't just go ahead
> an do the WHEN NOT MATCHED INSERT on the basis of information which is
> not visible to our snapshot. We choose to not UPDATE/DELETE on the
> basis of information from the future already, within EPQ. My first
> feeling is that this is a distinction without a difference, and you
> should actually go and INSERT at this point, though I reserve the
> right to change my mind about that.
>
> Yeah, EPQ semantics are icky, but would actually inserting here really
> be any worse than what we do?

I argued that was possible and desirable, but you argued it was not
and got everybody else to agree with you. I'm surprised to see you
change your mind on that.

At your request I have specifically written the logic to avoid that.
I'm happy with that, for now, since what we have is correct, even if
we can do better later.

We can have fun with looping, live locks and weird errors another day.
SQL Standard says this area is implementation defined.

Re: PATCH: Configurable file mode mask

2018-01-29 Thread David Steele

On 1/19/18 4:43 PM, Peter Eisentraut wrote:
> On 1/19/18 14:07, David Steele wrote:
>> I have yet to add tests for pg_rewindwal and pg_upgrade.  pg_rewindwal
>> doesn't *have* any tests as far as I can tell and pg_upgrade has tests
>> in a shell script -- it's not clear how I would extend it or reuse the
>> Perl code for perm testing.
>>
>> Does anyone have suggestions on tests for pg_resetwal and pg_upgrade?
>> Should I start from scratch?
> 
> A test suite for pg_resetwal would be nice.

Agreed.

> TAP tests for pg_upgrade will create problems with the build farm.
> There was a recent thread about that.

OK, that being the case I have piggy-backed on the current pg_upgrade
tests in the same way that --wal-segsize did.

There are now three patches:

1) 01-pgresetwal-test

Adds a *very* basic test suite for pg_resetwal.  I was able to make this
utility core dump (floating point errors) pretty easily with empty or
malformed pg_control files so I focused on WAL reset functionality plus
the basic help/command tests that every utility has.

2) 02-mkdir

Converts mkdir() calls to MakeDirectoryDefaultPerm() if the original
call used default permissions.

3) 03-group

Allow group access on PGDATA.  This includes backend changes, utility
changes (initdb, pg_ctl, pg_upgrade, etc.) and tests for each utility.

Regards,
-- 
-David
da...@pgmasters.net
diff --git a/src/bin/pg_resetwal/Makefile b/src/bin/pg_resetwal/Makefile
index 5ab7ad33e0..13c9ca6279 100644
--- a/src/bin/pg_resetwal/Makefile
+++ b/src/bin/pg_resetwal/Makefile
@@ -33,3 +33,9 @@ uninstall:
 
 clean distclean maintainer-clean:
rm -f pg_resetwal$(X) $(OBJS)
+
+check:
+   $(prove_check)
+
+installcheck:
+   $(prove_installcheck)
diff --git a/src/bin/pg_resetwal/t/001_basic.pl 
b/src/bin/pg_resetwal/t/001_basic.pl
new file mode 100644
index 00..234bd70303
--- /dev/null
+++ b/src/bin/pg_resetwal/t/001_basic.pl
@@ -0,0 +1,53 @@
+use strict;
+use warnings;
+
+use Config;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 13;
+
+my $tempdir   = TestLib::tempdir;
+my $tempdir_short = TestLib::tempdir_short;
+
+program_help_ok('pg_resetwal');
+program_version_ok('pg_resetwal');
+program_options_handling_ok('pg_resetwal');
+
+# Initialize node without replication settings
+my $node = get_new_node('main');
+my $pgdata = $node->data_dir;
+my $pgwal = "$pgdata/pg_wal";
+$node->init;
+$node->start;
+
+# Remove WAL from pg_wal and make sure it gets rebuilt
+$node->stop;
+
+unlink("$pgwal/00010001") == 1
+   or die("unable to remove 00010001");
+
+is_deeply(
+   [sort(slurp_dir($pgwal))], [sort(qw(. .. archive_status))], 'no WAL');
+
+$node->command_ok(['pg_resetwal', '-D', $pgdata], 'recreate pg_wal');
+
+is_deeply(
+   [sort(slurp_dir($pgwal))],
+   [sort(qw(. .. archive_status 00010002))],
+   'WAL recreated');
+
+$node->start;
+
+# Reset to specific WAL segment
+$node->stop;
+
+$node->command_ok(
+   ['pg_resetwal', '-l', '000700070007', '-D', $pgdata],
+   'set to specific WAL');
+
+is_deeply(
+   [sort(slurp_dir($pgwal))],
+   [sort(qw(. .. archive_status 000700070007))],
+   'WAL recreated');
+
+$node->start;
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index e42b828edf..9814ac6d3e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4086,7 +4086,7 @@ ValidateXLOGDirectoryStructure(void)
{
ereport(LOG,
(errmsg("creating missing WAL directory 
\"%s\"", path)));
-   if (mkdir(path, S_IRWXU) < 0)
+   if (MakeDirectoryDefaultPerm(path) < 0)
ereport(FATAL,
(errmsg("could not create missing 
directory \"%s\": %m",
path)));
diff --git a/src/backend/commands/tablespace.c 
b/src/backend/commands/tablespace.c
index 5c450caa4e..746b8c121b 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -151,7 +151,7 @@ TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool 
isRedo)
else
{
/* Directory creation failed? */
-   if (mkdir(dir, S_IRWXU) < 0)
+   if (MakeDirectoryDefaultPerm(dir) < 0)
{
char   *parentdir;
 
@@ -173,7 +173,7 @@ TablespaceCreateDbspace(Oid spcNode, Oid dbNode, bool 
isRedo)
get_parent_directory(parentdir);
get_parent_directory(parentdir);
/* Can't create parent and it doesn't 
already exist? */
-   if (mkdir(parentdir, S_IRWXU) < 0 && 
errno !=

Re: Add RANGE with values and exclusions clauses to the Window Functions

2018-01-29 Thread Tom Lane

Oliver Ford  writes:
> [ 0001-window-frame-v9.patch ]

I've started to go through this in some detail, and I'm wondering why
you invented a FRAMEOPTION_EXCLUDE_NO_OTHERS option bit rather than
just representing that choice as default (0).  As you have it, a
window definition that explicitly specifies EXCLUDE NO OTHERS will be
considered unequal to one that just defaults to that behavior, in
e.g. transformWindowFuncCall().  That seems undesirable, if not
outright wrong.  Against that, having the bit allows ruleutils.c
to print "EXCLUDE NO OTHERS" when the input included that, but that
seems like a wash if not an anti-feature.  We've never been big on
making ruleutils output distinguish explicit from implicit selection
of a default setting, and in this case it could possibly lead to
outputting a query in a form that is not backwards-compatible to
older PG versions, when there's no need to be incompatible.

If there's some other consideration I'm missing, please say what;
otherwise I'll change it.

BTW, I generally recommend not including a catversion change in
submitted patches.  It causes merge problems any time some other
catversion-bumping patch gets committed, and it can't possibly be
the right value for the final commit since you aren't likely to
be able to predict that date in advance.  It surely doesn't hurt
to remind the committer that a catversion bump is needed, but just
do that in the submission message.

regards, tom lane

Re: [HACKERS] datetime.h defines like PM conflict with external libraries

2018-01-29 Thread Andres Freund

Hi,

On 2017-10-04 11:36:56 +0200, Alvaro Herrera wrote:
> Andrew Dunstan wrote:
> > On 10/03/2017 04:43 PM, Tom Lane wrote:
> > > I like the new-header-file idea because it will result in minimal code
> > > churn and thus minimal back-patching hazards.

I'm not sure it's that little code churn, and the insulation isn't
great. Based on my WIP patch adding a DT_ prefix it would affect at
least:

 contrib/adminpack/adminpack.c  |   8 +-
 src/backend/parser/gram.y  |  52 -
 src/backend/utils/adt/date.c   |  50 
 src/backend/utils/adt/datetime.c   | 614 
+-
 src/backend/utils/adt/formatting.c |  10 +-
 src/backend/utils/adt/json.c   |   8 +-
 src/backend/utils/adt/nabstime.c   |  32 +++---
 src/backend/utils/adt/timestamp.c  | 196 +++
 src/backend/utils/adt/xml.c|   6 +-
 src/backend/utils/misc/tzparser.c  |   4 +-
 src/bin/pg_waldump/compat.c|   6 +-
 src/include/utils/datetime.h   | 216 +--

so I'm not quite convinced this that well isolated pieces of code.  I
wonder if just moving the defines around won't primarily increase pain.

I have however, for now, worked around the need to deal with this
problem (by moving more stuff .c files that are careful about their
includes). So this is more about historical raisins, I do not have an
urgent need to work on this.

> > > I do *not* like "PG_PM".  For our own purposes that adds no uniqueness
> > > at all.  If we're to touch these symbols then I'd go for names like
> > > "DATETIME_PM".  Or maybe "DT_PM" ... there's a little bit of precedent
> > > for the DT_ prefix already.
> > 
> > Yeah. If we use a prefix +1 for DT_. If we do that then I think they
> > should *all* be prefixed, not just the ones we know of conflicts for.

Attached is a WIP patch doing exactly this conversion for
datetime.h. Note that we'd want to do something about ecpg's dt.h if we
were to go for the approach.

While the changes are fairly verbose, they're also mechanical, so I
suspect the issues around backpatching - not that that code changes that
much - wouldn't be too hard to resolve.

> Maybe it'd be good idea to unify some of that stuff so that ecpg can use
> it, too?  Having a second copy of the same stuff in
> src/interfaces/ecpg/pgtypeslib/dt.h is pretty terrible.  Even if not,
> let's make sure they don't diverge.

I agree that that would be quite an advantage. It's more than just
datetime.h that'd need to be usable by ecpg. Luckily timestamp.h would
probably be easy,
commit a7801b62f21bd051444bd1119cd3745ecc8e14ec
Author: Tom Lane 
Date:   2011-09-09 13:23:41 -0400

Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h.
provides the basics.  I suspect we'd want to do something very similar
for datetime?

I however wonder if even that would be really going far enough - we'd
still end up with a lot of copied functions:

int DecodeInterval(char **, int *, int, int *, struct tm *, 
fsec_t *);
int DecodeTime(char *, int *, struct tm *, fsec_t *);
voidEncodeDateTime(struct tm *tm, fsec_t fsec, bool print_tz, int 
tz, const char *tzn, int style, char *str, bool EuroDates);
voidEncodeInterval(struct tm *tm, fsec_t fsec, int style, char 
*str);
int tm2timestamp(struct tm *, fsec_t, int *, timestamp *);
int DecodeUnits(int field, char *lowtoken, int *val);
boolCheckDateTokenTables(void);
voidEncodeDateOnly(struct tm *tm, int style, char *str, bool 
EuroDates);
int GetEpochTime(struct tm *);
int ParseDateTime(char *, char *, char **, int *, int *, 
char **);
int DecodeDateTime(char **, int *, int, int *, struct tm *, 
fsec_t *, bool);
voidj2date(int, int *, int *, int *);
voidGetCurrentDateTime(struct tm *);
int date2j(int, int, int);
voidTrimTrailingZeros(char *);
voiddt2time(double, int *, int *, int *, fsec_t *);

I suspect starting to implement infrastructure to deal with would be a
bit bigger a task than I can chew of right now though.  Medium term, it
seems to me, we should start actually move a lot of the adt code into a
library that can be included (or possibly just compiled?) both by
frontend and backend code.  Which kinda seems to imply we'd need
compatible elog support for frontend code, which I'd wished for many
times.

Michael, is there any problem including datatype/* headers in ecpg that
are frontend clean? I see no such usage so far, that's why I'm asking.

Greetings,

Andres Freund
>From eb8fa449c7d06763128c2c33fd1bb972f9b719d1 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 25 Jan 2018 22:06:40 -0800
Subject: [PATCH] WIP: Deconflict

Re: PATCH: Exclude unlogged tables from base backups

2018-01-29 Thread Adam Brightwell

On Mon, Jan 29, 2018 at 1:17 PM, David Steele  wrote:
> On 1/29/18 9:13 AM, David Steele wrote:
>> On 1/29/18 5:28 AM, Masahiko Sawada wrote:
>>> But I
>>> have a question; can we exclude temp tables as well? The pg_basebackup
>>> includes even temp tables. But I don't think that it's necessary for
>>> backups
>> Thank you for having another look at the patch.
>>
>> Temp tables should be excluded by this code which is already in
>> basebackup.c:
>>
>> /* Skip temporary files */
>> if (strncmp(de->d_name,
>> PG_TEMP_FILE_PREFIX,
>> strlen(PG_TEMP_FILE_PREFIX)) == 0)
>> continue;
>>
>> This looks right to me.
>
>
> Whoops, my bad.  Temp relations are stored in the db directories with a
> "t" prefix.  Looks like we can take care of those easily enough but I
> think it should be a separate patch.
>
> I'll plan to submit that for CF 2018-03.

I agree, I believe this should be a separate patch.

As for the latest patch above, I have reviewed, applied and tested it.

It looks good to me. As well, it applies cleanly against master at
(97d4445a03). All tests passed when running 'check-world'.

If it is agreed that the temp file exclusion should be submitted as a
separate patch, then I will mark 'ready for committer'.

-Adam

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Peter Geoghegan

On Mon, Jan 29, 2018 at 6:11 AM, Simon Riggs  wrote:
> New patch attached that correctly handles all known concurrency cases,
> with expected test output.

This revision, v13, seems much improved in this area.

> This means MERGE will work just fine for "normal" UPDATEs, but it will
> often fail (deterministically) in concurrent tests with mixed
> insert/deletes or UPDATEs that touch the PK, as requested.

* While looking at how you're handling concurrency/EPQ now, I noticed this code:

> +   /*
> +* Test condition, if any
> +*
> +* In the absence of a condition we perform the action
> +* unconditionally (no need to check separately since
> +* ExecQual() will return true if there are no
> +* conditions to evaluate).
> +*/
> +   if (!ExecQual(action->whenqual, econtext))
> +   {
> +   if (BufferIsValid(buffer))
> +   ReleaseBuffer(buffer);
> +   continue;
> +   }

(As well as its interactions with ExecUpdate() + ExecDelete(), which
are significant.)

The way that routines like ExecUpdate() interact with MERGE for WHEN
qual + EPQ handling seems kind of convoluted. I hope for something
cleaner in the next revision.

* This also stood out (not sure if you were referring/alluding to this
in the quoted text):

> +   /*
> +* If EvalPlanQual did not return a tuple, it 
> means we
> +* have seen a concurrent delete, or a 
> concurrent update
> +* where the row has moved to another 
> partition.
> +*
> +* UPDATE ignores this case and continues.
> +*
> +* If MERGE has a WHEN NOT MATCHED clause we 
> know that the
> +* user would like to INSERT something in 
> this case, yet
> +* we can't see the delete with our snapshot, 
> so take the
> +* safe choice and throw an ERROR. If the 
> user didn't care
> +* about WHEN NOT MATCHED INSERT then neither 
> do we.
> +*
> +* XXX We might consider setting matched = 
> false and loop
> +* back to lmerge though we'd need to do 
> something like
> +* EvalPlanQual, but not quite.
> +*/
> +   else if (epqstate->epqresult == 
> EPQ_TUPLE_IS_NULL &&
> +node->mt_merge_subcommands & 
> ACL_INSERT)
> +   {
> +   /*
> +* We need to throw a retryable ERROR 
> because of the
> +* concurrent update which we can't 
> handle.
> +*/
> +   ereport(ERROR,
> +   
> (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
> +errmsg("could not serialize 
> access due to concurrent update")));
> +   }

I don't think that ERRCODE_T_R_SERIALIZATION_FAILURE is ever okay in
READ COMMITTED mode. Anyway, I wonder why we shouldn't just go ahead
an do the WHEN NOT MATCHED INSERT on the basis of information which is
not visible to our snapshot. We choose to not UPDATE/DELETE on the
basis of information from the future already, within EPQ. My first
feeling is that this is a distinction without a difference, and you
should actually go and INSERT at this point, though I reserve the
right to change my mind about that.

Yeah, EPQ semantics are icky, but would actually inserting here really
be any worse than what we do?

Other things I noticed (not related to concurrency) following a fairly
quick pass:

* Basic tab completion support would be nice.

* The SQL standard explicitly imposes this limitation:

postgres=# explain merge into cities a using cities b on a.city =
b.city when matched and (select 1=1) then update set country =
b.country;
ERROR:  0A000: cannot use subquery in WHEN AND condition
LINE 1: ...sing cities b on a.city = b.city when matched and (select 1=...
 ^
LOCATION:  transformSubLink, parse_expr.c:1865

Why do you, though? Do we really need it? I'm just curious about your
thoughts on it. To be clear, I'm not asserting that you're wrong.

* ISTM that this patch should have inserted/updated/deleted rows, in

Re: FOR EACH ROW triggers on partitioned tables

2018-01-29 Thread Peter Eisentraut

On 1/23/18 17:10, Alvaro Herrera wrote:
> The main question is this: when running the trigger function, it is
> going to look as it is running in the context of the partition, not in
> the context of the parent partitioned table (TG_RELNAME etc).  That
> seems mildly ugly: some users may be expecting that the partitioning
> stuff is invisible to the rest of the system, so if you have triggers on
> a regular table and later on decide to partition that table, the
> behavior of triggers will change, which is maybe unexpected.  Maybe this
> is not really a problem, but I'm not sure and would like further
> opinions.

One could go either way on this, but I think reporting the actual table
partition is acceptable and preferable.  If you are writing a generic
trigger function, maybe to dump out all columns, you want to know the
physical table and its actual columns.  It's easy[citation needed] to
get the partition root for a given table, if the trigger code needs
that.  The other way around is not possible.

Some other comments are reading the patch:

It seems to generally follow the patterns of the partitioned indexes
patch, which is good.

I think WHEN clauses on partition triggers should be OK.  I don't see a
reason to disallow them.

Similarly, transition tables should be OK.  You only get the current
partition to look at, of course.

The function name CloneRowTriggersOnPartition() confused me.  A more
accurate phrasing might be CloneRowTriggersToPartition(), or maybe
reword altogether.

New CommandCounterIncrement() call in AttachPartitionEnsureIndexes()
should be explained.  Or maybe it belongs in ATExecAttachPartition()
between the calls to AttachPartitionEnsureIndexes() and
CloneRowTriggersOnPartition()?

Prohibition against constraint triggers is unclear.  The subsequent
foreign-key patches mess with that further.  It's not clear to me why
constraint triggers shouldn't be allowed like normal triggers.

Obvious missing things: documentation, pg_dump, psql updates

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Built-in connection pooling

2018-01-29 Thread Vladimir Sitnikov

Bruce>Well, we could have the connection pooler disconnect those, right?

I agree. Do you think we could rely on all the applications being
configured in a sane way?
A fallback configuration at DB level could still be useful to ensure the DB
keeps running in case multiple applications access it. It might be
non-trivial to ensure proper configurations across all the apps.

What I do like is the behaviour of dropping connections should already be
considered by existing applications, so it should fit naturally to the
existing apps.

Alternative approach might be to dump to disk relevant resources for
inactive sessions, so the session could be recreated in case the connection
is requested again after a long pause (e.g. reprepare all the statements),
however it sounds scary.

Vladimir

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund

Hi,

On 2018-01-29 10:28:18 -0800, Jeff Davis wrote:
> OK. How about this: are you open to changes that move us in the
> direction of extensibility later? (By this I do *not* mean imposing a
> bunch of requirements on you... either small changes to your patches
> or something part of another commit.)

I'm good with that.

> Or are you determined that this always should be a part of core?

I do think JIT compilation should be in core, yes. And after quite some
looking around that currently means either using LLVM or building our
own from scratch, and the latter doesn't seem attractive. But that
doesn't mean there *also* can be extensibility. If somebody wants to
experiment with a more advanced version of JIT compilation, develop a
gcc backed version (which can't be in core due to licensing), ... - I'm
happy to provide hooks that only require a reasonable effort and don't
affect the overall stability of the system (i.e. no callback from
PostgresMain()'s sigsetjmp() block).

> I don't want to stand in your way, but I am also hesitant to dive head
> first into LLVM and not look back. Postgres has always been lean, fast
> building, and with few dependencies.

It's an optional dependency, and it doesn't increase build time that
much... If we were to move the llvm interfacing code to a .so, there'd
not even be a packaging issue, you can just package that .so separately
and get errors if somebody tries to enable LLVM without that .so being
installed.

> In other words, are you "strongly against [extensbility being a
> requirement for the first commit]" or "strongly against [extensible
> JIT]"?

I'm strongly against there not being an in-core JIT. I'm not at all
against adding APIs that allow to do different JIT implementations out
of core.

> If the source for functions is in the catalog, we could build the
> bitcode at runtime and still do the inlining. We wouldn't need to do
> anything at build time. (Again, this would be "cool stuff for the
> future", I am not asking you for it now.)

Well, the source would require an actual compiler around. And the
inlining *just* for the function code itself isn't actually that
interesting, you e.g. want to also be able to

Greetings,

Andres Freund

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Jeff Davis

On Mon, Jan 29, 2018 at 1:36 AM, Andres Freund  wrote:
> There's already a *lot* of integration points in the patchseries. Error
> handling needs to happen in parts of code we do not want to make
> extensible, the defintion of expression steps has to exactly match, the
> core code needs to emit the right types for syncing, the core code needs
> to define the right FIELDNO accessors, there needs to be planner
> integrations.  Many of those aren't doable with even remotely the same
> effort, both initial and continual, from non-core code

OK. How about this: are you open to changes that move us in the
direction of extensibility later? (By this I do *not* mean imposing a
bunch of requirements on you... either small changes to your patches
or something part of another commit.) Or are you determined that this
always should be a part of core?

I don't want to stand in your way, but I am also hesitant to dive head
first into LLVM and not look back. Postgres has always been lean, fast
building, and with few dependencies. Who knows what LLVM will do in
the future and how that will affect postgres? Especially when, on day
one, we already know that it causes a few annoyances?

In other words, are you "strongly against [extensbility being a
requirement for the first commit]" or "strongly against [extensible
JIT]"?

>> > Well, but doing this outside of core would pretty much prohibit doing so
>> > forever, no?
>>
>> First of all, building .bc files at build time is much less invasive
>> than linking to the LLVM library.
>
> Could you expand on that, I don't understand why that'd be the case?

Building the .bc files at build time depends on LLVM, but is not very
version-dependent and has no impact on the resulting binary. That's
less invasive than a dependency on a library with an unstable API that
doesn't entirely work with our error reporting facility.

>> Third, there's lots of cool stuff we can do here:
>>   * put the source in the catalog
>>   * an extension could have its own catalog and build the source into
>> bitcode and cache it there
>>   * the source for functions would flow to replicas, etc.
>>   * security-conscious environments might even choose to run some of
>> the C code in a safe C interpreter rather than machine code
>
> I agree, but what does that have to do with the llvmjit stuff being an
> extension or not?

If the source for functions is in the catalog, we could build the
bitcode at runtime and still do the inlining. We wouldn't need to do
anything at build time. (Again, this would be "cool stuff for the
future", I am not asking you for it now.)

Regards,
 Jeff Davis

Re: PATCH: Exclude unlogged tables from base backups

2018-01-29 Thread David Steele

On 1/29/18 9:13 AM, David Steele wrote:
> On 1/29/18 5:28 AM, Masahiko Sawada wrote:
>> But I
>> have a question; can we exclude temp tables as well? The pg_basebackup
>> includes even temp tables. But I don't think that it's necessary for
>> backups
> Thank you for having another look at the patch.
> 
> Temp tables should be excluded by this code which is already in
> basebackup.c:
> 
> /* Skip temporary files */
> if (strncmp(de->d_name,
> PG_TEMP_FILE_PREFIX,
> strlen(PG_TEMP_FILE_PREFIX)) == 0)
> continue;
> 
> This looks right to me.


Whoops, my bad.  Temp relations are stored in the db directories with a
"t" prefix.  Looks like we can take care of those easily enough but I
think it should be a separate patch.

I'll plan to submit that for CF 2018-03.

Thanks!
-- 
-David
da...@pgmasters.net

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 17:35, Peter Geoghegan  wrote:
> On Mon, Jan 29, 2018 at 8:51 AM, Simon Riggs  wrote:
>> On 29 January 2018 at 16:44, Bruce Momjian  wrote:
>>
>>> I think the question is how does it handle cases it doesn't support?
>>> Does it give wrong answers?  Does it give a helpful error message?  Can
>>> you summarize that?
>>
>> I'm happy to report that it gives correct answers to every known MERGE
>> test, except
>>
>> * where it hits a concurrency issue and throws SQLCODE =
>> ERRCODE_T_R_SERIALIZATION_FAILURE and the standard text for that
>>
>> * where it hits an unsupported feature and throws SQLCODE =
>> ERRCODE_FEATURE_NOT_SUPPORTED, with appropriate text
>
> What specific features does it not work with already? A list would be helpful.

Yes, I added that to the docs as a result of your review comments.

I also mentioned them here last week in your review in answer to your
specific questions.


The current list of features that return ERRCODE_FEATURE_NOT_SUPPORTED is
* Tables with Row Security enabled
* Partitioning & Inheritance
* Foreign Tables

Rules are ignored, as they are with COPY.

If people have concerns or find problems following review, I will be
happy to update this list and/or fix issues, as normal.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Peter Geoghegan

On Mon, Jan 29, 2018 at 8:51 AM, Simon Riggs  wrote:
> On 29 January 2018 at 16:44, Bruce Momjian  wrote:
>
>> I think the question is how does it handle cases it doesn't support?
>> Does it give wrong answers?  Does it give a helpful error message?  Can
>> you summarize that?
>
> I'm happy to report that it gives correct answers to every known MERGE
> test, except
>
> * where it hits a concurrency issue and throws SQLCODE =
> ERRCODE_T_R_SERIALIZATION_FAILURE and the standard text for that
>
> * where it hits an unsupported feature and throws SQLCODE =
> ERRCODE_FEATURE_NOT_SUPPORTED, with appropriate text

What specific features does it not work with already? A list would be helpful.

-- 
Peter Geoghegan

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Peter Geoghegan

On Mon, Jan 29, 2018 at 8:44 AM, Bruce Momjian  wrote:
> On Mon, Jan 29, 2018 at 04:34:48PM +, Simon Riggs wrote:
>> The only discussion would be about the word "unfinished". I'm not
>> clear why this patch, which has current caveats all clearly indicated
>> in the docs, differs substantially from other projects that have
>> committed their work ahead of having everything everybody wants, such
>> as replication, materialized views, parallel query, partitioning,
>> logical decoding etc.. All of those features had caveats in the first
>> release in which they were included and many of them were committed
>> prior to the last CF. We are working now to remove those caveats. Why
>> is this different? It shouldn't be. If unfinished means it has caveats
>> that is different to unfinished meaning crappy, risky, contentious
>> etc..
>
> I think the question is how does it handle cases it doesn't support?
> Does it give wrong answers?  Does it give a helpful error message?  Can
> you summarize that?

+1

ON CONFLICT had support for logical decoding, updatable views, and RLS
in the first commit. ON CONFLICT was committed in a form that worked
seamlessly with any other feature in the system you can name. Making
ON CONFLICT play nice with all adjacent features was a great deal of
work, and I definitely needed Andres' help for the logical decoding
part, but we got it done. (Logical decoding for MERGE should be quite
a lot easier, though.)

I'm willing to talk about why MERGE is different to ON CONFLICT, and
why it may not need to tick all of the same boxes. DML statements are
supposed to be highly composable things, though. That's the premise
you should start from IMV.

-- 
Peter Geoghegan

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Pavel Stehule

2018-01-29 18:08 GMT+01:00 Simon Riggs :

> On 29 January 2018 at 16:23, Chapman Flack  wrote:
> > On 01/29/2018 11:13 AM, Simon Riggs wrote:
> >> On 29 January 2018 at 15:44, Bruce Momjian  wrote:
> >>> Uh, if we know we are going to get question on this, the patch had
> >>> better have an explanation of when to use it.  Pushing the problem to
> >>> later doesn't seem helpful.
> >>
> >> What problem are you referring to?
> >>
> >> INSERT ON CONFLICT UPDATE does ...
> >>
> >> MERGE allows you to ...
> > In my reading of Pavel and Bruce, the only 'problem' being suggested
> > is that the patch hasn't added a bit of documentation somewhere that
> > lays out the relationship between these two things, more or less as
> > you just did.
>
> I am happy to write docs as requested.
>
> There are currently no docs saying when INSERT ON CONFLICT UPDATE
> should be used other than the ref page for that command. There is no
> mention of it in the "Data Manipulation" section of the docs.
>
> I've included docs for MERGE so it is mentioned in concurrency and
> reference sections, so it follows the same model.
>
> Where would people like me to put these docs?
>

Depends on size - small note can be placed in MERGE docs and link from
INSERT ON CONFLICT DO.

Regards

Pavel



>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 16:23, Chapman Flack  wrote:
> On 01/29/2018 11:13 AM, Simon Riggs wrote:
>> On 29 January 2018 at 15:44, Bruce Momjian  wrote:
>>> Uh, if we know we are going to get question on this, the patch had
>>> better have an explanation of when to use it.  Pushing the problem to
>>> later doesn't seem helpful.
>>
>> What problem are you referring to?
>>
>> INSERT ON CONFLICT UPDATE does ...
>>
>> MERGE allows you to ...
> In my reading of Pavel and Bruce, the only 'problem' being suggested
> is that the patch hasn't added a bit of documentation somewhere that
> lays out the relationship between these two things, more or less as
> you just did.

I am happy to write docs as requested.

There are currently no docs saying when INSERT ON CONFLICT UPDATE
should be used other than the ref page for that command. There is no
mention of it in the "Data Manipulation" section of the docs.

I've included docs for MERGE so it is mentioned in concurrency and
reference sections, so it follows the same model.

Where would people like me to put these docs?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions

2018-01-29 Thread Tomas Vondra

On 01/29/2018 03:17 PM, Simon Riggs wrote:
> On 29 January 2018 at 14:13, Tomas Vondra  
> wrote:
> 
>> 4) inspect the new row (which we still have in reorderbuffer)
>>
>> 5) Kabooom! The row has column "c" which we don't see in the catalog.
> 
> We don't use caches? Why does a cache miss cause it to explode?
> 

We do use caches (and we invalidate them), of course.

But the problem is that by the time we get to lookup the row, it may be
either removed by VACUUM (because the catalog cleanup is more
aggressive) or not reachable using an index (which is the HOT issue
pointed out by Robert earlier in this thread).

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 16:50, Tom Lane  wrote:
> Bruce Momjian  writes:
>> On Mon, Jan 29, 2018 at 04:34:48PM +, Simon Riggs wrote:
>>> ... If unfinished means it has caveats
>>> that is different to unfinished meaning crappy, risky, contentious
>>> etc..
>
>> I think the question is how does it handle cases it doesn't support?
>> Does it give wrong answers?  Does it give a helpful error message?  Can
>> you summarize that?
>
> What I was reacting to was the comments just upthread that it doesn't
> yet handle partitions or RLS.  Those things don't seem optional to me.
> Maybe they're small additions, but if so why aren't they done already?

Phasing and risk.

Partitioning doesn't look too bad, so that looks comfortable for PG11,
assuming it doesn't hit some unhandled complexity.

Including RLS in the first commit/release turns this into a high risk
patch. Few people use it, but if they do, they don't want me putting a
hole in their battleship (literally) should we discover some weird
unhandled logic in a complex new command.

My recommendation would be to support that later for those that use
it. For those that don't, it doesn't matter so can also be done later.

> Also, as far as phased development goes: Simon's drawing analogies
> to things like parallel query, which we all understood had to be
> done over multiple dev cycles because they were too big to finish
> in one cycle.  I don't think MERGE qualifies: there seems no good
> reason why it can't be done, full stop, in the first release where
> it appears.

That remains the plan, barring delays.

If you want to include RLS, then I would appreciate an early review.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Andres Freund

Hi,

On 2018-01-29 15:45:56 +0300, Konstantin Knizhnik wrote:
> On 26.01.2018 22:38, Andres Freund wrote:
> > And without it perf is not able to unwind stack trace for generated
> > > code.
> > You can work around that by using --call-graph lbr with a sufficiently
> > new perf. That'll not know function names et al, but at least the parent
> > will be associated correctly.
> 
> With --call-graph lbr result is ... slightly different (see attached
> profile) but still there is "unknown" bar.

Right. All that allows is to attribute the cost below the parent in the
perf report --children case. For it to be attributed to proper symbols
you need my llvm patch to support pef.



> Actually I am trying to find answer for the question why your version of JIT
> provides ~2 times speedup at Q1, while ISPRAS version 
> (https://www.pgcon.org/2017/schedule/attachments/467_PGCon%202017-05-26%2015-00%20ISPRAS%20Dynamic%20Compilation%20of%20SQL%20Queries%20in%20PostgreSQL%20Using%20LLVM%20JIT.pdf)
> speedup Q1 is 5.5x times.
> May be it is because them are using double type to calculate aggregates
> while as far as I understand you are using standard Postgres aggregate
> functions?
> Or may be because ISPRAS version is not checking for NULL values...

All of those together, yes. And added that I'm aiming to work
incrementally towards core inclusions, rather than getting the best
results.  There's a *lot* that can be done to improve the generated code
- after e.g. hacking together an improvement to the argument passing (by
allocating isnull / nargs / arg[], argnull[] as a separate on-stack from
FunctionCallInfoData), I get another 1.8x.  Eliminating redundant float
overflow checks gives another 1.2x. And so on.

Greetings,

Andres Freund

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 16:44, Bruce Momjian  wrote:

> I think the question is how does it handle cases it doesn't support?
> Does it give wrong answers?  Does it give a helpful error message?  Can
> you summarize that?

I'm happy to report that it gives correct answers to every known MERGE
test, except

* where it hits a concurrency issue and throws SQLCODE =
ERRCODE_T_R_SERIALIZATION_FAILURE and the standard text for that

* where it hits an unsupported feature and throws SQLCODE =
ERRCODE_FEATURE_NOT_SUPPORTED, with appropriate text

but of course Robert is correct and everything benefits from further review.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Tom Lane

Bruce Momjian  writes:
> On Mon, Jan 29, 2018 at 04:34:48PM +, Simon Riggs wrote:
>> ... If unfinished means it has caveats
>> that is different to unfinished meaning crappy, risky, contentious
>> etc..

> I think the question is how does it handle cases it doesn't support? 
> Does it give wrong answers?  Does it give a helpful error message?  Can
> you summarize that?

What I was reacting to was the comments just upthread that it doesn't
yet handle partitions or RLS.  Those things don't seem optional to me.
Maybe they're small additions, but if so why aren't they done already?

Also, as far as phased development goes: Simon's drawing analogies
to things like parallel query, which we all understood had to be
done over multiple dev cycles because they were too big to finish
in one cycle.  I don't think MERGE qualifies: there seems no good
reason why it can't be done, full stop, in the first release where
it appears.

regards, tom lane

Re: [HACKERS] Secondary index access optimizations

2018-01-29 Thread Konstantin Knizhnik




On 29.01.2018 16:24, Konstantin Knizhnik wrote:

On 29.01.2018 07:34, Thomas Munro wrote:

On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik
 wrote:

On 19.01.2018 16:14, Antonin Houska wrote:

you should test the operator B-tree strategy: BTLessStrategyNumber,
BTLessEqualStrategyNumber, etc. The operator string alone does not 
tell

enough
about the operator semantics.

The strategy can be found in the pg_amop catalog.
get_op_btree_interpretation() function may be useful, but there may be
something better in utils/cache/lsyscache.c.


Thank you very much.
Shame on me that I didn't notice such solution myself - such 
checking of

B-tree strategy is done in the same predtest.c file!
Now checking of predicate clauses compatibility is done in much more 
elegant

and general way.
Attached please find new version of the patch.

I wonder if you should create a new index and SysCache entry for
looking up range types by subtype.  I'll be interested to see what
others have to say about this range type-based technique -- it seems
clever to me, but I'm not familiar enough with this stuff to say if
there is some other approach you should be using instead.

I think that it is good idea to add caching for range type lookup.
If community think that it will be useful I can try to add such 
mechanism.
But it seems to be not so trivial, especially properly handle 
invalidations.




I have added caching of element_type->range_type mapping to typcache.c. 
But it is not invalidated now if pg_range relation is changed.
Also if there are more than one range types for the specified element 
type then first of them is used.


--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bce3348..6a7e7fb 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 1df1e3a..c421530 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 44f6b03..c7bf118 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, rte);
 if (rte->relkind == RELKIND_FOREIGN_TABLE)
 {
 	/* Foreign table */
@@ -1130,6 +1131,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			set_dummy_rel_pathlist(childrel);
 			continue;
 		}
+		remove_restrictions_implied_by_constraints(root, childrel, childRTE);
 
 		/* CE failed, so finish copying/modifying join quals. */
 		childrel->joininfo =

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Bruce Momjian

On Mon, Jan 29, 2018 at 04:34:48PM +, Simon Riggs wrote:
> I agree with all of the above.
> 
> In terms of timing of commits, I have marked the patch Ready For
> Committer. To me that signifies that it is ready for review by a
> Committer prior to commit.
> 
> In case of doubt, I would not even suggest committing this if it had
> any concurrency issues. That would be clearly unacceptable.
> 
> The only discussion would be about the word "unfinished". I'm not
> clear why this patch, which has current caveats all clearly indicated
> in the docs, differs substantially from other projects that have
> committed their work ahead of having everything everybody wants, such
> as replication, materialized views, parallel query, partitioning,
> logical decoding etc.. All of those features had caveats in the first
> release in which they were included and many of them were committed
> prior to the last CF. We are working now to remove those caveats. Why
> is this different? It shouldn't be. If unfinished means it has caveats
> that is different to unfinished meaning crappy, risky, contentious
> etc..

I think the question is how does it handle cases it doesn't support? 
Does it give wrong answers?  Does it give a helpful error message?  Can
you summarize that?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 16:06, Tom Lane  wrote:
> Simon Riggs  writes:
>> On 29 January 2018 at 15:07, Robert Haas  wrote:
>>> An argument could be made that this patch is already too late for PG
>>> 11, because it's a major feature that was not submitted in relatively
>>> complete form before the beginning of the penultimate CommitFest.
>
>> Overall, I'm following the style of development process you have
>> yourself used a number of times now. Waiting for mega-patches to be
>> complete is not as useful as phased development.
>
> An important part of that style is starting at an appropriate time in the
> release cycle.  As things stand, you are proposing to commit an unfinished
> feature to v11, and then we have to see if the missing parts show up on
> time (ie before 1 March) and with adequate quality.  Otherwise we'll be
> having a debate on whether to revert the feature or not ... and if it
> comes to that, my vote will be for reverting.
>
> I'd be much happier about committing this with some essential parts
> missing if it were done at the start of a devel cycle rather than near
> the end.

I agree with all of the above.

In terms of timing of commits, I have marked the patch Ready For
Committer. To me that signifies that it is ready for review by a
Committer prior to commit.

In case of doubt, I would not even suggest committing this if it had
any concurrency issues. That would be clearly unacceptable.

The only discussion would be about the word "unfinished". I'm not
clear why this patch, which has current caveats all clearly indicated
in the docs, differs substantially from other projects that have
committed their work ahead of having everything everybody wants, such
as replication, materialized views, parallel query, partitioning,
logical decoding etc.. All of those features had caveats in the first
release in which they were included and many of them were committed
prior to the last CF. We are working now to remove those caveats. Why
is this different? It shouldn't be. If unfinished means it has caveats
that is different to unfinished meaning crappy, risky, contentious
etc..

Anyway, reviews welcome, but few people know anything about
targetlists and column handling.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Chapman Flack

On 01/29/2018 11:13 AM, Simon Riggs wrote:
> On 29 January 2018 at 15:44, Bruce Momjian  wrote:
>> Uh, if we know we are going to get question on this, the patch had
>> better have an explanation of when to use it.  Pushing the problem to
>> later doesn't seem helpful.
> 
> What problem are you referring to?
> 
> INSERT ON CONFLICT UPDATE does ...
> 
> MERGE allows you to ...
In my reading of Pavel and Bruce, the only 'problem' being suggested
is that the patch hasn't added a bit of documentation somewhere that
lays out the relationship between these two things, more or less as
you just did.

-Chap

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 15:44, Bruce Momjian  wrote:
> On Mon, Jan 29, 2018 at 03:12:23PM +, Simon Riggs wrote:
>> On 29 January 2018 at 14:55, Pavel Stehule  wrote:
>>
>> > My note was not against MERGE or INSERT ON CONFLICT. If I understand to 
>> > this
>> > topic, I agree so these commands should be implemented separately. But if 
>> > we
>> > use two commands with some intersection, there can be nice to have
>> > documentation about recommended use cases. Probably it will be very often
>> > question.
>>
>> That is more qualitative assessment of each, which I think I will defer on.
>>
>> This patch is about implementing the SQL Standard compliant MERGE
>> command which is widely used in other databases and by various tools.
>
> Uh, if we know we are going to get question on this, the patch had
> better have an explanation of when to use it.  Pushing the problem to
> later doesn't seem helpful.

What problem are you referring to? MERGE is not being implemented as
some kind of rival to existing functionality, it does things we cannot
yet do.

Info below is for interest only, it is unrelated to this patch:

INSERT ON CONFLICT UPDATE does only INSERT and UPDATE and has various
restrictions. It violates MVCC when it needed to allow it to succeed
more frequently in updating a concurrently inserted row. It is not SQL
Standard.

MERGE allows you to make INSERTs, UPDATEs and DELETEs against a single
target table using complex conditionals. It follows the SQLStandard;
many developers from other databases, much existing code and many
tools know it.
e.g.

MERGE INTO target t
USING source s
ON t.tid = s.sid
WHEN MATCHED AND balance > delta THEN
  UPDATE SET balance = balance - delta
WHEN MATCHED
 DELETE;
WHEN NOT MATCHED THEN
  INSERT (balance, tid) VALUES (balance + delta, sid)

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Built-in connection pooling

2018-01-29 Thread Bruce Momjian

On Mon, Jan 29, 2018 at 04:02:22PM +, Vladimir Sitnikov wrote:
> Bruce>Yes, it would impact applications and you are right most applications
> could not handle that cleanly.
> 
> I would disagree here.
> We are discussing applications that produce "lots of idle" connections, aren't
> we? That typically comes from an application-level connection pool.
> Most of the connection pools have a setting that would "validate" connection 
> in
> case it was not used for a certain period of time.
> 
> That plays nicely in case server drops "idle, not in a transaction" 
> connection.

Well, we could have the connection pooler disconnect those, right?

> Of course, there are cases when application just grabs a connection from a 
> pool
> and uses it in a non-transacted way (e.g. does some action once an hour and
> commits immediately). However that kind of application would already face
> firewalls, etc. I mean the application should already be prepared to handle
> "network issues".
> 
> Bruce> It is probably better to look into
> Bruce>freeing resources for idle connections instead and keep the socket open.
> 
> The application might expect for the session-specific data to be present, so 
> it
> might be even worse if the database deallocates all the things but TCP
> connection.
> 
> For instance: application might expect for the server-prepared statements to 
> be
> there. Would you deallocate server-prepared statements for those "idle"
> connections? The app would just break. There's no way (currently) for the
> application to know that the statement expired unexpectedly.

I don't know what we would deallocate yet.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Tom Lane

Simon Riggs  writes:
> On 29 January 2018 at 15:07, Robert Haas  wrote:
>> An argument could be made that this patch is already too late for PG
>> 11, because it's a major feature that was not submitted in relatively
>> complete form before the beginning of the penultimate CommitFest.

> Overall, I'm following the style of development process you have
> yourself used a number of times now. Waiting for mega-patches to be
> complete is not as useful as phased development.

An important part of that style is starting at an appropriate time in the
release cycle.  As things stand, you are proposing to commit an unfinished
feature to v11, and then we have to see if the missing parts show up on
time (ie before 1 March) and with adequate quality.  Otherwise we'll be
having a debate on whether to revert the feature or not ... and if it
comes to that, my vote will be for reverting.

I'd be much happier about committing this with some essential parts
missing if it were done at the start of a devel cycle rather than near
the end.

regards, tom lane

Re: [HACKERS] Custom compression methods

2018-01-29 Thread Ildus Kurbangaliev

On Mon, 29 Jan 2018 17:29:29 +0300
Ildar Musin  wrote:

> 
> Patch applies cleanly, builds without any warnings, documentation
> builds ok, all tests pass.
> 
> A remark for the committers. The patch is quite big, so I really wish 
> more reviewers looked into it for more comprehensive review. Also a 
> native english speaker should check the documentation and comments. 
> Another thing is that tests don't cover cmdrop method because the 
> built-in pglz compression doesn't use it (I know there is an jsonbd 
> extension [1] based on this patch and which should benefit from
> cmdrop method, but it doesn't test it either yet).
> 
> I think I did what I could and so passing this patch to committers
> for the review. Changed status to "Ready for committer".
> 
> 
> [1] https://github.com/postgrespro/jsonbd
> 

Thank you!

About cmdrop, I checked that's is called manually, but going to check
it thoroughly in my extension.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: Built-in connection pooling

2018-01-29 Thread Vladimir Sitnikov

Bruce>Yes, it would impact applications and you are right most applications
could not handle that cleanly.

I would disagree here.
We are discussing applications that produce "lots of idle" connections,
aren't we? That typically comes from an application-level connection pool.
Most of the connection pools have a setting that would "validate"
connection in case it was not used for a certain period of time.

That plays nicely in case server drops "idle, not in a transaction"
connection.

Of course, there are cases when application just grabs a connection from a
pool and uses it in a non-transacted way (e.g. does some action once an
hour and commits immediately). However that kind of application would
already face firewalls, etc. I mean the application should already be
prepared to handle "network issues".

Bruce> It is probably better to look into
Bruce>freeing resources for idle connections instead and keep the socket
open.

The application might expect for the session-specific data to be present,
so it might be even worse if the database deallocates all the things but
TCP connection.

For instance: application might expect for the server-prepared statements
to be there. Would you deallocate server-prepared statements for those
"idle" connections? The app would just break. There's no way (currently)
for the application to know that the statement expired unexpectedly.

Vladimir

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Bruce Momjian

On Mon, Jan 29, 2018 at 03:12:23PM +, Simon Riggs wrote:
> On 29 January 2018 at 14:55, Pavel Stehule  wrote:
> 
> > My note was not against MERGE or INSERT ON CONFLICT. If I understand to this
> > topic, I agree so these commands should be implemented separately. But if we
> > use two commands with some intersection, there can be nice to have
> > documentation about recommended use cases. Probably it will be very often
> > question.
> 
> That is more qualitative assessment of each, which I think I will defer on.
> 
> This patch is about implementing the SQL Standard compliant MERGE
> command which is widely used in other databases and by various tools.

Uh, if we know we are going to get question on this, the patch had
better have an explanation of when to use it.  Pushing the problem to
later doesn't seem helpful.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: A Generic Question about Generic type subscripting

2018-01-29 Thread Tom Lane

Hannu Krosing  writes:
> I started looking at the thread about "Generic type subscripting" and am
> wondering, why does it take the approach of modifying pg_type and
> modifying lots of internal functions, when instead it could be defined
> in a much lighter and less intrusive way as an operator, probably by
> reserving a dedicated operator name

It's pretty hard to see how that would extend to allowing extensions to
support either array slices ("arr[lo:hi]") or multi-dimensional arrays.
Or at least, by the time you get done with allowing those cases, plus
assignments to them, it's not so lightweight anymore.

You could make the argument that it's okay to blow all those options off
and say that extension types only get to define the simplest form of
one-subscript subscripting.  But this patch has higher ambition than
that, and I think that's good.

regards, tom lane

Re: Built-in connection pooling

2018-01-29 Thread Bruce Momjian

On Mon, Jan 29, 2018 at 11:57:36AM +0300, Konstantin Knizhnik wrote:
> Right now, if you hit max_connections, we start rejecting new
> connections.  Would it make sense to allow an option to exit idle
> connections when this happens so new users can connect?
> 
> It will require changes in client applications, will not it? Them should be
> ready that connection can be dropped by server at any moment of time.
> I do not know it is possible to drop idle connection and hide this fact from
> the client. In my implementation each session keeps minimal necessary
> information requires for interaction with client (session context).  It
> includes socket, struct Port and session memory context which should be used
> instead of TopMemoryContext for session specific data.

Yes, it would impact applications and you are right most applications
could not handle that cleanly.  It is probably better to look into
freeing resources for idle connections instead and keep the socket open.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 15:07, Robert Haas  wrote:

> Moreover, the patch should have had meaningful review from people not
> involved in writing it, and that is a process that generally takes a
> few months or at least several weeks, not a few days.

The code is about 1200 lines and has extensive docs, comments and tests.

There are no contentious infrastructure changes, so the debate around
concurrency is probably the main one. So it looks to me like
meaningful review has taken place, though I know Andrew and Pavan have
also looked at it in detail.

But having said that, I'm not rushing to commit and further detailed
review is welcome, hence the CF status.

> An argument could be made that this patch is already too late for PG
> 11, because it's a major feature that was not submitted in relatively
> complete form before the beginning of the penultimate CommitFest.  I'm
> not going to make that argument, because I believe this patch is
> probably sufficiently low-risk that it can be committed between now
> and feature freeze without great risk of destabilizing the release.
> But committing it without some in-depth review is not the way to get
> there.

The patch was substantially complete at that time (was v9d). Later
work has changed isolated areas.

I agree that this is low-risk. If I suggest committing it sooner
rather than later it is because that is more likely to throw up bugs
that will increase the eventual quality.

Overall, I'm following the style of development process you have
yourself used a number of times now. Waiting for mega-patches to be
complete is not as useful as phased development.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 14:55, Pavel Stehule  wrote:

> My note was not against MERGE or INSERT ON CONFLICT. If I understand to this
> topic, I agree so these commands should be implemented separately. But if we
> use two commands with some intersection, there can be nice to have
> documentation about recommended use cases. Probably it will be very often
> question.

That is more qualitative assessment of each, which I think I will defer on.

This patch is about implementing the SQL Standard compliant MERGE
command which is widely used in other databases and by various tools.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Robert Haas

On Tue, Jan 23, 2018 at 8:51 AM, Simon Riggs  wrote:
> This is complete and pretty clean now. 1200 lines of code, plus docs and 
> tests.
>
> I'm expecting to commit this and then come back for the Partitioning &
> RLS later, but will wait a few days for comments and other reviews.

I agree with Peter: that's unacceptable.  You're proposing to commit a
patch that is not only has had only a very limited amount of review
yet but by your own admission is not even complete.  Partitioning and
RLS shouldn't be afterthoughts; they should be in the original patch.
Moreover, the patch should have had meaningful review from people not
involved in writing it, and that is a process that generally takes a
few months or at least several weeks, not a few days.

An argument could be made that this patch is already too late for PG
11, because it's a major feature that was not submitted in relatively
complete form before the beginning of the penultimate CommitFest.  I'm
not going to make that argument, because I believe this patch is
probably sufficiently low-risk that it can be committed between now
and feature freeze without great risk of destabilizing the release.
But committing it without some in-depth review is not the way to get
there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: \describe*

2018-01-29 Thread David Fetter

On Mon, Jan 29, 2018 at 02:51:53PM +, Ryan Murphy wrote:
> >
> > >What I propose is in fact a server command, >which at least three of
> > >the other popular RDBMSs already have.
> >
> Well to actually implement it, it would probably be a client command,
> because that's what \d* are.

Why should this command be silo'ed off to the psql client?  If it's a
server command, it's available to all clients, not just psql.

> We would most likely want them implemented the same, to avoid
> needless complexity.

We could certainly have \d call DESCRIBE for later versions of the
server.  \ commands which call different SQL depending on server
version have long been a standard practice.

> I think people are more ok with \describe (with the backslash), which seems
> like what you're suggesting anyway.  I read Vik's "hard pass" as being on
> having DESCRIBE which looks like an SQL command but would actually be
> implemented on the client.  This seems simpler at first but could cause
> deep confusion later.

If we implement \d as DESCRIBE for server versions as of when DESCRIBE
is actually implemented, we've got wins all around.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Pavel Stehule

2018-01-29 15:40 GMT+01:00 Simon Riggs :

> On 29 January 2018 at 14:19, Pavel Stehule 
> wrote:
>
> >> The concurrency rules are very simple:
> >> If a MATCHED row is concurrently updated/deleted
> >> 1. We run EvalPlanQual
> >> 2. If the updated row is gone EPQ returns NULL slot or EPQ returns a
> >> row with NULL values, then
> >> {
> >>if NOT MATCHED action exists, then raise ERROR
> >>else continue to next row
> >> }
> >> else
> >>   re-check all MATCHED AND conditions and execute the first action
> >> whose WHEN Condition evaluates to TRUE
> >>
> >>
> >> This means MERGE will work just fine for "normal" UPDATEs, but it will
> >> often fail (deterministically) in concurrent tests with mixed
> >> insert/deletes or UPDATEs that touch the PK, as requested.
> >
> >
> > can be nice to have part about differences between MERGE and INSERT ON
> > CONFLICT DO
>
> We've agreed not to attempt to make it do anything like INSERT ON
> CONFLICT, so we don't need to discuss that here anymore.
>

My note was not against MERGE or INSERT ON CONFLICT. If I understand to
this topic, I agree so these commands should be implemented separately. But
if we use two commands with some intersection, there can be nice to have
documentation about recommended use cases. Probably it will be very often
question.

Regards

Pavel


>
> MERGE can be semantically equivalent to an UPDATE join or a DELETE
> join, and in those cases, MERGE behaves the same. It handles much more
> complex cases also.
>
> MERGE as submitted here follows all MVCC rules similar to an UPDATE
> join. If it hits a problem with concurent activity it throws
>ERROR:  could not serialize access due to concurrent update
> to make sure there is no ambiguity (as described directly above).
>
> As we discussed earlier, removing some of those ERRORs and making it
> do something useful instead may be possible later.
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: Wait for parallel workers to attach

2018-01-29 Thread Robert Haas

On Sat, Jan 27, 2018 at 3:14 AM, Amit Kapila  wrote:
> During the recent development of parallel operation (parallel create
> index)[1], a need has been arised for $SUBJECT.  The idea is to allow
> leader backend to rely on number of workers that are successfully
> started.  This API allows leader to wait for all the workers to start
> or fail even if one of the workers fails to attach.  We consider
> workers started/attached once they are attached to error queue.  This
> will ensure that any error after the workers are attached won't be
> silently ignored by leader.

known_started_workers looks a lot like any_message_received.  Perhaps
any_message_received should be renamed to known_started_workers and
reused here.  After all, if we know that a worker was started, there's
no need for WaitForParallelWorkersToFinish to again call
GetBackgroundWorkerPid() for it.

I think that you shouldn't need the 10ms delay loop; waiting forever
should work.  If a work fails to start, the postmaster should send
SIGUSR1 which should set our latch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Cancelling parallel query leads to segfault

2018-01-29 Thread Robert Haas

On Sat, Jan 27, 2018 at 10:45 PM, Andres Freund  wrote:
> Hi Peter (and others who mucked around with related code),
>
> While testing another patch I found that cancelling a parallel query on
> master segfaults the leader in an interesting manner:
...
> It's clearly not OK to recurse back into the executing query after a
> failure, even less so if resources like the underlying DSM segment have
> already been freed.
>
> I suspect this is largely the fault of
>
> commit 8561e4840c81f7e345be2df170839846814fa004
> Author: Peter Eisentraut 
> Date:   2018-01-22 08:30:16 -0500
>
> Transaction control in PL procedures

Yeah, I think this must be the fault of that commit, and I think your
diagnosis is accurate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] WIP: Aggregation push-down

2018-01-29 Thread Robert Haas

On Mon, Jan 29, 2018 at 3:32 AM, Antonin Houska  wrote:
> I think of a variant of this: implement an universal function that tests the
> binary values for equality (besides the actual arguments, caller would have to
> pass info like typlen, typalign, typbyval for each argument, and cache these
> for repeated calls) and set pg_proc(oprcode) to 0 wherever this function is
> sufficient. Thus the problematic cases like numeric, citext, etc. would be
> detected by their non-zero oprcode.

I don't think that's a good option, because we don't want int4eq to
have to go through a code path that has branches to support varlenas
and cstrings.  Andres is busy trying to speed up expression evaluation
by removing unnecessary code branches; adding new ones would be
undoing that valuable work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 14:19, Pavel Stehule  wrote:

>> The concurrency rules are very simple:
>> If a MATCHED row is concurrently updated/deleted
>> 1. We run EvalPlanQual
>> 2. If the updated row is gone EPQ returns NULL slot or EPQ returns a
>> row with NULL values, then
>> {
>>if NOT MATCHED action exists, then raise ERROR
>>else continue to next row
>> }
>> else
>>   re-check all MATCHED AND conditions and execute the first action
>> whose WHEN Condition evaluates to TRUE
>>
>>
>> This means MERGE will work just fine for "normal" UPDATEs, but it will
>> often fail (deterministically) in concurrent tests with mixed
>> insert/deletes or UPDATEs that touch the PK, as requested.
>
>
> can be nice to have part about differences between MERGE and INSERT ON
> CONFLICT DO

We've agreed not to attempt to make it do anything like INSERT ON
CONFLICT, so we don't need to discuss that here anymore.

MERGE can be semantically equivalent to an UPDATE join or a DELETE
join, and in those cases, MERGE behaves the same. It handles much more
complex cases also.

MERGE as submitted here follows all MVCC rules similar to an UPDATE
join. If it hits a problem with concurent activity it throws
   ERROR:  could not serialize access due to concurrent update
to make sure there is no ambiguity (as described directly above).

As we discussed earlier, removing some of those ERRORs and making it
do something useful instead may be possible later.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

2018-01-29 Thread Claudio Freire

On Mon, Jan 29, 2018 at 4:12 AM, Masahiko Sawada  wrote:
> On Sat, Jul 29, 2017 at 9:42 AM, Claudio Freire  
> wrote:
>> Introduce a tree pruning threshold to FreeSpaceMapVacuum that avoids
>> recursing into branches that already contain enough free space, to
>> avoid having to traverse the whole FSM and thus induce quadratic
>> costs. Intermediate FSM vacuums are only supposed to make enough
>> free space visible to avoid extension until the final (non-partial)
>> FSM vacuum.
>
> Hmm, I think this resolve a  part of the issue. How about calling
> AutoVacuumRequestWork() in PG_CATCH() if VACOPT_VACUUM is specified
> and give the relid that we were vacuuming but could not complete as a
> new autovacuum work-item? The new autovacuum work-item makes the
> worker vacuum FSMs of the given relation and its indices.

Well, I tried that in fact, as I mentioned in the OP.

I abandoned it due to the conjunction of the 2 main blockers I found
and mentioned there. In essence, those issues defeat the purpose of
the patch (to get the free space visible ASAP).

Don't forget, this is aimed at cases where autovacuum of a single
relation takes a very long time. That is, very big relations. Maybe
days, like in my case. A whole autovacuum cycle can take weeks, so
delaying FSM vacuum that much is not good, and using work items still
cause those delays, not to mention the segfaults.

> That way, we
> can ensure that FSM gets vacuumed by the cancelled autovacuum process
> or other autovacuum processes. Since a work-item can be handled by
> other autovacuum process I think 256 work-item limit would not be a
> problem.

Why do you think it wouldn't? In particular if you take into account
the above. If you have more than 256 relations in the cluster, it
could very well happen that you've queued the maximum amount and no
autovac worker has had a chance to take a look at them, because
they're all stuck vacuuming huge relations.

Not to mention the need to de-duplicate work items. We wouldn't want
to request repeated FSM vacuums, or worst, queue an FSM vacuum of a
single table 256 times and fill up the queue with redundant items.
With the current structure, de-duplication is O(N), so if we wanted to
raise the limit of 256 work items, we'd need a structure that would
let us de-duplicate in less than O(N). In essence, it's a ton of work
for very little gain. Hence why I abandoned it.

> Or it might be better to make autovacuum launcher launch
> worker process if there is pending work-item in a database even if
> there is no table needs to be vacuumed/analyzed.

Quite a must, in fact.

>> For one, it would sometimes crash when adding the work item from
>> inside autovacuum itself. I didn't find the cause of the crash, but I suspect
>> AutoVacuumRequestWork was being called in a context where it
>> was not safe.
>
> Perhaps the cause of this might be that autovacuum work-item is
> implemented using DSA, which has been changed before by commit
> 31ae1638ce35c23979f9bcbb92c6bb51744dbccb.

I thought about that. Perhaps. But the work-item approach didn't fail
just because of the segfaults. It's also that it doesn't address the
free space visibility issue quickly enough.

Re: [HACKERS] Custom compression methods

2018-01-29 Thread Ildar Musin


Hello Ildus,

On 29.01.2018 14:44, Ildus Kurbangaliev wrote:


Thanks! Attached new version of the patch.



Patch applies cleanly, builds without any warnings, documentation builds 
ok, all tests pass.


A remark for the committers. The patch is quite big, so I really wish 
more reviewers looked into it for more comprehensive review. Also a 
native english speaker should check the documentation and comments. 
Another thing is that tests don't cover cmdrop method because the 
built-in pglz compression doesn't use it (I know there is an jsonbd 
extension [1] based on this patch and which should benefit from cmdrop 
method, but it doesn't test it either yet).


I think I did what I could and so passing this patch to committers for 
the review. Changed status to "Ready for committer".



[1] https://github.com/postgrespro/jsonbd

--
Ildar Musin
i.mu...@postgrespro.ru

Re: [HACKERS] MERGE SQL Statement for PG11

2018-01-29 Thread Pavel Stehule

Hi

2018-01-29 15:11 GMT+01:00 Simon Riggs :

> On 26 January 2018 at 11:25, Simon Riggs  wrote:
> > On 24 January 2018 at 04:12, Simon Riggs  wrote:
> >> On 24 January 2018 at 01:35, Peter Geoghegan  wrote:
> >>>
> >> Please rebase, and post a new version.
> >>
> >> Will do, though I'm sure that's only minor since we rebased only a few
> days ago.
> >
> > New v12 with various minor corrections and rebased.
> >
> > Main new aspect here is greatly expanded isolation tests. Please read
> > and suggest new tests.
> >
> > We've used those to uncover a few unhandled cases in the concurrency
> > of very comple MERGE statements, so we will repost again on Mon/Tues
> > with a new version covering all the new tests and any comments made
> > here. Nothing to worry about, just some changed logic.
> >
> > I will post again later today with written details of the concurrency
> > rules we're working to now. I've left most of the isolation test
> > expected output as "TO BE DECIDED", so that we can agree our way
> > forwards.
>
> New patch attached that correctly handles all known concurrency cases,
> with expected test output.
>
> The concurrency rules are very simple:
> If a MATCHED row is concurrently updated/deleted
> 1. We run EvalPlanQual
> 2. If the updated row is gone EPQ returns NULL slot or EPQ returns a
> row with NULL values, then
> {
>if NOT MATCHED action exists, then raise ERROR
>else continue to next row
> }
> else
>   re-check all MATCHED AND conditions and execute the first action
> whose WHEN Condition evaluates to TRUE
>
>
> This means MERGE will work just fine for "normal" UPDATEs, but it will
> often fail (deterministically) in concurrent tests with mixed
> insert/deletes or UPDATEs that touch the PK, as requested.
>

can be nice to have part about differences between MERGE and INSERT ON
CONFLICT DO

Regards

Pavel


> --
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [HACKERS] WIP: Aggregation push-down

2018-01-29 Thread Chapman Flack

On 01/29/18 03:32, Antonin Houska wrote:
> Robert Haas  wrote:

>>> only take place if we had a special equality operator which distinguishes 
>>> the
>>> *binary* values (I don't know yet how to store this operator the catalog ---
...
>> We don't have an operator that tests for binary equality, but it's
>> certainly testable from C; see datumIsEqual.

Disclaimer: I haven't been following the whole thread closely. But could
there be some way to exploit the composite-type *= operator?

https://www.postgresql.org/docs/current/static/functions-comparisons.html#COMPOSITE-TYPE-COMPARISON

-Chap

Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 13:34, Tomas Vondra  wrote:

> The important detail is that we only really care
> about aborts in transactions that modified catalogs in some way (e.g. by
> doing DDL). But we can safely decode (and stream) changes up to the
> point when the catalogs get modified, so we can do two different things
> at that point:

Safely? Hmm.

I think what we are missing here is a test case, or a detailed
description of what we think happens. I can't see why a concurrent
abort would be unsafe, as in, it would make decoding explode.

If all it does is give the wrong answer when we process DDL, then all
we have to do is check for abort before and after we process any DDL
during decoding. Handling DDL is rare, so that test won't cost much in
terms of handling DDL. So ISTM that we can continue decoding after we
hit DDL, we just need some care.

(My earlier comments were based on the idea that 2PC patch would
simply block aborts, which of course will not work for streaming,
hence the difference)

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] [PATCH] Generic type subscripting

2018-01-29 Thread Arthur Zakirov

Hi,

On Sun, Jan 28, 2018 at 06:26:56PM +0100, Dmitry Dolgov wrote:
> 
> Here is a new version of the patch:
> 
> * rebased to the latest master
> 
> * fixed issues I mentioned above
> 
> * updated an example from the tutorial part

I have a few comments.

0002-Base-implementation-of-subscripting-mechanism-v6.patch:

> - if (op->d.arrayref_subscript.isupper)
> - indexes = arefstate->upperindex;
> + if (op->d.sbsref_subscript.isupper)
> + indexes = sbsrefstate->upper;

I think upperindex is better here. There was no need to rename it. Same
for lowerindex/lower.

There are a couple changes which unrelated to the patch. For example:

> -  * subscripting.  Adjacent A_Indices nodes have to be treated as a 
> single
> +  * subscripting. Adjacent A_Indices nodes have to be treated as a single

It is better to avoid it for the sake of decrease size of the patch.

> -  * typmod to be applied to the base type.  Subscripting a domain is an
> +  * typmod to be applied to the base type. Subscripting a domain is an

Same here.

> +/* Non-inline data for container operations */
> +typedef struct SubscriptingRefState
> +{
> + boolisassignment;   /* is it assignment, or just fetch? */
> ...
> +} SubscriptingRefState;

It is not good to move up SubscriptingRefState, because it is hard to
see changes between SubscriptingRefState and ArrayRefState.

> + FmgrInfo   *eval_finfo; /* function to evaluate 
> subscript */
> + FmgrInfo   *nested_finfo;   /* function to handle 
> nested assignment */

I think eval_finfo and nested_finfo are not needed anymore.

> +typedef Datum (*SubscriptingFetch) (Datum source, struct 
> SubscriptingRefState *sbsefstate);
> +
> +typedef Datum (*SubscriptingAssign) (Datum source, struct 
> SubscriptingRefState *sbsefstate);

Typo here? Did you mean sbsrefstate in the second argument?

> +typedef struct SbsRoutines
> +{
> + SubscriptingPrepare prepare;
> + SubscriptingValidatevalidate;
> + SubscriptingFetch   fetch;
> + SubscriptingAssign  assign;
> +
> +} SbsRoutines;

SbsRoutines is not good name for me. SubscriptRoutines or
SubscriptingRoutines sound better and it is consistent with other
structures.

0005-Subscripting-documentation-v6.patch:

> +class="parameter">type_modifier_output_function,
> +   analyze_function,
> +class="parameter">subscripting_handler_function,
>are optional.  Generally these functions have to be coded in C

Extra comma here.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions

2018-01-29 Thread Tomas Vondra

Hi,

On 01/29/2018 11:23 AM, Simon Riggs wrote:
> On 29 January 2018 at 07:15, Nikhil Sontakke  wrote:
> 
>>> Having this as responsibility of plugin sounds interesting. It
>>> certainly narrows the scope for which we need to solve the abort
>>> issue. For 2PC that may be okay as we need to somehow interact
>>> with transaction manager as Simon noted. I am not sure if this
>>> helps streaming use-case though as there is not going to be any
>>> external transaction management involved there.
> 
> I think we should recognize that the two cases are different.
> 
> 2PC decoding patch is looking at what happens AFTER a PREPARE has
> occurred on a transaction.
> 
> Streaming looks to have streaming occur right in the middle of a
> transaction, so it has other concerns and likely other solutions as a
> result.
> 

I don't quite see how these cases are so different, or why would they
require different handling. Can you explain?

We need to deal with decoding a transaction that aborts while we're
decoding it - we must not continue decoding it (in the sense of passing
the changes to the plugin). I don't see any major difference between
ROLLBACK and ROLLBACK PREPARED.

So I think the pre-abort hook solution should work for the streaming
case too. At least - I don't see why it wouldn't.

While discussing this with Peter Eisentraut off-list, I think we came up
with an alternative idea for the streaming case, that does not require
the pre-abort hook. The important detail is that we only really care
about aborts in transactions that modified catalogs in some way (e.g. by
doing DDL). But we can safely decode (and stream) changes up to the
point when the catalogs get modified, so we can do two different things
at that point:

(1) Stop streaming changes from that transaction, and instead start
spilling it to disk (and then continue with the replay only when it
actually commits).

(2) Note the transaction ID somewhere and restart the decoding (that is,
notify the downstream to throw away the data and go back in WAL to read
all the data from scratch) but spilling that one transaction to disk
instead of streaming it.

Neither of these solutions is currently implemented and would require
changes to ReorderBuffer (which currently does not support mixing of
spill-to-disk and streaming) and possibly the logical decoding
infrastructure (e.g. to stash the XIDs somewhere and allow restarting
from previous LSN).

The good thing is it does not need to know about aborted transactions
and so does not require communication with transaction manager using
pre-abort hooks etc. I think this would be a massive advantage.

The main question (at least for me) is if it's actually cheaper,
compared to the pre-abort hook. In my experience, aborts tend to be
fairly rare in practice - maybe 1:1000 to commits. On the other hand,
temporary tables are fairly common thing, and they count as catalog
changes, of course.

Maybe this is not that bad though, as it only really matters for large
transactions, which is when we start to spill to disk or stream. And
that should not be very often - if it happens very often, you probably
need a higher limit (similarly to work_mem).

The cases where this would matter are large ETL jobs, upgrade scripts
and so on - these tend to be large and mix DDL (temporary tables, ALTER
TABLE, ...). That's unfortunate, as it's one of the cases the streaming
was supposed to help.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Secondary index access optimizations

2018-01-29 Thread Konstantin Knizhnik


On 29.01.2018 07:34, Thomas Munro wrote:

On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik
 wrote:

On 19.01.2018 16:14, Antonin Houska wrote:

you should test the operator B-tree strategy: BTLessStrategyNumber,
BTLessEqualStrategyNumber, etc. The operator string alone does not tell
enough
about the operator semantics.

The strategy can be found in the pg_amop catalog.
get_op_btree_interpretation() function may be useful, but there may be
something better in utils/cache/lsyscache.c.


Thank you very much.
Shame on me that I didn't notice such solution myself - such checking of
B-tree strategy is done in the same predtest.c file!
Now checking of predicate clauses compatibility is done in much more elegant
and general way.
Attached please find new version of the patch.

I wonder if you should create a new index and SysCache entry for
looking up range types by subtype.  I'll be interested to see what
others have to say about this range type-based technique -- it seems
clever to me, but I'm not familiar enough with this stuff to say if
there is some other approach you should be using instead.

I think that it is good idea to add caching for range type lookup.
If community think that it will be useful I can try to add such mechanism.
But it seems to be not so trivial, especially properly handle invalidations.


Some superficial project style comments (maybe these could be fixed
automatically with pgindent?):

+static TypeCacheEntry* lookup_range_type(Oid type)

+static RangeType* operator_to_range(TypeCacheEntry *typcache, Oid
oper, Const* literal)

... these should be like this:

static RangeType *
operator_to_range(...

I think the idea is that you can always search for a function
definition with by looking for "name(" at the start of a line, so we
put a newline there.  Then there is the whitespace before "*", not
after it.

+ if (pred_range && clause_range && range_eq_internal(typcache,
pred_range, clause_range))
+ {
+ return true;
+ }

Unnecessary braces.

+/*
+ * Get range type for the corresprent scalar type.
+ * Returns NULl if such range type is not found.
+ * This function performs sequential scan in pg_range table,
+ * but since number of range rtype is not expected to be large (6
builtin range types),
+ * it should not be a problem.
+ */

Typos.


Thank you.
I fixed this issues.
New patch is attached.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bce3348..6a7e7fb 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 1df1e3a..c421530 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c

Re: [HACKERS] path toward faster partition pruning

2018-01-29 Thread Jesper Pedersen


Hi Amit,

On 01/26/2018 04:17 AM, Amit Langote wrote:

I made a few of those myself in the updated patches.

Thanks a lot again for your work on this.



This needs a rebase.

Best regards,
 Jesper

Re: PATCH: Exclude unlogged tables from base backups

2018-01-29 Thread Michael Paquier

On Mon, Jan 29, 2018 at 07:28:22PM +0900, Masahiko Sawada wrote:
> Thank you for updating the patch! The patch looks good to me. But I
> have a question; can we exclude temp tables as well? The pg_basebackup
> includes even temp tables. But I don't think that it's necessary for
> backups.

They are not needed in base backups.  Note that RemovePgTempFiles() does
not remove temporary relfilenodes after a crash per the comments on its
top.  I have not looked at the patch in details, but if you finish by
not including those files in what's proposed there is much refactoring
possible.
--
Michael


signature.asc
Description: PGP signature

Re: General purpose hashing func in pgbench

2018-01-29 Thread Ildar Musin




On 29.01.2018 15:03, Fabien COELHO wrote:



Patch applies, compiles, pgbench & global "make check" ok, doc built ok.

Ok for me, switched to "Ready".



Thank you for the thorough review!

--
Ildar Musin
i.mu...@postgrespro.ru

Re: A Generic Question about Generic type subscripting

2018-01-29 Thread Arthur Zakirov

On Mon, Jan 29, 2018 at 09:45:15AM +0200, Hannu Krosing wrote:
> ...
> I see two possibilities
> 
> 1) add a third "ARG" to the CREATE OPERATOR syntax, maybe VALUEARG
> 2) use composite types - so for
> 
> jsonb1[int1] = jsonb2
> 
> the operator would be defined by first defining a
> 
> CREATE TYPE intkey_and_jsonvalue as (key int, value jsonb)
> and then using this in
> CREATE OPERATOR [...] (PROCEDURE = jsonb_set_key, LEFTARG=jsonb,
> RIGHTARG=intkey_and_jsonvalue)

I think it will work for assignments. But what about fetching. For
example we have:

CREATE TYPE intkey_and_jsonvalue as (key int, value jsonb);
CREATE TYPE intkey_and_textvalue as (key int, value text);

What should return the next query?

select jsonb1[int1];

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Konstantin Knizhnik




On 26.01.2018 22:38, Andres Freund wrote:

And without it perf is not able to unwind stack trace for generated

code.

You can work around that by using --call-graph lbr with a sufficiently
new perf. That'll not know function names et al, but at least the parent
will be associated correctly.


With --call-graph lbr result is ... slightly different (see attached 
profile) but still there is "unknown" bar.



But you are compiling code using LLVMOrcAddEagerlyCompiledIR
and I find no way to pass no-omit-frame pointer option here.

It shouldn't be too hard to open code support for it, encapsulated in a
function:
 // Set function attribute "no-frame-pointer-elim" based on
 // NoFramePointerElim.
 for (auto  : *Mod) {
   auto Attrs = F.getAttributes();
   StringRef Value(options.NoFramePointerElim ? "true" : "false");
   Attrs = Attrs.addAttribute(F.getContext(), AttributeList::FunctionIndex,
  "no-frame-pointer-elim", Value);
   F.setAttributes(Attrs);
 }
that's all that option did for mcjit.


I have implemented the following function:

void
llvm_no_frame_pointer_elimination(LLVMModuleRef mod)
{
    llvm::Module *module = llvm::unwrap(mod);
    for (auto  : *module) {
        auto Attrs = F.getAttributes();
        Attrs = Attrs.addAttribute(F.getContext(), 
llvm::AttributeList::FunctionIndex,

                                   "no-frame-pointer-elim", "true");
        F.setAttributes(Attrs);
    }
}

and call it before LLVMOrcAddEagerlyCompiledIR in llvm_compile_module:

        llvm_no_frame_pointer_elimination(context->module);
        smod = LLVMOrcMakeSharedModule(context->module);

        if (LLVMOrcAddEagerlyCompiledIR(compile_orc, _handle, smod,
                                        llvm_resolve_symbol, NULL))
        {
            elog(ERROR, "failed to jit module");
        }


... but it has no effect: produced profile is the same (with 
--call-graph dwarf).

May be you can point me on my mistake...


Actually I am trying to find answer for the question why your version of 
JIT provides ~2 times speedup at Q1, while ISPRAS version 
(https://www.pgcon.org/2017/schedule/attachments/467_PGCon%202017-05-26%2015-00%20ISPRAS%20Dynamic%20Compilation%20of%20SQL%20Queries%20in%20PostgreSQL%20Using%20LLVM%20JIT.pdf)

speedup Q1 is 5.5x times.
May be it is because them are using double type to calculate aggregates 
while as far as I understand you are using standard Postgres aggregate 
functions?

Or may be because ISPRAS version is not checking for NULL values...

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: General purpose hashing func in pgbench

2018-01-29 Thread Fabien COELHO



Hello Ildar,


Fixed the doc, attached the patch. Thanks!


Patch applies, compiles, pgbench & global "make check" ok, doc built ok.

Ok for me, switched to "Ready".

--
Fabien.

Re: [HACKERS] [PATCH] Lockable views

2018-01-29 Thread Yugo Nagata

On Fri, 26 Jan 2018 21:30:49 +0900
Yugo Nagata  wrote:

> On Thu, 25 Jan 2018 20:51:41 +1300
> Thomas Munro  wrote:
> 
> > On Sun, Dec 31, 2017 at 11:57 PM, Yugo Nagata  wrote:
> > > On Fri, 29 Dec 2017 23:39:39 +0900 (JST)
> > > Tatsuo Ishii  wrote:
> > >> Your addition to the doc:
> > >> +   Automatically updatable views (see )
> > >> +   that do not have INSTEAD OF triggers or 
> > >> INSTEAD
> > >> +   rules are also lockable. When a view is locked, its base relations 
> > >> are
> > >> +   also locked recursively with the same lock mode.
> > >
> > > I added this point to the documentation.
> > 
> > +   Automatically updatable views (see )
> > +   that do not have INSTEAD OF triggers or INSTEAD
> 
> Thank you for pointing out this.
> 
> Also, I've noticed this patch cannot be applied to the HEAD, so I'll fix
> them together and update the patch.

Attached is the updated patch v5 including fixing SGML and rebase to HEAD.

Regards,

> 
> > 
> > Tthe documentation doesn't build: you now need to say 
> > instead of , and you need to say .
> > 
> > -- 
> > Thomas Munro
> > http://www.enterprisedb.com
> > 
> 
> 
> -- 
> Yugo Nagata 
> 


-- 
Yugo Nagata 
diff --git a/doc/src/sgml/ref/lock.sgml b/doc/src/sgml/ref/lock.sgml
index b2c7203..6ddd128 100644
--- a/doc/src/sgml/ref/lock.sgml
+++ b/doc/src/sgml/ref/lock.sgml
@@ -46,6 +46,14 @@ LOCK [ TABLE ] [ ONLY ] name [ * ]
   
 
   
+   Automatically updatable views (see )
+   that do not have INSTEAD OF triggers or
+   INSTEAD rules are also lockable. When a view is
+   locked, its base relations are also locked recursively with the same
+   lock mode. Tables appearing in a subquery are ignored and not locked.
+  
+
+  
When acquiring locks automatically for commands that reference
tables, PostgreSQL always uses the least
restrictive lock mode possible. LOCK TABLE
diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c
index 6479dcb..7ae3a84 100644
--- a/src/backend/commands/lockcmds.c
+++ b/src/backend/commands/lockcmds.c
@@ -19,15 +19,20 @@
 #include "commands/lockcmds.h"
 #include "miscadmin.h"
 #include "parser/parse_clause.h"
+#include "parser/parsetree.h"
 #include "storage/lmgr.h"
 #include "utils/acl.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
+#include "rewrite/rewriteHandler.h"
+#include "access/heapam.h"
 
-static void LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait);
-static AclResult LockTableAclCheck(Oid relid, LOCKMODE lockmode);
+static void LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait, Oid userid);
+static AclResult LockTableAclCheck(Oid relid, LOCKMODE lockmode, Oid userid);
 static void RangeVarCallbackForLockTable(const RangeVar *rv, Oid relid,
 			 Oid oldrelid, void *arg);
+static void LockViewRecurse(Oid reloid, Oid root_reloid, LOCKMODE lockmode, bool nowait);
+static void LockViewCheck(Relation view, Query *viewquery);
 
 /*
  * LOCK TABLE
@@ -62,8 +67,10 @@ LockTableCommand(LockStmt *lockstmt)
 		  RangeVarCallbackForLockTable,
 		  (void *) >mode);
 
-		if (recurse)
-			LockTableRecurse(reloid, lockstmt->mode, lockstmt->nowait);
+		if (get_rel_relkind(reloid) == RELKIND_VIEW)
+			LockViewRecurse(reloid, reloid, lockstmt->mode, lockstmt->nowait);
+		else if (recurse)
+			LockTableRecurse(reloid, lockstmt->mode, lockstmt->nowait, GetUserId());
 	}
 }
 
@@ -86,15 +93,17 @@ RangeVarCallbackForLockTable(const RangeVar *rv, Oid relid, Oid oldrelid,
 		return;	/* woops, concurrently dropped; no permissions
  * check */
 
-	/* Currently, we only allow plain tables to be locked */
-	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
+
+	/* Currently, we only allow plain tables or auto-updatable views to be locked */
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE &&
+		relkind != RELKIND_VIEW)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("\"%s\" is not a table",
+ errmsg("\"%s\" is not a table or view",
 		rv->relname)));
 
 	/* Check permissions. */
-	aclresult = LockTableAclCheck(relid, lockmode);
+	aclresult = LockTableAclCheck(relid, lockmode, GetUserId());
 	if (aclresult != ACLCHECK_OK)
 		aclcheck_error(aclresult, get_relkind_objtype(get_rel_relkind(relid)), rv->relname);
 }
@@ -107,7 +116,7 @@ RangeVarCallbackForLockTable(const RangeVar *rv, Oid relid, Oid oldrelid,
  * multiply-inheriting children more than once, but that's no problem.
  */
 static void
-LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait)
+LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait, Oid userid)
 {
 	List	   *children;
 	ListCell   *lc;
@@ -120,7 +129,7 @@ LockTableRecurse(Oid reloid, LOCKMODE lockmode, bool nowait)
 		AclResult	aclresult;
 
 		/* Check permissions before acquiring the lock. */
-		aclresult =

Re: [HACKERS] Restricting maximum keep segments by repslots

2018-01-29 Thread Kyotaro HORIGUCHI

Hello,

At Mon, 29 Jan 2018 19:26:34 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20180129.192634.217484965.horiguchi.kyot...@lab.ntt.co.jp>
> While rechecking the patch, I fixed the message issued on losing
> segments in 0001, revised the TAP test since I found that it was
> unstable.
> 
> The attached files are the correct version of the latest patch.

The name of the new function GetMinKeepSegment seems giving wrong
meaning. I renamed it to GetOlestKeepSegment.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 162dec1f6de1449047c4856722a276e1dcc14b63 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 21 Dec 2017 21:20:20 +0900
Subject: [PATCH 1/4] Add WAL releaf vent for replication slots

Adds a capability to limit the number of segments kept by replication
slots by a GUC variable.
---
 src/backend/access/transam/xlog.c | 114 +-
 src/backend/utils/misc/guc.c  |  11 +++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/access/xlog.h |   1 +
 4 files changed, 107 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e42b828..c01d0b3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -105,6 +105,7 @@ int			wal_level = WAL_LEVEL_MINIMAL;
 int			CommitDelay = 0;	/* precommit delay in microseconds */
 int			CommitSiblings = 5; /* # concurrent xacts needed to sleep */
 int			wal_retrieve_retry_interval = 5000;
+int			max_slot_wal_keep_size_mb = 0;
 
 #ifdef WAL_DEBUG
 bool		XLOG_DEBUG = false;
@@ -861,6 +862,7 @@ static void checkTimeLineSwitch(XLogRecPtr lsn, TimeLineID newTLI,
 static void LocalSetXLogInsertAllowed(void);
 static void CreateEndOfRecoveryRecord(void);
 static void CheckPointGuts(XLogRecPtr checkPointRedo, int flags);
+static XLogSegNo GetOldestKeepSegment(XLogRecPtr currpos, XLogRecPtr minSlotPtr);
 static void KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo);
 static XLogRecPtr XLogGetReplicationSlotMinimumLSN(void);
 
@@ -9348,6 +9350,74 @@ CreateRestartPoint(int flags)
 }
 
 /*
+ * Returns minimum segment number the next checktpoint must leave considering
+ * wal_keep_segments, replication slots and max_slot_wal_keep_size.
+ */
+static XLogSegNo
+GetOldestKeepSegment(XLogRecPtr currpos, XLogRecPtr minSlotPtr)
+{
+	uint64 keepSegs;
+	XLogSegNo currSeg;
+	XLogSegNo tailSeg;
+	uint64 slotlimitbytes;
+	uint64 slotlimitfragment;
+	uint64 currposoff;
+	XLogRecPtr slotpos = minSlotPtr;
+	XLogSegNo	slotSeg;
+
+	Assert(wal_keep_segments >= 0);
+	Assert(max_slot_wal_keep_size_mb >= 0);
+
+	XLByteToSeg(currpos, currSeg, wal_segment_size);
+	XLByteToSeg(slotpos, slotSeg, wal_segment_size);
+
+	/*
+	 * wal_keep_segments keeps more segments than slot, slotpos is no longer
+	 * useful. Don't perform subtraction to keep values positive.
+	 */
+	if (slotpos != InvalidXLogRecPtr && currSeg <= slotSeg + wal_keep_segments)
+		slotpos = InvalidXLogRecPtr;
+
+	/* slots aren't useful, consider only wal_keep_segments */
+	if (slotpos == InvalidXLogRecPtr)
+	{
+		/* avoid underflow, don't go below 1 */
+		if (currSeg <= wal_keep_segments)
+			return 1;
+
+		return currSeg - wal_keep_segments;
+	}
+
+	/* just return slotSeg if we don't put a limit */
+	if (max_slot_wal_keep_size_mb == 0)
+		return slotSeg;
+
+	/*
+	 * Slot limit is defined and slot gives the oldest segment to keep,
+	 * calculate the oldest segment that should not be removed
+	 */
+	slotlimitbytes = 1024 * 1024 * max_slot_wal_keep_size_mb;
+	slotlimitfragment = XLogSegmentOffset(slotlimitbytes,
+ wal_segment_size);
+	currposoff = XLogSegmentOffset(currpos, wal_segment_size);
+	keepSegs = wal_keep_segments +
+		ConvertToXSegs(max_slot_wal_keep_size_mb, wal_segment_size);
+	if (currposoff < slotlimitfragment)
+		keepSegs++;
+
+	/*
+	 * calculate the oldest segment that is kept by wal_keep_segments and
+	 * max_slot_wal_keep_size.
+	 */
+	if (currSeg <= keepSegs)
+		tailSeg = 1;
+	else
+		tailSeg = currSeg - keepSegs;
+
+	return tailSeg;
+}
+
+/*
  * Retreat *logSegNo to the last segment that we need to retain because of
  * either wal_keep_segments or replication slots.
  *
@@ -9359,34 +9429,38 @@ static void
 KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
 {
 	XLogSegNo	segno;
-	XLogRecPtr	keep;
+	XLogRecPtr	slotminptr = InvalidXLogRecPtr;
+	XLogSegNo	minSegNo;
+	XLogSegNo	slotSegNo;
 
 	XLByteToSeg(recptr, segno, wal_segment_size);
-	keep = XLogGetReplicationSlotMinimumLSN();
 
-	/* compute limit for wal_keep_segments first */
-	if (wal_keep_segments > 0)
-	{
-		/* avoid underflow, don't go below 1 */
-		if (segno <= wal_keep_segments)
-			segno = 1;
-		else
-			segno = segno - wal_keep_segments;
-	}
+	if (max_replication_slots > 0)
+		slotminptr = XLogGetReplicationSlotMinimumLSN();
 
-	/* then check whether slots limit

Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions

2018-01-29 Thread Simon Riggs

On 23 January 2018 at 19:17, Petr Jelinek  wrote:

> I am not sure if this helps streaming use-case though as
> there is not going to be any external transaction management involved there.

So, I think we need some specific discussion of what to do in that case.

Streaming happens only with big transactions and only for short periods.

The problem only occurs when we are decoding and we hit a catalog
table change. Processing of that is short, then we continue. So it
seems perfectly fine to block aborts in those circumstances. We can
just mark that state in a in-memory array of
StreamingDecodedTransactions that has size SizeOf(TransactionId) *
MaxNumWalSenders.

We can add a check into RecordTransactionAbort() just before the
critical section to see if we are currently processing a
StreamingDecodedTransaction and if so, poll until we're OK to abort.
The check will be quick and the abort path is not considered one we
need to optimize.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: Exclude unlogged tables from base backups

2018-01-29 Thread Masahiko Sawada

On Fri, Jan 26, 2018 at 4:58 AM, David Steele  wrote:
> On 1/25/18 12:31 AM, Masahiko Sawada wrote:
>> On Thu, Jan 25, 2018 at 3:25 AM, David Steele  wrote:

 Here is the first review comments.

 +   unloggedDelim = strrchr(path, '/');

 I think it doesn't work fine on windows. How about using
 last_dir_separator() instead?
>>>
>>> I think this function is OK on Windows -- we use it quite a bit.
>>> However, last_dir_separator() is clearer so I have changed it.
>>
>> Thank you for updating this. I was concerned about a separator
>> character '/' might not work fine on windows.
>
> Ah yes, I see what you mean now.
>
>> On Thu, Jan 25, 2018 at 6:23 AM, David Steele  wrote:
>>> On 1/24/18 4:02 PM, Adam Brightwell wrote:
>>> Actually, I was talking to Stephen about this it seems like #3 would be
>>> more practical if we just stat'd the init fork for each relation file
>>> found.  I doubt the stat would add a lot of overhead and we can track
>>> each unlogged relation in a hash table to reduce overhead even more.
>>>
>>
>> Can the readdir handle files that are added during the loop? I think
>> that we still cannot exclude a new unlogged relation if the relation
>> is added after we execute readdir first time. To completely eliminate
>> it we need a sort of lock that prevents to create new unlogged
>> relation from current backends. Or we need to do readdir loop multiple
>> times to see if no new relations were added during sending files.
>
> As far as I know readdir() is platform-dependent in terms of how it
> scans the dir and if files created after the opendir() will appear.
>
> It shouldn't matter, though, since WAL replay will recreate those files.

Yea, agreed.

>
>> If you're updating the patch to implement #3, this patch should be
>> marked as "Waiting on Author". After updated I'll review it again.
> Attached is a new patch that uses stat() to determine if the init fork
> for a relation file exists.  I decided not to build a hash table as it
> could use considerable memory and I didn't think it would be much faster
> than a simple stat() call.
>
> The reinit.c refactor has been removed since it was no longer needed.
> I'll submit the tests I wrote for reinit.c as a separate patch for the
> next CF.
>

Thank you for updating the patch! The patch looks good to me. But I
have a question; can we exclude temp tables as well? The pg_basebackup
includes even temp tables. But I don't think that it's necessary for
backups.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: [HACKERS] Restricting maximum keep segments by repslots

2018-01-29 Thread Kyotaro HORIGUCHI

Thank you for kindly noticing me of that.

At Mon, 29 Jan 2018 11:07:31 +1300, Thomas Munro 
 wrote in

Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions

2018-01-29 Thread Simon Riggs

On 29 January 2018 at 07:15, Nikhil Sontakke  wrote:

>> Having this as responsibility of plugin sounds interesting. It certainly
>> narrows the scope for which we need to solve the abort issue. For 2PC
>> that may be okay as we need to somehow interact with transaction manager
>> as Simon noted. I am not sure if this helps streaming use-case though as
>> there is not going to be any external transaction management involved there.

I think we should recognize that the two cases are different.

2PC decoding patch is looking at what happens AFTER a PREPARE has
occurred on a transaction.

Streaming looks to have streaming occur right in the middle of a
transaction, so it has other concerns and likely other solutions as a
result.

>> In any case all this interlocking could potentially be made less
>> impact-full by only doing it when we know the transaction did catalog
>> changes prior to currently decoded change (which we do during decoding)
>> since that's the only time we are interested in if it aborted or not.
>>
>> This all leads me to another idea. What if logical decoding provided API
>> for "locking/unlocking" the currently decoded transaction against abort.
>> This function would then be called by both decoding and output plugin
>> before any catalog read. The function can be smart enough to be NOOP if
>> the transaction is not running (ie we are not doing 2PC decoding or
>> streaming) or when the transaction didn't do any catalog modifications
>> (we already have that info easily accessible as bool).
>>
>> That would mean we'd never do any kind of heavy locking for prolonged
>> periods of time (ie network calls) but only during catalog access and
>> only when needed. It would also solve this for both 2PC and streaming
>> and it would be easy to use by plugin authors. Just document that some
>> call should be done before catalog access when in output plugin, can
>> even be Asserted that the call was done probably.
>>
>> Thoughts?
>>
>
> Yeah, this might work. We already have SET_LOCKTAG_TRANSACTION() via
> which every transaction takes an exclusive lock on its own
> transactionid when it starts, for example. We ideally want a single
> solution to handle 2PC and ongoing (streaming) transactions. We could
> introduce a new SET_LOCKTAG_LOGICALTRANSACTION(). The logical decoding
> process could take a SHARED lock on this, check if the XID is still ok
> to decode, read the catalog and unlock. Abort/Commit transaction
> processing could take this in EXCLUSIVE mode.
>
> As mentioned above, the plugin API which takes this lock will be smart
> enough to be a NOOP if the transaction is not running (i.e we are not
> doing 2PC decoding or streaming) or when the transaction didn't do any
> catalog modifications.

I think it is enough to say that ROLLBACK PREPARED cannot abort a
transaction while decoding is taking place, if the plugin decides it
wishes to prevent that. So the plugin API must have a
pre-rollback-prepared call to allow the plugin to decide how to handle
that (Is that what you meant by the pre-abort hook?). That call will
just return when and if it is OK to abort. Any waiting or throwing of
ERRORs can occur inside the plugin.

The plugin can provide a function to allow an abort that type of abort
if it wishes, but no new core functionality required for that. If
needed, we would wait on a latch the plugin provides for that.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v9.0

2018-01-29 Thread Pierre Ducroquet

On Monday, January 29, 2018 10:46:13 AM CET Andres Freund wrote:
> Hi,
> 
> On 2018-01-28 23:02:56 +0100, Pierre Ducroquet wrote:
> > I have fixed the build issues with LLVM 3.9 and 4.0. The LLVM
> > documentation is really lacking when it comes to porting from version x
> > to x+1.
> > The only really missing part I found is that in 3.9, GlobalValueSummary
> > has no flag showing if it's not EligibleToImport. I am not sure about the
> > consequences.
> 
> I think that'd not be too bad, it'd just lead to some small increase in
> overhead as more modules would be loaded.
> 
> > BTW, the makefile for src/backend/lib does not remove the llvmjit_types.bc
> > file when cleaning, and doesn't seem to install in the right folder.
> 
> Hm, both seems to be right here? Note that the llvmjit_types.bc file
> should *not* go into the bitcode/ directory, as it's about syncing types
> not inlining. I've added a comment to that effect.

The file was installed in lib/ while the code expected it in lib/postgresql. 
So there was something wrong here.
And deleting the file when cleaning is needed if at configure another llvm 
version is used. The file must be generated with a clang release that is not 
more recent than the llvm version linked to postgresql. Otherwise, the bitcode 
generated is not accepted by llvm.

Regards

 Pierre

1 2 >

1 - 100 of 114 matches

Mail list logo