Re: [HACKERS] Support to COMMENT ON DATABASE CURRENT_DATABASE

2017-10-05 Thread Jing Wang
Hi all,

The patch has been updated according to Nathan's comments.
Thanks Nathan's review.

Please find the updated patch in the attached files:
comment_on_current_database_no_pgdump_v4.3.patch --- support
current_database keyword exclude the pg_dump part.
comment_on_current_database_for_pgdump_v4.3.patch --- only for the pg_dump
part changed based on the previous patch

Regards,
Jing Wang
Fujitsu Australia


comment_on_current_database_no_pgdump_v4.3.patch
Description: Binary data


comment_on_current_database_for_pgdump_v4.3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] A hook for session start

2017-10-05 Thread Nico Williams
On Fri, Oct 06, 2017 at 11:04:38AM +0800, Craig Ringer wrote:
> On 6 October 2017 at 10:52, Pavel Stehule  wrote:
> 
> > It is better to work on GLOBAL TEMP tables.
> >
> > Current TEMP tables, if you do it for any session has pretty significant
> > overhead  - with possible risk of performance lost (system catalog bloat).
> >
> > pretty significant performance issue of my customers are related to temp
> > tables usage (under high load)
> 
> I've seen the same thing too. Especially when combined with logical
> decoding, where IIRC we mark transactions as having catalog changes
> due to temp tables.
> 
> Sometimes the catalog bloat can be truly horrible when a user has
> hundreds of plpgsql functions that all like to make temp tables.

I agree that we should have GLOBAL TEMP tables, but also we should have
a pg_temp_catalog where all TEMP schema elements go...  (That, I'm sure,
would be a lot of work.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] A hook for session start

2017-10-05 Thread Nico Williams
On Fri, Oct 06, 2017 at 04:52:09AM +0200, Pavel Stehule wrote:
> 2017-10-05 22:31 GMT+02:00 Nico Williams :
> > On Tue, Aug 01, 2017 at 03:36:23PM -0400, Peter Eisentraut wrote:
> > > On 7/21/17 13:14, Jim Mlodgenski wrote:
> > > > When I first saw this thread, my initial thought of a use case is to
> > > > prepare some key application queries so they are there and ready to go.
> > > > That would need to be before the ExecutorStart_hook or
> > > > ProcessUtility_hook if an app would just want to execute the prepared
> > > > statement.
> > >
> > > Isn't that what the preprepare extension does already?
> >
> > more generic facility -> more useful
> >
> > My use case is to pre-create TEMP schema elements that VIEWs, FUNCTIONs,
> > and TRIGGERs, might need.
> 
> It is better to work on GLOBAL TEMP tables.

I don't disagree.

In fact, I was scoping out what it might take to do that just yesterday.

I've too thoughts on that: either a new relpersistence kind that is very
similar to persistent, but which always uses temp heaps, or a modifier
for the persistent kind that says to use temp heaps.  Either way it
looks like it should be fairly straightforward (but then, i've only
ever written one thing for PG, earlier this week, the ALWAYS DEFERRED
thing).

> Current TEMP tables, if you do it for any session has pretty significant
> overhead  - with possible risk of performance lost (system catalog bloat).

Because of the DDLs for them?

> So often creating local temp tables is antipattern (in Postgres)
> unfortunately.

I do it plenty, but sometimes I use an UNLOGGED table with a txid column
in the PK set to txid_current(), then I clean up where I can.  It'd be
nice to have COMMIT triggers for cleaning up such rows, among other
things.  I've implemented that using DDL event triggers, but to perform
well it needs to be a native feature.

> I am not sure, if we should to support this case more :( Probably is
> better, so it is hard to use local TEMP tables.

No, I want GLOBAL TEMP tables.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-10-05 Thread Amit Kapila
On Thu, Oct 5, 2017 at 4:11 PM, Amit Khandekar  wrote:
>
> Ok. How about removing pa_all_partial_subpaths altogether , and
> instead of the below condition :
>
> /*
> * If all the child rels have partial paths, and if the above Parallel
> * Append path has a mix of partial and non-partial subpaths, then consider
> * another Parallel Append path which will have *all* partial subpaths.
> * If enable_parallelappend is off, make this one non-parallel-aware.
> */
> if (partial_subpaths_valid && !pa_all_partial_subpaths)
> ..
>
> Use this condition :
> if (partial_subpaths_valid && pa_nonpartial_subpaths != NIL)
> ..
>

Sounds good to me.

One minor point:

+ if (!node->as_padesc)
+ {
+ /*
+ */
+ if (!exec_append_seq_next(node))
+ return ExecClearTuple(node->ps.ps_ResultTupleSlot);
+ }

It seems either you want to add a comment in above part of patch or
you just left /**/ mistakenly.

> 
>
>
> Regarding a mix of partial and non-partial paths, I feel it always
> makes sense for the leader to choose the partial path.
>

Okay, but why not cheapest partial path?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] A hook for session start

2017-10-05 Thread Craig Ringer
On 6 October 2017 at 10:52, Pavel Stehule  wrote:

> It is better to work on GLOBAL TEMP tables.
>
> Current TEMP tables, if you do it for any session has pretty significant
> overhead  - with possible risk of performance lost (system catalog bloat).
>
> pretty significant performance issue of my customers are related to temp
> tables usage (under high load)

I've seen the same thing too. Especially when combined with logical
decoding, where IIRC we mark transactions as having catalog changes
due to temp tables.

Sometimes the catalog bloat can be truly horrible when a user has
hundreds of plpgsql functions that all like to make temp tables.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PATCH: Expose generate_qualified_relation_name functionality

2017-10-05 Thread Craig Ringer
I'm regularly surprised that the only non-static function we seem to
have for getting a relation name by oid is get_rel_name. It doesn't
handle schema qualification at all, and it returns NULL rather than
ERROR'ing.

Doing it correctly involves interacting with the syscache, calling
RelationIsVisible and calling get_namespace_name if needed, then
passing the result to quote_qualified_identifier.

so here's what I propose to do:

Add get_qualified_relation_name(bool force_qualified, bool missing_ok)
to ruleutils.c and builtins.h.

(Unless there's somewhere better? I wanted to put it in lsyscache.c
alongside get_rel_name, but it uses quote_qualified_identifier,
StringInfo, etc, so it doesn't fit well in lsyscache.c.)

Replace generate_qualified_relation_name in ruleutils.c with a call to
get_qualified_relation_name(relid, true, false) .

Add comments to RelationGetRelationName, get_rel_name and regclassout
pointing to get_qualified_relation_name.

The only part I don't like here is the two boolean arguments, since
they're ugly to read. But inventing a new flags field seemed a bit
heavy for such a trivial task.

I had a quick look through the codebase for places where this pattern
is repeated and found astonishingly few. In most places we just use
get_rel_name and hope the user can figure out any ambiguity.

I did notice that in heap_copy_data and AlterTableMoveAll we manually
schema-qualify a relation name and fail to call
quote_qualified_identifier, so it's unclear if we mean a relation
named "my.relation" or the relation "relation" in schema "my". But in
diagnostic output, meh, whatever.

Arguably regclassout does the same thing as
get_qualified_relation_name, but I didn't replace it because it cares
about IsBootstrapProcessingMode().

(Prompted by https://dba.stackexchange.com/q/187788/7788)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 011cd6e08d6d29854db637d6beb8709615f376cb Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Fri, 6 Oct 2017 10:48:17 +0800
Subject: [PATCH v1] Add get_qualified_relation_name

It was surprisingly hard to just get a table name in PostgreSQL at the
C level.

To simplify that, expose the functionality of generate_qualified_relation_name
and generate_relation_name to callers in the wider codebase and to extensions
in the form of a new get_qualified_relation_name function.
---
 src/backend/utils/adt/regproc.c |  2 +
 src/backend/utils/adt/ruleutils.c   | 81 +++--
 src/backend/utils/cache/lsyscache.c |  5 ++-
 src/include/utils/builtins.h|  2 +
 src/include/utils/rel.h |  4 ++
 5 files changed, 70 insertions(+), 24 deletions(-)

diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index 6fe81fa..691efa8 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -967,6 +967,8 @@ to_regclass(PG_FUNCTION_ARGS)
 
 /*
  * regclassout		- converts class OID to "class_name"
+ *
+ * (See get_qualified_relation_name for a direct-callable version).
  */
 Datum
 regclassout(PG_FUNCTION_ARGS)
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index eb01f35..0eccc02 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -10483,6 +10483,9 @@ quote_qualified_identifier(const char *qualifier,
  *
  * This differs from the underlying get_rel_name() function in that it will
  * throw error instead of silently returning NULL if the OID is bad.
+ *
+ * The returned relation name is not guaranteed to be unique outside the
+ * schema, so it is frequently preferable to use get_qualified_relation_name.
  */
 static char *
 get_relation_name(Oid relid)
@@ -10495,6 +10498,56 @@ get_relation_name(Oid relid)
 }
 
 /*
+ * get_qualified_relation_name
+ *
+ * Get a relation name as a palloc'd string in the current memory context.
+ *
+ * If the relation is not on the current search path, or if force_qualify is
+ * true, the namespace for the relation is prepended.
+ *
+ * ERROR's on a missing relation unless missing_ok is set, in which case returns
+ * NULL.
+ *
+ * The quoting rules used are the same as those for regclass quoting: each
+ * component is quoted if necessary to prevent ambiguity.
+ */
+char *
+get_qualified_relation_name(Oid relid, bool force_qualify, bool missing_ok)
+{
+	HeapTuple   tp;
+	Form_pg_class reltup;
+	char   *relname;
+	char   *nspname = NULL;
+	char   *result;
+
+	tp = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+	if (!HeapTupleIsValid(tp))
+	{
+		if (missing_ok)
+			return NULL;
+		else
+			elog(ERROR, "cache lookup failed for relation %u", relid);
+	}
+
+	reltup = (Form_pg_class) GETSTRUCT(tp);
+	relname = NameStr(reltup->relname);
+
+	if (force_qualify || !RelationIsVisible(relid))
+	{
+		nspname = get_namespace_name(reltup->relnamespace);
+		if (!nspname)
+			elog(ERROR, "cache 

Re: [HACKERS] [PATCH] A hook for session start

2017-10-05 Thread Pavel Stehule
2017-10-05 22:31 GMT+02:00 Nico Williams :

> On Tue, Aug 01, 2017 at 03:36:23PM -0400, Peter Eisentraut wrote:
> > On 7/21/17 13:14, Jim Mlodgenski wrote:
> > > When I first saw this thread, my initial thought of a use case is to
> > > prepare some key application queries so they are there and ready to go.
> > > That would need to be before the ExecutorStart_hook or
> > > ProcessUtility_hook if an app would just want to execute the prepared
> > > statement.
> >
> > Isn't that what the preprepare extension does already?
>
> more generic facility -> more useful
>
> My use case is to pre-create TEMP schema elements that VIEWs, FUNCTIONs,
> and TRIGGERs, might need.
>

It is better to work on GLOBAL TEMP tables.

Current TEMP tables, if you do it for any session has pretty significant
overhead  - with possible risk of performance lost (system catalog bloat).

pretty significant performance issue of my customers are related to temp
tables usage (under high load)

So often creating local temp tables is antipattern (in Postgres)
unfortunately.

I am not sure, if we should to support this case more :( Probably is
better, so it is hard to use local TEMP tables.

Regards

Pavel

>
> Nico
> --
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] valgrind complains about WaitEventSetWaitBlock on HEAD (fe9ba28e)

2017-10-05 Thread Andres Freund
On 2017-10-06 11:20:05 +0900, Michael Paquier wrote:
> Hi all,
> 
> While running valgrind on latest HEAD (suppression list included),  I
> am seeing complains with epoll_pwait() on Linux:
> ==12692== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
> ==12692==at 0x62F72D0: epoll_pwait (in /usr/lib/libc-2.26.so)
> ==12692==by 0x5D819C: WaitEventSetWaitBlock (latch.c:1048)
> ==12692==by 0x5D8060: WaitEventSetWait (latch.c:1000)
> ==12692==by 0x5D774D: WaitLatchOrSocket (latch.c:385)
> ==12692==by 0x5D7619: WaitLatch (latch.c:339)
> ==12692==by 0x582E57: ApplyLauncherMain (launcher.c:976)
> ==12692==by 0x550BA7: StartBackgroundWorker (bgworker.c:841)
> ==12692==by 0x5658BC: do_start_bgworker (postmaster.c:5693)
> ==12692==by 0x565C37: maybe_start_bgworkers (postmaster.c:5897)
> ==12692==by 0x5620FC: reaper (postmaster.c:2887)
> ==12692==by 0x4E4AD9F: ??? (in /usr/lib/libpthread-2.26.so)
> ==12692==by 0x62EEAA6: select (in /usr/lib/libc-2.26.so)
> ==12692==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
> 
> I have not looked at that very closely, but this does not look normal,
> hence this post.

I think this might be more a valgrind bug than anything. Note the
"sigmask points to to unaddressable byte" and "Address 0x0 is not
stack'd, malloc'd or (recently) free'd" bits. It's valid to pass a NULL
sigmask argument.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] valgrind complains about WaitEventSetWaitBlock on HEAD (fe9ba28e)

2017-10-05 Thread Michael Paquier
Hi all,

While running valgrind on latest HEAD (suppression list included),  I
am seeing complains with epoll_pwait() on Linux:
==12692== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==12692==at 0x62F72D0: epoll_pwait (in /usr/lib/libc-2.26.so)
==12692==by 0x5D819C: WaitEventSetWaitBlock (latch.c:1048)
==12692==by 0x5D8060: WaitEventSetWait (latch.c:1000)
==12692==by 0x5D774D: WaitLatchOrSocket (latch.c:385)
==12692==by 0x5D7619: WaitLatch (latch.c:339)
==12692==by 0x582E57: ApplyLauncherMain (launcher.c:976)
==12692==by 0x550BA7: StartBackgroundWorker (bgworker.c:841)
==12692==by 0x5658BC: do_start_bgworker (postmaster.c:5693)
==12692==by 0x565C37: maybe_start_bgworkers (postmaster.c:5897)
==12692==by 0x5620FC: reaper (postmaster.c:2887)
==12692==by 0x4E4AD9F: ??? (in /usr/lib/libpthread-2.26.so)
==12692==by 0x62EEAA6: select (in /usr/lib/libc-2.26.so)
==12692==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

I have not looked at that very closely, but this does not look normal,
hence this post.
Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A design for amcheck heapam verification

2017-10-05 Thread Peter Geoghegan
On Fri, Sep 29, 2017 at 10:54 AM, Peter Geoghegan  wrote:
>> Something that allocates new memory as the patch's bloom_init()
>> function does I'd tend to call 'make' or 'create' or 'new' or
>> something, rather than 'init'.
>
> I tend to agree. I'll adopt that style in the next version. I just
> didn't want the caller to have to manage the memory themselves.

v3 of the patch series, attached, does it that way -- it adds a
bloom_create(). The new bloom_create() function still allocates its
own memory, but does so while using a FLEXIBLE_ARRAY_MEMBER. A
separate bloom_init() function (that works with dynamic shared memory)
could easily be added later, for the benefit of parallel hash join.

Other notable changes:

* We now support bloom filters that have bitsets of up to 512MB in
size. The previous limit was 128MB.

* We now use TransactionXmin for our AccessShareLock xmin cutoff,
rather than calling GetOldestXmin(). This is the same cut-off used by
xacts that must avoid broken hot chains for their earliest snapshot.
This avoids a scan of the proc array, and allows more thorough
verification, since GetOldestXmin() was overly restrictive here.

* Expanded code comments describing the kinds of problems the new
verification capability is expected to be good at catching.

For example, there is now a passing reference to the CREATE INDEX
CONCURRENTLY bug that was fixed back in February (it's given as an
example of a more general problem -- faulty HOT safety assessment).
With the new heapallindexed enhancement added by this patch, amcheck
can be expected to catch that issue much of the time. We also go into
heap-only tuple handling within IndexBuildHeapScan(). The way that
CREATE INDEX tries to index the most recent tuple in a HOT chain
(while locating the root tuple in the chain, to get the right heap TID
for the index) has proven to be very useful as a smoke test while
investigating HOT/VACUUM FREEZE bugs in the past couple of weeks [1].
I believe it would have caught several historic MultiXact/recovery
bugs, too. This all seems worth noting explicitly.

[1] 
https://postgr.es/m/cah2-wznm4rcrhfaiwkpwtpew2bxdtgrozk7jwwgucxeh3d1...@mail.gmail.com
-- 
Peter Geoghegan
From 3bed03a9e0506c0b81097b634c5f1b5534a2dcb3 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Tue, 2 May 2017 00:19:24 -0700
Subject: [PATCH 2/2] Add amcheck verification of indexes against heap.

Add a new, optional capability to bt_index_check() and
bt_index_parent_check():  callers can check that each heap tuple that
ought to have an index entry does in fact have one.  This happens at the
end of the existing verification checks.

This is implemented by using a Bloom filter data structure.  The
implementation performs set membership tests within a callback (the same
type of callback that each index AM registers for CREATE INDEX).  The
Bloom filter is populated during the initial index verification scan.
---
 contrib/amcheck/Makefile |   2 +-
 contrib/amcheck/amcheck--1.0--1.1.sql|  28 
 contrib/amcheck/amcheck.control  |   2 +-
 contrib/amcheck/expected/check_btree.out |  14 +-
 contrib/amcheck/sql/check_btree.sql  |   9 +-
 contrib/amcheck/verify_nbtree.c  | 275 ---
 doc/src/sgml/amcheck.sgml| 157 ++
 src/include/utils/snapmgr.h  |   2 +-
 8 files changed, 423 insertions(+), 66 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.0--1.1.sql

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 43bed91..c5764b5 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,7 +4,7 @@ MODULE_big	= amcheck
 OBJS		= verify_nbtree.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.0.sql
+DATA = amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree
diff --git a/contrib/amcheck/amcheck--1.0--1.1.sql b/contrib/amcheck/amcheck--1.0--1.1.sql
new file mode 100644
index 000..e6cca0a
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.0--1.1.sql
@@ -0,0 +1,28 @@
+/* contrib/amcheck/amcheck--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.1'" to load this file. \quit
+
+--
+-- bt_index_check()
+--
+DROP FUNCTION bt_index_check(regclass);
+CREATE FUNCTION bt_index_check(index regclass,
+heapallindexed boolean DEFAULT false)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'bt_index_check'
+LANGUAGE C STRICT PARALLEL RESTRICTED;
+
+--
+-- bt_index_parent_check()
+--
+DROP FUNCTION bt_index_parent_check(regclass);
+CREATE FUNCTION bt_index_parent_check(index regclass,
+heapallindexed boolean DEFAULT false)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'bt_index_parent_check'
+LANGUAGE C STRICT PARALLEL RESTRICTED;
+
+-- Don't want these to be available to public
+REVOKE ALL ON FUNCTION bt_index_check(regclass, 

Re: [HACKERS] v10 bottom-listed

2017-10-05 Thread Amit Langote
On 2017/10/05 22:28, Erik Rijkers wrote:
> In the 'ftp' listing, v10 appears at the bottom:
>   https://www.postgresql.org/ftp/source/
> 
> With all the other v10* directories at the top, we could get a lot of
> people installing wrong binaries...
> 
> Maybe it can be fixed so that it appears at the top.

Still see it at the bottom.  Maybe ping pgsql-www?

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JIT compiling - v4.0

2017-10-05 Thread Robert Haas
On Thu, Oct 5, 2017 at 2:57 AM, Andres Freund  wrote:
> master q01 min: 14146.498 dev min: 11479.05 [diff -23.24] dev-jit 
> min: 8659.961 [diff -63.36] dev-jit-deform min: 7279.395 [diff -94.34]
>  dev-jit-deform-inline min: 6997.956 [diff -102.15]

I think this is a really strange way to display this information.
Instead of computing the percentage of time that you saved, you've
computed the negative of the percentage that you would have lost if
the patch were already committed and you reverted it.  That's just
confusing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimise default partition scanning while adding new partition

2017-10-05 Thread Amit Langote
On 2017/10/06 2:25, Robert Haas wrote:
> On Tue, Sep 26, 2017 at 4:27 AM, Amit Langote wrote:
>> I guess we don't need to squash, as they could be seen as implementing
>> different features. Reordering the patches helps though.  So, apply them
>> in this order:
>>
>> 1. My patch to teach ValidatePartitionConstraints() to skip scanning
>>a partition's own partitions, which optimizes ATTACH PARTITION
>>command's partition constraint validation scan (this also covers the
>>case of scanning the default partition to validate its updated
>>constraint when attaching a new partition)
>>
>> 2. Jeevan's patch to teach check_default_allows_bound() to skip scanning
>>the default partition's own partitions, which covers the case of
>>scanning the default partition to validate its updated constraint when
>>adding a new partition using CREATE TABLE
>>
>> Attached 0001 and 0002 are ordered that way.
> 
> OK, I pushed all of this, spread out over 3 commits.  I reworked the
> test cases not to be entangled with the existing test cases, and also
> to test both of these highly-related features together.  Hopefully you
> like the result.

Thanks for committing.

I noticed that 6476b26115f3 doesn't use the same INFO message as
14f67a8ee28.  You can notice in the following example that the message
emitted (that the default partition scan is skipped) is different when a
new partition is added with CREATE TABLE and with ATTACH PARTITION,
respectively.

create table foo (a int, b char) partition by list (a);

-- default partition
create table food partition of foo default partition by list (b);

-- default partition's partition with the check constraint
create table fooda partition of food (check (not (a is not null and a = 1
and a = 2))) for values in ('a');

create table foo1 partition of foo for values in (1);
INFO:  partition constraint for table "fooda" is implied by existing
constraints
CREATE TABLE

create table foo2 (like foo);
alter table foo attach partition foo2 for values in (2);
INFO:  updated partition constraint for default partition "fooda" is
implied by existing constraints
ALTER TABLE

That's because it appears that you applied Jeevan's original patch.
Revised version of his patch that I had last posted contained the new
message. Actually the revised patch had not only addressed the case where
the default partition's child's scan is skipped, but also the case where
the scan of default partition itself is skipped.  As things stand now:

alter table food add constraint skip_check check (not (a is not null and
(a = any (array[1, 2];

create table foo1 partition of foo for values in (1);
INFO:  partition constraint for table "food" is implied by existing
constraints
CREATE TABLE

create table foo2 (like foo);
CREATE TABLE

alter table foo attach partition foo2 for values in (2);
INFO:  updated partition constraint for default partition "food" is
implied by existing constraints
ALTER TABLE


Attached a patch to modify the INFO messages in check_default_allows_bound.

Thanks,
Amit
From 1bd3b03bcd1f8af6cec3a22c40e96c9cde46e705 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 6 Oct 2017 10:23:24 +0900
Subject: [PATCH] Like c31e9d4bafd, but for check_default_allows_bound

---
 src/backend/catalog/partition.c   | 10 +++---
 src/test/regress/expected/alter_table.out |  4 ++--
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 3a8306a055..1459fba778 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -920,7 +920,7 @@ check_default_allows_bound(Relation parent, Relation 
default_rel,
if (PartConstraintImpliedByRelConstraint(default_rel, 
def_part_constraints))
{
ereport(INFO,
-   (errmsg("partition constraint for table \"%s\" 
is implied by existing constraints",
+   (errmsg("updated partition constraint for 
default partition \"%s\" is implied by existing constraints",

RelationGetRelationName(default_rel;
return;
}
@@ -956,16 +956,12 @@ check_default_allows_bound(Relation parent, Relation 
default_rel,
{
part_rel = heap_open(part_relid, NoLock);
 
-   /*
-* If the partition constraints on default partition 
child imply
-* that it will not contain any row that would belong 
to the new
-* partition, we can avoid scanning the child table.
-*/
+   /* Can we skip scanning this part_rel? */
if (PartConstraintImpliedByRelConstraint(part_rel,

 def_part_constraints))
   

Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Craig Ringer
On 6 October 2017 at 08:06, Andres Freund  wrote:
> On 2017-10-06 07:59:40 +0800, Craig Ringer wrote:
>> The only thing that gets me excited about a threaded postgres is the
>> ability to have a PL/Java, PL/Mono etc that don't suck. We could do
>> some really cool things that just aren't practical right now.
>
> Faster parallelism with a lot less reinventing the wheel. Easier backend
> / session separation. Shared caches.

Yeah. We have a pretty major NIH problem in PostgreSQL, and I agree
that adopting threading and some commonplace tools would sure help us
reduce that burden a bit.

I would really miss shared-nothing-by-default though.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Andres Freund


On October 5, 2017 5:15:41 PM PDT, Tom Lane  wrote:
>Andres Freund  writes:
>> On 2017-10-06 07:59:40 +0800, Craig Ringer wrote:
>>> The only thing that gets me excited about a threaded postgres is the
>>> ability to have a PL/Java, PL/Mono etc that don't suck. We could do
>>> some really cool things that just aren't practical right now.
>
>> Faster parallelism with a lot less reinventing the wheel. Easier
>backend
>> / session separation. Shared caches.
>
>What you guys are talking about here is a threaded backend, which is a
>whole different matter from replacing the client-side threading that
>Nico
>was looking at.  That would surely offer far higher rewards, but the
>costs
>to get there are likewise orders of magnitude greater.

No disagreement there. Don't really see much need for it client side though.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Tom Lane
Andres Freund  writes:
> On 2017-10-06 07:59:40 +0800, Craig Ringer wrote:
>> The only thing that gets me excited about a threaded postgres is the
>> ability to have a PL/Java, PL/Mono etc that don't suck. We could do
>> some really cool things that just aren't practical right now.

> Faster parallelism with a lot less reinventing the wheel. Easier backend
> / session separation. Shared caches.

What you guys are talking about here is a threaded backend, which is a
whole different matter from replacing the client-side threading that Nico
was looking at.  That would surely offer far higher rewards, but the costs
to get there are likewise orders of magnitude greater.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Andres Freund
On 2017-10-06 07:59:40 +0800, Craig Ringer wrote:
> The only thing that gets me excited about a threaded postgres is the
> ability to have a PL/Java, PL/Mono etc that don't suck. We could do
> some really cool things that just aren't practical right now.

Faster parallelism with a lot less reinventing the wheel. Easier backend
/ session separation. Shared caches.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Craig Ringer
On 6 October 2017 at 06:49, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2017-10-05 17:31:07 -0500, Nico Williams wrote:
>>> You don't think eliminating a large difference between handling of WIN32
>>> vs. POSIX is a good reason?
>
>> I seems like you'd not really get a much reduced set of differences,
>> just a *different* set of differences. After investing time.
>
> Yeah -- unless we're prepared to drop threadless systems altogether,
> this doesn't seem like it does much for maintainability.  It might even
> be a net negative on that score, due to reducing the amount of testing
> the now-legacy code path would get.
>
> If there were reason to think we'd get a large performance benefit,
> or some other concrete win, it might be worth putting time into this.
> But I see no reason to believe that.
>
> (There's certainly an argument to be made that no-one cares about
> platforms without thread support anymore.  But I'm unconvinced that
> rewriting existing code that works fine is the most productive
> way to exploit such a choice if we were to make it.)

The only thing that gets me excited about a threaded postgres is the
ability to have a PL/Java, PL/Mono etc that don't suck. We could do
some really cool things that just aren't practical right now.

Not compelling to a wide audience, really.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Andres Freund
On 2017-10-05 18:49:22 -0400, Tom Lane wrote:
> (There's certainly an argument to be made that no-one cares about
> platforms without thread support anymore.  But I'm unconvinced that
> rewriting existing code that works fine is the most productive
> way to exploit such a choice if we were to make it.)

Yea, that's pretty much what I'm thinking too.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Tom Lane
Andres Freund  writes:
> On 2017-10-05 17:31:07 -0500, Nico Williams wrote:
>> You don't think eliminating a large difference between handling of WIN32
>> vs. POSIX is a good reason?

> I seems like you'd not really get a much reduced set of differences,
> just a *different* set of differences. After investing time.

Yeah -- unless we're prepared to drop threadless systems altogether,
this doesn't seem like it does much for maintainability.  It might even
be a net negative on that score, due to reducing the amount of testing
the now-legacy code path would get.

If there were reason to think we'd get a large performance benefit,
or some other concrete win, it might be worth putting time into this.
But I see no reason to believe that.

(There's certainly an argument to be made that no-one cares about
platforms without thread support anymore.  But I'm unconvinced that
rewriting existing code that works fine is the most productive
way to exploit such a choice if we were to make it.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Nico Williams
On Thu, Oct 05, 2017 at 03:34:41PM -0700, Andres Freund wrote:
> On 2017-10-05 17:31:07 -0500, Nico Williams wrote:
> > > >vfork() is widely demonized, but it's actually quite superior
> > > >(performance-wise) to fork() when all you want to do is exec-or-exit
> > > >since no page copying (COW or otherwise) needs be done when using
> > > >vfork().
> > > 
> > > Not on linux, at least not as of a year or two back.
> > 
> > glibc has it.  Other Linux C libraries might also; I've not checked them
> > all.
> 
> It has it, but it's not more efficient.

Because of signal-blocking issues?

> > > I do think it'd be good to move more towards threads, but not at all for
> > > the reasons mentioned here.
> > 
> > You don't think eliminating a large difference between handling of WIN32
> > vs. POSIX is a good reason?
> 
> I seems like you'd not really get a much reduced set of differences,
> just a *different* set of differences. After investing time.

Fair enough.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Andres Freund
Hi,

On 2017-10-05 17:31:07 -0500, Nico Williams wrote:
> > >vfork() is widely demonized, but it's actually quite superior
> > >(performance-wise) to fork() when all you want to do is exec-or-exit
> > >since no page copying (COW or otherwise) needs be done when using
> > >vfork().
> > 
> > Not on linux, at least not as of a year or two back.
> 
> glibc has it.  Other Linux C libraries might also; I've not checked them
> all.

It has it, but it's not more efficient.


> > I do think it'd be good to move more towards threads, but not at all for
> > the reasons mentioned here.
> 
> You don't think eliminating a large difference between handling of WIN32
> vs. POSIX is a good reason?

I seems like you'd not really get a much reduced set of differences,
just a *different* set of differences. After investing time.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Nico Williams
On Thu, Oct 05, 2017 at 03:13:07PM -0700, Andres Freund wrote:
> On 2017-10-05 17:02:22 -0500, Nico Williams wrote:
> >A quick look at the functions called on the child side of fork()
> >makes me think that it's unlikely that the children here use
> >async-signal-safe functions only.
> 
> That's not a requirement unless you're using fork *and* threads. At
> least by my last reading of posix and common practice.

True, yes.  One still has to be careful to fflush() all open FILEs (that
might be used on both sides of fork()) and such though.

> >  - fork() is used in a number of places where execl() or execv() are
> >called immediately after (and exit() if the exec fails).
> > 
> >It would be better to use vfork() where available and _exit() instead
> >of exit().
> 
> vfork is less portable, and doesn't really win us anything on common
> platforms. On most it's pretty much the same implementation.

It's trivial to use it where available, and fork() otherwise.  Mind you,
all current versions of Solaris/Illumos, *BSD, OS X, and Linux w/glibc
(and even Windows with WSL!) have a true vfork().

> >vfork() is widely demonized, but it's actually quite superior
> >(performance-wise) to fork() when all you want to do is exec-or-exit
> >since no page copying (COW or otherwise) needs be done when using
> >vfork().
> 
> Not on linux, at least not as of a year or two back.

glibc has it.  Other Linux C libraries might also; I've not checked them
all.

> I do think it'd be good to move more towards threads, but not at all for
> the reasons mentioned here.

You don't think eliminating a large difference between handling of WIN32
vs. POSIX is a good reason?

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search path security issue?

2017-10-05 Thread David G. Johnston
On Thu, Oct 5, 2017 at 3:05 PM, Joshua D. Drake 
wrote:

> On 10/05/2017 02:54 PM, David G. Johnston wrote:
>
>> On Thu, Oct 5, 2017 at 2:37 PM, Joshua D. Drake > >wrote:
>>
>> I get being able to change my search_path on the fly but it seems
>> odd that as user foo I can change my default search path?
>>
>>
>> Seems down-right thoughtful of us to allow users to change their own
>> defaults instead of forcing them to always change things on-the-fly or bug
>> a DBA to change the default for them.
>>
>
> It seems that if a super user changes the search path with ALTER
> USER/ROLE, then the user itself should not (assuming not an elevated
> privilege) should not be able to change it. Again, I get being able to do
> it with SET but a normal user shouldn't be able to reset a super user
> determined setting.
>

If the superuser really wanted to enforce it they could, a bit verbosely,
set the default for the user for every database explicitly so that the
database+role setting takes precedence over the role-only setting.​

I get where you are coming from but there is no demonstrated
security-related reason to enforce such a restriction and so adding the
book-keeping necessary to track "who done it" (i.e. mere presence of a
value is an insufficient distinguishing characteristic) has a cost but no
benefit (if the superuser is upset that someone changed their own default
search_path the fact that said user is entrusted with login credentials
should be questioned).

David J.


Re: [HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Andres Freund
Hi,

On 2017-10-05 17:02:22 -0500, Nico Williams wrote:
>A quick look at the functions called on the child side of fork()
>makes me think that it's unlikely that the children here use
>async-signal-safe functions only.

That's not a requirement unless you're using fork *and* threads. At
least by my last reading of posix and common practice.



>  - fork() is used in a number of places where execl() or execv() are
>called immediately after (and exit() if the exec fails).
> 
>It would be better to use vfork() where available and _exit() instead
>of exit().

vfork is less portable, and doesn't really win us anything on common
platforms. On most it's pretty much the same implementation.


>vfork() is widely demonized, but it's actually quite superior
>(performance-wise) to fork() when all you want to do is exec-or-exit
>since no page copying (COW or otherwise) needs be done when using
>vfork().

Not on linux, at least not as of a year or two back.


I do think it'd be good to move more towards threads, but not at all for
the reasons mentioned here.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search path security issue?

2017-10-05 Thread Joshua D. Drake

On 10/05/2017 02:54 PM, David G. Johnston wrote:
On Thu, Oct 5, 2017 at 2:37 PM, Joshua D. Drake >wrote:


I get being able to change my search_path on the fly but it seems
odd that as user foo I can change my default search path?


Seems down-right thoughtful of us to allow users to change their own 
defaults instead of forcing them to always change things on-the-fly or 
bug a DBA to change the default for them.


It seems that if a super user changes the search path with ALTER 
USER/ROLE, then the user itself should not (assuming not an elevated 
privilege) should not be able to change it. Again, I get being able to 
do it with SET but a normal user shouldn't be able to reset a super user 
determined setting.


Shrug,

JD



David J.
​



--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] fork()-safety, thread-safety

2017-10-05 Thread Nico Williams
A thread on parallelization made me wonder so I took a look:

 - src/bin/*/parallel.c uses threads on WIN32
 - src/bin/*/parallel.c uses fork() on not-WIN32

   (Ditto src/bin/pg_basebackup/pg_basebackup.c and
   src/backend/postmaster/syslogger.c.)

   A quick look at the functions called on the child side of fork()
   makes me think that it's unlikely that the children here use
   async-signal-safe functions only.

   Why not use threads on all systems where threads are available when
   we'd use threads on some such systems?  If this code is thread-safe
   on WIN32, why wouldn't it be thread-safe on POSIX?  (Well, naturally
   there may be calls to, e.g., getpwnam() and such that would not be
   thread-safe on POSIX, and which might not exist on WIN32.  But I
   mean, aside from that, if synchronization is done correctly on WIN32,
   what would stop that from being true on POSIX?)

 - fork() is used in a number of places where execl() or execv() are
   called immediately after (and exit() if the exec fails).

   It would be better to use vfork() where available and _exit() instead
   of exit().

   Alternatively posix_spawn() should be used (which generally uses
   vfork() or equivalent under the covers).

   vfork() is widely demonized, but it's actually quite superior
   (performance-wise) to fork() when all you want to do is exec-or-exit
   since no page copying (COW or otherwise) needs be done when using
   vfork().

   It's actually safer to use vfork() because POSIX limits one to
   async-signal-safe functions between fork() and exec-or-exit...  With
   fork(), where neither the parent nor the child immediately execs-or-
   exits, it's too easy to fail to make sure that the code they execute
   is fork-safe.  Whereas with vfork() the fact that the parent (just
   the one thread, incidentally, not all of them[*]) blocks until the
   child execs-or-exits means it's impossible to fail to notice a
   long-running child that does lots of fork-unsafe work.

   It's safer still to use posix_spawn(), naturally.

In Unix-land it's standard practice to ignore the async-signal-safe
requirement when using fork() early on in a daemon's life to start
worker processes.  This is fine, of course, though if we're using
CreateProcess*()/_spawn() on WIN32 anyways, it might be best to do the
equivalent on Unix and just spawn the children -- if nothing else, this
would reduce the likelihood of unintended divergence between WIN32 and
Unix.

Nico

[*] Actually, I do believe that on Solaris/Illumos vfork() stops all
threads in the parent, if I remember correctly anyways.  Linux's and
NetBSD's vfork() only stops the one thread in the parent that called
it.  I haven't checked other BSDs.  There was a patch for NetBSD to
stop all threads in the parent, but I convinced the NetBSD community
to discard that patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search path security issue?

2017-10-05 Thread David G. Johnston
On Thu, Oct 5, 2017 at 2:37 PM, Joshua D. Drake 
wrote:

> I get being able to change my search_path on the fly but it seems odd that
> as user foo I can change my default search path?
>

Seems down-right thoughtful of us to allow users to change their own
defaults instead of forcing them to always change things on-the-fly or bug
a DBA to change the default for them.

David J.
​


Re: [HACKERS] search path security issue?

2017-10-05 Thread Tom Lane
"Joshua D. Drake"  writes:
> I get being able to change my search_path on the fly but it seems odd 
> that as user foo I can change my default search path?

Why is that odd?  It's a USERSET variable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] search path security issue?

2017-10-05 Thread Joshua D. Drake

-hackers,

Please see the below:

"""
postgres=# create user foo;
CREATE ROLE
postgres=# create schema foo;
CREATE SCHEMA
postgres=# alter role foo set search_path to 'foo';
ALTER ROLE
postgres=# \q
jd@jd-wks:~$ psql -U foo postgres
psql (9.6.5)
Type "help" for help.

postgres=> show search_path;
 search_path
-
 foo
(1 row)

postgres=> alter role foo set search_path to default;
ALTER ROLE
postgres=> show search_path;
 search_path
-
 foo
(1 row)

postgres=> \q


jd@jd-wks:~$ psql -U foo postgres
psql (9.6.5)
Type "help" for help.

postgres=> show search_path;
   search_path
-
 "$user", public
(1 row)


I get being able to change my search_path on the fly but it seems odd 
that as user foo I can change my default search path?


JD



--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2017-10-05 Thread Robert Haas
On Mon, Sep 25, 2017 at 10:17 AM, Alexander Kuzmenkov
 wrote:
> Here is some news about the CSN patch.
>
> * I merged it with master (58bd60995f), which now has the clog group update.
> With this optimization, CSN is now faster than the master by about 15% on
> 100 to 400 clients (72 cores, pgbench tpcb-like, scale 500). It does not
> degrade faster than master as it did before. The numbers of clients greater
> than 400 were not measured.

Hmm, that's gratifying.

> * Querying for CSN of subtransactions was not implemented in the previous
> version of the patch, so I added it. I tested the performance on the
> tpcb-like pgbench script with some savepoints added, and it was
> significantly worse than on the master. The main culprit seems to be the
> ProcArrayLock taken in GetSnapshotData, GetRecentGlobalXmin,
> ProcArrayEndTransaction. Although it is only taken in shared mode, just
> reading the current lock mode and writing the same value back takes about
> 10% CPU. Maybe we could do away with some of these locks, but there is some
> interplay with imported snapshots and replication slots which I don't
> understand well. I plan to investigate this next.

That's not so good, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] Re: [PATCH] BUG #13416: Postgres >= 9.3 doesn't use optimized shared memory on Solaris anymore

2017-10-05 Thread Sean Chittenden
Fair enough.  We observed a ~4x amplification in memory usage so it was rather 
severe in our case.

The patch you referenced was a much nicer approach and Sam updated it to match 
that style (thank you Sam!).  We debated this internally and feel strongly that 
this should be exposed as a runtime GUC as suggested.  What's your take on the 
attached patch?

-sc



--
Sean Chittenden
se...@joyent.com

On Oct 4, 2017, 10:56 AM -0700, Andres Freund , wrote:
> Hi,
>
> On 2017-10-04 10:47:06 -0700, Sean Chittenden wrote:
> > Hello.  We identified the same problem.  Sam Gwydir and Josh Clulow were 
> > able to put together the appropriate fix after.
> >
> > The breakage in src/backend/port/sysv_shmem.c and 
> > src/include/storage/dsm_impl.h should be back ported to all supported 
> > versions (the regression was introduced between the 9.2 and 9.3 branches).
>
> Personally I don't think "breakage" is quite the right work.
>
> I also don't like that we're unconditionally not using
> USE_ANONYMOUS_SHMEM - doesn't that run into similar config limits on
> solaris based stuff as it does on linux etc?
>
> I think if we want to do this, we should rather go with a patch like
> https://www.postgresql.org/message-id/20140422121921.gd4...@awork2.anarazel.de
>
> Greetings,
>
> Andres Freund


ism2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing initplans

2017-10-05 Thread Robert Haas
On Thu, Oct 5, 2017 at 1:16 PM, Amit Kapila  wrote:
> On Thu, Oct 5, 2017 at 6:08 PM, Robert Haas  wrote:
>> On Thu, Oct 5, 2017 at 5:52 AM, Amit Kapila  wrote:
>>> Now, unless, I am missing something here, it won't be possible to
>>> detect params in such cases during forming of join rels and hence we
>>> need the tests in generate_gather_paths.  Let me know if I am missing
>>> something in this context or if you have any better ideas to make it
>>> work?
>>
>> Hmm, in a case like this, I think we shouldn't need to detect it.  The
>> Var is perfectly parallel-safe, the problem is that we can't use a
>> not-safe plan for the inner rel.  I wonder why that's happening
>> here...
>
> It is because the inner rel (Result Path) contains a reference to a
> param which appears to be at the same query level.  Basically due to
> changes in max_parallel_hazard_walker().

I spent several hours debugging this issue this afternoon.  I think
you've misdiagnosed the problem.  I think that the Param reference in
the result path is parallel-safe; that doesn't seem to me to be wrong.
If we see a Param reference for our own query level, then either we're
below the Gather and the new logic added by this patch will pass down
the value or we're above the Gather and we can access the value
directly.  Either way, no problem.

However, I think that if you attach an InitPlan to a parallel-safe
plan, it becomes parallel-restricted.  This is obvious in the case
where the InitPlan's plan isn't itself parallel-safe, but it's also
arguably true even when the InitPlan's plan *is* parallel-safe,
because pushing that below a Gather introduces a multiple-evaluation
hazard.  I think we should fix that problem someday by providing a
DSA-based parameter store, but that's a job for another day.

And it turns out that we actually have such logic already, but this
patch removes it:

@@ -2182,7 +2181,6 @@ SS_charge_for_initplans(PlannerInfo *root,
RelOptInfo *final_rel)

path->startup_cost += initplan_cost;
path->total_cost += initplan_cost;
-   path->parallel_safe = false;
}

/* We needn't do set_cheapest() here, caller will do it */

Now, the header comment for SS_charge_for_initplans() is wrong; it
claims we can't transmit initPlans to workers, but I think that's
already wrong even before this patch.  On the other hand, I think that
the actual code is right even after this patch.  If I put that line
back but make contains_parallel_unsafe_param always return false, then
I can still get plans like this (I modified EXPLAIN to show Parallel
Safe markings)...

rhaas=# explain select * from pgbench_accounts where bid = (select
max(g) from generate_series(1,1000)g);
   QUERY PLAN
-
 Gather  (cost=12.51..648066.51 rows=10 width=97)
   Parallel Safe: false
   Workers Planned: 2
   Params Evaluated: $0
   InitPlan 1 (returns $0)
 ->  Aggregate  (cost=12.50..12.51 rows=1 width=4)
   Parallel Safe: true
   ->  Function Scan on generate_series g  (cost=0.00..10.00
rows=1000 width=4)
 Parallel Safe: true
   ->  Parallel Seq Scan on pgbench_accounts  (cost=0.00..648054.00
rows=41667 width=97)
 Parallel Safe: true
 Filter: (bid = $0)
(12 rows)

...but Kuntal's example no longer misbehaves:

  QUERY PLAN
--
 Hash Semi Join
   Parallel Safe: false
   Output: t1.i, t1.j, t1.k
   Hash Cond: (t1.i = ((1 + $1)))
   ->  Gather
 Parallel Safe: false
 Output: t1.i, t1.j, t1.k
 Workers Planned: 2
 ->  Parallel Seq Scan on public.t1
   Parallel Safe: true
   Output: t1.i, t1.j, t1.k
   ->  Hash
 Parallel Safe: false
 Output: ((1 + $1))
 ->  Result
   Parallel Safe: false
   Output: (1 + $1)
   InitPlan 1 (returns $1)
 ->  Finalize Aggregate
   Parallel Safe: false
   Output: max(t3.j)
   ->  Gather
 Parallel Safe: false
 Output: (PARTIAL max(t3.j))
 Workers Planned: 2
 ->  Partial Aggregate
   Parallel Safe: true
   Output: PARTIAL max(t3.j)
   ->  Parallel Seq Scan on public.t3
 Parallel Safe: true
 Output: t3.j
(31 rows)

With your original path, the Result was getting marked parallel-safe,
but now it doesn't, which is correct, and after that everything seems
to just work.

Notice that in my 

Re: [HACKERS] [PATCH] A hook for session start

2017-10-05 Thread Nico Williams
On Tue, Aug 01, 2017 at 03:36:23PM -0400, Peter Eisentraut wrote:
> On 7/21/17 13:14, Jim Mlodgenski wrote:
> > When I first saw this thread, my initial thought of a use case is to
> > prepare some key application queries so they are there and ready to go.
> > That would need to be before the ExecutorStart_hook or
> > ProcessUtility_hook if an app would just want to execute the prepared
> > statement.
> 
> Isn't that what the preprepare extension does already?

more generic facility -> more useful

My use case is to pre-create TEMP schema elements that VIEWs, FUNCTIONs,
and TRIGGERs, might need.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] A hook for session start

2017-10-05 Thread Nico Williams
On Fri, Jul 21, 2017 at 11:10:52PM +0800, Craig Ringer wrote:
> What practical use cases are there for acting post-auth but that can't wait
> until the user tries to do something?

Creating TEMP schema that triggers and functions might need.

Doing CREATE TEMP TABLE IF NOT EXISTS in triggers slows things down.

It'd be super nice if PostgreSQL had some sort of persistent TEMP
schema option, where you can have schema elements that are persistent
in that they're always there, but where the data is all TEMP.  Oracle
has this and they call it GLOBAL TEMP IIRC.  There would be some
caveats, such as not being able to have FKs between these sorts of
persistent temp tables and persistent tables.

In the absence of such a feature, a session hook/trigger is a great
workaround.

> Can a user do anything remotely interesting or useful without hitting
> either ExecutorStart_hook or ProcessUtility_hook? They can parse queries I
> guess but you could just set your hook up in the parser instead. If you
> hook the parser all they can do is open an idle session and sit there...

In any other hook you'd have to check whether the session setup work you
wanted to do has been done.  That could be potentially slow.

I actually have an all SQL implementation of session/begin/commit
triggers.  The session triggers in that implementation only run on the
first DML statement, which could be too late for OP's purpose.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] A hook for session start

2017-10-05 Thread Fabrízio de Royes Mello
On Thu, Oct 5, 2017 at 4:14 PM, Fabrízio de Royes Mello <
fabriziome...@gmail.com> wrote:
>
>
>
> On Tue, Aug 1, 2017 at 4:55 PM, Robert Haas  wrote:
> >
> > On Tue, Aug 1, 2017 at 3:37 PM, Peter Eisentraut
> >  wrote:
> > > On 7/21/17 12:59, Robert Haas wrote:
> > >> That's an exceedingly-weak argument for rejecting this patch.  The
> > >> fact that you can probably hack around the lack of a hook for most
> > >> reasonable use cases is not an argument for having a hook that does
> > >> what people actually want to do.
> > >
> > > Still nobody has presented a concrete use case so far.
> >
> > I've been asked for this at EDB, too.  Inserting a row into some table
> > on each logon, for example.
> >
> > A quick Google search found 6 previous requests for this feature, some
> > of which describe intended use cases:
> >
> > https://www.postgresql.org/message-id/4ebc6852.5030...@fuzzy.cz (2011)
> >
https://www.postgresql.org/message-id/CAHyXU0wrsYShxmwBxZSGYoiBJa%3DgzEJ17iAeRvaf_vA%2BcoH_qA%40mail.gmail.com
> > (2011)
> >
https://www.postgresql.org/message-id/bay104-w513cf26c0046c9d28747b8d1...@phx.gbl
> > (2009, in Spanish)
> >
https://www.postgresql.org/message-id/758d5e7f0803130227m558d32cdl7159bed00d21f084%40mail.gmail.com
> > (2008)
> >
https://www.postgresql.org/message-id/001a01c48077%240b118e60%240200030a%40gendron.ca
> > (2004)
> >
https://www.postgresql.org/message-id/f96sgcorblsaqv6updv0...@hotmail.com
> > (2000)
> >
>
> Hi all,
>
> I'm sending a new rebased patches and added tests to src/tests/modules as
suggested before.
>

Also added for the next commitfest:

https://commitfest.postgresql.org/15/1318/

Att,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] [PATCH] A hook for session start

2017-10-05 Thread Fabrízio de Royes Mello
On Tue, Aug 1, 2017 at 4:55 PM, Robert Haas  wrote:
>
> On Tue, Aug 1, 2017 at 3:37 PM, Peter Eisentraut
>  wrote:
> > On 7/21/17 12:59, Robert Haas wrote:
> >> That's an exceedingly-weak argument for rejecting this patch.  The
> >> fact that you can probably hack around the lack of a hook for most
> >> reasonable use cases is not an argument for having a hook that does
> >> what people actually want to do.
> >
> > Still nobody has presented a concrete use case so far.
>
> I've been asked for this at EDB, too.  Inserting a row into some table
> on each logon, for example.
>
> A quick Google search found 6 previous requests for this feature, some
> of which describe intended use cases:
>
> https://www.postgresql.org/message-id/4ebc6852.5030...@fuzzy.cz (2011)
>
https://www.postgresql.org/message-id/CAHyXU0wrsYShxmwBxZSGYoiBJa%3DgzEJ17iAeRvaf_vA%2BcoH_qA%40mail.gmail.com
> (2011)
>
https://www.postgresql.org/message-id/bay104-w513cf26c0046c9d28747b8d1...@phx.gbl
> (2009, in Spanish)
>
https://www.postgresql.org/message-id/758d5e7f0803130227m558d32cdl7159bed00d21f084%40mail.gmail.com
> (2008)
>
https://www.postgresql.org/message-id/001a01c48077%240b118e60%240200030a%40gendron.ca
> (2004)
>
https://www.postgresql.org/message-id/f96sgcorblsaqv6updv0...@hotmail.com
> (2000)
>

Hi all,

I'm sending a new rebased patches and added tests to src/tests/modules as
suggested before.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index c807b00..5c22f2d 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -165,6 +165,9 @@ static bool RecoveryConflictPending = false;
 static bool RecoveryConflictRetryable = true;
 static ProcSignalReason RecoveryConflictReason;
 
+/* Hook for plugins to get control at start of session */
+session_start_hook_type session_start_hook = NULL;
+
 /* 
  *		decls for routines only used in this file
  * 
@@ -3838,6 +3841,9 @@ PostgresMain(int argc, char *argv[],
 	if (!IsUnderPostmaster)
 		PgStartTime = GetCurrentTimestamp();
 
+	if (session_start_hook)
+		(*session_start_hook) (dbname, username);
+
 	/*
 	 * POSTGRES main processing loop begins here
 	 *
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index f8c535c..d349592 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -35,6 +35,11 @@ extern PGDLLIMPORT const char *debug_query_string;
 extern int	max_stack_depth;
 extern int	PostAuthDelay;
 
+/* Hook for plugins to get control at start of session */
+typedef void (*session_start_hook_type) (const char *dbname,
+		 const char *username);
+extern PGDLLIMPORT session_start_hook_type session_start_hook;
+
 /* GUC-configurable parameters */
 
 typedef enum
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 5c22f2d..79f3099 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -165,9 +165,9 @@ static bool RecoveryConflictPending = false;
 static bool RecoveryConflictRetryable = true;
 static ProcSignalReason RecoveryConflictReason;
 
-/* Hook for plugins to get control at start of session */
+/* Hook for plugins to get control at start or end of session */
 session_start_hook_type session_start_hook = NULL;
-
+session_end_hook_type session_end_hook = NULL;
 /* 
  *		decls for routines only used in this file
  * 
@@ -192,6 +192,7 @@ static void drop_unnamed_stmt(void);
 static void log_disconnections(int code, Datum arg);
 static void enable_statement_timeout(void);
 static void disable_statement_timeout(void);
+static void do_session_end_hook(int code, Datum arg);
 
 
 /* 
@@ -3845,6 +3846,12 @@ PostgresMain(int argc, char *argv[],
 		(*session_start_hook) (dbname, username);
 
 	/*
+	 * Setup handler to session end hook
+	 */
+	if (IsUnderPostmaster)
+		on_proc_exit(do_session_end_hook, 0);
+
+	/*
 	 * POSTGRES main processing loop begins here
 	 *
 	 * If an exception is encountered, processing resumes here so we abort the
@@ -4596,3 +4603,15 @@ disable_statement_timeout(void)
 		stmt_timeout_active = false;
 	}
 }
+
+/*
+ * on_proc_exit handler to call session end hook
+ */
+static void
+do_session_end_hook(int code, Datum arg)
+{
+	Port	   *port = MyProcPort;
+
+	if (session_end_hook)
+		(*session_end_hook) (port->database_name, port->user_name);
+}
\ No newline at end of file

Re: [HACKERS] Binary search in fmgr_isbuiltin() is a bottleneck.

2017-10-05 Thread Tom Lane
Andres Freund  writes:
> On 2017-10-05 17:08:39 +0300, Heikki Linnakangas wrote:
>> BTW, there's some alignment padding in FmgrBuiltin, when MAXIMUM_ALIGNOF==8.
>> You could easily shrink the struct from 32 to 24 bytes by moving funcName to
>> the end of the struct:

> Yea, that's probably worthwhile, although I suspect it's not a huge save
> overall. Do you just want to commit that?

+1

>> If we care about cache efficiency here, we could move funcName out of the
>> fmgr_builtins array, to a separate array of the same size.

> When'd that be beneficial? fmgr_builtins is pretty much only used for
> internal-language CREATE FUNCTIONs? In other cases oid bounds + mapping
> array should filter out the access before fmgr_builtins is accessed.

I'm -1 on this, it would complicate using the data structure to look up
the name of a built-in function from its OID.  (Don't know that anyone
actually does that, but I see no reason to break their code if they do.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017 : Patch for predicate locking in Gist index

2017-10-05 Thread Shubham Barai
 Sent with Mailtrack

<#>

On 3 October 2017 at 00:32, Alexander Korotkov 
wrote:

> On Mon, Oct 2, 2017 at 9:11 PM, Andrew Borodin 
> wrote:
>
>> On Mon, Oct 2, 2017 at 8:00 PM, Alexander Korotkov
>>  wrote:
>> > What happen if exactly this "continue" fires?
>> >
>> >> if (GistFollowRight(stack->page))
>> >> {
>> >> if (!xlocked)
>> >> {
>> >> LockBuffer(stack->buffer, GIST_UNLOCK);
>> >> LockBuffer(stack->buffer, GIST_EXCLUSIVE);
>> >> xlocked = true;
>> >> /* someone might've completed the split when we unlocked */
>> >> if (!GistFollowRight(stack->page))
>> >> continue;
>> >
>> >
>> > In this case we might get xlocked == true without calling
>> > CheckForSerializableConflictIn().
>> Indeed! I've overlooked it. I'm remembering this issue, we were
>> considering not fixing splits if in Serializable isolation, but
>> dropped the idea.
>>
>
> Yeah, current insert algorithm assumes that split must be fixed before we
> can correctly traverse the tree downwards.
>
>
>> CheckForSerializableConflictIn() must be after every exclusive lock.
>>
>
> I'm not sure, that fixing split is the case to necessary call
> CheckForSerializableConflictIn().  This lock on leaf page is not taken to
> do modification of the page.  It's just taken to ensure that nobody else is
> fixing this split the same this.  After fixing the split, it might appear
> that insert would go to another page.
>
> > I think it would be rather safe and easy for understanding to more
>> > CheckForSerializableConflictIn() directly before gistinserttuple().
>> The difference is that after lock we have conditions to change page,
>> and before gistinserttuple() we have actual intent to change page.
>>
>> From the point of future development first version is better (if some
>> new calls occasionally spawn in), but it has 3 calls while your
>> proposal have 2 calls.
>> It seems to me that CheckForSerializableConflictIn() before
>> gistinserttuple() is better as for now.
>>
>
> Agree.
>

I have updated the location of  CheckForSerializableConflictIn()  and
changed the status of the patch to "needs review".

Regards,
Shubham


Predicate-locking-in-gist-index_3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-05 Thread Wood, Dan
Yes, I’ve been testing 9.6.  I’ll try Alvaro’s patch today.

I would prefer to focus on either latest 9X or 11dev.  Does Alvaro’s patch 
presume any of the other patch to set COMMITTED in the freeze code?


On 10/4/17, 7:17 PM, "Michael Paquier"  wrote:

On Thu, Oct 5, 2017 at 10:39 AM, Wood, Dan  wrote:
> Whatever you do make sure to also test 250 clients running lock.sql.  
Even with the communities fix plus YiWen’s fix I can still get duplicate rows.  
What works for “in-block” hot chains may not work when spanning blocks.

Interesting. Which version did you test? Only 9.6?

> Once nearly all 250 clients have done their updates and everybody is 
waiting to vacuum which one by one will take a while I usually just “pkill -9 
psql”.  After that I have many of duplicate “id=3” rows.  On top of that I 
think we might have a lock leak.  After the pkill I tried to rerun setup.sql to 
drop/create the table and it hangs.  I see an autovacuum process starting and 
existing every couple of seconds.  Only by killing and restarting PG can I drop 
the table.

Yeah, that's more or less what I have been doing. My tests involve
using your initial script with way more sessions triggering lock.sql,
minus the kill-9 portion (good idea actually). I can of course see the
sessions queuing for VACUUM, still I cannot see duplicated rows, even
if I headshot Postgres in the middle of the VACUUM waiting queue. Note
that I have just tested Alvaro's patch on 9.3.
-- 
Michael




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-05 Thread Peter Geoghegan
On Thu, Oct 5, 2017 at 4:35 AM, Alvaro Herrera  wrote:
> At any rate, I was thinking in a new routine to encapsulate the logic,
>
> /*
>  * Check the tuple XMIN against prior XMAX, if any
>  */
> if (!HeapTupleUpdateXmaxMatchesXmin(priorXmax, 
> HeapTupleHeaderGetXmin(htup)))
> break;
>
> where ..XmaxMatchesXmin would know about checking for possibly frozen
> tuples.

Makes sense.

>> I can see why it
>> would be safe (or at least no more dangerous) to rely on
>> HeapTupleHeaderGetRawXmin() in the way mentioned here, at least on
>> installations that initdb'd on a version after commit 37484ad2a
>> (version 9.4+). However, I'm not sure why what you propose here would
>> be safe when even raw xmin happens to be FrozenTransactionId. Are you
>> sure that that's truly race-free? If it's really true that we only
>> need to check for FrozenTransactionId on 9.3, why not just do that on
>> all versions, and never bother with HeapTupleHeaderGetRawXmin()?
>> ("Sheer paranoia" is a valid answer; I just want us to be clear on the
>> reasoning.)
>
> I think the RawXmin is the better mechanism.  I'm not absolutely certain
> that the windows is completely closed in 9.3; as I understand things, it
> is possible for transaction A to prune an aborted heap-only tuple, then
> transaction B to insert a frozen tuple in the same location, then
> transaction C follows a link to the HOT that was pruned.  I think we'd
> end up considering that the new frozen tuple is part of the chain, which
> is wrong ...  In 9.4 we can compare the real xmin.

Good point: the race doesn't exist on 9.4+ with pg_upgrade from 9.3,
when raw xmin happens to be FrozenTransactionId, because there can be
no new tuples that have that property. This possible race is strictly
a 9.3 issue. (You have to deal with a raw xmin of FrozenTransactionId
on 9.4+, but there are no race conditions, because affected tuples are
all pre-pg_upgrade tuples -- they were frozen months ago, not
microseconds ago.)

> I hope I am proved wrong about this, because if not, I think we're
> looking at an essentially unsolvable problem in 9.3.

Well, as noted within README.HOT, the xmin/xmax matching did not
appear in earlier versions of the original HOT patch, because it was
thought that holding a pin of the buffer was a sufficient interlock
against concurrent line pointer recycling (pruning must have a buffer
cleanup lock). Clearly it is more robust to match xmin to xmax, but is
there actually any evidence that it was really necessary at all (for
single-page HOT chain traversals) when HOT went in in 2007, or that it
has since become necessary? heap_page_prune()'s caller has to have a
buffer cleanup lock.

I'm certainly not advocating removing the xmin/xmax matching within
heap_prune_chain() on 9.3. However, it may be acceptable to rely on
holding a buffer cleanup lock within heap_prune_chain() on 9.3, just
for the rare/theoretical cases where the FrozenTransactionId raw xmin
ambiguity means that xmin/xmax matching alone might not be enough. As
you say, it seems unsolvable through looking at state on the page
directly on 9.3, so there may be no other way. And, that's still
strictly better than what we have today.

> As far as I understand, this problem only emerges if one part of a HOT
> chain reaches the min freeze age while another part of the same chain is
> still visible by some running transaction.  It is particularly
> noticeably in our current test case because we use a min freeze age of
> 0, with many concurrrent modifying the same page.  What this says to me
> is that VACUUM FREEZE is mildly dangerous when there's lots of high
> concurrent HOT UPDATE activity.

I'm not sure what you mean here. It is dangerous right now, because
there is a bug, but we can squash the bug.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw super user checks

2017-10-05 Thread Simon Riggs
On 4 October 2017 at 18:13, Jeff Janes  wrote:
> On Thu, Sep 14, 2017 at 1:08 PM, Robert Haas  wrote:
>>
>> On Thu, Sep 14, 2017 at 2:33 PM, Jeff Janes  wrote:
>> > I think that foreign tables ought to behave as views do, where they run
>> > as
>> > the owner rather than the invoker.  No one has talked me out of it, but
>> > no
>> > one has supported me on it either.  But I think it is too late to change
>> > that now.
>>
>> That's an interesting point.  I think that you can imagine use cases
>> for either method.  Obviously, if what you want to do is drill a hole
>> through the Internet to another server and then expose it to some of
>> your fellow users, having the FDW run with the owner's permissions
>> (and credentials) is exactly right.  But there's another use case too,
>> which is where you have something that looks like a multi-user
>> sharding cluster.  You want each person's own credentials to carry
>> over to everything they do remotely.
>
>
> OK.  And if you want the first one, you can wrap it in a view currently, but
> if it were changed I don't know what you would do if you want the 2nd one
> (other than having every user create their own set of foreign tables).  So I
> guess the current situation is more flexible.

Sounds like it would be a useful option on a Foreign Server to allow
it to run queries as either the invoker or the owner. We have that
choice for functions, so we already have the concept and syntax
available. We could have another default at FDW level that specifies
what the default is for that type of FDW, and if that is not
specified, we keep it like it currently is.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error: dsa_area could not attach to a segment that has been freed

2017-10-05 Thread Gaddam Sai Ram
Hi Thomas,

  Thanks for cautioning us about possible memory leaks(during error cases) 
incase of long-lived DSA segements.



  Actually we are following an approach to avoid this DSA memory leaks. Let 
me explain our implementation and please validate and correct us in-case we 
  miss anything.



  Implementation:

  

  Basically we have to put our index data into memory (Index Column Value 
Vs Ctid) which we get in aminsert callback function.

  

  Coming to the implementation, in aminsert Callback function, 

We Switch to CurTransactionContext 

Cache the DMLs of a transaction into dlist(global per process)

Even if different clients work parallel, it won't be a problem because every 
client gets one dlist in separate process and it'll have it's own 
CurTransactionContext

We have registered transaction callback (using RegisterXactCallback() 
function). And during event pre-commit(XACT_EVENT_PRE_COMMIT), we populate all 
the transaction specific DMLs (from dlist) into our in-memory index(DSA) 
obviously inside PG_TRY/PG_CATCH block.

In case we got some errors(because of dsa_allocate() or something else) while 
processing dlist(while populating in-memory index), we cleanup the DSA memory 
in PG_CATCH block that is allocated/used till that point.

During other error cases, typically transactions gets aborted and PRE_COMMIT 
event is not called and hence we don't touch DSA at that time. Hence no need to 
bother about leaks.

Even sub transaction case is handled with sub transaction callbacks.

CurTransactionContext(dlist basically) is automatically cleared after that 
particular transaction.



I want to know if this approach is good and works well in all cases. Kindly 
provide your feedback on this.



Regards

G. Sai Ram







 On Wed, 20 Sep 2017 14:25:43 +0530 Thomas Munro 
thomas.mu...@enterprisedb.com wrote 




On Wed, Sep 20, 2017 at 6:14 PM, Gaddam Sai Ram 

gaddamsaira...@zohocorp.com wrote: 

 Thank you very much! That fixed my issue! :) 

 I was in an assumption that pinning the area will increase its lifetime 
but 

 yeah after taking memory context into consideration its working fine! 



So far the success rate in confusing people who first try to make 

long-lived DSA areas and DSM segments is 100%. Basically, this is all 

designed to ensure automatic cleanup of resources in short-lived 

scopes. 



Good luck for your graph project. I think you're going to have to 

expend a lot of energy trying to avoid memory leaks if your DSA lives 

as long as the database cluster, since error paths won't automatically 

free any memory you allocated in it. Right now I don't have any 

particularly good ideas for mechanisms to deal with that. PostgreSQL 

C has exception-like error handling, but doesn't (and probably can't) 

have a language feature like scoped destructors from C++. IMHO 

exceptions need either destructors or garbage collection to keep you 

sane. There is a kind of garbage collection for palloc'd memory and 

also for other resources like file handles, but if you're using a big 

long lived DSA area you have nothing like that. You can use 

PG_TRY/PG_CATCH very carefully to clean up, or (probably better) you 

can try to make sure that all your interaction with shared memory is 

no-throw (note that that means using dsa_allocate_extended(x, 

DSA_ALLOC_NO_OOM), because dsa_allocate itself can raise errors). The 

first thing I'd try would probably be to keep all shmem-allocating 

code in as few routines as possible, and use only no-throw operations 

in the 'critical' regions of them, and maybe look into some kind of 

undo log of things to free or undo in case of error to manage 

multi-allocation operations if that turned out to be necessary. 



-- 

Thomas Munro 

http://www.enterprisedb.com 








Re: [HACKERS] Binary search in fmgr_isbuiltin() is a bottleneck.

2017-10-05 Thread Andres Freund
Hi,

On 2017-10-05 17:08:39 +0300, Heikki Linnakangas wrote:
> > I pushed a further cleaned up version of these two patches.  If you see
> > a way to avoid initializing the "trailing" part of the
> > fmgr_builtin_oid_index in a different manner, I'm all ears ;)
> 
> You could put a dummy entry at fmgr_builtins[0].

Right, I'd considered that somewhere upthread. Didn't really seem
better.


> BTW, there's some alignment padding in FmgrBuiltin, when MAXIMUM_ALIGNOF==8.
> You could easily shrink the struct from 32 to 24 bytes by moving funcName to
> the end of the struct:
> 
> --- a/src/include/utils/fmgrtab.h
> +++ b/src/include/utils/fmgrtab.h
> @@ -25,11 +25,11 @@
>  typedef struct
>  {
>   Oid foid;   /* OID of the function 
> */
> - const char *funcName;   /* C name of the function */
>   short   nargs;  /* 0..FUNC_MAX_ARGS, or -1 if 
> variable count */
>   boolstrict; /* T if function is "strict" */
>   boolretset; /* T if function returns a set 
> */
>   PGFunction  func;   /* pointer to compiled function 
> */
> + const char *funcName;   /* C name of the function */
>  } FmgrBuiltin;
> 
>  extern const FmgrBuiltin fmgr_builtins[];

Yea, that's probably worthwhile, although I suspect it's not a huge save
overall. Do you just want to commit that?


> If we care about cache efficiency here, we could move funcName out of the
> fmgr_builtins array, to a separate array of the same size. I believe
> funcName is only used when you create an internal-language function with
> CREATE FUNCTION, and having it in a separate array shouldn't hurt those
> lookups.

When'd that be beneficial? fmgr_builtins is pretty much only used for
internal-language CREATE FUNCTIONs? In other cases oid bounds + mapping
array should filter out the access before fmgr_builtins is accessed.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimise default partition scanning while adding new partition

2017-10-05 Thread Robert Haas
On Tue, Sep 26, 2017 at 4:27 AM, Amit Langote
 wrote:
> On 2017/09/16 1:57, Amit Langote wrote:
>> On Sat, Sep 16, 2017 at 12:59 AM, Robert Haas  wrote:
>>> I believe the intended advantage of the current system is that if you
>>> specify multiple operations in a single ALTER TABLE command, you only
>>> do one scan rather than having a second scan per operation.  If that's
>>> currently working, we probably don't want to make it stop working.
>>
>> OK.
>>
>> How about squash Jeevan's and my patch, so both
>> check_default_allows_bound() and ValidatePartitionConstraints() know
>> to scan default partition's children and there won't be any surprises
>> in the regression test output as you found after applying just the
>> Jeevan's patch.  Unfortunately, I'm not able to post such a patch
>> right now.
>
> I guess we don't need to squash, as they could be seen as implementing
> different features. Reordering the patches helps though.  So, apply them
> in this order:
>
> 1. My patch to teach ValidatePartitionConstraints() to skip scanning
>a partition's own partitions, which optimizes ATTACH PARTITION
>command's partition constraint validation scan (this also covers the
>case of scanning the default partition to validate its updated
>constraint when attaching a new partition)
>
> 2. Jeevan's patch to teach check_default_allows_bound() to skip scanning
>the default partition's own partitions, which covers the case of
>scanning the default partition to validate its updated constraint when
>adding a new partition using CREATE TABLE
>
> Attached 0001 and 0002 are ordered that way.

OK, I pushed all of this, spread out over 3 commits.  I reworked the
test cases not to be entangled with the existing test cases, and also
to test both of these highly-related features together.  Hopefully you
like the result.

Thanks for the patches and the analysis.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JIT compiling - v4.0

2017-10-05 Thread Andres Freund
On 2017-10-05 23:43:37 +1300, David Rowley wrote:
> On 5 October 2017 at 19:57, Andres Freund  wrote:
> > Here's some numbers for a a TPC-H scale 5 run. Obviously the Q01 numbers
> > are pretty nice in partcular. But it's also visible that the shorter
> > query can loose, which is largely due to the JIT overhead - that can be
> > ameliorated to some degree, but JITing obviously isn't always going to
> > be a win.
> 
> It's pretty exciting to see thing being worked on.
> 
> I've not looked at the code, but I'm thinking, could you not just JIT
> if the total cost of the plan is estimated to be > X ? Where X is some
> JIT threshold GUC.

Right, that's the plan. But it seems fairly important to make the
envelope in which it is beneficial as broad as possible. Also, test
coverage is more interesting for me right now ;)

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing initplans

2017-10-05 Thread Amit Kapila
On Thu, Oct 5, 2017 at 6:08 PM, Robert Haas  wrote:
> On Thu, Oct 5, 2017 at 5:52 AM, Amit Kapila  wrote:
>> Now, unless, I am missing something here, it won't be possible to
>> detect params in such cases during forming of join rels and hence we
>> need the tests in generate_gather_paths.  Let me know if I am missing
>> something in this context or if you have any better ideas to make it
>> work?
>
> Hmm, in a case like this, I think we shouldn't need to detect it.  The
> Var is perfectly parallel-safe, the problem is that we can't use a
> not-safe plan for the inner rel.  I wonder why that's happening
> here...
>

It is because the inner rel (Result Path) contains a reference to a
param which appears to be at the same query level.  Basically due to
changes in max_parallel_hazard_walker().

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw super user checks

2017-10-05 Thread Nico Williams
On Thu, Sep 14, 2017 at 04:08:08PM -0400, Robert Haas wrote:
> On Thu, Sep 14, 2017 at 2:33 PM, Jeff Janes  wrote:
> > I think that foreign tables ought to behave as views do, where they run as
> > the owner rather than the invoker.  No one has talked me out of it, but no
> > one has supported me on it either.  But I think it is too late to change
> > that now.
> 
> That's an interesting point.  I think that you can imagine use cases
> for either method.  Obviously, if what you want to do is drill a hole
> through the Internet to another server and then expose it to some of
> your fellow users, having the FDW run with the owner's permissions
> (and credentials) is exactly right.  But there's another use case too,
> which is where you have something that looks like a multi-user
> sharding cluster.  You want each person's own credentials to carry
> over to everything they do remotely.

Hmmm, I don't think that's really right.

What I'd like instead is for the FDW client to tell the FDW server the
session_user/current_user on behalf of which it's acting, and let the
FDW server decide how to proceed.  This could be done by doing a SET
SESSION fdw.client.session_user... and so on.

We use Kerberos principal names as PG user/role names, _with_ @REALM
included, so a user foo@BAR is likely to make sense to the FDW server.

Of course, if you're not using Kerberos then the local and remote user
namespaces might be completely distinct, but by letting the FDW server
know a) the FDW client's username (via authentication) and b) the true
username on the client side (via SET/set_config()), the server might
have enough information to decide whether it trusts (a) to impersonate
(b) and how to map (b) to a local user.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-10-05 Thread Ashutosh Bapat
On Thu, Oct 5, 2017 at 7:18 PM, Alvaro Herrera  wrote:
> Robert Haas wrote:
>
>> Regarding nomenclature and my previous griping about wisdom, I was
>> wondering about just calling this a "partition join" like you have in
>> the regression test.  So the GUC would be enable_partition_join, you'd
>> have generate_partition_join_paths(), etc.  Basically just delete
>> "wise" throughout.
>
> If I understand correctly, what's being used here is the "-wise" suffix,
> unrelated to wisdom, which Merriam Webster lists as "adverb combining
> form" here https://www.merriam-webster.com/dictionary/wise (though you
> have to scroll down a lot), which is defined as
>
> 1 a :in the manner of  * crabwise * fanwise
>   b :in the position or direction of  * slantwise * clockwise
> 2 :with regard to :in respect of * dollarwise
>

That's right.

> According to that, the right way to write this is "partitionwise join"
> (no dash), which means "join in respect of partitions", "join with
> regard to partitions".

Google lists mostly  "partition wise" or "partition-wise" and very
rarely "partitionwise". The first being used in other DBMS literature.
The paper (there aren't many on this subject) I referred [1] uses
"partition-wise". It made more sense to replace " " or "-" with "_"
when syntax doesn't allow the first two. I am not against
"partitionwise" but I don't see any real reason why we should move
away from popular usage of this term.

[1] https://users.cs.duke.edu/~shivnath/papers/sigmod295-herodotou.pdf

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw super user checks

2017-10-05 Thread Jeff Janes
On Thu, Oct 5, 2017 at 6:44 AM, Robert Haas  wrote:

> On Wed, Oct 4, 2017 at 6:13 PM, Jeff Janes  wrote:
> > OK.  And if you want the first one, you can wrap it in a view currently,
> but
> > if it were changed I don't know what you would do if you want the 2nd one
> > (other than having every user create their own set of foreign tables).
> So I
> > guess the current situation is more flexible.
>
> So where does that leave this patch?


Sorry, I thought we were just having a digression.  I didn't think that
part was about this patch specifically, but what more could be done later.


> I don't think it makes this
> patch a bad idea, although I kind of lean towads the view that the
> patch should just be checking superuser_arg(), not superuser() ||
> superuser_arg().
>

I don't see a reason to block a directly-logged-in superuser from using a
mapping.  I asked in the closed list whether the current (released)
behavior was a security bug, and the answer was no.  And I don't know why
else to block superusers from doing something other than as a security
bug.  Also it would create a backwards compatibility hazard to revoke the
ability now.

Cheers,

Jeff


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-05 Thread Alvaro Herrera
I think this is the patch for 9.3.  I ran the test a few hundred times
(with some additional changes such as randomly having an update inside a
savepoint that's randomly aborted, randomly aborting the transaction,
randomly skipping the for key share lock, randomly sleeping at a few
points; and also adding filler columns, reducing fillfactor and using
5, 50, 180, 250, 500 sessions after verifying that it causes the tuples
to stay in the same page or migrate to later pages).  The final REINDEX
has never complained again about failing to find the root tuple.  I hope
it's good now.

The attached patch needs a few small tweaks, such as improving
commentary in the new function, maybe turn it into a macro (otherwise I
think it could be bad for performance; I'd like a static func but not
sure those are readily available in 9.3), change the XID comparison to
use the appropriate macro rather than ==, and such.

Regarding changes of xmin/xmax comparison, I also checked manually the
spots I thought should be modified and later double-checked against the
list that Michael posted.  It's a match, except for rewriteheap.c which
I cannot make heads or tails about.  (I think it's rather unfortunate
that it sticks a tuple's Xmax into a field that's called Xmin, but let's
put that aside).  Maybe there's a problem here, maybe there isn't.

I'm now going to forward-port this to 9.4.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e00dc6c1ca..87bce0e7ea 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1718,8 +1718,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation 
relation, Buffer buffer,
 * broken.
 */
if (TransactionIdIsValid(prev_xmax) &&
-   !TransactionIdEquals(prev_xmax,
-
HeapTupleHeaderGetXmin(heapTuple->t_data)))
+   !HeapTupleUpdateXmaxMatchesXmin(prev_xmax, 
heapTuple->t_data))
break;
 
/*
@@ -1888,7 +1887,7 @@ heap_get_latest_tid(Relation relation,
 * tuple.  Check for XMIN match.
 */
if (TransactionIdIsValid(priorXmax) &&
- !TransactionIdEquals(priorXmax, 
HeapTupleHeaderGetXmin(tp.t_data)))
+   !HeapTupleUpdateXmaxMatchesXmin(priorXmax, tp.t_data))
{
UnlockReleaseBuffer(buffer);
break;
@@ -1920,6 +1919,36 @@ heap_get_latest_tid(Relation relation,
}   /* end of loop 
*/
 }
 
+/*
+ * Given a tuple, verify whether the given Xmax matches the tuple's Xmin,
+ * taking into account that the Xmin might have been frozen.
+ */
+bool
+HeapTupleUpdateXmaxMatchesXmin(TransactionId xmax, HeapTupleHeader htup)
+{
+   TransactionId   xmin = HeapTupleHeaderGetXmin(htup);
+
+   /* xmax must not be invalid or frozen */
+   Assert(TransactionIdIsNormal(xmax));
+
+   /*
+* If the xmax of the old tuple is identical to the xmin of the new one,
+* it's a match.
+*/
+   if (xmax == xmin)
+   return true;
+
+   /* FIXME Here we need to check the HEAP_XMIN_FROZEN in 9.4 and up */
+
+   /*
+* We actually don't know if there's a match, but if the previous tuple
+* was frozen, we cannot really rely on a perfect match.
+*/
+   if (xmin == FrozenTransactionId)
+   return true;
+
+   return false;
+}
 
 /*
  * UpdateXmaxHintBits - update tuple hint bits after xmax transaction ends
@@ -5045,8 +5074,7 @@ l4:
 * the end of the chain, we're done, so return success.
 */
if (TransactionIdIsValid(priorXmax) &&
-   
!TransactionIdEquals(HeapTupleHeaderGetXmin(mytup.t_data),
-priorXmax))
+   !HeapTupleUpdateXmaxMatchesXmin(priorXmax, 
mytup.t_data))
{
UnlockReleaseBuffer(buf);
return HeapTupleMayBeUpdated;
@@ -5500,7 +5528,10 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
if (TransactionIdPrecedes(xid, cutoff_xid))
{
if (TransactionIdDidCommit(xid))
+   {
+   xid = FrozenTransactionId;
*flags = FRM_MARK_COMMITTED | 
FRM_RETURN_IS_XID;
+   }
else
{
*flags |= FRM_INVALIDATE_XMAX;
diff --git 

Re: [HACKERS] [PATCH] Add ALWAYS DEFERRED option for constraints

2017-10-05 Thread Nico Williams
I accidentally typoed when saving a file.  Here's the new patch with
that typo corrected, changes to information_schema dropped, and with the
addition of tab completion of ALWAYS DEFERRED in psql.

Nico
-- 
>From 97d3db0be9307eff5919821db7fc437da52ef7e3 Mon Sep 17 00:00:00 2001
From: Nicolas Williams 
Date: Tue, 3 Oct 2017 00:33:09 -0500
Subject: [PATCH] Add ALWAYS DEFERRED option for CONSTRAINTs

and CONSTRAINT TRIGGERs.

This is important so that one can have triggers and constraints that
must run after all of the user/client's statements in a transaction
(i.e., at COMMIT time), so that the user/client may make no further
changes (triggers, of course, still can).
---
 doc/src/sgml/catalogs.sgml | 17 -
 doc/src/sgml/ref/alter_table.sgml  |  4 +-
 doc/src/sgml/ref/create_table.sgml | 10 ++-
 doc/src/sgml/ref/create_trigger.sgml   |  2 +-
 doc/src/sgml/trigger.sgml  |  1 +
 src/backend/bootstrap/bootparse.y  |  2 +
 src/backend/catalog/heap.c |  1 +
 src/backend/catalog/index.c|  8 +++
 src/backend/catalog/information_schema.sql |  8 +++
 src/backend/catalog/pg_constraint.c|  2 +
 src/backend/catalog/toasting.c |  2 +-
 src/backend/commands/indexcmds.c   |  2 +-
 src/backend/commands/tablecmds.c   | 20 +-
 src/backend/commands/trigger.c | 28 +++--
 src/backend/commands/typecmds.c|  3 +
 src/backend/nodes/copyfuncs.c  |  3 +
 src/backend/nodes/outfuncs.c   |  4 ++
 src/backend/parser/gram.y  | 99 ++
 src/backend/parser/parse_utilcmd.c | 46 +-
 src/backend/utils/adt/ruleutils.c  |  4 ++
 src/bin/pg_dump/pg_dump.c  | 31 --
 src/bin/pg_dump/pg_dump.h  |  2 +
 src/bin/psql/describe.c| 34 +++---
 src/bin/psql/tab-complete.c|  4 +-
 src/include/catalog/index.h|  2 +
 src/include/catalog/pg_constraint.h| 42 +++--
 src/include/catalog/pg_constraint_fn.h |  1 +
 src/include/catalog/pg_trigger.h   | 16 ++---
 src/include/commands/trigger.h |  1 +
 src/include/nodes/parsenodes.h |  6 +-
 src/include/utils/reltrigger.h |  1 +
 src/test/regress/input/constraints.source  | 51 +++
 src/test/regress/output/constraints.source | 54 +++-
 33 files changed, 418 insertions(+), 93 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 9af77c1..2c3ed23 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2202,6 +2202,13 @@ SCRAM-SHA-256$iteration count:salt<
  
 
  
+  conalwaysdeferred
+  bool
+  
+  Is the constraint always deferred?
+ 
+
+ 
   convalidated
   bool
   
@@ -6948,6 +6955,13 @@ SCRAM-SHA-256$iteration count:salt<
  
 
  
+  tgalwaysdeferred
+  bool
+  
+  True if constraint trigger is always deferred
+ 
+
+ 
   tgnargs
   int2
   
@@ -7009,7 +7023,8 @@ SCRAM-SHA-256$iteration count:salt<

 When tgconstraint is nonzero,
 tgconstrrelid, tgconstrindid,
-tgdeferrable, and tginitdeferred are
+tgdeferrable, tginitdeferred, and
+tgalwaysdeferred are
 largely redundant with the referenced pg_constraint entry.
 However, it is possible for a non-deferrable trigger to be associated
 with a deferrable constraint: foreign key constraints can have some
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index 0fb385e..e81d1fa 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -55,7 +55,7 @@ ALTER TABLE [ IF EXISTS ] name
 ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }
 ADD table_constraint [ NOT VALID ]
 ADD table_constraint_using_index
-ALTER CONSTRAINT constraint_name [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
+ALTER CONSTRAINT constraint_name [ DEFERRABLE | NOT DEFERRABLE | ALWAYS DEFERRED ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
 VALIDATE CONSTRAINT constraint_name
 DROP CONSTRAINT [ IF EXISTS ]  constraint_name [ RESTRICT | CASCADE ]
 DISABLE TRIGGER [ trigger_name | ALL | USER ]
@@ -89,7 +89,7 @@ ALTER TABLE [ IF EXISTS ] name
 
 [ CONSTRAINT constraint_name ]
 { UNIQUE | PRIMARY KEY } USING INDEX index_name
-[ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
+[ DEFERRABLE | NOT DEFERRABLE | ALWAYS DEFERRED ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
 
  
 
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 1477288..38c88b8 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -67,7 +67,7 @@ 

Re: [HACKERS] Parallel Append implementation

2017-10-05 Thread Amit Kapila
On Thu, Oct 5, 2017 at 6:14 PM, Robert Haas  wrote:
> On Thu, Oct 5, 2017 at 6:29 AM, Amit Kapila  wrote:
>> Okay, but can't we try to pick the cheapest partial path for master
>> backend and maybe master backend can try to work on a partial path
>> which is already picked up by some worker.
>
> Well, the master backend is typically going to be the first process to
> arrive at the Parallel Append because it's already running, whereas
> the workers have to start.
>

Sure, the leader can pick the cheapest partial path to start with.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] utility commands benefiting from parallel plan

2017-10-05 Thread Robert Haas
On Fri, Sep 15, 2017 at 2:22 AM, Haribabu Kommi
 wrote:
> Thanks for the review.

I committed this patch with some cosmetic changes.  I think the fact
that several people have asked for this indicates that, even without
making some of the more complicated cases work, this has some value.
I am not convinced it is safe in any case other than when the DML
command is both creating and populating the table, so I removed
REFRESH MATERIALIZED VIEW support from the patch and worked over the
documentation and comments to a degree.

The problem with a case like REFRESH MATERIALIZED VIEW is that there's
nothing to prevent something that gets run in the course of the query
from trying to access the view (and the heavyweight lock won't prevent
that, due to group locking).  That's probably a stupid thing to do,
but it can't be allowed to break the world.  The other cases are safe
from that particular problem because the table doesn't exist yet.

I am still slightly nervous that there may be some other problem that
none of us have thought about that makes this unsafe, but we still
have quite a while until 11 goes out the door, so if there is such a
problem, maybe someone else will find it now that this is committed
and more likely to get some attention.  I thought about not committing
this just in case such a problem exists, but that seemed too timid: if
we never commit anything that might have an undetected bug, we'll
never commit anything at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw bug in 9.6

2017-10-05 Thread Tom Lane
Robert Haas  writes:
> On Tue, Aug 29, 2017 at 5:14 PM, Tom Lane  wrote:
>> We'd definitely need to do things that way in 9.6.  I'm not quite sure
>> whether it's too late to adopt the clean solution in v10.

> It probably is now.  Are you still planning to do something about this patch?

It's still on my list, but I didn't get to it during the CF.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow multiple tables to be specified in one VACUUM or ANALYZE c

2017-10-05 Thread Tom Lane
Michael Paquier  writes:
> Tom, it seems to me that in the portions you have editorialized, you
> have forgotten to update two comments still mentioning get_rel_oids()
> in vacuum.c and analyze.c. Those should now refer to
> expand_vacuum_rel() instead. Please see the attached.

Oh, good point.  I don't think that just s/get_rel_oids/expand_vacuum_rel/
will do though, as it leaves out the interaction with get_all_vacuum_rels.
I decided that the point was of merely historical interest anyway, and
just removed the reference to the other routine.

The partitioned-table patches seem to have been a bit sloppy with
maintaining these comments too, so I ended up doing more than that...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages

2017-10-05 Thread Robert Haas
On Sun, Sep 17, 2017 at 7:04 AM, Dilip Kumar  wrote:
> I used  lossy_pages = max(0, total_pages - maxentries / 2). as
> suggesed by Alexander.

Does that formula accurately estimate the number of lossy pages?

The performance results look good, but that's a slightly different
thing from whether the estimate is accurate.

+nbuckets = tbm_calculate_entires(maxbytes);

entires?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres 9.6 Logical and Fisical replication

2017-10-05 Thread Mario Fernando Guerrero Díaz
Thank you for the clarification.

El 5/10/2017 9:27 AM, "Robert Haas"  escribió:

> On Mon, Sep 18, 2017 at 5:30 PM, guedim  wrote:
> > I am working with Postgres9.6 with a Master/Slave cluster replication
> using
> > Streaming replication.
> > I would like to add a new Slave server database but this database with
> > logical replication .
> >
> > I tried with some configurations but it was not possible  :(
> >
> > https://github.com/guedim/postgres-streaming-replication
> >
> > Here is the image of what is in my mind:
> >  >
>
> This question is really off-topic for this list, which is probably why
> you haven't gotten any replies.  This list is for discussion of
> PostgreSQL development; there are other lists for user questions, like
> pgsql-general.  Logical replication is only supported beginning in
> PostgreSQL 10; if you are using some earlier version, you need an
> add-on tool like pglogical, slony, etc.
>
> Please also read https://wiki.postgresql.org/wiki/Guide_to_reporting_
> problems
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] Postgres 9.6 Logical and Fisical replication

2017-10-05 Thread Robert Haas
On Mon, Sep 18, 2017 at 5:30 PM, guedim  wrote:
> I am working with Postgres9.6 with a Master/Slave cluster replication using
> Streaming replication.
> I would like to add a new Slave server database but this database with
> logical replication .
>
> I tried with some configurations but it was not possible  :(
>
> https://github.com/guedim/postgres-streaming-replication
>
> Here is the image of what is in my mind:
> 

This question is really off-topic for this list, which is probably why
you haven't gotten any replies.  This list is for discussion of
PostgreSQL development; there are other lists for user questions, like
pgsql-general.  Logical replication is only supported beginning in
PostgreSQL 10; if you are using some earlier version, you need an
add-on tool like pglogical, slony, etc.

Please also read https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Additional logging for VACUUM and ANALYZE

2017-10-05 Thread Bossart, Nathan
On 10/5/17, 12:29 AM, "Michael Paquier"  wrote:
> On Thu, Oct 5, 2017 at 1:23 AM, Bossart, Nathan  wrote:
>> Presently, there are a few edge cases in vacuum_rel() and analyze_rel() that 
>> I
>> believe do not have sufficient logging.  This was discussed a bit in the
>> vacuum-multiple-relations thread [0], but it was ultimately decided that any
>> logging changes should be proposed separately.
>
> I think that I agree with that, especially now with VACUUM allowing
> multiple relations. The discussion then would be how much logging we
> want. WARNING looks adapted per the discussions we had on the other
> thread as manual VACUUMs can now involve much more relations, even
> with partitioned tables. More opinions would be welcome.

Thanks for taking a look.

>> and a test that exercises a bit of this functionality.
>
> My take on those test would be to not include them. This is a lot just
> to test two logging lines where the relation has been dropped.

Yeah, I'm not terribly concerned about those tests.

>> If this change were to be considered for back-patching, we would likely want 
>> to
>> also apply Michael's RangeVar fix for partitioned tables to 10 [1].  Without
>> this change, log messages for unspecified partitions will be emitted with the
>> parent's RangeVar information.
>
> Well, that's assuming that we begin logging some information for
> manual VACUUMs using the specified RangeVar, something that does not
> happen at the top of upstream REL_10_STABLE, but can happen if we were
> to include the patch you are proposing on this thread for
> REL_10_STABLE. But the latter is not going to happen. Or did you patch
> your version of v10 to do so in your stuff? For v10 the ship has
> already sailed, so I think that it would be better to just let it go,
> and rely on v11 which has added all the facility we wanted.

I agree.  I didn't mean to suggest that it should be back-patched, just
that v10 has a small quirk that needs to be handled if others feel
differently.

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw bug in 9.6

2017-10-05 Thread Robert Haas
On Tue, Aug 29, 2017 at 5:14 PM, Tom Lane  wrote:
> Etsuro Fujita  writes:
>> [ epqpath-for-foreignjoin-11.patch ]
>
> I started looking at this.  I've not yet wrapped my head around what
> CreateLocalJoinPath() is doing, but I noted that Robert's concerns
> about ABI breakage in the back branches would not be very difficult
> to satisfy.  What we'd need to do is
>
> (1) In the back-branch patch, add the new fields of JoinPathExtraData
> at the end, rather than in their more natural locations.  This should
> avoid any ABI break for uses of that struct, since there's no reason
> why an FDW would be creating its own variables of that type rather
> than just using what the core code passes it.
>
> (2) In the back branches, just leave GetExistingLocalJoinPath in place
> rather than deleting it.  Existing FDWs would still be subject to the
> bug until they were updated, but given how hard it is to trigger, that
> doesn't seem like a huge problem.
>
> A variant of (2) is to install a hack fix in GetExistingLocalJoinPath
> rather than leaving it unchanged.  I'm not very excited about that though;
> it doesn't seem unlikely that we might introduce new bugs that way, and
> it would certainly be a distraction from getting the real fix finished.
>
> We'd definitely need to do things that way in 9.6.  I'm not quite sure
> whether it's too late to adopt the clean solution in v10.

It probably is now.  Are you still planning to do something about this patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Binary search in fmgr_isbuiltin() is a bottleneck.

2017-10-05 Thread Heikki Linnakangas

On 10/04/2017 10:33 AM, Andres Freund wrote:

On 2017-10-02 15:01:36 -0700, Andres Freund wrote:

On 2017-10-02 17:57:51 -0400, Tom Lane wrote:

Andres Freund  writes:

Done that way. It's a bit annoying, because we've to take care to
initialize the "unused" part of the array with a valid signalling it's
an unused mapping. Can't use 0 for that because fmgr_builtins[0] is a
valid entry.


The prototype code I posted further upthread just used -1 as the "unused"
marker. There's no reason the array can't be int16 rather than uint16,
and "if (index < 0)" is probably a faster test anyway.


Right, but whether we use -1 or UINT16_MAX or such doesn't matter. The
relevant bit is that we can't use 0, so we can't rely on the rest of the
array being zero initialized, but instead of to initialize all of it
explicitly.  I've no real feelings about using -1 or UINT16_MAX - I'd be
very surprised if there's any sort of meaningful performance difference.


I pushed a further cleaned up version of these two patches.  If you see
a way to avoid initializing the "trailing" part of the
fmgr_builtin_oid_index in a different manner, I'm all ears ;)


You could put a dummy entry at fmgr_builtins[0].

BTW, there's some alignment padding in FmgrBuiltin, when 
MAXIMUM_ALIGNOF==8. You could easily shrink the struct from 32 to 24 
bytes by moving funcName to the end of the struct:


--- a/src/include/utils/fmgrtab.h
+++ b/src/include/utils/fmgrtab.h
@@ -25,11 +25,11 @@
 typedef struct
 {
Oid foid;   /* OID of the function 
*/
-   const char *funcName;   /* C name of the function */
short   nargs;  /* 0..FUNC_MAX_ARGS, or -1 if 
variable count */
boolstrict; /* T if function is "strict" */
boolretset; /* T if function returns a set 
*/
PGFunction  func;   /* pointer to compiled function 
*/
+   const char *funcName;   /* C name of the function */
 } FmgrBuiltin;

 extern const FmgrBuiltin fmgr_builtins[];

If we care about cache efficiency here, we could move funcName out of 
the fmgr_builtins array, to a separate array of the same size. I believe 
funcName is only used when you create an internal-language function with 
CREATE FUNCTION, and having it in a separate array shouldn't hurt those 
lookups.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] subscription worker signalling wal writer too much

2017-10-05 Thread Robert Haas
On Tue, Sep 19, 2017 at 8:48 AM, Kyotaro HORIGUCHI
 wrote:
> Though a bit uneasy to have similar code on both side
> (XLogBackgroundFlush and XLogSetAsyncXactLSN) but +1 to this from
> me.

This patch wasn't formatted very nicely; attached is a version that
pgindent likes better and doesn't bust past 80 columns.

I think this patch needs significantly better comments.  There's no
explanation of the rather complex formula the patch proposed to use.
Maybe it's just copied from someplace else that does have a comment,
but in that case, the comments here should mention that.  And the
comment in the other place should probably be updated too, like:

/* Make sure to keep this in sync with the similar logic in
XLogDoAwesomeStuff() */

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


change_walwriter_wakeup_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-10-05 Thread Robert Haas
On Thu, Oct 5, 2017 at 9:48 AM, Alvaro Herrera  wrote:
> Robert Haas wrote:
>> Regarding nomenclature and my previous griping about wisdom, I was
>> wondering about just calling this a "partition join" like you have in
>> the regression test.  So the GUC would be enable_partition_join, you'd
>> have generate_partition_join_paths(), etc.  Basically just delete
>> "wise" throughout.
>
> If I understand correctly, what's being used here is the "-wise" suffix,
> unrelated to wisdom, which Merriam Webster lists as "adverb combining
> form" here https://www.merriam-webster.com/dictionary/wise (though you
> have to scroll down a lot), which is defined as
>
> 1 a :in the manner of  * crabwise * fanwise
>   b :in the position or direction of  * slantwise * clockwise
> 2 :with regard to :in respect of * dollarwise
>
> According to that, the right way to write this is "partitionwise join"
> (no dash), which means "join in respect of partitions", "join with
> regard to partitions".

I'm fine with that, if others like it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-10-05 Thread Alvaro Herrera
Robert Haas wrote:

> Regarding nomenclature and my previous griping about wisdom, I was
> wondering about just calling this a "partition join" like you have in
> the regression test.  So the GUC would be enable_partition_join, you'd
> have generate_partition_join_paths(), etc.  Basically just delete
> "wise" throughout.

If I understand correctly, what's being used here is the "-wise" suffix,
unrelated to wisdom, which Merriam Webster lists as "adverb combining
form" here https://www.merriam-webster.com/dictionary/wise (though you
have to scroll down a lot), which is defined as

1 a :in the manner of  * crabwise * fanwise
  b :in the position or direction of  * slantwise * clockwise
2 :with regard to :in respect of * dollarwise

According to that, the right way to write this is "partitionwise join"
(no dash), which means "join in respect of partitions", "join with
regard to partitions".

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgresql gives error that role goes not exists while it exists

2017-10-05 Thread Tom Lane
Robert Haas  writes:
> On Tue, Oct 3, 2017 at 8:47 AM, Euler Taveira  wrote:
>> I'm not sure. I bet a dime that the role was created as "Iris" and you
>> are trying to assing "iris" (they are different). If you list the
>> roles, we can confirm that.

> I don't see how this would explain anything.

The query as given has obvious syntax problems:

regression=# select *
from users
where case when (select pg_has_role(select "current_user"(), 'hr_user'::name, 
'MEMBER'::text)) then 1=1 else userstatus <>'Active' end
;
ERROR:  syntax error at or near "select"
LINE 3: where case when (select pg_has_role(select "current_user"(),...
^

I'm betting that the complainant has tried to obscure what they actually
did, and has obscured away some critical detail --- maybe the case of the
user name, maybe something else.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-10-05 Thread Robert Haas
On Tue, Oct 3, 2017 at 9:38 PM, Andres Freund  wrote:
>> Do you have any suggestion as to how we should transmit the blacklist to
>> parallel workers?
>
> How about storing them in the a dshash table instead of dynahash?
> Similar to how we're now dealing with the shared typmod registry stuff?
> It should be fairly simple to now simply add a new struct Session member
> shared_enum_whatevs_table.

Yeah, that approach seems worth exploring.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw super user checks

2017-10-05 Thread Robert Haas
On Wed, Oct 4, 2017 at 6:13 PM, Jeff Janes  wrote:
> OK.  And if you want the first one, you can wrap it in a view currently, but
> if it were changed I don't know what you would do if you want the 2nd one
> (other than having every user create their own set of foreign tables).  So I
> guess the current situation is more flexible.

So where does that leave this patch?  I don't think it makes this
patch a bad idea, although I kind of lean towads the view that the
patch should just be checking superuser_arg(), not superuser() ||
superuser_arg().

> It does seem like it would then be a good idea to have a user mapping option
> of "pass_the_hash" which would look up md5 hash from the pg_authid (if the
> local username is the same as the remote user name) and use that to connect
> to the foreign server, as an alternative option to recording the password in
> plain text in the mapping itself.  But that would require some changes to
> libpq, not just postgres_fdw.
>
> And that wouldn't work for SCRAM.  I guess that SCRAM does have some feature
> to allow this kind of delegation, but I don't know enough about it to know
> how hard it would be to implement in postgres_fdw or how useful it would be
> to have.

We really need some kind of method for delegating authentication.  I
don't know how it should work.

Generally, password authentication is a silly choice for automated
logins because then you've got to store the password someplace from
which it can be digitally stolen; stealing a password from someone's
brain is a different kind of problem.  It's not even a good method for
this situation, yet it's the only one we allow.  I think that bites,
but I don't really know what to do about it.

> Yeah, I have not been finding it enjoyable.  How much flexibility does the
> SQL/MED spec even give us (I don't have access to the spec)?

Sorry, I don't know.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall

2017-10-05 Thread Robert Haas
On Wed, Oct 4, 2017 at 3:40 AM, Haribabu Kommi  wrote:
> There are some differences in handling database objects
> between pg_dump and pg_dumpall, To retain both pg_dump
> and pg_dumpall behavior even after refactoring, this option
> is added. Currently this option is used mainly for the three
> purposes.
>
> 1. Don't print unnecessary CREATE DATABASE options like
> ENCODING, LC_COLLATE and LC_CTYPE options if the
> default encoding is same with the above values.
>
> The above behavior is as per the pg_dumpall, but it can be
> changed to print irrespective of the default encoding.
>
> 2. Do not dump postgres and template0 databases.
>
> 3. Set default_transaction_read_only = off.

To me it seems that a refactoring which requires pg_dump to behave
differently in small ways like this based on whether it is being
called by pg_dumpall or not is probably not a good refactoring.  And I
don't see why the proposal from Tom that started this thread would
require such a thing to be true.

>From your list, I would say that (1) and (3) seem like behaviors that
we either want or do not want.  Whether pg_dump or pg_dumpall is
involved seems irrelevant.  (2) seems like it might need some special
handling, but that could be handled in pg_dumpall by just not calling
pg_dump at all for those database.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] v10 bottom-listed

2017-10-05 Thread Erik Rijkers

In the 'ftp' listing, v10 appears at the bottom:
  https://www.postgresql.org/ftp/source/

With all the other v10* directories at the top, we could get a lot of 
people installing wrong binaries...


Maybe it can be fixed so that it appears at the top.


Thanks,

Erik Rijkers




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgresql gives error that role goes not exists while it exists

2017-10-05 Thread Robert Haas
On Tue, Oct 3, 2017 at 8:47 AM, Euler Taveira  wrote:
> I'm not sure. I bet a dime that the role was created as "Iris" and you
> are trying to assing "iris" (they are different). If you list the
> roles, we can confirm that.

I don't see how this would explain anything.  "current_role"() is
going to return the role using its actual case, and that's also what
pg_hash_role() expects.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cache lookup errors for missing replication origins

2017-10-05 Thread Michael Paquier
On Thu, Oct 5, 2017 at 9:38 PM, Robert Haas  wrote:
> On Tue, Oct 3, 2017 at 2:16 AM, Masahiko Sawada  wrote:
>> The patch passes the regression test and I found no problems in this
>> patch. I've marked it as Ready for Committer.
>
> Committed and back-patched to 9.5, which was as far as it applied cleanly.

Thanks all.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-10-05 Thread Robert Haas
On Thu, Oct 5, 2017 at 6:29 AM, Amit Kapila  wrote:
> Okay, but can't we try to pick the cheapest partial path for master
> backend and maybe master backend can try to work on a partial path
> which is already picked up by some worker.

Well, the master backend is typically going to be the first process to
arrive at the Parallel Append because it's already running, whereas
the workers have to start.  So in the case that really matters, no
paths will have been picked yet.  Later on, we could have the master
try to choose a path on which some other worker is already working,
but I really doubt that's optimal.  Either the workers are generating
a lot of tuples (in which case the leader probably won't do much work
on any path because it will be busy reading tuples) or they are
generating only a few tuples (in which case the leader is probably
better off working on an a path not yet chosen, to reduce contention).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment typo

2017-10-05 Thread Robert Haas
On Thu, Oct 5, 2017 at 6:11 AM, Etsuro Fujita
 wrote:
> Here is a small patch to fix a typo in a comment in partition.c:
> s/mantain/maintain/.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cache lookup errors for missing replication origins

2017-10-05 Thread Robert Haas
On Tue, Oct 3, 2017 at 2:16 AM, Masahiko Sawada  wrote:
> The patch passes the regression test and I found no problems in this
> patch. I've marked it as Ready for Committer.

Committed and back-patched to 9.5, which was as far as it applied cleanly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing initplans

2017-10-05 Thread Robert Haas
On Thu, Oct 5, 2017 at 5:52 AM, Amit Kapila  wrote:
> Now, unless, I am missing something here, it won't be possible to
> detect params in such cases during forming of join rels and hence we
> need the tests in generate_gather_paths.  Let me know if I am missing
> something in this context or if you have any better ideas to make it
> work?

Hmm, in a case like this, I think we shouldn't need to detect it.  The
Var is perfectly parallel-safe, the problem is that we can't use a
not-safe plan for the inner rel.  I wonder why that's happening
here... is it a problem with your patch or an existing bug?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging idle checkpoints

2017-10-05 Thread Kyotaro HORIGUCHI
At Thu, 5 Oct 2017 13:41:42 +0200, Alvaro Herrera  
wrote in <20171005114142.dupjeqe2cnplhgkx@alvherre.pgsql>
> Kyotaro HORIGUCHI wrote:
> 
> > # This reminded me of a concern. I'd like to count vacuums that
> > # are required but skipped by lock-failure, or killed by other
> > # backend.
> 
> We clearly need to improve the stats and logs related to vacuuming work
> executed, both by autovacuum and manually invoked.  One other item I
> have in my head is to report numbers related to the truncation phase of
> a vacuum run, since in some cases it causes horrible and hard to
> diagnose problems.  (Also, add an reloption to stop vacuum from doing
> the truncation phase at all -- for some usage patterns that is a serious
> problem.)
> 
> However, please do open a new thread about it.

Thanks! Will do after a bit time of organization of the thougts.

reagareds,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging idle checkpoints

2017-10-05 Thread Alvaro Herrera
Kyotaro HORIGUCHI wrote:

> # This reminded me of a concern. I'd like to count vacuums that
> # are required but skipped by lock-failure, or killed by other
> # backend.

We clearly need to improve the stats and logs related to vacuuming work
executed, both by autovacuum and manually invoked.  One other item I
have in my head is to report numbers related to the truncation phase of
a vacuum run, since in some cases it causes horrible and hard to
diagnose problems.  (Also, add an reloption to stop vacuum from doing
the truncation phase at all -- for some usage patterns that is a serious
problem.)

However, please do open a new thread about it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-05 Thread Alvaro Herrera
Peter Geoghegan wrote:

> As you know, on version 9.4+, as of commit 37484ad2a, we decided that
> we are "largely ignoring the value to which it [xmin] is set". The
> expectation became that raw xmin is available after freezing, but
> mostly for forensic purposes. I think Alvaro should now memorialize
> the idea that its value is actually critical in some place
> (htup_details.h?).

I'd love to be certain that we preserve the original value in all
cases.

> On Wed, Oct 4, 2017 at 6:46 AM, Alvaro Herrera  
> wrote:

> > I independently arrived at the same conclusion.  Since I was trying with
> > 9.3, the patch differs -- in the old version we must explicitely test
> > for the FrozenTransactionId value, instead of using GetRawXmin.
> 
> Obviously you're going to have to be prepared for a raw xmin of
> FrozenTransactionId, even on 9.4+, due to pg_upgrade.

Hmm.  It would be good to be able to get rid of that case, actually,
because I now think it's less safe (see below).

At any rate, I was thinking in a new routine to encapsulate the logic,

/*
 * Check the tuple XMIN against prior XMAX, if any
 */
if (!HeapTupleUpdateXmaxMatchesXmin(priorXmax, 
HeapTupleHeaderGetXmin(htup)))
break;

where ..XmaxMatchesXmin would know about checking for possibly frozen
tuples.

> I can see why it
> would be safe (or at least no more dangerous) to rely on
> HeapTupleHeaderGetRawXmin() in the way mentioned here, at least on
> installations that initdb'd on a version after commit 37484ad2a
> (version 9.4+). However, I'm not sure why what you propose here would
> be safe when even raw xmin happens to be FrozenTransactionId. Are you
> sure that that's truly race-free? If it's really true that we only
> need to check for FrozenTransactionId on 9.3, why not just do that on
> all versions, and never bother with HeapTupleHeaderGetRawXmin()?
> ("Sheer paranoia" is a valid answer; I just want us to be clear on the
> reasoning.)

I think the RawXmin is the better mechanism.  I'm not absolutely certain
that the windows is completely closed in 9.3; as I understand things, it
is possible for transaction A to prune an aborted heap-only tuple, then
transaction B to insert a frozen tuple in the same location, then
transaction C follows a link to the HOT that was pruned.  I think we'd
end up considering that the new frozen tuple is part of the chain, which
is wrong ...  In 9.4 we can compare the real xmin.

I hope I am proved wrong about this, because if not, I think we're
looking at an essentially unsolvable problem in 9.3.

> Obviously any race would have a ridiculously tiny window, but it's not
> obvious why this protocol would be completely race-free (in the event
> of a FrozenTransactionId raw xmin).

As far as I understand, this problem only emerges if one part of a HOT
chain reaches the min freeze age while another part of the same chain is
still visible by some running transaction.  It is particularly
noticeably in our current test case because we use a min freeze age of
0, with many concurrrent modifying the same page.  What this says to me
is that VACUUM FREEZE is mildly dangerous when there's lots of high
concurrent HOT UPDATE activity.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another oddity in handling of WCO constraints in postgres_fdw

2017-10-05 Thread Kyotaro HORIGUCHI
At Thu, 5 Oct 2017 18:08:50 +0900, Etsuro Fujita  
wrote in <60e94494-4e5d-afed-e482-b9ad1986b...@lab.ntt.co.jp>
> On 2017/10/04 21:28, Ashutosh Bapat wrote:
> > On Wed, Oct 4, 2017 at 5:32 PM, Robert Haas 
> > wrote:
> >> On Wed, Oct 4, 2017 at 6:40 AM, Ashutosh Bapat
> >>  wrote:
> >>> We can
> >>> check whether a row being sent from the local server to the foreign
> >>> server obeys WCO, but what foreign server does to that row is beyond
> >>> local server's scope.
> >>
> >> But I think right now we're not checking the row being sent from the
> >> local server, either.
> 
> We don't check the row *before* sending it to the remote server, but
> check the row returned by ExecForeignInsert/ExecForeignUpdate, which
> is allowed to have been changed by the remote server.  In
> postgres_fdw, we currently return the data actually inserted/updated
> if RETURNING/AFTER TRIGGER present, but not if WCO only presents.  So,
> for the postgres_fdw foreign table, WCO is enforced on the data that
> was actually inserted/updated if RETURNING/AFTER TRIGGER present and
> on the original data core supplied if WCO only presents, which is
> inconsistent behavior.
> 
> > Didn't 7086be6e3627c1ad797e32ebbdd232905b5f577f fix that?
> 
> No.  The commit addressed another issue.
> 
> >> The WCO that is being ignored isn't a
> >> constraint on the foreign table; it's a constraint on a view which
> >> happens to reference the foreign table.  It seems quite odd for the
> >> "assume constraints are valid" property of the foreign table to
> >> propagate back up into the view that references it.
> 
> Agreed.
> 
> > The view with WCO is local but the modification which violates WCO is
> > being made on remote server by a trigger on remote table. Trying to
> > control that doesn't seem to be a good idea, just like we can't
> > control what rows get inserted on the foreign server when they violate
> > local constraints. I am using local constraints as an example of
> > precedence where we ignore what's happening on remote side and enforce
> > whatever we could enforce locally. Local server should make sure that
> > any rows sent from local server to the remote server do not violate
> > any local WCO.
> 
> Seems odd (and too restrictive) to me too.

Since WCO ensures finally inserted values, we can't do other than
acturally requesting for the values. So just merging WCO columns
to RETURNING in deparsed query is ok. But can't we concatenate
returningList and withCheckOptionList at more higher level?
Specifically, just passing calculated used_attr to
deparse(Insert|Update)Sql instead of returningList and
withCheckOptionList separately.  Deparsed queries anyway forget
the origin of requested columns.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JIT compiling - v4.0

2017-10-05 Thread David Rowley
On 5 October 2017 at 19:57, Andres Freund  wrote:
> Here's some numbers for a a TPC-H scale 5 run. Obviously the Q01 numbers
> are pretty nice in partcular. But it's also visible that the shorter
> query can loose, which is largely due to the JIT overhead - that can be
> ameliorated to some degree, but JITing obviously isn't always going to
> be a win.

It's pretty exciting to see thing being worked on.

I've not looked at the code, but I'm thinking, could you not just JIT
if the total cost of the plan is estimated to be > X ? Where X is some
JIT threshold GUC.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-10-05 Thread Amit Khandekar
On 30 September 2017 at 19:21, Amit Kapila  wrote:
> On Wed, Sep 20, 2017 at 10:59 AM, Amit Khandekar  
> wrote:
>> On 16 September 2017 at 10:42, Amit Kapila  wrote:
>>>
>>> At a broader level, the idea is good, but I think it won't turn out
>>> exactly like that considering your below paragraph which indicates
>>> that it is okay if leader picks a partial path that is costly among
>>> other partial paths as a leader won't be locked with that.
>>> Considering this is a good design for parallel append, the question is
>>> do we really need worker and leader to follow separate strategy for
>>> choosing next path.  I think the patch will be simpler if we can come
>>> up with a way for the worker and leader to use the same strategy to
>>> pick next path to process.  How about we arrange the list of paths
>>> such that first, all partial paths will be there and then non-partial
>>> paths and probably both in decreasing order of cost.  Now, both leader
>>> and worker can start from the beginning of the list. In most cases,
>>> the leader will start at the first partial path and will only ever
>>> need to scan non-partial path if there is no other partial path left.
>>> This is not bulletproof as it is possible that some worker starts
>>> before leader in which case leader might scan non-partial path before
>>> all partial paths are finished, but I think we can avoid that as well
>>> if we are too worried about such cases.
>>
>> If there are no partial subpaths, then again the leader is likely to
>> take up the expensive subpath. And this scenario would not be
>> uncommon.
>>
>
> While thinking about how common the case of no partial subpaths would
> be, it occurred to me that as of now we always create a partial path
> for the inheritance child if it is parallel-safe and the user has not
> explicitly set the value of parallel_workers to zero (refer
> compute_parallel_worker).  So, unless you are planning to change that,
> I think it will be quite uncommon to have no partial subpaths.

There are still some paths that can have non-partial paths cheaper
than the partial paths. Also, there can be UNION ALL queries which
could have non-partial subpaths. I guess this has already been
discussed in the other replies.

>
> Few nitpicks in your latest patch:
> 1.
> @@ -298,6 +366,292 @@ ExecReScanAppend(AppendState *node)
>   if (subnode->chgParam == NULL)
>   ExecReScan(subnode);
>   }
> +
>
> Looks like a spurious line.
>
> 2.
> @@ -1285,7 +1291,11 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
> ..
> + if (chosen_path && chosen_path != cheapest_partial_path)
> + pa_all_partial_subpaths = false;
>
> It will keep on setting pa_all_partial_subpaths as false for
> non-partial paths which don't seem to be the purpose of this variable.
> I think you want it to be set even when there is one non-partial path,
> so isn't it better to write as below or something similar:
> if (pa_nonpartial_subpaths && pa_all_partial_subpaths)
> pa_all_partial_subpaths = false;

Ok. How about removing pa_all_partial_subpaths altogether , and
instead of the below condition :

/*
* If all the child rels have partial paths, and if the above Parallel
* Append path has a mix of partial and non-partial subpaths, then consider
* another Parallel Append path which will have *all* partial subpaths.
* If enable_parallelappend is off, make this one non-parallel-aware.
*/
if (partial_subpaths_valid && !pa_all_partial_subpaths)
..

Use this condition :
if (partial_subpaths_valid && pa_nonpartial_subpaths != NIL)
..




Regarding a mix of partial and non-partial paths, I feel it always
makes sense for the leader to choose the partial path. If it chooses a
non-partial path, no other worker would be able to help finish that
path. Among the partial paths, whether it chooses the cheapest one or
expensive one does not matter, I think. We have the partial paths
unordered. So whether it starts from the last partial path or the
first partial path should not matter.

Regarding scenario where all paths are non-partial, here is an e.g. :
Suppose we have 4 child paths with costs : 10 5 5 3, and with 2
workers plus one leader. And suppose the leader takes additionally
1/4th of these costs to process the returned tuples.

If leader takes least expensive one (3)  :
2 workers will finish 10, 5, 5 in 10 units,
and leader simultaneously chooses the plan with cost 3, and so it
takes 3 + (1/4)(10 + 5 + 5 + 3) = 9 units.
So the total time taken by Append is : 10.


Whereas if leader takes most expensive one (10) :
10 + .25 (total) = 10 + 6 = 16
The 2 workers will finish 2nd, 3rd and 4th plan (5,5,3) in 8 units.
and simultaneously leader will finish 1st plan (10) in 10 units, plus
tuple processing cost i.e. 10 +  (1/4)(10 + 5 + 5 + 3) = 15 units.
So the total time taken by Append is : 15.


-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 

Re: [HACKERS] Parallel Append implementation

2017-10-05 Thread Amit Kapila
On Mon, Oct 2, 2017 at 6:21 PM, Robert Haas  wrote:
> On Sun, Oct 1, 2017 at 9:55 AM, Amit Kapila  wrote:
>> Isn't it for both?  I mean it is about comparing the non-partial paths
>> for child relations of the same relation and also when there are
>> different relations involved as in Union All kind of query.  In any
>> case, the point I was trying to say is that generally non-partial
>> relations will have relatively smaller scan size, so probably should
>> take lesser time to complete.
>
> I don't think that's a valid inference.  It's true that a relation
> could fail to have a partial path because it's small, but that's only
> one reason among very many.  The optimal index type could be one that
> doesn't support parallel index scans, for example.
>

Valid point.

>> Sure, I think it is quite good if we can achieve that but it seems to
>> me that we will not be able to achieve that in all scenario's with the
>> patch and rather I think in some situations it can result in leader
>> ended up picking the costly plan (in case when there are all partial
>> plans or mix of partial and non-partial plans).  Now, we are ignoring
>> such cases based on the assumption that other workers might help to
>> complete master backend.  I think it is quite possible that the worker
>> backends picks up some plans which emit rows greater than tuple queue
>> size and they instead wait on the master backend which itself is busy
>> in completing its plan.  So master backend will end up taking too much
>> time.  If we want to go with a strategy of master (leader) backend and
>> workers taking a different strategy to pick paths to work on, then it
>> might be better if we should try to ensure that master backend always
>> starts from the place which has cheapest plans irrespective of whether
>> the path is partial or non-partial.
>
> I agree that it's complicated to get this right in all cases; I'm
> realizing that's probably an unattainable ideal.
>
> However, I don't think ignoring the distinction between partial and
> non-partial plans is an improvement, because the argument that other
> workers may be able to help the leader is a correct one.  You are
> correct in saying that the workers might fill up their tuple queues
> while the leader is working, but once the leader returns one tuple it
> will switch to reading from the queues.  Every other tuple can be
> returned by some worker.  On the other hand, if the leader picks a
> non-partial plan, it must run that plan through to completion.
>
> Imagine (a) a non-partial path with a cost of 1000 returning 100
> tuples and (b) a partial path with a cost of 10,000 returning 1,000
> tuples.  No matter which path the leader picks, it has about 10 units
> of work to do to return 1 tuple.  However, if it picks the first path,
> it is committed to doing 990 more units of work later, regardless of
> whether the workers have filled the tuple queues or not.  If it picks
> the second path, it again has about 10 units of work to do to return 1
> tuple, but it hasn't committed to doing all the rest of the work of
> that path.  So that's better.
>
> Now it's hard to get all of the cases right.  If the partial path in
> the previous example had a cost of 1 crore then even returning 1 tuple
> is more costly than running the whole non-partial path.  When you
> introduce partition-wise join and parallel hash, there are even more
> problems.  Now the partial path might have a large startup cost, so
> returning the first tuple may be very expensive, and that work may
> help other workers (if this is a parallel hash) or it may not (if this
> is a non-parallel hash).
>

Yeah, these were the type of cases I am also worried.

>  I don't know that we can get all of these
> cases right, or that we should try.  However, I still think that
> picking partial paths preferentially is sensible -- we don't know all
> the details, but we do know that they're typically going to at least
> try to divide up the work in a fine-grained fashion, while a
> non-partial path, once started, the leader must run it to completion.
>

Okay, but can't we try to pick the cheapest partial path for master
backend and maybe master backend can try to work on a partial path
which is already picked up by some worker.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Comment typo

2017-10-05 Thread Etsuro Fujita
Here is a small patch to fix a typo in a comment in partition.c: 
s/mantain/maintain/.


Best regards,
Etsuro Fujita
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 1ab6dba..9777d40 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1236,7 +1236,7 @@ RelationGetPartitionDispatchInfo(Relation rel,
  * get_partition_dispatch_recurse
  * Recursively expand partition tree rooted at rel
  *
- * As the partition tree is expanded in a depth-first manner, we mantain two
+ * As the partition tree is expanded in a depth-first manner, we maintain two
  * global lists: of PartitionDispatch objects corresponding to partitioned
  * tables in *pds and of the leaf partition OIDs in *leaf_part_oids.
  *

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-05 Thread Alvaro Herrera
Wood, Dan wrote:
> Whatever you do make sure to also test 250 clients running lock.sql.  Even 
> with the communities fix plus YiWen’s fix I can still get duplicate rows.  
> What works for “in-block” hot chains may not work when spanning blocks.

Good idea.  You can achieve a similar effect by adding a filler column,
and reducing fillfactor.

> Once nearly all 250 clients have done their updates and everybody is
> waiting to vacuum which one by one will take a while I usually just
> “pkill -9 psql”.  After that I have many of duplicate “id=3” rows.

Odd ...

> On top of that I think we might have a lock leak.  After the pkill I
> tried to rerun setup.sql to drop/create the table and it hangs.  I see
> an autovacuum process starting and existing every couple of seconds.
> Only by killing and restarting PG can I drop the table.

Please do try to figure this one out.  It'd be a separate problem,
worthy of its own thread.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing initplans

2017-10-05 Thread Amit Kapila
On Wed, Oct 4, 2017 at 12:55 PM, Amit Kapila  wrote:
> On Wed, Oct 4, 2017 at 3:40 AM, Robert Haas  wrote:
>> On Tue, Oct 3, 2017 at 7:33 AM, Amit Kapila  wrote:
>>
>> Having said all that, I think that this patch only wants to handle the
>> subset of cases (2) and (4) where the relevant InitPlan is attached
>> ABOVE the Gather node -- which seems very reasonable, because
>> evaluating an InitPlan at a level of the plan tree above the level at
>> which it is defined sounds like it might be complex.  But I still
>> don't quite see why we need these tests.  I mean, if we only allow
>> Param references from a set of safe parameter IDs, and the safe
>> parameter IDs include only those IDs that can be generated in a
>> worker, then won't other InitPlans in the workers anyway be ruled out?
>
> It doesn't happen always.  There are cases when the part of required
> conditions are pushed one query level below, so when we check during
> max_parallel_hazard_walker, they look safe, but actually, they are
> not.  I will try to explain by example:
>
> postgres=# explain (costs off, verbose) select * from t1 where t1.i in
> ( select 1 + (select max(j) from t3));
>   QUERY PLAN
> --
>  Hash Semi Join
>Output: t1.i, t1.j, t1.k
>Hash Cond: (t1.i = ((1 + $1)))
>->  Seq Scan on public.t1
>  Output: t1.i, t1.j, t1.k
>->  Hash
>  Output: ((1 + $1))
>  ->  Result
>Output: (1 + $1)
>InitPlan 1 (returns $1)
>  ->  Finalize Aggregate
>Output: max(t3.j)
>->  Gather
>  Output: (PARTIAL max(t3.j))
>  Workers Planned: 2
>  ->  Partial Aggregate
>Output: PARTIAL max(t3.j)
>->  Parallel Seq Scan on public.t3
>  Output: t3.j
> (19 rows)
>
> In the above example, you can see that the condition referring to
> initplan (1 + $1) is pushed one level below.  So when it tries to
> check parallel safety for the join condition, it won't see Param node.
>

To add more detail, the param node in join qual is replaced with Var
during pullup of sublink.  Basically, it constructs vars from target
list of subquery and then replaces it in join quals.  Refer code:
convert_ANY_sublink_to_join()
{
..
/*
* Build a list of Vars representing the subselect outputs.
*/
subquery_vars = generate_subquery_vars(root,
  subselect->targetList,
  rtindex);

/*
* Build the new join's qual expression, replacing Params with these Vars.
*/
quals = convert_testexpr(root, sublink->testexpr, subquery_vars);
..
}

Now, unless, I am missing something here, it won't be possible to
detect params in such cases during forming of join rels and hence we
need the tests in generate_gather_paths.  Let me know if I am missing
something in this context or if you have any better ideas to make it
work?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-10-05 Thread Ashutosh Bapat
On Wed, Oct 4, 2017 at 9:01 PM, Robert Haas  wrote:
> On Thu, Sep 21, 2017 at 8:52 AM, Robert Haas  wrote:
>> On Thu, Sep 21, 2017 at 8:21 AM, Ashutosh Bapat
>>  wrote:
>>> About your earlier comment of making build_joinrel_partition_info()
>>> simpler. Right now, the code assumes that partexprs or
>>> nullable_partexpr can be NULL when either of them is not populated.
>>> That may be saves a sizeof(pointer) * (number of keys) byes of memory.
>>> Saving that much memory may not be worth the complexity of code. So,
>>> we may always allocate memory for those arrays and fill it with NIL
>>> values when there are no key expressions to populate those. That will
>>> simplify the code. I haven't done that change in this patchset. I was
>>> busy debugging the Q7 regression. Let me know your comments about
>>> that.
>>
>> Hmm, I'm not sure that's the best approach, but let me look at it more
>> carefully before I express a firm opinion.
>
> Having studied this a bit more, I now think your proposed approach is
> a good idea.

Thanks. Done.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another oddity in handling of WCO constraints in postgres_fdw

2017-10-05 Thread Etsuro Fujita

On 2017/10/04 21:28, Ashutosh Bapat wrote:

On Wed, Oct 4, 2017 at 5:32 PM, Robert Haas  wrote:

On Wed, Oct 4, 2017 at 6:40 AM, Ashutosh Bapat
 wrote:

We can
check whether a row being sent from the local server to the foreign
server obeys WCO, but what foreign server does to that row is beyond
local server's scope.


But I think right now we're not checking the row being sent from the
local server, either.


We don't check the row *before* sending it to the remote server, but 
check the row returned by ExecForeignInsert/ExecForeignUpdate, which is 
allowed to have been changed by the remote server.  In postgres_fdw, we 
currently return the data actually inserted/updated if RETURNING/AFTER 
TRIGGER present, but not if WCO only presents.  So, for the postgres_fdw 
foreign table, WCO is enforced on the data that was actually 
inserted/updated if RETURNING/AFTER TRIGGER present and on the original 
data core supplied if WCO only presents, which is inconsistent behavior.



Didn't 7086be6e3627c1ad797e32ebbdd232905b5f577f fix that?


No.  The commit addressed another issue.


The WCO that is being ignored isn't a
constraint on the foreign table; it's a constraint on a view which
happens to reference the foreign table.  It seems quite odd for the
"assume constraints are valid" property of the foreign table to
propagate back up into the view that references it.


Agreed.


The view with WCO is local but the modification which violates WCO is
being made on remote server by a trigger on remote table. Trying to
control that doesn't seem to be a good idea, just like we can't
control what rows get inserted on the foreign server when they violate
local constraints. I am using local constraints as an example of
precedence where we ignore what's happening on remote side and enforce
whatever we could enforce locally. Local server should make sure that
any rows sent from local server to the remote server do not violate
any local WCO.


Seems odd (and too restrictive) to me too.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Hash take II

2017-10-05 Thread Thomas Munro
On Thu, Oct 5, 2017 at 7:07 PM, Rushabh Lathia  wrote:
> v20 patch set (I was trying 0008, 0009 patch)  not getting cleanly apply on
> latest commit also getting compilation error due to refactor in below
> commit.
>
> commit 0c5803b450e0cc29b3527df3f352e6f18a038cc6

Hi Rushabh

I am about to post a new patch set that fixes the deadlock problem,
but in the meantime here is a rebase of those two patches (numbers
changed to 0006 + 0007).  In the next version I think I should
probably remove that 'stripe' concept.  The idea was to spread
temporary files over the available temporary tablespaces, but it's a
terrible API, since you have to promise to use the same stripe number
when opening the same name later... Maybe I should use a hash of the
name for that instead.  Thoughts?

-- 
Thomas Munro
http://www.enterprisedb.com


0006-Remove-BufFile-s-isTemp-flag.patch
Description: Binary data


0007-Add-BufFileSet-for-sharing-temporary-files-between-b.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JIT compiling - v4.0

2017-10-05 Thread Andres Freund
On 2017-10-04 11:56:47 +0300, Ants Aasma wrote:
> On Wed, Oct 4, 2017 at 9:48 AM, Andres Freund  wrote:
> > Here's an updated version of the patchset.  There's some substantial
> > changes here, but it's still very obviously very far from committable as
> > a whole. There's some helper commmits that are simple and independent
> > enough to be committable earlier on.
>
> Looks pretty impressive already.

Thanks!


> I wanted to take it for a spin, but got errors about the following
> symbols being missing:
>
> LLVMOrcUnregisterPerf
> LLVMOrcRegisterGDB
> LLVMOrcRegisterPerf
> LLVMOrcGetSymbolAddressIn
> LLVMLinkModules2Needed
>
> As far as I can tell these are not in mainline LLVM. Is there a branch
> or patchset of LLVM available somewhere that I need to use this?

Oops, I'd forgotten about the modifications. Sorry. I've attached them
here.  The GDB and Perf stuff should now be an optional dependency,
too.  The required changes are fairly small, so they hopefully shouldn't
be too hard to upstream.

Please check the git tree for a rebased version of the pg patches, with
a bunch bugfixes (oops, some last minute "cleanups") and performance
fixes.

Here's some numbers for a a TPC-H scale 5 run. Obviously the Q01 numbers
are pretty nice in partcular. But it's also visible that the shorter
query can loose, which is largely due to the JIT overhead - that can be
ameliorated to some degree, but JITing obviously isn't always going to
be a win.

It's pretty impressive that in q01, even after all of this, expression
evaluation *still* is 35% of the total time (25% in the aggregate
transition function). That's partially just because the query does
primarily aggregation, but also because the generated code can stand a
good chunk of improvements.

master q01 min: 14146.498 dev min: 11479.05 [diff -23.24] dev-jit min: 
8659.961 [diff -63.36] dev-jit-deform min: 7279.395 [diff -94.34] 
dev-jit-deform-inline min: 6997.956 [diff -102.15]
master q02 min: 1234.229 dev min: 1208.102 [diff -2.16] dev-jit min: 
1292.983 [diff +4.54] dev-jit-deform min: 1580.505 [diff +21.91] 
dev-jit-deform-inline min: 1809.046 [diff +31.77]
master q03 min: 6220.814 dev min: 5424.107 [diff -14.69] dev-jit min: 
5175.125 [diff -20.21] dev-jit-deform min: 4257.368 [diff -46.12] 
dev-jit-deform-inline min: 4218.115 [diff -47.48]
master q04 min: 947.476 dev min: 970.608 [diff +2.38] dev-jit min: 
969.944 [diff +2.32] dev-jit-deform min: 999.006 [diff +5.16] 
dev-jit-deform-inline min: 1033.78 [diff +8.35]
master q05 min: 4729.9 dev min: 4059.665 [diff -16.51] dev-jit min: 
4182.941 [diff -13.08] dev-jit-deform min: 4147.493 [diff -14.04] 
dev-jit-deform-inline min: 4284.473 [diff -10.40]
master q06 min: 1603.708 dev min: 1592.107 [diff -0.73] dev-jit min: 
1556.216 [diff -3.05] dev-jit-deform min: 1516.078 [diff -5.78] 
dev-jit-deform-inline min: 1579.839 [diff -1.51]
master q07 min: 4549.738 dev min: 4331.565 [diff -5.04] dev-jit min: 
4475.654 [diff -1.66] dev-jit-deform min: 4645.773 [diff +2.07] 
dev-jit-deform-inline min: 4885.781 [diff +6.88]
master q08 min: 1394.428 dev min: 1350.363 [diff -3.26] dev-jit min: 
1434.366 [diff +2.78] dev-jit-deform min: 1716.65 [diff +18.77] 
dev-jit-deform-inline min: 1938.152 [diff +28.05]
master q09 min: 5958.198 dev min: 5700.329 [diff -4.52] dev-jit min: 
5491.683 [diff -8.49] dev-jit-deform min: 5582.431 [diff -6.73] 
dev-jit-deform-inline min: 5797.475 [diff -2.77]
master q10 min: 5228.69 dev min: 4475.154 [diff -16.84] dev-jit min: 
4269.365 [diff -22.47] dev-jit-deform min: 3767.888 [diff -38.77] 
dev-jit-deform-inline min: 3962.084 [diff -31.97]
master q11 min: 281.201 dev min: 280.132 [diff -0.38] dev-jit min: 
351.85 [diff +20.08] dev-jit-deform min: 455.885 [diff +38.32] 
dev-jit-deform-inline min: 532.093 [diff +47.15]
master q12 min: 4289.268 dev min: 4082.359 [diff -5.07] dev-jit min: 
4007.199 [diff -7.04] dev-jit-deform min: 3752.396 [diff -14.31] 
dev-jit-deform-inline min: 3916.653 [diff -9.51]
master q13 min: 7110.545 dev min: 6898.576 [diff -3.07] dev-jit min: 
6579.554 [diff -8.07] dev-jit-deform min: 6304.15 [diff -12.79] 
dev-jit-deform-inline min: 6135.952 [diff -15.88]
master q14 min: 678.024 dev min: 650.943 [diff -4.16] dev-jit min: 
682.387 [diff +0.64] dev-jit-deform min: 746.354 [diff +9.16] 
dev-jit-deform-inline min: 878.437 [diff +22.81]
master q15 min: 1641.897 dev min: 1650.57 [diff +0.53] dev-jit min: 
1661.591 [diff +1.19] dev-jit-deform min: 1821.02 [diff +9.84] 
dev-jit-deform-inline min: 1863.304 [diff +11.88]
master q16 min: 1890.246 dev min: 1819.423 [diff -3.89] dev-jit min: 
1838.079 [diff -2.84] dev-jit-deform min: 1962.274 [diff +3.67] 
dev-jit-deform-inline min: 2096.154 [diff +9.82]
master 

Re: [HACKERS] Parallel Hash take II

2017-10-05 Thread Rushabh Lathia
v20 patch set (I was trying 0008, 0009 patch)  not getting cleanly apply on
latest commit also getting compilation error due to refactor in below
commit.

commit 0c5803b450e0cc29b3527df3f352e6f18a038cc6
Author: Peter Eisentraut 
Date:   Sat Sep 23 09:49:22 2017 -0400

Refactor new file permission handling



On Mon, Sep 25, 2017 at 11:38 AM, Thomas Munro <
thomas.mu...@enterprisedb.com> wrote:

> On Thu, Sep 21, 2017 at 5:49 AM, Peter Geoghegan  wrote:
> > Graefe's "Query Evaluation Techniques for Large Databases" has several
> > pages on deadlock avoidance strategies. It was written almost 25 years
> > ago, but still has some good insights IMV (you'll recall that Graefe
> > is the author of the Volcano paper; this reference paper seems like
> > his follow-up). Apparently, deadlock avoidance strategy becomes
> > important for parallel sort with partitioning. You may be able to get
> > some ideas from there. And even if you don't, his handling of the
> > topic is very deliberate and high level, which suggests that ours
> > should be, too.
>
> Very interesting and certainly relevant (the parts I've read so far),
> though we don't have multiple consumers.  Multiplexing one thread so
> that it is both a consumer and a producer is an extra twist though.
>
> --
> Thomas Munro
> http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Rushabh Lathia