date:20160912

Re: [HACKERS] WAL consistency check facility

2016-09-12 Thread Michael Paquier

On Sun, Sep 11, 2016 at 12:03 AM, Robert Haas  wrote:
> It seems entirely unnecessary for the master and the standby to agree
> here.  I think what we need is two GUCs.  One of them, which affects
> only the master, controls whether the validation information is
> including in the WAL, and the other, which affects only the standby,
> affects whether validation is performed when the necessary information
> is present.  Or maybe skip the second one and just decree that
> standbys will always validate if the necessary information is present.
> Using the same GUC on both the master and the standby but making it
> mean different things in each of those places (whether to log the
> validation info in one case, whether to perform validation in the
> other case) is another option that also avoids needing to enforce that
> the setting is the same in both places, but probably an inferior one.

Thinking more about that, there is no actual need to do anything
complicated here. We could just track at the record level if a
consistency check is needs to be done at replay and do it. If nothing
is set, just do nothing. That would allow us to promote this parameter
to SIGHUP. wal_compression does something similar.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.6 TAP tests and extensions

2016-09-12 Thread Craig Ringer

On 13 September 2016 at 13:27, Craig Ringer  wrote:
> This was wrong for out-of-tree builds, updated.
>
> Still pending fix for PG_REGRESS path when invoked using
> $(prove_check) from PGXS

Looking further at this, I think a pgxs-specific patch to add support
for prove tests and isolation tests would be best, and can be done
separately. Possibly even snuck into a point release, but if not, at
least extension authors can invoke prove in their own Makefile if the
required modules get installed. It just needs an adaptation of the
command used in the $(prove_check) definition.

Extension makefiles run tests by listing the tests in REGRESS .
Something similar would make sense for isolation checks. For prove,
probably just a macro that can be invoked to enable prove tests in
pgxs makefiles.

I suggest that the above patches be applied to 9.6 and v10. Then for
v10 I'll look at enhancing PGXS to make it easier to use isolation
tests and prove tests; extensions that want to use them in 9.6 can
just add something like:

prove_check:
rm -rf $(CURDIR)/tmp_check/log
cd $(srcdir) && TESTDIR='$(CURDIR)' PATH="$(shell pg_config
--bindir):$$PATH" PGPORT='6$(DEF_PGPORT)'
top_builddir='$(CURDIR)/$(top_builddir)' PG_REGRESS='pg_regress'
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) t/*.pl

.PHONY: prove_check

to their Makefile , so it's not necessary to have PGXS support for
this for it to be useful in 9.6.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL consistency check facility

2016-09-12 Thread Michael Paquier

On Mon, Sep 12, 2016 at 5:06 AM, Kuntal Ghosh
 wrote:
>>+   void(*rm_checkConsistency) (XLogReaderState *record);
>>All your _checkConsistency functions share the same pattern, in short
>>they all use a for loop for each block, call each time
>>XLogReadBufferExtended, etc. And this leads to a *lot* of duplication.
>>You would get a reduction by a couple of hundreds of lines by having a
>>smarter refactoring. And to be honest, if I look at your patch what I
>>think is the correct way of doing things is to add to the rmgr not
>>this check consistency function, but just a pointer to the masking
>>function.
>
> +1. In rmgrlist, I've added a pointer to the masking function for each rmid.
> A common function named checkConsistency calls these masking functions
> based on their rmid and does comparison for each block.

The patch size is down from 79kB to 38kB. That gets better :)

>>> - If WAL consistency check is enabled for a rmgrID, we always include
>>> the backup image in the WAL record.
>>
>>What happens if wal_consistency has different settings on a standby
>>and its master? If for example it is set to 'all' on the standby, and
>>'none' on the master, or vice-versa, how do things react? An update of
>>this parameter should be WAL-logged, no?
>
> If wal_consistency is enabled for a rmid, standby will always check whether
> backup image exists or not i.e. BKPBLOCK_HAS_IMAGE is set or not.
> (I guess Amit and Robert also suggested the same in the thread)
> Basically, BKPBLOCK_HAS_IMAGE is set if a block contains image and
> BKPIMAGE_IS_REQUIRED_FOR_REDO (I've added this one) is set if that backup
> image is required during redo. When we decode a wal record, has_image
> flag of DecodedBkpBlock is set to BKPIMAGE_IS_REQUIRED_FOR_REDO.

Ah I see. But do we actually store the status in the record itself,
then at replay we don't care of the value of wal_consistency at
replay. That's the same concept used by wal_compression. So shouldn't
you have more specific checks related to that in checkConsistency? You
actually don't need to check for anything in xlogreader.c, just check
for the consistency if there is a need to do so, or do nothing.

> For now, I've kept this as a WARNING message to detect all inconsistencies
> at once. Once, the patch is finalized, I'll modify it as an ERROR message.

Or say FATAL. This way the server is taken down.

> Thoughts?

A couple of extra thoughts:
1) The routines of each rmgr are located in a dedicated file, for
example GIN stuff is in ginxlog.c, etc. It seems to me that it would
be better to move each masking function where it should be instead
being centralized. A couple of routines need to be centralized, so I'd
suggest putting them in a new file, like xlogmask.c, xlog because now
this is part of WAL replay completely, including the lsn, the hint
bint and the other common routines.

2) Regarding page comparison:
+/*
+ * Compare the contents of two pages.
+ * If the two pages are exactly same, it returns BLCKSZ. Otherwise,
+ * it returns the location where the first mismatch has occurred.
+ */
+int
+comparePages(char *page1, char *page2)
We could just use memcpy() here. compareImages was useful to get a
clear image of what the inconsistencies were, but you don't do that
anymore.

3)
+static void checkConsistency(RmgrId rmid, XLogReaderState *record);
The RMGR if is part of the record decoded, so you could just remove
RmgrId from the list of arguments and simplify this interface.

4) If this patch still goes with the possibility to set up a list of
RMGRs, documentation is needed for that. I'd suggest writing first a
patch to explain what are RMGRs for WAL, then apply the WAL
consistency facility on top of it and link wal_consistency to it.

5)
+   has_image = record->blocks[block_id].has_image;
+   record->blocks[block_id].has_image = true;
+   if (!RestoreBlockImage(record, block_id, old_page))
+   elog(ERROR, "failed to restore block image");
+   record->blocks[block_id].has_image = has_image;
Er, what? And BKPIMAGE_IS_REQUIRED_FOR_REDO?

6)
+   /*
+* Remember that, if WAL consistency check is enabled for
the current rmid,
+* we always include backup image with the WAL record.
But, during redo we
+* restore the backup block only if needs_backup is set.
+*/
+   if (needs_backup)
+   bimg.bimg_info |= BKPIMAGE_IS_REQUIRED_FOR_REDO;
This should use wal_consistency[rmid]?

7) This patch has zero documentation. Please add some. Any human being
on this list other than those who worked on the first versions
(Heikki, Simon and I?) is going to have a hard time to review this
patch in details moving on if there is no reference to tell what this
feature does for the user...

This patch is going to the good direction, but I don't think it's far
from being ready for commit yet. So I am going to mark it as returned
with feedback if there are no objections.
-- 
Michae

Re: [HACKERS] OpenSSL 1.1 breaks configure and more

2016-09-12 Thread Michael Paquier

On Tue, Sep 13, 2016 at 1:51 AM, Heikki Linnakangas  wrote:
> I planned to commit this today, but while reading through it and testing, I
> ended up doing a bunch more changes, so this deserves another round of
> review.

OK, I am giving it a try. Note to people using OSX: at least for brew
there is the formula openssl@1.1 that you can use with the following
flags:
CFLAGS="-I/usr/local/opt/openssl@1.1/include"
LDFLAGS="-L/usr/local/opt/openssl@1.1/lib"
Postgres is not the only broken thing, so they kept the formula
openssl to 1.0.2.

> Changes since last version:
>
> * Added more error checks to the my_BIO_s_socket() function. Check for NULL
> result from malloc(). Check the return code of BIO_meth_set_*() functions;
> looking at OpenSSL sources, they always succeed, but all the test/example
> programs that come with OpenSSL do check them.
>
> * Use BIO_get_new_index() to get the index number for the wrapper BIO.
>
> * Also call BIO_meth_set_puts(). It was missing in previous patch versions.
>
> * Fixed src/test/ssl test suite to also work with OpenSSL 1.1.0.
>
> * Changed all references (in existing code) to SSLEAY_VERSION_NUMBER into
> OPENSSL_VERSION_NUMBER, for consistency.
>
> * Squashed all into one patch.
>
> I intend to apply this to all supported branches, so please have a look!
> This is now against REL9_6_STABLE, but there should be little difference
> between branches in the code that this touches.

I just took a look at this patch, testing it on the way with 1.1.0 and
1.0.2. And it looks in pretty good shape.

+   ResourceOwner owner;
+   struct OSSLDigest *next;
+   struct OSSLDigest *prev;
This could be done as well with a list of pg_list, the cost being a
couple of extra calls to switch memory contexts, but it would simplify
free_openssldigest when cleaning up an entry. I guessed you already
thought about that but discarded it?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.6 TAP tests and extensions

2016-09-12 Thread Craig Ringer

This was wrong for out-of-tree builds, updated.

Still pending fix for PG_REGRESS path when invoked using
$(prove_check) from PGXS


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 462a0ab51935b45d17820b83b8e9f6abd4ad2904 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Tue, 13 Sep 2016 11:06:58 +0800
Subject: [PATCH 1/3] Install the Perl TAP tests

---
 src/Makefile  |  3 ++-
 src/test/Makefile |  2 +-
 src/test/perl/GNUmakefile | 39 +++
 3 files changed, 42 insertions(+), 2 deletions(-)
 create mode 100644 src/test/perl/GNUmakefile

diff --git a/src/Makefile b/src/Makefile
index b526be7..977f80b 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -26,7 +26,8 @@ SUBDIRS = \
 	bin \
 	pl \
 	makefiles \
-	test/regress
+	test/regress \
+	test/perl
 
 # There are too many interdependencies between the subdirectories, so
 # don't attempt parallel make here.
diff --git a/src/test/Makefile b/src/test/Makefile
index 7f7754f..6b40cf5 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = regress isolation modules recovery
+SUBDIRS = perl regress isolation modules recovery
 
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
diff --git a/src/test/perl/GNUmakefile b/src/test/perl/GNUmakefile
new file mode 100644
index 000..3c5dc70
--- /dev/null
+++ b/src/test/perl/GNUmakefile
@@ -0,0 +1,39 @@
+#-
+#
+# Makefile for src/test/perl
+#
+# Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/perl/Makefile
+#
+#-
+
+subdir = src/test/perl
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+installdirs:
+	$(MKDIR_P) '$(DESTDIR)$(pgxsdir)/$(subdir)'
+
+ifeq ($(enable_tap_tests),yes)
+
+install: all installdirs
+	$(INSTALL_DATA) $(srcdir)/TestLib.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/TestLib.pm'
+	$(INSTALL_DATA) $(srcdir)/SimpleTee.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/SimpleTee.pm'
+	$(INSTALL_DATA) $(srcdir)/RecursiveCopy.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/RecursiveCopy.pm'
+	$(INSTALL_DATA) $(srcdir)/PostgresNode.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgresNode.pm'
+
+uninstall:
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/TestLib.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/SimpleTee.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/RecursiveCopy.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgresNode.pm'
+
+else
+
+install: ;
+
+uninstall: ;
+
+endif
-- 
2.5.5

From 86aff40374e05fec0160cbab0d1879bd01c6f411 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Tue, 13 Sep 2016 11:00:41 +0800
Subject: [PATCH 2/3] Add install rules for isolation tester

Allow 'make install' for the isolation tester to work so it can be
used from PGXS extensions.
---
 src/Makefile|  1 +
 src/test/isolation/Makefile | 11 +++
 2 files changed, 12 insertions(+)

diff --git a/src/Makefile b/src/Makefile
index 977f80b..d4aa06b 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -27,6 +27,7 @@ SUBDIRS = \
 	pl \
 	makefiles \
 	test/regress \
+	test/isolation \
 	test/perl
 
 # There are too many interdependencies between the subdirectories, so
diff --git a/src/test/isolation/Makefile b/src/test/isolation/Makefile
index 3d272d5..e111bf0 100644
--- a/src/test/isolation/Makefile
+++ b/src/test/isolation/Makefile
@@ -66,3 +66,14 @@ installcheck-prepared-txns: all temp-install
 
 check-prepared-txns: all temp-install
 	./pg_isolation_regress --temp-instance=./tmp_check $(TEMP_CONF) $(EXTRA_REGRESS_OPTS) --inputdir=$(srcdir) --schedule=$(srcdir)/isolation_schedule prepared-transactions
+
+install: all installdirs
+	$(INSTALL_PROGRAM) isolationtester$(X) '$(DESTDIR)$(pgxsdir)/$(subdir)/isolationtester$(X)'
+	$(INSTALL_PROGRAM) pg_isolation_regress$(X) '$(DESTDIR)$(pgxsdir)/$(subdir)/pg_isolation_regress$(X)'
+
+installdirs:
+	$(MKDIR_P) '$(DESTDIR)$(pgxsdir)/$(subdir)'
+
+uninstall:
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/isolationtester$(X)'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/pg_isolation_regress$(X)'
-- 
2.5.5

From a651bc09c067bdcc8d0c165a91ae1c26ddb0befc Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Tue, 13 Sep 2016 11:48:43 +0800
Subject: [PATCH 3/3] Note that src/test/Makefile is not called from
 src/Makefile

Add a comment to help developers who're editing src/test/Makefile.
---
 src/test/Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/test/Makefile b/src/test/Makefile
index 6b40cf5..a24071e 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -11,6 +11,10 @@
 subdir = src/test
 top_builddir = ../..

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-12 Thread Ashutosh Bapat

>
> 3. I think registerAlias stuff should happen really at the time of
>> creating paths and should be stored in fpinfo. Without that it
>> will be
>> computed every time we deparse the query. This means every time
>> we try
>> to EXPLAIN the query at various levels in the join tree and once
>> for the
>> query to be sent to the foreign server.
>>
>
> Hmm.  I think the overhead in calculating aliases would be
>> negligible compared to the overhead in explaining each remote query
>> for costing or sending the remote query for execution.  So, I
>> created aliases in the same way as remote params created and stored
>> into params_list in deparse_expr_cxt.  I'm not sure it's worth
>> complicating the code.
>>
>
> We defer remote parameter creation till deparse time since the the
>> parameter numbers are dependent upon the sequence in which we deparse
>> the query. Creating them at the time of path creation and storing them
>> in fpinfo is not possible because a. those present in the joining
>> relations will conflict with each other and need some kind of
>> serialization at the time of deparsing b. those defer for differently
>> parameterized paths or paths with different pathkeys. We don't defer it
>> because it's very light on performance.
>>
>> That's not true with the alias information. As long as we detect which
>> relations need subqueries, their RTIs are enough to create unique
>> aliases e.g. if a relation involves RTIs 1, 2, 3 corresponding subquery
>> can have alias r123 without conflicting with any other alias.
>>
>
> Hmm.  But another concern about the approach of generating an subselect
> alias for a path, if needed, at the path creation time would be that the
> path might be rejected by add_path, which would result in totally wasting
> cycles for generating the subselect alias for the path.
>

A path may get rejected but the relation is not rejected. The alias applies
to a relation and its targetlist, which will remain same for all paths
created for that relation, esp when it's a base relation or join. So, AFAIU
that work never gets wasted. Also, for costing paths with
use_remote_estimate, we deparse the query, which builds the alias
information again and again for very path. That's much worse than building
it once for a given relation.


>
> However minimum overhead it might have, we will deparse the query every
>> time we create a path involving that kind of relation i.e. for different
>> pathkeys, different parameterization and different joins in which that
>> relation participates in. This number can be significant. We want to
>> avoid this overhead if we can.
>>
>
> Exactly.  As I said above, I think the overhead would be negligible
> compared to the overhead in explaining each remote query for costing or the
> overhead in sending the final remote query for execution, though.
>

It won't remain minimal as the number of paths created increases,
increasing the number of times a query is deparsed. We deparse query every
time time we cost a path for a relation with use_remote_estimates true. As
we try to push down more and more stuff, we will create more paths and
deparse the query more time.

Also, that makes the interface better. Right now, in your patch, you have
changed the order of deparsing in the existing code, so that the aliases
are registered while deparsing FROM clause and before any Var nodes are
deparsed. If we create aliases at the time of path creation, only once in
GetForeignJoinPaths or GetForeignPaths as appropriate, that would require
less code churn and would save some CPU cycles as well.


>
> 6.
>> ! if (is_placeholder)
>> ! errcontext("placeholder expression at position %d in
>> select list",
>> !errpos->cur_attno);
>> A user wouldn't know what a placeholder expression is, there is
>> no such
>> term in the documentation. We have to device a better way to
>> provide an
>> error context for an expression in general.
>>
>
> Though I proposed that, I don't think that it's that important to
>> let users know that the expression is from a PHV.  How about just
>> saying "expression", not "placeholder expression", so that we have
>> the message "expression at position %d in select list" in the context?
>>
>
> Hmm, that's better than placeholder expression, but not as explanatory
>> as it should be since we won't be printing the "select list" in the
>> error context and user won't have a clue as to what error context is
>> about.
>>
>
> I don't think so.  Consider an example of the conversion error message,
> which is from the regression test:
>
> SELECT  ft1.c1,  ft2.c2, ft1 FROM ft1, ft2 WHERE ft1.c1 = ft2.c1 AND
> ft1.c1 = 1;
> ERROR:  invalid input syntax for integer: "foo"
> CONTEXT:  whole-row reference to foreign table "ft1"
>
> As shown in the example, the erro

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2016-09-12 Thread Amit Langote

On 2016/09/09 18:47, Ashutosh Bapat wrote:
> A related change is renaming RangeBound structure in Amit
> Langote's patches to PartitionRangeBound to avoid name conflict with
> rangetypes.h. That change too should vanish once we decide where to keep
> that structure and its final name.

This change has been incorporated into the latest patch I posted on Sep 9 [1].

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/28ee345c-1278-700e-39a7-36a71f9a3...@lab.ntt.co.jp




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-12 Thread Michael Paquier

On Sat, Sep 10, 2016 at 6:27 PM, Peter Eisentraut
 wrote:
> In fsync_pgdata(), you have removed the progname from one error
> message, even though it is passed into the function.

Right. Good catch.

> Also, I think
> fsync_pgdata() should not be printing initdb's progress messages.
> That should stay in initdb.

That's a reference to that.
+   fputs(_("syncing data to disk ... "), stdout);
+   fflush(stdout);
Oops, fixed. I missed that now that I look at it. Perhaps I was
thinking something else when hacking that lastly, like getting this
message out for pg_basebackup as well.

> Worse, in 0002 you are then changing the
> output, presumably to suit pg_basebackup or something else, which
> messes up the initdb output.

Yes, fixed.

> durable_rename() does not fsync an existing newfile before renaming,
> in contrast to the backend code in fd.c.

That's in 0002. In the case of initdb it did not really matter, but
that matters for pg_receivexlog, so let's do it unconditionally. I
reworked the code to check if the existing new file is here, and
fsync() if it exists.

> I'm slightly concerned about lots of code duplication in
> durable_rename, fsync_parent_path between backend and standalone code.

Based on the ground of code readability, I am not sure that it would
be worth sharing the code of durable_rename or even fsync_parent_path
between the backend and the frontend, particularly because they are
actually not really duplicated... One good step in favor of that would
be to get a elog()/ereport() layer in src/common as well, but that's
really something that this set of patches should tackle, no?

> Also, I'm equally slightly concerned that having duplicate function
> names will mess up tag search for someone.

There are similar cases, take palloc().. Now perhaps some of those
functions could be renamed pg_fsync_... Thoughts?

> This is a preexisting mistake, but fsync_fname_ext() should just be
> fsync_fname() to match the backend naming.  In the backend,
> fsync_fname_ext() "extends" fsync_fname() by accepting an ignore_perm
> argument, but the initdb code doesn't do that.

OK.

> I don't think tar file output in pg_basebackup needs to be fsynced.
> It's just a backup file like what pg_dump produces, and we don't fsync
> that either.  The point of this change is to leave a data directory in
> a synced state equivalent to what initdb leaves behind.

Here I don't agree. The point of the patch is to make sure that what
gets generated by pg_basebackup is consistent on disk, for all the
formats. It seems weird to me that we could say that the plain format
makes things consistent while the tar format may cause some files to
be lost in case of power outage. The docs make it clear I think:
+By default, pg_basebackup will wait for all files
+to be written safely to disk.
But perhaps this should directly mention that this is done for all the formats?

> Some of the changes in receivelog.c seem excessive.  Replacing a plain
> fsync() with fsync_fname_ext() plus fsync_parent_path() isn't
> justified by the references you have provided.  This would need more
> explanation.

In mark_file_as_archived(), we had better do it or in case of a crash
the .done file would just be gone. open_walfile() as well need that,
to ensure that the segment is created and zeroed, and in the case
where it was padded (is this one overthinking it?).

Patch 0003 had conflicts caused by 9083353.
-- 
Michael
From 6158dc8f70dd100f59d7d2c4b7a60435b7c1cfac Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 13 Sep 2016 12:14:12 +0900
Subject: [PATCH 1/4] Relocation fsync routines of initdb into src/common

Those are aimed at being used by other utilities, pg_basebackup mainly,
and at some other degree by pg_dump and pg_receivexlog.
---
 src/bin/initdb/initdb.c | 270 ++-
 src/common/Makefile |   2 +-
 src/common/file_utils.c | 276 
 src/include/common/file_utils.h |  22 
 src/tools/msvc/Mkvcbuild.pm |   2 +-
 5 files changed, 311 insertions(+), 261 deletions(-)
 create mode 100644 src/common/file_utils.c
 create mode 100644 src/include/common/file_utils.h

diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 3350e13..e52e67d 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -61,6 +61,7 @@
 #endif
 
 #include "catalog/catalog.h"
+#include "common/file_utils.h"
 #include "common/restricted_token.h"
 #include "common/username.h"
 #include "mb/pg_wchar.h"
@@ -70,13 +71,6 @@
 #include "fe_utils/string_utils.h"
 
 
-/* Define PG_FLUSH_DATA_WORKS if we have an implementation for pg_flush_data */
-#if defined(HAVE_SYNC_FILE_RANGE)
-#define PG_FLUSH_DATA_WORKS 1
-#elif defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED)
-#define PG_FLUSH_DATA_WORKS 1
-#endif
-
 /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
 extern const char *select_default_timezone(c

[HACKERS] 9.6 TAP tests and extensions

2016-09-12 Thread Craig Ringer

Hi all

While updating an extension for 9.6 I noticed that while the
$(prove_check) definition is exposed for use by PGXS in
Makefile.global, extensions can't actually use the TAP tests because
we don't install the required Perl modules like PostgresNode.pm.

I don't see any reason not to make this available to extension authors
and doing so is harmless, so here's a small patch to install it. I
think it's reasonable to add this to 9.6 even at this late stage; IMO
it should've been installed from the beginning. They're only installed
if --enable-tap-tests is set, since otherwise $(prove_check) will just
error out with "TAP tests not enabled" anyway.

Not having this in 9.6 will make it way harder for extension authors
to benefit from the new TAP tooling.

Another patch allows the isolation tester to be installed too, again
so that extensions can use it.

The final patch just adds a comment to src/test/Makefile to warn that
src/Makefile doesn't call it directly, because I was confused as to
why 'make -C src/test install' installed everything, but 'make
install' did not.

Sorry this is so late in the piece. It only came up when I switched
from 10.0 dev/review to updating extensions for 9.6. But it's just
adding installed files and I think it's worth doing.

Another small patch (pending) will be needed because we look for
pg_regress in the wrong place when running out-of-tree with
$(prove_installcheck).

Can't exec 
"/home/craig/projects/2Q/postgres-bdr-extension//home/craig/pg/96/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress":
No such file or directory at
/home/craig/pg/96/lib/postgresql/pgxs/src/makefiles/../../src/test/perl/TestLib.pm
line 151.

$(prove_check) won't be usable because it assumes a temp install in
./tmp_install but that's to be expected.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 1974ef1e771e28c39d5f6acb29c648e864b0f057 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Tue, 13 Sep 2016 11:06:58 +0800
Subject: [PATCH 1/3] Install the Perl TAP tests

---
 src/Makefile  |  3 ++-
 src/test/Makefile |  2 +-
 src/test/perl/GNUmakefile | 39 +++
 3 files changed, 42 insertions(+), 2 deletions(-)
 create mode 100644 src/test/perl/GNUmakefile

diff --git a/src/Makefile b/src/Makefile
index b526be7..977f80b 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -26,7 +26,8 @@ SUBDIRS = \
 	bin \
 	pl \
 	makefiles \
-	test/regress
+	test/regress \
+	test/perl
 
 # There are too many interdependencies between the subdirectories, so
 # don't attempt parallel make here.
diff --git a/src/test/Makefile b/src/test/Makefile
index 7f7754f..6b40cf5 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = regress isolation modules recovery
+SUBDIRS = perl regress isolation modules recovery
 
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
diff --git a/src/test/perl/GNUmakefile b/src/test/perl/GNUmakefile
new file mode 100644
index 000..810a712
--- /dev/null
+++ b/src/test/perl/GNUmakefile
@@ -0,0 +1,39 @@
+#-
+#
+# Makefile for src/test/perl
+#
+# Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/perl/Makefile
+#
+#-
+
+subdir = src/test/perl
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+installdirs:
+	$(MKDIR_P) '$(DESTDIR)$(pgxsdir)/$(subdir)'
+
+ifeq ($(enable_tap_tests),yes)
+
+install: all installdirs
+	$(INSTALL_PROGRAM) TestLib.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/TestLib.pm'
+	$(INSTALL_PROGRAM) SimpleTee.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/SimpleTee.pm'
+	$(INSTALL_PROGRAM) RecursiveCopy.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/RecursiveCopy.pm'
+	$(INSTALL_PROGRAM) PostgresNode.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgresNode.pm'
+
+uninstall:
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/TestLib.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/SimpleTee.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/RecursiveCopy.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgresNode.pm'
+
+else
+
+install: ;
+
+uninstall: ;
+
+endif
-- 
2.5.5

From bc023c43e44a615aaf310e13edeea97ca8928ce6 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Tue, 13 Sep 2016 11:00:41 +0800
Subject: [PATCH 2/3] Add install rules for isolation tester

Allow 'make install' for the isolation tester to work so it can be
used from PGXS extensions.
---
 src/Makefile|  1 +
 src/test/isolation/Makefile | 11 +++
 2 files changed, 12 insertions(+)

diff --git a/src/Makefile b/src/Makefile
index 977f80b..

Re: [HACKERS] 9.6 TAP tests and extensions

2016-09-12 Thread Andres Freund

On 2016-09-13 10:54:01 +0800, Craig Ringer wrote:
> [zap]

Uhm, empty email ;)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 9.6 TAP tests and extensions

2016-09-12 Thread Craig Ringer

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Indexes

2016-09-12 Thread Amit Kapila

On Tue, Sep 13, 2016 at 3:58 AM, Mark Kirkwood
 wrote:
> On 13/09/16 01:20, Jesper Pedersen wrote:
>>
>> On 09/01/2016 11:55 PM, Amit Kapila wrote:
>>>
>>> I have fixed all other issues you have raised.  Updated patch is
>>> attached with this mail.
>>>
>>
>> The following script hangs on idx_val creation - just with v5, WAL patch
>> not applied.
>
>
> Are you sure it is actually hanging? I see 100% cpu for a few minutes but
> the index eventually completes ok for me (v5 patch applied to today's
> master).
>

It completed for me as well.  The second index creation is taking more
time and cpu, because it is just inserting duplicate values which need
lot of overflow pages.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-12 Thread Kyotaro HORIGUCHI

At Thu, 8 Sep 2016 07:09:51 +, "Tsunakawa, Takayuki" 
 wrote in 
<0A3221C70F24FB45833433255569204D1F5E7D4A@G01JPEXMBYT05>
> From: pgsql-hackers-ow...@postgresql.org
> > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro
> > HORIGUCHI
> > 
> > $ time psql postgres -c 'select t.a from t, generate_series(0, )' >
> > /dev/null
> > 
> > real0m22.696s
> > user0m16.991s
> > sys 0m0.182s>
> > 
> > Using binsearch the result for the same operation was
> > real0m35.296s
> > user0m17.166s
> > sys 0m0.216s
> > 
> > Returning in UTF-8 bloats the result string by about 1.5 times so it doesn't
> > seem to make sense comparing with it. But it takes real = 47.35s.
> 
> Cool, 36% speedup!  Does this difference vary depending on the actual 
> characters used, e.g. the speedup would be greater if most of the characters 
> are ASCII?

Binsearch on JIS X 0208 always needs about 10 times of comparison
and bisecting and the radix tree requires three hops on arrays
for most of the characters and two hops for some. In sort, this
effect won't be differ among 2 and 3 byte characters in UTF-8.

The translation speed of ASCII cahracters (U+20 - U+7f) is not
affected by the character conversion mechanism. They are just
copied without conversion.

As the result, there's no speedup if the output consists only of
ASCII characters and maximum speedup when the output consists
only of 2 byte UTF-8 characters.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IF (NOT) EXISTS in psql-completion

2016-09-12 Thread Kyotaro HORIGUCHI

Hello,

At Sat, 10 Sep 2016 07:40:16 +0200, Pavel Stehule  
wrote in 
> 2016-09-06 15:00 GMT+02:00 Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.jp>:
> 
> > Hello, this is the new version of this patch. Rebased on the
> > current master.
..
> > This patch consists of the following files. Since these files are
> > splitted in strange criteria and order for historical reasons,
> > I'll reorganize this and post them later.
> >
> 
> ok, can I start with testing and review with some from these files?

No problem, of couese. The reorganizing won't be a labor but will
change the spape so as to affect reviewing. Please wait for a
while.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-12 Thread Michael Paquier

On Tue, Sep 13, 2016 at 10:37 AM, Andres Freund  wrote:
> On 2016-09-13 10:35:38 +0900, Michael Paquier wrote:
>> On Sat, Sep 10, 2016 at 7:36 PM, Craig Ringer
>>  wrote:
>> > We need it for tap tests. More and more will use pg_basebackup and it'll
>> > start hurting test speeds badly.
>>
>> Ah yes, that's a good argument. hamster would suffer pretty badly
>> after that if nothing is done. I'll get an extra patch out for that,
>> with --no-sync and not --nosync by the way.
>
> FWIW, it might be better to instead use eatmydata in the cron
> invocations on such slow machines, that way we also test the fsync paths
> in them.

Er, well.. libeatmydata is only available in AUR for Archlinux x86_64,
and not in sight for Arch ARM...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] An extra error for client disconnection on Windows

2016-09-12 Thread Kyotaro HORIGUCHI

Hello, thanks for revewing and the discussion.

At Sat, 10 Sep 2016 10:44:50 -0400, Tom Lane  wrote in 
<17326.1473518...@sss.pgh.pa.us>
> Michael Paquier  writes:
> > On Fri, Sep 9, 2016 at 11:39 PM, Tom Lane  wrote:
> >> So this change would deal nicely with the "peer application on the remote
> >> host is suddenly stopped" case, at the price of being not nice about any
> >> of the other cases.  Not convinced it's a good tradeoff.
> 
> > Yes, in the list of failure cases that could trigger this error, the
> > one that looks like a problem is to me is when a network interface is
> > disabled. It may be a good idea to let users know via the logs that
> > something was connected. Could we for example log a WARNING message,
> > and not report an error?

This "error" won't be raised when any side of NIC stopped. This
error is returned when the connection was "resetted", that is,
RST packet is sent and received from the peer. "connection reset"
is regarded as just a EOF at least for read() on Linux, I don't
think there's no problem to conceal the ECONNRESET on Windows if
we are satisfied with the behavior of Linux's read(). Users won't
know of the closed connections if a client doesn't show a message
for an EOF on Linux. Users will know of them on Windows if a
program shows a message for an EOF without a message for
ECONNRESET.

In another aspect is the policy of behavior unification.

If we take a policy to try to imitate the behavior of some
reference platform (specifically Linux) on other platforms, this
is required disguising. Another potential policy on this problem
is "following the platform's behavior". From this viewpoint, this
message should be shown to users because Windows says
so. Especially for socket operations, the simultion layer is
intending the former for non-error behaviors, but I'm not sure
about the behaviors on errors.

> It isn't an "error".  These conditions get logged at COMMERROR which is
> effectively LOG_SERVER_ONLY.

If RST is not expected at the time and distinguishing it from FIN
is significant to users, it would be an "error".

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-12 Thread Andres Freund

On 2016-09-13 10:35:38 +0900, Michael Paquier wrote:
> On Sat, Sep 10, 2016 at 7:36 PM, Craig Ringer
>  wrote:
> > We need it for tap tests. More and more will use pg_basebackup and it'll
> > start hurting test speeds badly.
> 
> Ah yes, that's a good argument. hamster would suffer pretty badly
> after that if nothing is done. I'll get an extra patch out for that,
> with --no-sync and not --nosync by the way.

FWIW, it might be better to instead use eatmydata in the cron
invocations on such slow machines, that way we also test the fsync paths
in them.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-12 Thread Michael Paquier

On Sat, Sep 10, 2016 at 7:36 PM, Craig Ringer
 wrote:
> We need it for tap tests. More and more will use pg_basebackup and it'll
> start hurting test speeds badly.

Ah yes, that's a good argument. hamster would suffer pretty badly
after that if nothing is done. I'll get an extra patch out for that,
with --no-sync and not --nosync by the way.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Let file_fdw access COPY FROM PROGRAM

2016-09-12 Thread Amit Langote

On 2016/09/13 2:01, Corey Huinker wrote:
> Thanks for the review!
> 
> I agree with all the code cleanups suggested and have made then in the
> attached patch, to save the committer some time.

Thanks.  Have already marked the patch as ready for the committer.

> Also in this patch, I changed sgml para to
>  
>   Changing table-level options requires superuser privileges, for security
>   reasons: only a superuser should be able to determine which file is read
>   or which program is run.  In principle non-superusers could be allowed to
>   change the other options, but that's not supported at present.
>  
> 
> "Determine" is an odd word in this context. I understand it to mean
> "set/change", but I can see where a less familiar reader would take the
> meaning to be "has permission to see the value already set". Either way,
> it now mentions program as an option in addition to filename.

Agreed.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2016-09-12 Thread Craig Ringer

On 13 September 2016 at 06:03, Petr Jelinek  wrote:

> Oh sure, I don't see that as big problem, the TupleData already contains
> type of the data it sends (to distinguish between nulls and text data) so
> that's mostly about adding some different type there and we'll also need
> type info in the column part of the Relation message but that should be easy
> to fence with one if for different protocol version.

The missing piece seems to be negotiation.

If a binary-aware client connects to a non-binary aware server, the
non-binary-aware server needs a way to say "you requested this option
I don't understand, go away" or "you asked for binary but I don't
support that".

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 20:15:46 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 16:56:32 -0700, Andres Freund wrote:
> >> Attached is a noticeably expanded set of tests, with fixes for the stuff
> >> you'd found.  I plan to push that soon-ish.  Independent of the approach
> >> we'll be choosing, increased coverage seems quite useful.
> 
> > And for real.
> 
> Looks good to me, please push.

Done.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-09-12 16:56:32 -0700, Andres Freund wrote:
>> Attached is a noticeably expanded set of tests, with fixes for the stuff
>> you'd found.  I plan to push that soon-ish.  Independent of the approach
>> we'll be choosing, increased coverage seems quite useful.

> And for real.

Looks good to me, please push.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2016-09-12 Thread Vitaly Burovoy

On 9/12/16, Peter Eisentraut  wrote:
> Thank you for this extensive testing.  I will work on getting the bugs
> fixed.  Just a couple of comments on some of your points:
>
> On 9/9/16 11:45 PM, Vitaly Burovoy wrote:
>> It compiles and passes "make check" tests, but fails with "make
>> check-world" at:
>> test foreign_data ... FAILED
>
> I do not see that.  You can you show the diffs?

I can't reproduce it, it is my fault, may be I did not clean build dir.

>> 1. The standard requires "... ALTER COLUMN ... SET GENERATED { ALWAYS
>> | BY DEFAULT }" (9075-2:2011 subcl 11.20), but the patch implements it
>> as "... ALTER COLUMN ... ADD GENERATED { ALWAYS | BY DEFAULT } AS
>> IDENTITY"
>
> SET and ADD are two different things.  The SET command just changes the
> parameters of the underlying sequence.

Well... As for me ADD is used when you can add more than one property
of the same kind to a relation (e.g. column or constraint), but SET is
used when you change something and it replaces previous state (e.g.
NOT NULL, DEFAULT, STORAGE, SCHEMA, TABLESPACE etc.)

You can't set ADD more than one IDENTITY to a column, so it should be "SET".

> This can be implemented later and doesn't seem so important now.

Hmm. Now you're passing params to CreateSeqStmt because they are the same.
Is it hard to pass them to AlterSeqStmt (if there is no SET GENERATED")?

> The ADD command is not in the standard, but I needed it for pg_dump, mainly.
> I will need to document this.

Firstly, why to introduce new grammar which differs from the standard
instead of follow the standard?
Secondly, I see no troubles to follow the standard:
src/bin/pg_dump/pg_dump.c:
- "ALTER COLUMN %s ADD GENERATED ",
+ "ALTER COLUMN %s SET GENERATED ",

src/backend/parser/gram.y:
- /* ALTER TABLE  ALTER [COLUMN]  ADD GENERATED ... AS
IDENTITY ... */
- | ALTER opt_column ColId ADD_P GENERATED generated_when AS
IDENTITY_P OptParenthesizedSeqOptList
- c->options = $9;
+ /* ALTER TABLE  ALTER [COLUMN]  SET GENERATED ... */
+ | ALTER opt_column ColId SET GENERATED generated_when OptSeqOptList
- c->options = $7;

I guess "ALTER opt_column ColId SET OptSeqOptList" is easy to be
implemented, after some research "ALTER opt_column ColId RESTART [WITH
...]" also can be added.

(and reflected in the docs)

>> 14. It would be fine if psql has support of new clauses.
>
> What do you mean by that?  Tab completion?

Yes, I'm about it. Or tab completion is usually developed later?

>> 16. I think it is a good idea to not raise exceptions for "SET
>> GENERATED/DROP IDENTITY" if a column has the same type of identity/not
>> an identity. To be consistent with "SET/DROP NOT NULL".
>
> These behaviors are per SQL standard.

Can you point to concrete rule(s) in the standard?

I could not find it in ISO/IEC 9075-2 subclauses 11.20 "" and 11.21 "".
Only subclause 4.15.11 "Identity columns" says "The columns of a base
table BT can optionally include not more than one identity column."
(which you don't follow).

For instance, subclause 11.42 , General
Rules p.1 says explicitly about exception.
Or (for columns): 11.4 , General Rules p.3: "The
 in the  SHALL NOT be equivalent to
the  of any other column of T."

> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Several additional thoughts:
1. I think it is wise to add ability to set name of a sequence (as
PG's extension of the standard) to SET GENERATED or GENERATED in a
relation definition (something like CONSTRAINTs names), without it it
is hard to fix conflicts with other sequences (e.g. from serial pseudo
type) and manual changes of the sequence ("alter seq rename", "alter
seq set schema" etc.).
2. Is it useful to rename sequence linked with identity constraint
when table is renamed (similar way when sequence moves to another
schema following the linked table)?
3. You're setting OWNER to a sequence, but what about USAGE privilege
to roles have INSERT/UPDATE privileges to the table? For security
reasons table is owned by a role different from roles which are using
the table (to prevent changing its definition).

-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Exclude additional directories in pg_basebackup

2016-09-12 Thread Michael Paquier

On Tue, Sep 13, 2016 at 3:50 AM, Peter Eisentraut
 wrote:
> Add some tests.  At least test that one entry from the directory list
> and one entry from the files list is not contained in the backup
> result.  Testing the symlink handling would also be good.  (The
> pg_basebackup tests claim that Windows doesn't support symlinks and
> therefore skip all the symlink tests.  That seems a bit at odds with
> your code handling symlinks on Windows.)

The code proposed needs to support junction points on Windows so from
this side of things everything is fine. What is lacking here is
support for symlink() in perl for Windows, and that's why the tests
are skipped.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 16:56:32 -0700, Andres Freund wrote:
> On 2016-09-12 09:14:47 -0700, Andres Freund wrote:
> > On 2016-09-12 10:19:14 -0400, Tom Lane wrote:
> > > Andres Freund  writes:
> > > 
> > > > 0001-Add-some-more-targetlist-srf-tests.patch
> > > >   Add some test.
> > > 
> > > I think you should go ahead and push this tests-adding patch now, as it
> > > adds documentation of the current behavior that is a good thing to have
> > > independently of what the rest of the patchset does.  Also, I'm worried
> > > that some of the GROUP BY tests might have machine-dependent results
> > > (if they are implemented by hashing) so it would be good to get in a few
> > > buildfarm cycles and let that settle out before we start changing the
> > > answers.
> > 
> > Generally a sound plan - I started to noticeably expand it though,
> > there's some important edge cases it didn't cover.
> 
> Attached is a noticeably expanded set of tests, with fixes for the stuff
> you'd found.  I plan to push that soon-ish.  Independent of the approach
> we'll be choosing, increased coverage seems quite useful.

And for real.
>From 3bdaab7028c0ae7cf9bea666a6e555adbc68640e Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 3 Aug 2016 18:29:42 -0700
Subject: [PATCH] Add more tests for targetlist SRFs.

We're considering changing the implementation of targetlist SRFs
considerably, and a lot of the current behaviour isn't tested in our
regression tests. Thus it seems useful to increase coverage to avoid
accidental behaviour changes.

It's quite possible that some of the plans here will require adjustments
to avoid falling afoul of ordering differences (e.g. hashed group
bys). The buildfarm will tell us.

Reviewed-By: Tom Lane
Discussion: <20160827214829.zo2dfb5jaikii...@alap3.anarazel.de>
---
 src/test/regress/expected/tsrf.out | 501 +
 src/test/regress/parallel_schedule |   2 +-
 src/test/regress/serial_schedule   |   1 +
 src/test/regress/sql/tsrf.sql  | 124 +
 4 files changed, 627 insertions(+), 1 deletion(-)
 create mode 100644 src/test/regress/expected/tsrf.out
 create mode 100644 src/test/regress/sql/tsrf.sql

diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out
new file mode 100644
index 000..983ce17
--- /dev/null
+++ b/src/test/regress/expected/tsrf.out
@@ -0,0 +1,501 @@
+--
+-- tsrf - targetlist set returning function tests
+--
+-- simple srf
+SELECT generate_series(1, 3);
+ generate_series 
+-
+   1
+   2
+   3
+(3 rows)
+
+-- parallel iteration
+SELECT generate_series(1, 3), generate_series(3,5);
+ generate_series | generate_series 
+-+-
+   1 |   3
+   2 |   4
+   3 |   5
+(3 rows)
+
+-- parallel iteration, different number of rows
+SELECT generate_series(1, 2), generate_series(1,4);
+ generate_series | generate_series 
+-+-
+   1 |   1
+   2 |   2
+   1 |   3
+   2 |   4
+(4 rows)
+
+-- srf, with SRF argument
+SELECT generate_series(1, generate_series(1, 3));
+ generate_series 
+-
+   1
+   1
+   2
+   1
+   2
+   3
+(6 rows)
+
+-- srf, with two SRF arguments
+SELECT generate_series(generate_series(1,3), generate_series(2, 4));
+ERROR:  functions and operators can take at most one set argument
+CREATE TABLE few(id int, dataa text, datab text);
+INSERT INTO few VALUES(1, 'a', 'foo'),(2, 'a', 'bar'),(3, 'b', 'bar');
+-- SRF output order of sorting is maintained, if SRF is not referenced
+SELECT few.id, generate_series(1,3) g FROM few ORDER BY id DESC;
+ id | g 
++---
+  3 | 1
+  3 | 2
+  3 | 3
+  2 | 1
+  2 | 2
+  2 | 3
+  1 | 1
+  1 | 2
+  1 | 3
+(9 rows)
+
+-- but SRFs can be referenced in sort
+SELECT few.id, generate_series(1,3) g FROM few ORDER BY id, g DESC;
+ id | g 
++---
+  1 | 3
+  1 | 2
+  1 | 1
+  2 | 3
+  2 | 2
+  2 | 1
+  3 | 3
+  3 | 2
+  3 | 1
+(9 rows)
+
+SELECT few.id, generate_series(1,3) g FROM few ORDER BY id, generate_series(1,3) DESC;
+ id | g 
++---
+  1 | 3
+  1 | 2
+  1 | 1
+  2 | 3
+  2 | 2
+  2 | 1
+  3 | 3
+  3 | 2
+  3 | 1
+(9 rows)
+
+-- it's weird to have ORDER BYs that increase the number of results
+SELECT few.id FROM few ORDER BY id, generate_series(1,3) DESC;
+ id 
+
+  1
+  1
+  1
+  2
+  2
+  2
+  3
+  3
+  3
+(9 rows)
+
+-- SRFs are computed after aggregation
+SELECT few.dataa, count(*), min(id), max(id), unnest('{1,1,3}'::int[]) FROM few WHERE few.id = 1 GROUP BY few.dataa;
+ dataa | count | min | max | unnest 
+---+---+-+-+
+ a | 1 |   1 |   1 |  1
+ a | 1 |   1 |   1 |  1
+ a | 1 |   1 |   1 |  3
+(3 rows)
+
+-- unless referenced in GROUP BY clause
+SELECT few

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 09:14:47 -0700, Andres Freund wrote:
> On 2016-09-12 10:19:14 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > 
> > > 0001-Add-some-more-targetlist-srf-tests.patch
> > >   Add some test.
> > 
> > I think you should go ahead and push this tests-adding patch now, as it
> > adds documentation of the current behavior that is a good thing to have
> > independently of what the rest of the patchset does.  Also, I'm worried
> > that some of the GROUP BY tests might have machine-dependent results
> > (if they are implemented by hashing) so it would be good to get in a few
> > buildfarm cycles and let that settle out before we start changing the
> > answers.
> 
> Generally a sound plan - I started to noticeably expand it though,
> there's some important edge cases it didn't cover.

Attached is a noticeably expanded set of tests, with fixes for the stuff
you'd found.  I plan to push that soon-ish.  Independent of the approach
we'll be choosing, increased coverage seems quite useful.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 19:35:22 -0400, Tom Lane wrote:
> >> You're inventing objections.
> 
> > Heh, it's actually your own objection ;)
> > http://archives.postgresql.org/message-id/32673.1464023429%40sss.pgh.pa.us
> 
> I'm changing my opinion in the light of unfavorable evidence.  Is that
> wrong?

Nah, not at all.  I was just referring to "inventing".

> > I wonder how much duplication we'd end up between nodeFunctionscan.c and
> > nodeSRF (or whatever). We'd need the latter node to support ValuePerCall
> > in an non-materializing fashion as well.  Could we combine them somehow?
> 
> Yeah, I was wondering that too.  I doubt that we want to make one node
> type do both things --- the fact that Result comes in two flavors with
> different semantics (with or without an input node) isn't very nice IMO,
> and this would be almost that identical case.

It might not, agreed. That imo has to be taken into acount calculating
the code costs - although the executor stuff usually is pretty boring in
comparison.

> But maybe they could share
> some code at the level of ExecMakeTableFunctionResult.  (I've not looked
> at your executor changes yet, not sure how much of that still exists.)

I'd split ExecMakeTableFunctionResult up, to allow for a percall mode,
and that seems like it'd still be needed to avoid performance
regressions.

It'd be an awfully large amount of code those two nodes would
duplicate. Excepting different rescan logic and ORDINALITY support,
nearly all the nodeFunctionscan.c code would be needed in both nodes.

> Anyway I'll draft a prototype and then we can compare.

Ok, cool.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-09-12 18:35:03 -0400, Tom Lane wrote:
>> You're inventing objections.

> Heh, it's actually your own objection ;)
> http://archives.postgresql.org/message-id/32673.1464023429%40sss.pgh.pa.us

I'm changing my opinion in the light of unfavorable evidence.  Is that
wrong?

>> It won't require that any more than the
>> LATERAL approach does; it's basically the same code as whatever
>> nodeFunctionscan is going to do, but packaged as a pipeline eval node
>> rather than a base scan node.

> That might work.  It gets somewhat nasty though because you also need to
> handle, SRF arguments to SRFs. And those again can contain nearly
> arbitrary expressions inbetween. With the ROWS FROM approach that can be
> fairly easily handled via LATERAL.  I guess what we could do here is to
> use one pipeline node to evaluate all the argument SRFs, and then
> another for the outer expression. Where the outer node would evaluate
> the SRF arguments using normal expression evaluation, with the inner SRF
> output replaced by a var.

Right.  Nested SRFs translate to multiple ROWS-FROM RTEs with lateral
references in the one approach, and nested Result-thingies in the other.
It's pretty much the same thing mutatis mutandis, but I think it will
likely be a lot easier to get there from here with the Result-based
approach --- for example, we don't have to worry about forcing lateral
join order, and the ordering constraints vis-a-vis GROUP BY etc won't take
any great effort either.  Anyway I think it is worth trying.

> I wonder how much duplication we'd end up between nodeFunctionscan.c and
> nodeSRF (or whatever). We'd need the latter node to support ValuePerCall
> in an non-materializing fashion as well.  Could we combine them somehow?

Yeah, I was wondering that too.  I doubt that we want to make one node
type do both things --- the fact that Result comes in two flavors with
different semantics (with or without an input node) isn't very nice IMO,
and this would be almost that identical case.  But maybe they could share
some code at the level of ExecMakeTableFunctionResult.  (I've not looked
at your executor changes yet, not sure how much of that still exists.)

> I don't think the code for adding these intermediate SRF evaluating
> nodes will be noticeably simpler than what's in my prototype. We'll
> still have to do the whole conversion recursively, we'll still need
> complexity of figuring out whether to put those SRFs evaluations
> before/after group by, order by, distinct on and window functions.

I think it will slot into the code that's already there rather more
easily than what you've done, because we already *have* code that makes
decisions in that form.  We just need to teach it to break down what
it now thinks of as a single projection step into N+1 steps when there
are N levels of nested SRF present.  Anyway I'll draft a prototype and
then we can compare.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

Hi,

On 2016-09-12 18:35:03 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I don't think it'd be all that hard to add something like the current
> > LCM behaviour into nodeFunctionscan.c if we really wanted. But I think
> > it'll be better to just say no here.
> 
> "Just say no" soon translates to memes about "disasters like the removal
> of implicit casting" (which in fact is not what 8.3 did, but I've grown
> pretty damn tired of the amount of bitching that that cleanup did and
> still does provoke).  In any case, it feels like the LATERAL approach is
> locking us into more and subtler incompatibilities than just that one.

I can't see those being equivalent impact-wise. Note that the person
bitching most loudly about the "implicit casting" thing (Merlin Moncure)
is voting to remove the LCM behaviour ;)

I think we'll continue to get more bitching about the confusing LCM
behaviour than complaints the backward compat break would generate.

> >> If we go with a Result-like tSRF evaluation node, then whether we change
> >> semantics or not becomes mostly a matter of what that node does.  It could
> >> become basically a wrapper around the existing ExecTargetList() logic if
> >> we needed to provide backwards-compatible behavior.
> 
> > As you previously objected: If we keep ExecTargetList() style logic, we
> > need to keep most of execQual.c's handling of ExprMultipleResult et al,
> > and that's going to prevent the stuff I want to work on.
> 
> You're inventing objections.

Heh, it's actually your own objection ;)

http://archives.postgresql.org/message-id/32673.1464023429%40sss.pgh.pa.us

> It won't require that any more than the
> LATERAL approach does; it's basically the same code as whatever
> nodeFunctionscan is going to do, but packaged as a pipeline eval node
> rather than a base scan node.  Or to be clearer: what I'm suggesting it
> would contain is ExecTargetList's logic about restarting individual SRFs.
> That wouldn't propagate into execQual because we would only allow SRFs at
> the top level of the node's tlist, just like nodeFunctionscan does.
> ExecMakeTableFunctionResult doesn't require the generic execQual code
> to support SRFs today, and it still wouldn't.

(it accidentally does (see your cast example), but that that's besides
your point)

That might work.  It gets somewhat nasty though because you also need to
handle, SRF arguments to SRFs. And those again can contain nearly
arbitrary expressions inbetween. With the ROWS FROM approach that can be
fairly easily handled via LATERAL.  I guess what we could do here is to
use one pipeline node to evaluate all the argument SRFs, and then
another for the outer expression. Where the outer node would evaluate
the SRF arguments using normal expression evaluation, with the inner SRF
output replaced by a var.

I wonder how much duplication we'd end up between nodeFunctionscan.c and
nodeSRF (or whatever). We'd need the latter node to support ValuePerCall
in an non-materializing fashion as well.  Could we combine them somehow?

> In larger terms: the whole point here is to fish SRF calls up to the
> top level of the tlist of whatever node is executing them, where they
> can be special-cased by that node.  Their SRF-free argument
> expressions would be evaluated by generic execQual.  AFAICS this goes
> through in the same way from the executor's viewpoint whether we use
> LATERAL as the query restructuring method or a SRF-capable variant of
> Result.  But it's now looking to me like the latter would be a lot
> simpler from the point of view of planner complexity, and in
> particular from the point of view of proving correctness (equivalence
> of the query transformation).

It's nicer not to introduce subqueries from a two angles from my pov:
1) EXPLAINs will look more like the original query
2) The evaluation order of the non-srf part of the query, and the query
   itself, will be easier. That's by the thing I'm least happy with the
   LATERAL approach.

I don't think the code for adding these intermediate SRF evaluating
nodes will be noticeably simpler than what's in my prototype. We'll
still have to do the whole conversion recursively, we'll still need
complexity of figuring out whether to put those SRFs evaluations
before/after group by, order by, distinct on and window functions.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-09-12 17:36:07 -0400, Tom Lane wrote:
>> Um, I dunno.  You've added half a thousand lines of not-highly-readable-
>> nor-extensively-commented code to the planner; that certainly reaches *my*
>> threshold of pain.

> Well, I certainly plan (and started to) make that code easier to
> understand, and better commented.  It also removes ~1400 LOC of not easy
> to understand code... A good chunk of that'd would also be removed with
> a Result style approach, but far from all.

Hm, I've not studied 0006 yet, but surely that's executor code that would
go away with *any* approach that takes away the need for generic execQual
to support SRFs?  I don't see that it counts while discussing which way
we take to reach that point.

>> I'm also growing rather concerned that the LATERAL
>> approach is going to lock us into some unremovable incompatibilities
>> no matter how much we might regret that later (and in view of how quickly
>> I got my wrist slapped in <001201d18524$f84c4580$e8e4d080$@pcorp.us>,
>> I am afraid there may be more pushback awaiting us than we think).

> I don't think it'd be all that hard to add something like the current
> LCM behaviour into nodeFunctionscan.c if we really wanted. But I think
> it'll be better to just say no here.

"Just say no" soon translates to memes about "disasters like the removal
of implicit casting" (which in fact is not what 8.3 did, but I've grown
pretty damn tired of the amount of bitching that that cleanup did and
still does provoke).  In any case, it feels like the LATERAL approach is
locking us into more and subtler incompatibilities than just that one.

>> If we go with a Result-like tSRF evaluation node, then whether we change
>> semantics or not becomes mostly a matter of what that node does.  It could
>> become basically a wrapper around the existing ExecTargetList() logic if
>> we needed to provide backwards-compatible behavior.

> As you previously objected: If we keep ExecTargetList() style logic, we
> need to keep most of execQual.c's handling of ExprMultipleResult et al,
> and that's going to prevent the stuff I want to work on.

You're inventing objections.  It won't require that any more than the
LATERAL approach does; it's basically the same code as whatever
nodeFunctionscan is going to do, but packaged as a pipeline eval node
rather than a base scan node.  Or to be clearer: what I'm suggesting it
would contain is ExecTargetList's logic about restarting individual SRFs.
That wouldn't propagate into execQual because we would only allow SRFs at
the top level of the node's tlist, just like nodeFunctionscan does.
ExecMakeTableFunctionResult doesn't require the generic execQual code
to support SRFs today, and it still wouldn't.

In larger terms: the whole point here is to fish SRF calls up to the top
level of the tlist of whatever node is executing them, where they can be
special-cased by that node.  Their SRF-free argument expressions would be
evaluated by generic execQual.  AFAICS this goes through in the same way
from the executor's viewpoint whether we use LATERAL as the query
restructuring method or a SRF-capable variant of Result.  But it's now
looking to me like the latter would be a lot simpler from the point of
view of planner complexity, and in particular from the point of view of
proving correctness (equivalence of the query transformation).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Indexes

2016-09-12 Thread Mark Kirkwood


On 13/09/16 01:20, Jesper Pedersen wrote:

On 09/01/2016 11:55 PM, Amit Kapila wrote:

I have fixed all other issues you have raised.  Updated patch is
attached with this mail.



The following script hangs on idx_val creation - just with v5, WAL patch
not applied.


Are you sure it is actually hanging? I see 100% cpu for a few minutes 
but the index eventually completes ok for me (v5 patch applied to 
today's master).


Cheers

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 17:36:07 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
> >> Stepping back a little bit ... way back at the start of this thread
> >> you muttered about possibly implementing tSRFs as a special pipeline
> >> node type, a la Result.  That would have the same benefits in terms
> >> of being able to take SRF support out of the main execQual code paths.
> >> I, and I think some other people, felt that the LATERAL approach would
> >> be a cleaner answer --- but now that we're seeing some of the messy
> >> details required to make the LATERAL way work, I'm beginning to have
> >> second thoughts.  I wonder if we should do at least a POC implementation
> >> of the other way to get a better fix on which way is really cleaner.
> 
> > I'm not particularly in love in restarting with a different approach. I
> > think fixing the ROWS FROM expansion is the only really painful bit, and
> > that seems like it's independently beneficial to allow for suppression
> > of expansion there.
> 
> Um, I dunno.  You've added half a thousand lines of not-highly-readable-
> nor-extensively-commented code to the planner; that certainly reaches *my*
> threshold of pain.

Well, I certainly plan (and started to) make that code easier to
understand, and better commented.  It also removes ~1400 LOC of not easy
to understand code... A good chunk of that'd would also be removed with
a Result style approach, but far from all.

> I'm also growing rather concerned that the LATERAL
> approach is going to lock us into some unremovable incompatibilities
> no matter how much we might regret that later (and in view of how quickly
> I got my wrist slapped in <001201d18524$f84c4580$e8e4d080$@pcorp.us>,
> I am afraid there may be more pushback awaiting us than we think).

I don't think it'd be all that hard to add something like the current
LCM behaviour into nodeFunctionscan.c if we really wanted. But I think
it'll be better to just say no here.

> If we go with a Result-like tSRF evaluation node, then whether we change
> semantics or not becomes mostly a matter of what that node does.  It could
> become basically a wrapper around the existing ExecTargetList() logic if
> we needed to provide backwards-compatible behavior.

As you previously objected: If we keep ExecTargetList() style logic, we
need to keep most of execQual.c's handling of ExprMultipleResult et al,
and that's going to prevent the stuff I want to work on. Because
handling ExprMultipleResult in all these places is a serious issue
WRT making expression evaluation faster.  If we find a good answer to
that, I'd be more open to pursuing this approach.

> > I actually had started to work on a Result style approach, and I don't
> > think it turned out that nice. But I didn't complete it, so I might just
> > be wrong.
> 
> It's also possible that it's easier now in the post-pathification code
> base than it was before.  After contemplating my navel for awhile, it
> seems like the core of the planner code for a Result-like approach would
> be something very close to make_group_input_target(), plus a thing like
> pull_var_clause() that extracts SRFs rather than Vars.  Those two
> functions, including their lengthy comments, weigh in at ~250 lines.
> Admittedly there'd be some boilerplate on top of that, if we invent a
> new plan node type rather than extending Result, but not all that much.

That's pretty much what I did (or rather started to do), yes.  I had
something that was called from grouping_planner() that added the new
node ontop of the original set of paths, after grouping or after
distinct, depending on where SRFs were referenced.  The biggest benefit
I saw with that is that there's no need to push things into a subquery,
and that the ordering is a lot more explicit.

> If you like, I'll have a go at drafting a patch in that style.  Do you
> happen to still have the executor side of what you did, so I don't have
> to reinvent that?

The executor side is actually what I found harder here. Either we end up
keeping most of ExecQual's handling, or we reinvent a good deal of
separate logic.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2016-09-12 Thread Petr Jelinek




On 12/09/16 22:21, Andres Freund wrote:

On 2016-09-12 21:57:39 +0200, Petr Jelinek wrote:

On 12/09/16 21:54, Andres Freund wrote:

On 2016-09-12 21:47:08 +0200, Petr Jelinek wrote:

On 09/09/16 06:33, Peter Eisentraut wrote:

The start_replication option pg_version option is not documented and
not used in any later patch.  We can probably do without it and just
rely on the protocol version.



That's leftover from binary type data transfer which is not part of this
initial approach as it adds a lot of complications to both protocol and
apply side. So yes can do without.


FWIW, I don't think we can leave this out of the initial protocol
design. We don't have to implement it, but it has to be part of the
design.



I don't think it's a good idea to have unimplemented parts of protocol, we
have protocol version so it can be added in v2 when we have code that is
able to handle it.


I don't think we have to have it part of the protocol. But it has to be
forseen, otherwise introducing it later will end up requiring more
invasive changes than acceptable. I don't want to repeat the "libpq v3
protocol" evolution story here.



Oh sure, I don't see that as big problem, the TupleData already contains 
type of the data it sends (to distinguish between nulls and text data) 
so that's mostly about adding some different type there and we'll also 
need type info in the column part of the Relation message but that 
should be easy to fence with one if for different protocol version.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
>> Stepping back a little bit ... way back at the start of this thread
>> you muttered about possibly implementing tSRFs as a special pipeline
>> node type, a la Result.  That would have the same benefits in terms
>> of being able to take SRF support out of the main execQual code paths.
>> I, and I think some other people, felt that the LATERAL approach would
>> be a cleaner answer --- but now that we're seeing some of the messy
>> details required to make the LATERAL way work, I'm beginning to have
>> second thoughts.  I wonder if we should do at least a POC implementation
>> of the other way to get a better fix on which way is really cleaner.

> I'm not particularly in love in restarting with a different approach. I
> think fixing the ROWS FROM expansion is the only really painful bit, and
> that seems like it's independently beneficial to allow for suppression
> of expansion there.

Um, I dunno.  You've added half a thousand lines of not-highly-readable-
nor-extensively-commented code to the planner; that certainly reaches *my*
threshold of pain.  I'm also growing rather concerned that the LATERAL
approach is going to lock us into some unremovable incompatibilities
no matter how much we might regret that later (and in view of how quickly
I got my wrist slapped in <001201d18524$f84c4580$e8e4d080$@pcorp.us>,
I am afraid there may be more pushback awaiting us than we think).
If we go with a Result-like tSRF evaluation node, then whether we change
semantics or not becomes mostly a matter of what that node does.  It could
become basically a wrapper around the existing ExecTargetList() logic if
we needed to provide backwards-compatible behavior.

> I actually had started to work on a Result style approach, and I don't
> think it turned out that nice. But I didn't complete it, so I might just
> be wrong.

It's also possible that it's easier now in the post-pathification code
base than it was before.  After contemplating my navel for awhile, it
seems like the core of the planner code for a Result-like approach would
be something very close to make_group_input_target(), plus a thing like
pull_var_clause() that extracts SRFs rather than Vars.  Those two
functions, including their lengthy comments, weigh in at ~250 lines.
Admittedly there'd be some boilerplate on top of that, if we invent a
new plan node type rather than extending Result, but not all that much.

If you like, I'll have a go at drafting a patch in that style.  Do you
happen to still have the executor side of what you did, so I don't have
to reinvent that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] identity columns

2016-09-12 Thread Peter Eisentraut

Thank you for this extensive testing.  I will work on getting the bugs
fixed.  Just a couple of comments on some of your points:

On 9/9/16 11:45 PM, Vitaly Burovoy wrote:
> It compiles and passes "make check" tests, but fails with "make check-world" 
> at:
> test foreign_data ... FAILED

I do not see that.  You can you show the diffs?

> 1. The standard requires "... ALTER COLUMN ... SET GENERATED { ALWAYS
> | BY DEFAULT }" (9075-2:2011 subcl 11.20), but the patch implements it
> as "... ALTER COLUMN ... ADD GENERATED { ALWAYS | BY DEFAULT } AS
> IDENTITY"

SET and ADD are two different things.  The SET command just changes the
parameters of the underlying sequence.  This can be implemented later
and doesn't seem so important now.  The ADD command is not in the
standard, but I needed it for pg_dump, mainly.  I will need to document
this.

> 14. It would be fine if psql has support of new clauses.

What do you mean by that?  Tab completion?

> 16. I think it is a good idea to not raise exceptions for "SET
> GENERATED/DROP IDENTITY" if a column has the same type of identity/not
> an identity. To be consistent with "SET/DROP NOT NULL".

These behaviors are per SQL standard.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] feature request: explain "with details" option

2016-09-12 Thread Tom Lane

Robert Haas  writes:
> I think the only part of this that would be really brutal to try to
> represent is alternative join orders.  I see no reasonable way for
> EXPLAIN to output useful information about what other join orders were
> considered and why they were not chosen; the only thing that seems
> like it would help in that case would be an easy way to force the
> exact join order you want and then see what the plan looks like.

That exists today: write your query as a nest of explicit JOIN syntax
and set join_collapse_limit = 1.  Not sure that it's really all that
useful, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2016-09-12 Thread Andres Freund

On 2016-09-12 21:57:39 +0200, Petr Jelinek wrote:
> On 12/09/16 21:54, Andres Freund wrote:
> > On 2016-09-12 21:47:08 +0200, Petr Jelinek wrote:
> > > On 09/09/16 06:33, Peter Eisentraut wrote:
> > > > The start_replication option pg_version option is not documented and
> > > > not used in any later patch.  We can probably do without it and just
> > > > rely on the protocol version.
> > > > 
> > > 
> > > That's leftover from binary type data transfer which is not part of this
> > > initial approach as it adds a lot of complications to both protocol and
> > > apply side. So yes can do without.
> > 
> > FWIW, I don't think we can leave this out of the initial protocol
> > design. We don't have to implement it, but it has to be part of the
> > design.
> > 
> 
> I don't think it's a good idea to have unimplemented parts of protocol, we
> have protocol version so it can be added in v2 when we have code that is
> able to handle it.

I don't think we have to have it part of the protocol. But it has to be
forseen, otherwise introducing it later will end up requiring more
invasive changes than acceptable. I don't want to repeat the "libpq v3
protocol" evolution story here.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] feature request: explain "with details" option

2016-09-12 Thread Robert Haas

On Fri, Sep 9, 2016 at 12:35 AM, Tom Lane  wrote:
> Craig Ringer  writes:
>> On 9 September 2016 at 01:40, Roger Pack  wrote:
>>> Today's explain tells us what loops and scans were used, and relative
>>> costs, etc.  It doesn't seem to tell *why* the planner elected to use
>>> what it did.
>
>> One thing that's been discussed here is to have a way to see which
>> potential plans are rejected and compare their costs.
>
>> This isn't simple because there are often *lots* of variants. You
>> don't just want to see the "top 10" candidate plans, because they're
>> probably a bunch of small variants on the same plan; the ones you'll
>> be interested in will probably be very different plans with very bad
>> relative estimates.
>
> The other big problem here is that the planner tries *very* hard to reject
> losing paths early; so it does not even form an explainable plan for a
> large fraction of the search space.  (And if it did, you'd be dead before
> you got your EXPLAIN result back.)
>
> People have experimented with making the planner log every candidate path
> before the path enters the comparison tournament (and, typically, doesn't
> survive the first round).  But I've never seen any version of that that
> I thought would be intelligible to non-experts.  It's exceedingly verbose
> and it certainly doesn't look anything like what we know as EXPLAIN output.

What I've observed when troubleshooting plan selection is that you
often want to change the planner's choice in one particular part of
the plan and see what happens - e.g. force a {sequential scan, index
scan using index X, bitmap index scan using index X, index-only scan
using index X} on a particular base relation, or force a {hash, merge,
nested loop} join between X and Y, possibly constraining which is the
inner side and which is the outer side.  Or, also fairly commonly, I
just want to know what other paths were generated at a given level of
the plan tree and what their estimate costs were.

Of course, at the risk of starting a flame war, query hints would be
rather useful here.  You'd be able to force the plan you want not
because forcing a plan choice is a great way to run a production
system (though somebody might want to do that, of course) but to debug
why you're not getting that plan.  Right now, the only tools we have
for this sort of thing are the enable_* GUCs and twiddling the cost
values, and that's OK for simple plans but for complex plans involving
multiple tables it's a pretty blunt instrument and it's often tedious
to understand exactly what made the planner do what it did.  I for one
would welcome a better way to elicit EXPLAIN (THE_PLAN_I_WANTED).

Another thing that would be useful is, for each base relation, save
all of the access paths and the costs of each; and for each join
relation, save the cheapest cost for each join method.  So if you have
a join between A, B, and C, you can see all of the possible access
methods and their costs for A, B, and C; plus, for the join order
actually chosen (but not any alternative join order), you can see
whether other join methods were judged feasible and if so what their
cost would have been given the actually-surviving paths for the
underlying relations.  So for a two-way join you might get something
like:

Hash Join
  Considered:
Merge Join (cost=...)
Nested Loop (cost=...)
  -> Seq Scan
  Considered:
  Index Scan on ... (cost=...)
  -> Hash
-> Seq Scan
   Considered:
  Index Scan on  (cost=...)

Of course, this wouldn't tell you everything you could possibly want
to know, but it would let you answer a lot of common questions like
"how much slower would it have been to use the index" (or "the other
index") and "were other join methods considered too expensive or were
they not even considered because the planner thinks they're not usable
here?" and "how much more expensive does the planner think that a hash
join would have been than the nested loop actually chosen?".

I think the only part of this that would be really brutal to try to
represent is alternative join orders.  I see no reasonable way for
EXPLAIN to output useful information about what other join orders were
considered and why they were not chosen; the only thing that seems
like it would help in that case would be an easy way to force the
exact join order you want and then see what the plan looks like.  Even
that's not totally perfect because sometimes there are a bunch of join
orders that are essentially interchangeable and what you really want
to know is whether the planner considered a join order that's
materially different, but the planner makes no such distinction
internally.  At any rate, I don't think the fact that it's difficult
or impossible to provide information about join orders should deter us
from having a way to display the stuff we can get our hands around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hacker

Re: [HACKERS] Logical Replication WIP

2016-09-12 Thread Petr Jelinek


On 12/09/16 21:54, Andres Freund wrote:

On 2016-09-12 21:47:08 +0200, Petr Jelinek wrote:

On 09/09/16 06:33, Peter Eisentraut wrote:

The start_replication option pg_version option is not documented and
not used in any later patch.  We can probably do without it and just
rely on the protocol version.



That's leftover from binary type data transfer which is not part of this
initial approach as it adds a lot of complications to both protocol and
apply side. So yes can do without.


FWIW, I don't think we can leave this out of the initial protocol
design. We don't have to implement it, but it has to be part of the
design.



I don't think it's a good idea to have unimplemented parts of protocol, 
we have protocol version so it can be added in v2 when we have code that 
is able to handle it.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] inappropriate use of NameGetDatum macro

2016-09-12 Thread Tom Lane

Mark Dilger  writes:
> there are several places in the code where variables defined as
> (char *) or as (const char *) are passed to the NameGetDatum()
> macro.  I believe it would be better form to use CStringGetDatum()
> in these locations.  I am aware that these two macros are internally
> the same.

Hm, I agree, this feels wrong.  I suppose you could argue that the
called functions are expecting Name pointers not CString pointers,
but that type cheat is happening anyway.  It would be better form
to explicitly pass a CString datum if that's what we're passing.

I'm tempted to propose that we redefine NameGetDatum as

#define NameGetDatum(X) CStringGetDatum(NameStr(*(X)))

which should do the same thing at runtime, but would result in a
compile error if what's passed isn't declared as Name (or NameData*).
This would be asymmetrical with the way DatumGetName looks, though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2016-09-12 Thread Andres Freund

On 2016-09-12 21:47:08 +0200, Petr Jelinek wrote:
> On 09/09/16 06:33, Peter Eisentraut wrote:
> > The start_replication option pg_version option is not documented and
> > not used in any later patch.  We can probably do without it and just
> > rely on the protocol version.
> > 
> 
> That's leftover from binary type data transfer which is not part of this
> initial approach as it adds a lot of complications to both protocol and
> apply side. So yes can do without.

FWIW, I don't think we can leave this out of the initial protocol
design. We don't have to implement it, but it has to be part of the
design.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2016-09-12 Thread Petr Jelinek


On 09/09/16 06:33, Peter Eisentraut wrote:

Review of 0003-Define-logical-replication-protocol-and-output-plugi.patch:

(This is still based on the Aug 31 patch set, but at quick glance I
didn't see any significant changes in the Sep 8 set.)



Yep.


The start_replication option pg_version option is not documented and
not used in any later patch.  We can probably do without it and just
rely on the protocol version.



That's leftover from binary type data transfer which is not part of this 
initial approach as it adds a lot of complications to both protocol and 
apply side. So yes can do without.



In pgoutput_startup(), you check opt->output_type.  But it is not set
anywhere.  Actually, the startup callback is supposed to set it
itself.


Leftover from pglogical which actually supports both output types.


In init_rel_sync_cache(), the way hash_flags is set seems kind of
weird.  I think that variable could be removed and the flags put
directly into the hash_create() call.



Eh, yes no idea how that came to be.


pgoutput_config.c seems over-engineered, e.g., converting cstring to
Datum and back.  Just do normal DefElem list parsing in pgoutput.c.
That's not pretty either, but at least it's a common coding pattern.



Yes now that we have only couple of options I agree.


In the protocol documentation, explain the meaning of int64 as a
commit timestamp.



You mean that it's milliseconds since postgres epoch?


On the actual protocol messages:

Why do strings have a length byte?  That is not how other strings in
the protocol work.  As a minor side-effect, this would limit for
example column names to 255 characters.


Because I originally sent them without the null termination but I guess 
they don't really need it anymore. (the 255 char limit is not really 
important in practice given the column length is limited to 64 
characters anyway)




The message structure doesn't match the documentation in some ways.
For example Attributes and TupleData are not separate messages but are
contained in Relation and Insert/Update/Delete messages.  So the
documentation needs to be structured a bit differently.

In the Attributes message (or actually Relation message), we don't
need the 'A' and 'C' bytes.



Hmm okay will look into it. I guess if we remove the 'A' then rest of 
the Attribute message neatly merges into the Relation message. The more 
interesting part will be the TupleData as it's common part of other 
messages.



I'm not sure that pgoutput should concern itself with the client
encoding.  The client encoding should already be set by the initial
FE/BE protocol handshake.  I haven't checked that further yet, so it
might already work, or it should be made to work that way, or I might
be way off.


Yes, I think you are right, that was there mostly for same reason as the 
pg_version.




Slight abuse of pqformat functions.  We're not composing messages
using pq_beginmessage()/pq_endmessage(), and we're not using
pq_getmsgend() when reading.  The "proper" way to do this is probably
to define a custom set of PQcommMethods.  (low priority)



If we change that, I'd probably rather go with direct use of StringInfo 
functions.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] inappropriate use of NameGetDatum macro

2016-09-12 Thread Mark Dilger

Friends,

there are several places in the code where variables defined as
(char *) or as (const char *) are passed to the NameGetDatum()
macro.  I believe it would be better form to use CStringGetDatum()
in these locations.  I am aware that these two macros are internally
the same.

src/backend/commands/proclang.c, line 466.
src/backend/commands/dbcommands.c, lines 1263, 1489, 1606, 1746.

Am I overlooking some reason why the code is correct as is? If not,
I am attaching a patch that applies cleanly for me against master,
compiles, and passes the regression tests.

Thanks,

Mark Dilger



NameGetDatum.patch.1
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup wish list

2016-09-12 Thread Tom Lane

Peter Eisentraut  writes:
> On 8/19/16 1:04 AM, Masahiko Sawada wrote:
>> I agree with adding this as an option and removing directory by default.
>> And it looks good to me except for missing new line in usage output.
>> 
>> printf(_("  -l, --label=LABEL  set backup label\n"));
>> +   printf(_("  -n, --noclean  do not clean up after errors"));
>> printf(_("  -P, --progress show progress information\n"));

> Committed with that fix.  Thanks.

Hm, there was just a kerfuffle about spelling things like this
"--no-clean" etc.  I wasn't on board with removing existing spellings,
but surely all newly added switches should use the "no-" spelling?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup wish list

2016-09-12 Thread Peter Eisentraut

On 8/19/16 1:04 AM, Masahiko Sawada wrote:
> I agree with adding this as an option and removing directory by default.
> And it looks good to me except for missing new line in usage output.
> 
> printf(_("  -l, --label=LABEL  set backup label\n"));
> +   printf(_("  -n, --noclean  do not clean up after errors"));
> printf(_("  -P, --progress show progress information\n"));

Committed with that fix.  Thanks.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-12 Thread Peter Geoghegan

On Mon, Sep 12, 2016 at 12:07 PM, Heikki Linnakangas  wrote:
> Here's a fixed version. I'll go through Peter's comments and address those,
> but I don't think there was anything there that should affect performance
> much, so I think you can proceed with your benchmarking with this version.
> (You'll also need to turn off assertions for that!)

I agree that it's unlikely that addressing any of my feedback will
result in any major change to performance.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-12 Thread Heikki Linnakangas


On 09/12/2016 06:47 PM, Claudio Freire wrote:

On Mon, Sep 12, 2016 at 12:02 PM, Claudio Freire  wrote:

On Sun, Sep 11, 2016 at 12:47 PM, Heikki Linnakangas  wrote:

Here's a new version of these patches, rebased over current master. I
squashed the two patches into one, there's not much point to keep them
separate.


I don't know what was up with the other ones, but this one works fine.

Benchmarking now.


I spoke too soon, git AM had failed and I didn't notice.

regression.diffs attached

Built with

./configure --enable-debug --enable-cassert && make clean && make -j7
&& make check


Ah, of course! I had been building without assertions, as I was doing 
performance testing. With --enable-cassert, it failed for me as well 
(and there was even a compiler warning pointing out one of the issues). 
Sorry about that.


Here's a fixed version. I'll go through Peter's comments and address 
those, but I don't think there was anything there that should affect 
performance much, so I think you can proceed with your benchmarking with 
this version. (You'll also need to turn off assertions for that!)


- Heikki

>From 6101a4b91f537bf483059b0b6e8ff13d6e7be9fa Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 12 Sep 2016 22:02:34 +0300
Subject: [PATCH 1/1] Change the way pre-reading in external sort's merge phase
 works.

Don't pre-read tuples into SortTuple slots during merge. Instead, use the
memory for larger read buffers in logtape.c. We're doing the same number
of READTUP() calls either way, but managing the pre-read SortTuple slots
is much more complicated. Also, the on-tape representation is more compact
than SortTuples, so we can fit more pre-read tuples into the same amount
of memory this way. And we have better cache-locality, when we use just a
small number of SortTuple slots.

Now that we only hold one tuple from each tape in the SortTuple slots, we
can greatly simplify the "batch memory" management. We now maintain a
small set of fixed-sized buffers, to hold the tuples, and fall back to
palloc() for larger tuples. We use this method during all merge phases,
not just the final merge.
---
 src/backend/utils/sort/logtape.c   | 140 +-
 src/backend/utils/sort/tuplesort.c | 996 +++--
 src/include/utils/logtape.h|   1 +
 3 files changed, 399 insertions(+), 738 deletions(-)

diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index 7745207..3377cef 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -79,6 +79,7 @@
 
 #include "storage/buffile.h"
 #include "utils/logtape.h"
+#include "utils/memutils.h"
 
 /*
  * Block indexes are "long"s, so we can fit this many per indirect block.
@@ -131,9 +132,12 @@ typedef struct LogicalTape
 	 * reading.
 	 */
 	char	   *buffer;			/* physical buffer (separately palloc'd) */
+	int			buffer_size;	/* allocated size of the buffer */
 	long		curBlockNumber; /* this block's logical blk# within tape */
 	int			pos;			/* next read/write position in buffer */
 	int			nbytes;			/* total # of valid bytes in buffer */
+
+	int			read_buffer_size;	/* buffer size to use when reading */
 } LogicalTape;
 
 /*
@@ -228,6 +232,53 @@ ltsReadBlock(LogicalTapeSet *lts, long blocknum, void *buffer)
 }
 
 /*
+ * Read as many blocks as we can into the per-tape buffer.
+ *
+ * The caller can specify the next physical block number to read, in
+ * datablocknum, or -1 to fetch the next block number from the internal block.
+ * If datablocknum == -1, the caller must've already set curBlockNumber.
+ *
+ * Returns true if anything was read, 'false' on EOF.
+ */
+static bool
+ltsReadFillBuffer(LogicalTapeSet *lts, LogicalTape *lt, long datablocknum)
+{
+	lt->pos = 0;
+	lt->nbytes = 0;
+
+	do
+	{
+		/* Fetch next block number (unless provided by caller) */
+		if (datablocknum == -1)
+		{
+			datablocknum = ltsRecallNextBlockNum(lts, lt->indirect, lt->frozen);
+			if (datablocknum == -1L)
+break;			/* EOF */
+			lt->curBlockNumber++;
+		}
+
+		/* Read the block */
+		ltsReadBlock(lts, datablocknum, (void *) (lt->buffer + lt->nbytes));
+		if (!lt->frozen)
+			ltsReleaseBlock(lts, datablocknum);
+
+		if (lt->curBlockNumber < lt->numFullBlocks)
+			lt->nbytes += BLCKSZ;
+		else
+		{
+			/* EOF */
+			lt->nbytes += lt->lastBlockBytes;
+			break;
+		}
+
+		/* Advance to next block, if we have buffer space left */
+		datablocknum = -1;
+	} while (lt->nbytes < lt->buffer_size);
+
+	return (lt->nbytes > 0);
+}
+
+/*
  * qsort comparator for sorting freeBlocks[] into decreasing order.
  */
 static int
@@ -546,6 +597,8 @@ LogicalTapeSetCreate(int ntapes)
 		lt->numFullBlocks = 0L;
 		lt->lastBlockBytes = 0;
 		lt->buffer = NULL;
+		lt->buffer_size = 0;
+		lt->read_buffer_size = BLCKSZ;
 		lt->curBlockNumber = 0L;
 		lt->pos = 0;
 		lt->nbytes = 0;
@@ -628,7 +681,10 @@ LogicalTapeWrite(LogicalTapeSet *lts, int tapenum,
 
 	/* Allocate data buffer and first indirect block on first write */
 	if (lt

Re: [HACKERS] PATCH: Exclude additional directories in pg_basebackup

2016-09-12 Thread Peter Eisentraut

The comments on excludeDirContents are somewhat imprecise.  For
example, none of those directories are actually removed or recreated
on startup, only the contents are.

The comment for pg_replslot is incorrect.  I think you can copy
replication slots just fine, but you usually don't want to.

What is the 2 for in this code?

if (strcmp(pathbuf + 2, excludeFile[excludeIdx]) == 0)

I would keep the signature of _tarWriteDir() similar to
_tarWriteHeader().  So don't pass sizeonly in there, or do pass in
sizeonly but do the same with _tarWriteHeader().

The documentation in pg_basebackup.sgml and protocol.sgml should be
updated.

Add some tests.  At least test that one entry from the directory list
and one entry from the files list is not contained in the backup
result.  Testing the symlink handling would also be good.  (The
pg_basebackup tests claim that Windows doesn't support symlinks and
therefore skip all the symlink tests.  That seems a bit at odds with
your code handling symlinks on Windows.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 14:05:33 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 13:48:05 -0400, Tom Lane wrote:
> >> Andres Freund  writes:
> >>> I kind of like ROWS FROM (... AS VALUE), that seems to confer the
> >>> meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
> >>> really work inside ROWS FROM() where AS is required.
> 
> >> Hm, wouldn't ... AS RECORD convey the meaning better?
> 
> > I was kind of envisioning AS VALUE to work for composite types without
> > removing their original type (possibly even for TYPEFUNC_SCALAR
> > ones).
> 
> Maybe.  A problem with any of these proposals though is that there's no
> place to put a column alias.  Yeah, you can stick it on outside the ROWS
> FROM, but it seems a bit non-orthogonal to have to do it that way when
> you can do it inside the ROWS FROM when adding a coldeflist.

I don't necessarily see that as a problem. The coldeflists inside ROWS
FROM() already don't allow assigning aliases for !record/composite
types, and they require specifying types.


> >> (Although once you look at it that way, it's just a cast spelled in
> >> an idiosyncratic fashion.)
> 
> > Well, not quite, by virtue of keeping the original type around. After a
> > record cast you likely couldn't directly access the columns anymore,
> > even if it were a known composite type, right?
> 
> Same is true for any of these syntax proposals, no?  So far as the rest of
> the query is concerned, the function output is going to be an anonymous
> record type.

Not for composite types, no. As implemented ROWS FROM (.. AS()) does:
CREATE OR REPLACE FUNCTION get_pg_class() RETURNS SETOF pg_class LANGUAGE sql 
AS $$SELECT * FROM pg_class;$$;
SELECT DISTINCT pg_typeof(f) FROM ROWS FROM (get_pg_class() AS ()) f;
┌───┐
│ pg_typeof │
├───┤
│ pg_class  │
└───┘
(1 row)
SELECT (f).relname FROM ROWS FROM (get_pg_class() AS ()) f LIMIT 1;
┌┐
│relname │
├┤
│ pg_toast_77994 │
└┘
(1 row)
which seems sensible to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-09-12 13:48:05 -0400, Tom Lane wrote:
>> Andres Freund  writes:
>>> I kind of like ROWS FROM (... AS VALUE), that seems to confer the
>>> meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
>>> really work inside ROWS FROM() where AS is required.

>> Hm, wouldn't ... AS RECORD convey the meaning better?

> I was kind of envisioning AS VALUE to work for composite types without
> removing their original type (possibly even for TYPEFUNC_SCALAR
> ones).

Maybe.  A problem with any of these proposals though is that there's no
place to put a column alias.  Yeah, you can stick it on outside the ROWS
FROM, but it seems a bit non-orthogonal to have to do it that way when
you can do it inside the ROWS FROM when adding a coldeflist.

Maybe we could do it like

ROWS FROM (func(...) AS alias)

where the difference from a coldeflist is that there's no parenthesized
list of names/types.  It's a bit weird that adding an alias makes for
a semantic not just naming difference, but it's no weirder than these
other ideas.

>> (Although once you look at it that way, it's just a cast spelled in
>> an idiosyncratic fashion.)

> Well, not quite, by virtue of keeping the original type around. After a
> record cast you likely couldn't directly access the columns anymore,
> even if it were a known composite type, right?

Same is true for any of these syntax proposals, no?  So far as the rest of
the query is concerned, the function output is going to be an anonymous
record type.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ICU integration

2016-09-12 Thread Peter Geoghegan

On Fri, Sep 9, 2016 at 6:39 AM, Dave Page  wrote:
> Looking back at my old emails, apparently ICU 5.0 and later include
> ucol_strcollUTF8() which avoids the need to convert UTF-8 characters
> to 16 bit before sorting. RHEL 6 has the older 4.2 version of ICU.

At the risk of stating the obvious, there is a reason why ICU
traditionally worked with UTF-16 natively. It's the same reason why
many OSes and application frameworks (e.g., Java) use UTF-16
internally, even though UTF-8 is much more popular on the web. Which
is: there are certain low-level optimizations possible that are not
possible with UTF-8.

I'm not saying that it would be just as good if we were to not use the
UTF-8 optimized stuff that ICU now has. My point is that it's not
useful to prejudge whether or not performance will be acceptable based
on a factor like this, which is ultimately just an implementation
detail. The ICU patch either performs acceptably as a substitute for
something like glibc, or it does not.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 13:48:05 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
> >> Andres Freund  writes:
> > On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
>  I can't say that I like the proposed syntax much.
>
> >>> Me neither. But I haven't really found a better approach.  It seems
> >>> kinda consistent to have ROWS FROM (... AS ()) change the picked out
> >>> columns to 0, and just return the whole thing.
>
> >> I just remembered that we allow zero-column composite types, which
> >> makes this proposal formally ambiguous.  So we really need a different
> >> syntax.  I'm not especially in love with the cast-to-record idea, but
> >> it does dodge that problem.
>
> > I kind of like ROWS FROM (... AS VALUE), that seems to confer the
> > meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
> > really work inside ROWS FROM() where AS is required.
>
> Hm, wouldn't ... AS RECORD convey the meaning better?

I was kind of envisioning AS VALUE to work for composite types without
removing their original type (possibly even for TYPEFUNC_SCALAR
ones). That, for one, makes the SRF to ROWS FROM conversion easier, and
for another seems generally useful. composites keeping their type with
AS RECORD seems a bit confusing.  There's also the issue that VALUE is
already a keyword, record not...

> (Although once you look at it that way, it's just a cast spelled in
> an idiosyncratic fashion.)

Well, not quite, by virtue of keeping the original type around. After a
record cast you likely couldn't directly access the columns anymore,
even if it were a known composite type, right?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
>> Andres Freund  writes:
> On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
 I can't say that I like the proposed syntax much.

>>> Me neither. But I haven't really found a better approach.  It seems
>>> kinda consistent to have ROWS FROM (... AS ()) change the picked out
>>> columns to 0, and just return the whole thing.

>> I just remembered that we allow zero-column composite types, which
>> makes this proposal formally ambiguous.  So we really need a different
>> syntax.  I'm not especially in love with the cast-to-record idea, but
>> it does dodge that problem.

> I kind of like ROWS FROM (... AS VALUE), that seems to confer the
> meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
> really work inside ROWS FROM() where AS is required.

Hm, wouldn't ... AS RECORD convey the meaning better?

(Although once you look at it that way, it's just a cast spelled in
an idiosyncratic fashion.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
> >> I can't say that I like the proposed syntax much.
> 
> > Me neither. But I haven't really found a better approach.  It seems
> > kinda consistent to have ROWS FROM (... AS ()) change the picked out
> > columns to 0, and just return the whole thing.
> 
> I just remembered that we allow zero-column composite types, which
> makes this proposal formally ambiguous.  So we really need a different
> syntax.  I'm not especially in love with the cast-to-record idea, but
> it does dodge that problem.

I kind of like ROWS FROM (... AS VALUE), that seems to confer the
meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
really work inside ROWS FROM() where AS is required.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

Hi,

On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
> >> I can't say that I like the proposed syntax much.
> 
> > Me neither. But I haven't really found a better approach.  It seems
> > kinda consistent to have ROWS FROM (... AS ()) change the picked out
> > columns to 0, and just return the whole thing.
> 
> I just remembered that we allow zero-column composite types, which
> makes this proposal formally ambiguous.

Well, we errored out in the grammar for AS () so far... We might want to
fix that independently.

> Stepping back a little bit ... way back at the start of this thread
> you muttered about possibly implementing tSRFs as a special pipeline
> node type, a la Result.  That would have the same benefits in terms
> of being able to take SRF support out of the main execQual code paths.
> I, and I think some other people, felt that the LATERAL approach would
> be a cleaner answer --- but now that we're seeing some of the messy
> details required to make the LATERAL way work, I'm beginning to have
> second thoughts.  I wonder if we should do at least a POC implementation
> of the other way to get a better fix on which way is really cleaner.

I'm not particularly in love in restarting with a different approach. I
think fixing the ROWS FROM expansion is the only really painful bit, and
that seems like it's independently beneficial to allow for suppression
of expansion there.  I'm working on this to actually be finally able to
get some stuff from the "faster executor" thread in a committable
shape,...  The other stuff like making SELECT * FROM func; not
materialize also seems independently useful; it's something people have
complained about repeatedly over the years.

I actually had started to work on a Result style approach, and I don't
think it turned out that nice. But I didn't complete it, so I might just
be wrong.

> Also, one of the points that's come up repeatedly in these discussions
> is the way that the parser's implementation of *-expansion sucks for
> composite-returning functions.  That is, if you write
>   SELECT (foo(...)).* FROM ...
> you get
>   SELECT (foo(...)).col1, (foo(...)).col2, ... FROM ...
> so that the function is executed N times not once.  We had discussed
> fixing that for setof-composite-returning functions by folding multiple
> identical SRF calls into a single LATERAL entry, but that doesn't
> directly fix the problem for non-SRF composite functions.  Also the
> whole idea of having the planner undo the parser's damage in this way
> is kinda grotty, not least because we can't safely combine multiple
> calls of volatile functions, so it only works for not-volatile ones.

> That line of thought leads to the idea that if we could have the *parser*
> do the transformation to LATERAL form, we could avoid breaking a
> composite-returning function call into multiple copies in the first place.
> I had said that I didn't think we wanted this transformation done in the
> parser, but maybe this is a sufficient reason to do so.

I still don't like doing all this is in the parser. It'd just trigger
complaints of users that we're changing their query structure, and we'd
have to solve a good bit of the same problems we have to solve here.

If we really want to reduce the expansion cost - and to me that's a
largely independent issue from this - it seems better to have the parser
emit some structure that's easily recognized at plan time, rather than
have the praser do all the work.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
>> I can't say that I like the proposed syntax much.

> Me neither. But I haven't really found a better approach.  It seems
> kinda consistent to have ROWS FROM (... AS ()) change the picked out
> columns to 0, and just return the whole thing.

I just remembered that we allow zero-column composite types, which
makes this proposal formally ambiguous.  So we really need a different
syntax.  I'm not especially in love with the cast-to-record idea, but
it does dodge that problem.

Stepping back a little bit ... way back at the start of this thread
you muttered about possibly implementing tSRFs as a special pipeline
node type, a la Result.  That would have the same benefits in terms
of being able to take SRF support out of the main execQual code paths.
I, and I think some other people, felt that the LATERAL approach would
be a cleaner answer --- but now that we're seeing some of the messy
details required to make the LATERAL way work, I'm beginning to have
second thoughts.  I wonder if we should do at least a POC implementation
of the other way to get a better fix on which way is really cleaner.

Also, one of the points that's come up repeatedly in these discussions
is the way that the parser's implementation of *-expansion sucks for
composite-returning functions.  That is, if you write
SELECT (foo(...)).* FROM ...
you get
SELECT (foo(...)).col1, (foo(...)).col2, ... FROM ...
so that the function is executed N times not once.  We had discussed
fixing that for setof-composite-returning functions by folding multiple
identical SRF calls into a single LATERAL entry, but that doesn't
directly fix the problem for non-SRF composite functions.  Also the
whole idea of having the planner undo the parser's damage in this way
is kinda grotty, not least because we can't safely combine multiple
calls of volatile functions, so it only works for not-volatile ones.

That line of thought leads to the idea that if we could have the *parser*
do the transformation to LATERAL form, we could avoid breaking a
composite-returning function call into multiple copies in the first place.
I had said that I didn't think we wanted this transformation done in the
parser, but maybe this is a sufficient reason to do so.

If we think in terms of pipeline evaluation nodes rather than LATERAL,
we could implement the above by having the parser emit multiple levels
of SELECT some-expressions FROM (SELECT some-expressions FROM ...),
with SRFs being rigidly separated into their own evaluation levels.

I'm not certain that any of these ideas are worth the electrons they're
written on, but I do think we ought to consider alternatives and not
just push forward with committing a first-draft implementation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Let file_fdw access COPY FROM PROGRAM

2016-09-12 Thread Corey Huinker

On Mon, Sep 12, 2016 at 1:59 AM, Amit Langote  wrote:

> On 2016/09/11 8:04, Corey Huinker wrote:
> > V2 of this patch:
> >
> > Changes:
> > * rebased to most recent master
> > * removed non-tap test that assumed the existence of Unix sed program
> > * added non-tap test that assumes the existence of perl
> > * switched from filename/program to filename/is_program to more closely
> > follow patterns in copy.c
> > * slight wording change in C comments
>
> This version looks mostly good to me.  Except some whitespace noise in
> some hunks:
>
> @@ -139,7 +142,9 @@ static bool fileIsForeignScanParallelSafe(PlannerInfo
> *root, RelOptInfo *rel,
>   */
>  static bool is_valid_option(const char *option, Oid context);
>  static void fileGetOptions(Oid foreigntableid,
> -   char **filename, List **other_options);
> +   char **filename,
> +   bool *is_program,
>
> Space after "is_program,"
>
> @@ -196,16 +201,17 @@ file_fdw_validator(PG_FUNCTION_ARGS)
>
>  /*
>   * Only superusers are allowed to set options of a file_fdw foreign
> table.
> - * This is because the filename is one of those options, and we don't
> want
> - * non-superusers to be able to determine which file gets read.
> + * The reason for this is to prevent non-superusers from changing the
>
> Space after "the"
>
> -if (stat(filename, &stat_buf) == 0)
> +if ((! is_program) && (stat(filename, &stat_buf) == 0)))
>
> Space between ! and is_program.
>
>
> -if (strcmp(def->defname, "filename") == 0)
> +if ((strcmp(def->defname, "filename") == 0) ||
> (strcmp(def->defname, "program") == 0))
>
> I think the usual style would be to split the if statement into two lines
> as follows to keep within 80 characters per line [1]:
>
> +if ((strcmp(def->defname, "filename") == 0) ||
> +(strcmp(def->defname, "program") == 0))
>
> And likewise for:
>
> -   &fdw_private->filename, &fdw_private->options);
> +   &fdw_private->filename, &fdw_private->is_program,
> &fdw_private->options);
>
> By the way, doesn't the following paragraph in file-fdw.sgml need an
> update?
>
>  
>   Changing table-level options requires superuser privileges, for security
>   reasons: only a superuser should be able to determine which file is read.
>   In principle non-superusers could be allowed to change the other options,
>   but that's not supported at present.
>  
>
>
> I would like to mark this now as "Ready for Committer".
>
> Thanks,
> Amit
>
> [1] https://www.postgresql.org/docs/devel/static/source-format.html
>
>
>
(reposting non-top-posted...sorry)

Thanks for the review!

I agree with all the code cleanups suggested and have made then in the
attached patch, to save the committer some time.

Also in this patch, I changed sgml para to
 
  Changing table-level options requires superuser privileges, for security
  reasons: only a superuser should be able to determine which file is read
  or which program is run.  In principle non-superusers could be allowed to
  change the other options, but that's not supported at present.
 

"Determine" is an odd word in this context. I understand it to mean
"set/change", but I can see where a less familiar reader would take the
meaning to be "has permission to see the value already set". Either way, it
now mentions program as an option in addition to filename.
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index b471991..7f534b1 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -59,6 +59,7 @@ struct FileFdwOption
 static const struct FileFdwOption valid_options[] = {
/* File options */
{"filename", ForeignTableRelationId},
+   {"program", ForeignTableRelationId},
 
/* Format options */
/* oids option is not supported */
@@ -85,10 +86,11 @@ static const struct FileFdwOption valid_options[] = {
  */
 typedef struct FileFdwPlanState
 {
-   char   *filename;   /* file to read */
-   List   *options;/* merged COPY options, excluding 
filename */
-   BlockNumber pages;  /* estimate of file's physical 
size */
-   double  ntuples;/* estimate of number of rows 
in file */
+   char   *filename;   /* file/program to read */
+   boolis_program; /* true if filename represents 
an OS command */
+   List   *options;/* merged COPY options, excluding 
filename and program */
+   BlockNumber pages;  /* estimate of file or program 
output's physical size */
+   double  ntuples;/* estimate of number of rows 
in file/program output */
 } FileFdwPlanState;
 
 /*
@@ -96,9 +98,10 @@ typedef struct FileFdwPlanState
  */
 typedef struct FileFdwExecutionState
 {
-   char   *filename;   /* file to read */
-   L

Re: [HACKERS] Write Ahead Logging for Hash Indexes

2016-09-12 Thread Jesper Pedersen


Hi,

On 09/07/2016 05:58 AM, Amit Kapila wrote:

Okay, I have fixed this issue as explained above.  Apart from that, I
have fixed another issue reported by Mark Kirkwood upthread and few
other issues found during internal testing by Ashutosh Sharma.

The locking issue reported by Mark and Ashutosh is that the patch
didn't maintain the locking order while adding overflow page as it
maintains in other write operations (lock the bucket pages first and
then metapage to perform the write operation).  I have added the
comments in _hash_addovflpage() to explain the locking order used in
modified patch.

During stress testing with pgbench using master-standby setup, we
found an issue which indicates that WAL replay machinery doesn't
expect completely zeroed pages (See explanation of RBM_NORMAL mode
atop XLogReadBufferExtended).  Previously before freeing the overflow
page we were zeroing it, now I have changed it to just initialize the
page such that the page will be empty.

Apart from above, I have added support for old snapshot threshold in
the hash index code.

Thanks to Ashutosh Sharma for doing the testing of the patch and
helping me in analyzing some of the above issues.

I forgot to mention in my initial mail that Robert and I had some
off-list discussions about the design of this patch, many thanks to
him for providing inputs.



Some initial feedback.

README:
+in_complete split flag.  The reader algorithm works correctly, as it 
will scan


What flag ?

hashxlog.c:

hash_xlog_move_page_contents
hash_xlog_squeeze_page

Both have "bukcetbuf" (-> "bucketbuf"), and

+   if (BufferIsValid(bukcetbuf));

->

+   if (BufferIsValid(bucketbuf))

and indent following line:

LockBufferForCleanup(bukcetbuf);

hash_xlog_delete

has the "if" issue too.

hash.h:

Move the XLog related content to hash_xlog.h

Best regards,
 Jesper



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

Hi,

On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > 0003-Avoid-materializing-SRFs-in-the-FROM-list.patch
> >   To avoid performance regressions from moving SRFM_ValuePerCall SRFs to
> >   ROWS FROM, nodeFunctionscan.c needs to support not materializing
> >   output.
> 
> Personally I'd put this one later, as it's a performance optimization not
> part of the core patch IMO --- or is there something in the later ones
> that directly depends on it?  Anyway I'm setting it aside for now.

Not strongly dependant. But the ROWS FROM stuff touches a lot of the
same code. And I wanted to implement this before ripping out the current
implementation, to allow for meaningful performance tests.

> > 0004-Allow-ROWS-FROM-to-return-functions-as-single-record.patch
> >   To allow transforming SELECT record_srf(); nodeFunctionscan.c needs to
> >   learn to return the result as a record.  I chose
> >   ROWS FROM (record_srf() AS ()) as the syntax for that. It doesn't
> >   necessarily have to be SQL exposed, but it does make testing easier.
> 
> The proposed commit message is wrong, as it claims aclexplode()
> demonstrates the problem which it doesn't --- we get the column set
> from the function's declared OUT parameters.

Oops. I'd probably tested with some self defined function and was
looking for an example...

> I can't say that I like the proposed syntax much.

Me neither. But I haven't really found a better approach.  It seems
kinda consistent to have ROWS FROM (... AS ()) change the picked out
columns to 0, and just return the whole thing.

> What about leaving out
> any syntax changes, and simply saying that "if the function returns record
> and hasn't got OUT parameters, then return its result as an unexpanded
> record"?  That might not take much more than removing the error check ;-)

Having the ability to do this for non-record returning functions turned
out to be quite convenient. Otherwise we need to create ROW()
expressions for composite returning functions, which is neither cheap,
nor fun..

As you say, that might be doable with some form of casting, but,
ugh. I'm actually kind of surprised that even works. The function call
that nodeFunctionscan.c sees, isn't a function call, much less a set
returning one. Which means this hits the direct_function_call == false
path in ExecMakeTableFunctionResult(). If it didn't, we'd have hit
/* We don't allow sets in the arguments of the table function */
if (argDone != ExprSingleResult)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("set-valued function called in 
context that cannot accept a set")));
therein. Which you'd hit e.g. with
SELECT * FROM ROWS FROM (int4mul(generate_series(1, 2), 3));

Thus this actually relies on the SRF code path in execQual.c;
the thing we want to rip out...

> A possible objection is that then you could not get the no-expansion
> behavior for functions that return named composite types or have OUT
> parameters that effectively give them known composite types.  From a
> semantic standpoint we could fix that by saying "just cast the result to
> record", ie ROWS FROM (aclexplode('whatever')::record) would give the
> no-expansion behavior.  I'm not sure if there might be any implementation
> glitches in the way of doing it like that.

See above.  Personally I think just using explicit syntax is cleaner,
but I don't feel like arguing about this a whole lot.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] OpenSSL 1.1 breaks configure and more

2016-09-12 Thread Heikki Linnakangas


On 09/05/2016 02:52 PM, Heikki Linnakangas wrote:

On 09/05/2016 03:23 AM, Tom Lane wrote:

Judging by the number of people who have popped up recently with their
own OpenSSL 1.1 patches, I think there is going to be a lot of demand for
back-patching some sort of 1.1 support into our back branches.  All this
talk of refactoring does not sound very back-patchable.  Should we be
thinking of what we can extract that is back-patchable?


Yes, I think you're right.


I planned to commit this today, but while reading through it and 
testing, I ended up doing a bunch more changes, so this deserves another 
round of review.


Changes since last version:

* Added more error checks to the my_BIO_s_socket() function. Check for 
NULL result from malloc(). Check the return code of BIO_meth_set_*() 
functions; looking at OpenSSL sources, they always succeed, but all the 
test/example programs that come with OpenSSL do check them.


* Use BIO_get_new_index() to get the index number for the wrapper BIO.

* Also call BIO_meth_set_puts(). It was missing in previous patch versions.

* Fixed src/test/ssl test suite to also work with OpenSSL 1.1.0.

* Changed all references (in existing code) to SSLEAY_VERSION_NUMBER 
into OPENSSL_VERSION_NUMBER, for consistency.


* Squashed all into one patch.

I intend to apply this to all supported branches, so please have a look! 
This is now against REL9_6_STABLE, but there should be little difference 
between branches in the code that this touches.


- Heikki

>From d535c3941c66600f2e1716dfbdc1f1fff7eebbe0 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 12 Sep 2016 19:48:48 +0300
Subject: [PATCH 1/1] Support OpenSSL 1.1.0.

Changes needed to build at all:

- Check for SSL_new in configure, now that SSL_library_init is a macro.
- Do not access struct members directly. This includes some new code in
  pgcrypto, to use the resource owner mechanism to ensure that we don't
  leak OpenSSL handles, now that we can't embed them in other structs
  anymore.
- RAND_SSLeay() -> RAND_OpenSSL()

Changes that were needed to silence deprecation warnings, but were not
strictly necessary:

- RAND_pseudo_bytes() -> RAND_bytes().
- SSL_library_init() and OpenSSL_config() -> OPENSSL_init_ssl()
- ASN1_STRING_data() -> ASN1_STRING_get0_data()
- DH_generate_parameters() -> DH_generate_parameters()
- Locking callbacks are not needed with OpenSSL 1.1.0 anymore. (Good
  riddance!)

Also change references to SSLEAY_VERSION_NUMBER with OPENSSL_VERSION_NUMBER,
for the sake of consistency. OPENSSL_VERSION_NUMBER has existed since time
immemorial.

Fix SSL test suite to work with OpenSSL 1.1.0. CA certificates must have
the "CA:true" basic constraint extension now, or OpenSSL will refuse them.
Regenerate the test certificates with that. The "openssl" binary, used to
generate the certificates, is also now more picky, and throws an error
if an X509 extension is specified in "req_extensions", but that section
is empty.

Backpatch to all supported branches, per popular demand. In back-branches,
we still support OpenSSL 0.9.7 and above. OpenSSL 0.9.6 should still work
too, but I didn't test it. In master, we only support 0.9.8 and above.

Patch by Andreas Karlsson, with additional changes by me.

Discussion: <20160627151604.gd1...@msg.df7cb.de>
---
 configure  |  44 -
 configure.in   |   4 +-
 contrib/pgcrypto/internal.c|   9 --
 contrib/pgcrypto/openssl.c | 130 +++--
 contrib/pgcrypto/pgcrypto.c|   2 +-
 contrib/pgcrypto/pgp-s2k.c |   6 +-
 contrib/pgcrypto/px-crypt.c|   2 +-
 contrib/pgcrypto/px.h  |   1 -
 contrib/sslinfo/sslinfo.c  |  14 +--
 src/backend/libpq/be-secure-openssl.c  | 101 +++
 src/interfaces/libpq/fe-secure-openssl.c   |  96 +-
 src/interfaces/libpq/libpq-int.h   |   2 +-
 src/test/ssl/Makefile  |   5 +-
 src/test/ssl/cas.config|   7 +-
 src/test/ssl/root_ca.config|   4 +
 src/test/ssl/server-cn-only.config |   1 -
 src/test/ssl/server-no-names.config|   1 -
 src/test/ssl/server-revoked.config |   1 -
 src/test/ssl/ssl/both-cas-1.crt|  67 ++---
 src/test/ssl/ssl/both-cas-2.crt|  67 ++---
 src/test/ssl/ssl/client-revoked.crt|  16 +--
 src/test/ssl/ssl/client-revoked.key|  26 ++---
 src/test/ssl/ssl/client.crl|  12 +--
 src/test/ssl/ssl/client.crt|  16 +--
 src/test/ssl/ssl/client.key|  26 ++---
 src/test/ssl/ssl/client_ca.crt |  22 ++---
 src/test/ssl/ssl/client_ca.key |  26 ++---
 src/test/ssl/ssl/root+client.crl   |  22 ++-

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-09-12 Thread Peter Geoghegan

On Wed, Sep 7, 2016 at 11:44 AM, Peter Geoghegan  wrote:
> I attach my V3.

I should point out that I'm leaving to go on Vacation this weekend.
I'll be away for a week, and will not be answering mail during that
period. If anyone has any questions on this for me, it might be
preferable to ask them before I leave.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-09-12 11:29:37 -0400, Tom Lane wrote:
>> I'm on board with disallowing SRFs in UPDATE, because it produces
>> underdetermined and unspecified results; but the other restriction
>> seems 100% arbitrary.  There is no semantic difference between
>> SELECT a, b FROM ... ORDER BY srf();
>> and
>> SELECT a, b, srf() FROM ... ORDER BY 3;
>> except that in the first case the ordering column doesn't get returned to
>> the client.  I do not see why that's so awful that we should make it fail
>> after twenty years of allowing it.

> I do think it's awful that an ORDER BY / GROUP BY changes the number of
> rows processed. This should never have been allowed.

Meh.  That's just an opinion, and it's a bit late to be making such
changes.  I think the general consensus of the previous discussion was
that we would preserve existing tSRF behavior as far as it was reasonably
practical to do so, with the exception that there's wide agreement that
the least-common-multiple rule for number of rows emitted is bad.  I do
not think you're going to get anywhere near that level of agreement that
a SRF appearing only in ORDER BY is bad.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 11:29:37 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > 0002-Shore-up-some-weird-corner-cases-for-targetlist-SRFs.patch
> >   Forbid UPDATE ... SET foo = SRF() and ORDER BY / GROUP BY containing
> >   SRFs that would change the number of returned rows.  Without the
> >   latter e.g. SELECT 1 ORDER BY generate_series(1,10); returns 10 rows.
> 
> I'm on board with disallowing SRFs in UPDATE, because it produces
> underdetermined and unspecified results; but the other restriction
> seems 100% arbitrary.  There is no semantic difference between
>   SELECT a, b FROM ... ORDER BY srf();
> and
>   SELECT a, b, srf() FROM ... ORDER BY 3;
> except that in the first case the ordering column doesn't get returned to
> the client.  I do not see why that's so awful that we should make it fail
> after twenty years of allowing it.

I do think it's awful that an ORDER BY / GROUP BY changes the number of
rows processed. This should never have been allowed. You just need a
little typo somewhere that makes the targetlist entry not match the
ORDER/GROUP BY anymore and your results will differ in weird ways -
rather hard to debug in the GROUP BY case.


> And I certainly don't see why there
> would be an implementation reason why we couldn't support it anymore
> if we can still do the second one.

There's nothing requiring this here, indeed.


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Let file_fdw access COPY FROM PROGRAM

2016-09-12 Thread Corey Huinker

Thanks for the review!

I agree with all the code cleanups suggested and have made then in the
attached patch, to save the committer some time.

Also in this patch, I changed sgml para to
 
  Changing table-level options requires superuser privileges, for security
  reasons: only a superuser should be able to determine which file is read
  or which program is run.  In principle non-superusers could be allowed to
  change the other options, but that's not supported at present.
 

"Determine" is an odd word in this context. I understand it to mean
"set/change", but I can see where a less familiar reader would take the
meaning to be "has permission to see the value already set". Either way, it
now mentions program as an option in addition to filename.


On Mon, Sep 12, 2016 at 1:59 AM, Amit Langote  wrote:

> On 2016/09/11 8:04, Corey Huinker wrote:
> > V2 of this patch:
> >
> > Changes:
> > * rebased to most recent master
> > * removed non-tap test that assumed the existence of Unix sed program
> > * added non-tap test that assumes the existence of perl
> > * switched from filename/program to filename/is_program to more closely
> > follow patterns in copy.c
> > * slight wording change in C comments
>
> This version looks mostly good to me.  Except some whitespace noise in
> some hunks:
>
> @@ -139,7 +142,9 @@ static bool fileIsForeignScanParallelSafe(PlannerInfo
> *root, RelOptInfo *rel,
>   */
>  static bool is_valid_option(const char *option, Oid context);
>  static void fileGetOptions(Oid foreigntableid,
> -   char **filename, List **other_options);
> +   char **filename,
> +   bool *is_program,
>
> Space after "is_program,"
>
> @@ -196,16 +201,17 @@ file_fdw_validator(PG_FUNCTION_ARGS)
>
>  /*
>   * Only superusers are allowed to set options of a file_fdw foreign
> table.
> - * This is because the filename is one of those options, and we don't
> want
> - * non-superusers to be able to determine which file gets read.
> + * The reason for this is to prevent non-superusers from changing the
>
> Space after "the"
>
> -if (stat(filename, &stat_buf) == 0)
> +if ((! is_program) && (stat(filename, &stat_buf) == 0)))
>
> Space between ! and is_program.
>
>
> -if (strcmp(def->defname, "filename") == 0)
> +if ((strcmp(def->defname, "filename") == 0) ||
> (strcmp(def->defname, "program") == 0))
>
> I think the usual style would be to split the if statement into two lines
> as follows to keep within 80 characters per line [1]:
>
> +if ((strcmp(def->defname, "filename") == 0) ||
> +(strcmp(def->defname, "program") == 0))
>
> And likewise for:
>
> -   &fdw_private->filename, &fdw_private->options);
> +   &fdw_private->filename, &fdw_private->is_program,
> &fdw_private->options);
>
> By the way, doesn't the following paragraph in file-fdw.sgml need an
> update?
>
>  
>   Changing table-level options requires superuser privileges, for security
>   reasons: only a superuser should be able to determine which file is read.
>   In principle non-superusers could be allowed to change the other options,
>   but that's not supported at present.
>  
>
>
> I would like to mark this now as "Ready for Committer".
>
> Thanks,
> Amit
>
> [1] https://www.postgresql.org/docs/devel/static/source-format.html
>
>
>
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index b471991..7f534b1 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -59,6 +59,7 @@ struct FileFdwOption
 static const struct FileFdwOption valid_options[] = {
/* File options */
{"filename", ForeignTableRelationId},
+   {"program", ForeignTableRelationId},
 
/* Format options */
/* oids option is not supported */
@@ -85,10 +86,11 @@ static const struct FileFdwOption valid_options[] = {
  */
 typedef struct FileFdwPlanState
 {
-   char   *filename;   /* file to read */
-   List   *options;/* merged COPY options, excluding 
filename */
-   BlockNumber pages;  /* estimate of file's physical 
size */
-   double  ntuples;/* estimate of number of rows 
in file */
+   char   *filename;   /* file/program to read */
+   boolis_program; /* true if filename represents 
an OS command */
+   List   *options;/* merged COPY options, excluding 
filename and program */
+   BlockNumber pages;  /* estimate of file or program 
output's physical size */
+   double  ntuples;/* estimate of number of rows 
in file/program output */
 } FileFdwPlanState;
 
 /*
@@ -96,9 +98,10 @@ typedef struct FileFdwPlanState
  */
 typedef struct FileFdwExecutionState
 {
-   char   *filename;   /* file to read */
-   List   *options;/*

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund

On 2016-09-12 10:19:14 -0400, Tom Lane wrote:
> Andres Freund  writes:
> 
> > 0001-Add-some-more-targetlist-srf-tests.patch
> >   Add some test.
> 
> I think you should go ahead and push this tests-adding patch now, as it
> adds documentation of the current behavior that is a good thing to have
> independently of what the rest of the patchset does.  Also, I'm worried
> that some of the GROUP BY tests might have machine-dependent results
> (if they are implemented by hashing) so it would be good to get in a few
> buildfarm cycles and let that settle out before we start changing the
> answers.

Generally a sound plan - I started to noticeably expand it though,
there's some important edge cases it didn't cover.


> Although tsrf.sql doesn't currently need to create any views, it doesn't
> seem like a great idea to assume that it never will.  Maybe add this
> after misc_functions in the previous parallel group, instead?

WFM


> +-- it's weird to GROUP BYs that increase the number of results
> 
> "it's weird to have ..."
> 
> +-- nonsensically that seems to be allowed
> +UPDATE fewmore SET data = generate_series(4,9);
> 
> "nonsense that seems to be allowed..."
> 
> +-- SRFs are now allowed in RETURNING
> +INSERT INTO fewmore VALUES(1) RETURNING generate_series(1,3);
> 
> s/now/not/, apparently

Err, yes. Will update.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> 0003-Avoid-materializing-SRFs-in-the-FROM-list.patch
>   To avoid performance regressions from moving SRFM_ValuePerCall SRFs to
>   ROWS FROM, nodeFunctionscan.c needs to support not materializing
>   output.

Personally I'd put this one later, as it's a performance optimization not
part of the core patch IMO --- or is there something in the later ones
that directly depends on it?  Anyway I'm setting it aside for now.

> 0004-Allow-ROWS-FROM-to-return-functions-as-single-record.patch
>   To allow transforming SELECT record_srf(); nodeFunctionscan.c needs to
>   learn to return the result as a record.  I chose
>   ROWS FROM (record_srf() AS ()) as the syntax for that. It doesn't
>   necessarily have to be SQL exposed, but it does make testing easier.

The proposed commit message is wrong, as it claims aclexplode()
demonstrates the problem which it doesn't --- we get the column set
from the function's declared OUT parameters.

I can't say that I like the proposed syntax much.  What about leaving out
any syntax changes, and simply saying that "if the function returns record
and hasn't got OUT parameters, then return its result as an unexpanded
record"?  That might not take much more than removing the error check ;-)

A possible objection is that then you could not get the no-expansion
behavior for functions that return named composite types or have OUT
parameters that effectively give them known composite types.  From a
semantic standpoint we could fix that by saying "just cast the result to
record", ie ROWS FROM (aclexplode('whatever')::record) would give the
no-expansion behavior.  I'm not sure if there might be any implementation
glitches in the way of doing it like that.  Also there seems to be some
syntactic issue with it ... actually, the current behavior there is just
weird:

regression=# select * from rows from (aclexplode('{=r/postgres}')::record); 
ERROR:  syntax error at or near "::"
LINE 1: ...lect * from rows from (aclexplode('{=r/postgres}')::record);
 ^
regression=# select * from rows from (cast(aclexplode('{=r/postgres}') as 
record));
 grantor | grantee | privilege_type | is_grantable 
-+-++--
  10 |   0 | SELECT | f
(1 row)

I was not aware that there was *anyplace* in the grammar where :: and CAST
behaved differently, and I'm not very pleased to find this one.

I haven't looked at the code, as there probably isn't much point in
reviewing in any detail till we choose the syntax.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-12 Thread Peter Geoghegan

On Mon, Sep 12, 2016 at 8:47 AM, Claudio Freire  wrote:
> I spoke too soon, git AM had failed and I didn't notice.

I wrote the regression test that causes Postgres to crash with the
patch applied. It tests, among other things, that CLUSTER tuples are
fixed-up by a routine like the current MOVETUP(), which is removed in
Heikki's patch. (There was a 9.6 bug where CLUSTER was broken due to
that.)

It shouldn't be too difficult for Heikki to fix this.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> 0002-Shore-up-some-weird-corner-cases-for-targetlist-SRFs.patch
>   Forbid UPDATE ... SET foo = SRF() and ORDER BY / GROUP BY containing
>   SRFs that would change the number of returned rows.  Without the
>   latter e.g. SELECT 1 ORDER BY generate_series(1,10); returns 10 rows.

I'm on board with disallowing SRFs in UPDATE, because it produces
underdetermined and unspecified results; but the other restriction
seems 100% arbitrary.  There is no semantic difference between
SELECT a, b FROM ... ORDER BY srf();
and
SELECT a, b, srf() FROM ... ORDER BY 3;
except that in the first case the ordering column doesn't get returned to
the client.  I do not see why that's so awful that we should make it fail
after twenty years of allowing it.  And I certainly don't see why there
would be an implementation reason why we couldn't support it anymore
if we can still do the second one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-12 Thread Claudio Freire

On Sun, Sep 11, 2016 at 12:47 PM, Heikki Linnakangas  wrote:
> Here's a new version of these patches, rebased over current master. I
> squashed the two patches into one, there's not much point to keep them
> separate.


I don't know what was up with the other ones, but this one works fine.

Benchmarking now.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-12 Thread Kevin Grittner

On Mon, Sep 12, 2016 at 8:14 AM, Kevin Grittner  wrote:
> On Mon, Sep 12, 2016 at 7:55 AM, Tom Lane  wrote:

>> The query you committed is flat wrong, because it doesn't
>> account for database ownership being granted via role membership.
>
> Ah, there was a flaw in my testing script.  It appeared from my
> (flawed) testing that the inherited ownership wasn't enough to
> allow use as a template.  With the script fixed I see that it does.
>
> Sorry for the noise.  Will fix.

Done.

Thanks guys!

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane

Andres Freund  writes:
> Attached is a significantly updated patch series (see the mail one up
> for details about what this is, I don't want to quote it in its
> entirety).

I've finally cleared my plate enough to start reviewing this patchset.

> 0001-Add-some-more-targetlist-srf-tests.patch
>   Add some test.

I think you should go ahead and push this tests-adding patch now, as it
adds documentation of the current behavior that is a good thing to have
independently of what the rest of the patchset does.  Also, I'm worried
that some of the GROUP BY tests might have machine-dependent results
(if they are implemented by hashing) so it would be good to get in a few
buildfarm cycles and let that settle out before we start changing the
answers.

I do have some trivial nitpicks about 0001:

 # rules cannot run concurrently with any test that creates a view
-test: rules psql_crosstab amutils
+test: rules psql_crosstab amutils tsrf

Although tsrf.sql doesn't currently need to create any views, it doesn't
seem like a great idea to assume that it never will.  Maybe add this
after misc_functions in the previous parallel group, instead?

+-- it's weird to GROUP BYs that increase the number of results

"it's weird to have ..."

+-- nonsensically that seems to be allowed
+UPDATE fewmore SET data = generate_series(4,9);

"nonsense that seems to be allowed..."

+-- SRFs are now allowed in RETURNING
+INSERT INTO fewmore VALUES(1) RETURNING generate_series(1,3);

s/now/not/, apparently

More to come later, but 0001 is pushable with these fixes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v19)

2016-09-12 Thread Dean Rasheed

On 3 August 2016 at 02:58, Tomas Vondra  wrote:
> Attached is v19 of the "multivariate stats" patch series

Hi,

I started looking at this - just at a very high level - I've not read
much of the detail yet, but here are some initial review comments.

I think the overall infrastructure approach for CREATE STATISTICS
makes sense, and I agree with other suggestions upthread that it would
be useful to be able to build statistics on arbitrary expressions,
although that doesn't need to be part of this patch, it's useful to
keep that in mind as a possible future extension of this initial
design.

I can imagine it being useful to be able to create user-defined
statistics on an arbitrary list of expressions, and I think that would
include univariate as well as multivariate statistics. Perhaps that's
something to take into account in the naming of things, e.g., as David
Rowley suggested, something like pg_statistic_ext, rather than
pg_mv_statistic.

I also like the idea that this might one day be extended to support
statistics across multiple tables, although I think that might be
challenging to achieve -- you'd need a method of taking a random
sample of rows from a join between 2 or more tables. However, if the
intention is to be able to support that one day, I think that needs to
be accounted for in the syntax now -- specifically, I think it will be
too limiting to only support things extending the current syntax of
the form table1(col1, col2, ...), table2(col1, col2, ...), because
that precludes building statistics on an expression referring to
columns from more than one table. So I think we should plan further
ahead and use a syntax giving greater flexibility in the future, for
example something structured more like a query (like CREATE VIEW):

CREATE STATISTICS name
  [ WITH (options) ]
  ON expression [, ...]
  FROM table [, ...]
  WHERE condition

where the first version of the patch would only support expressions
that are simple column references, and would require at least 2 such
columns from a single table with no WHERE clause, i.e.:

CREATE STATISTICS name
  [ WITH (options) ]
  ON column1, column2 [, ...]
  FROM table

For multi-table statistics, a WHERE clause would typically be needed
to specify how the tables are expected to be joined, but potentially
such a clause might also be useful in single-table statistics, to
build partial statistics on a commonly queried subset of the table,
just like a partial index.

Of course, I'm not suggesting that the current patch do any of that --
it's big enough as it is. I'm just throwing out possible future
directions this might go in, so that we don't get painted into a
corner when designing the syntax for the current patch.


Regarding the statistics themselves, I read the description of soft
functional dependencies, and I'm somewhat skeptical about that
algorithm. I don't like the arbitrary thresholds or the sudden jump
from independence to dependence and clause reduction. As others have
said, I think this should account for a continuous spectrum of
dependence from fully independent to fully dependent, and combine
clause selectivities in a way based on the degree of dependence. For
example, if you computed an estimate for the fraction 'f' of the
table's rows for which a -> b, then it might be reasonable to combine
the selectivities using

  P(a,b) = P(a) * (f + (1-f) * P(b))

Of course, having just a single number that tells you the columns are
correlated, tells you nothing about whether the clauses on those
columns are consistent with that correlation. For example, in the
following table

CREATE TABLE t(a int, b int);
INSERT INTO t SELECT x/10, ((x/10)*789)%100 FROM generate_series(0,999) g(x);

'b' is functionally dependent on 'a' (and vice versa), but if you
query the rows with a<50 and with b<50, those clauses behave
essentially independently, because they're not consistent with the
functional dependence between 'a' and 'b', so the best way to combine
their selectivities is just to multiply them, as we currently do.

So whilst it may be interesting to determine that 'b' is functionally
dependent on 'a', it's not obvious whether that fact by itself should
be used in the selectivity estimates. Perhaps it should, on the
grounds that it's best to attempt to use all the available
information, but only if there are no more detailed statistics
available. In any case, knowing that there is a correlation can be
used as an indicator that it may be worthwhile to build more detailed
multivariate statistics like a MCV list or a histogram on those
columns.


Looking at the ndistinct coefficient 'q', I think it would be better
if the recorded statistic were just the estimate for
ndistinct(a,b,...) rather than a ratio of ndistinct values. That's a
more fundamental statistic, and it's easier to document and easier to
interpret. Also, I don't believe that the coefficient 'q' is the right
number to use for clause estimation:

Looking at README.ndistinct, I'm skeptical about the

Re: [HACKERS] Hash Indexes

2016-09-12 Thread Jesper Pedersen


On 09/01/2016 11:55 PM, Amit Kapila wrote:

I have fixed all other issues you have raised.  Updated patch is
attached with this mail.



The following script hangs on idx_val creation - just with v5, WAL patch 
not applied.


Best regards,
 Jesper



zero.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH] Tab completion for ALTER TYPE … RENAME VALUE …

2016-09-12 Thread Dagfinn Ilmari Mannsåker

ilm...@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

> Hi hackers,
>
> Here's a patch to add psql tab completion for the recently-added ALTER
> TYPE … RENAME VALUE feature (thanks to Tom for fixing it up and
> committing it).

I've added it to the 2016-11 commit fest:
https://commitfest.postgresql.org/11/795/

- ilmari

-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-12 Thread Kevin Grittner

On Mon, Sep 12, 2016 at 7:55 AM, Tom Lane  wrote:
> Kevin Grittner  writes:
>> But that gives incorrect results for the case I asked about earlier
>> on the thread, while the query I pushed gives correct results:
>
> AFAICS, my query gives correct results for that case.  bob is permitted
> to copy fred's databases db1 and postgres, or would be if he had createdb
> privileges.  The query you committed is flat wrong, because it doesn't
> account for database ownership being granted via role membership.

Ah, there was a flaw in my testing script.  It appeared from my
(flawed) testing that the inherited ownership wasn't enough to
allow use as a template.  With the script fixed I see that it does.

Sorry for the noise.  Will fix.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-12 Thread Tom Lane

Kevin Grittner  writes:
> But that gives incorrect results for the case I asked about earlier
> on the thread, while the query I pushed gives correct results:

AFAICS, my query gives correct results for that case.  bob is permitted
to copy fred's databases db1 and postgres, or would be if he had createdb
privileges.  The query you committed is flat wrong, because it doesn't
account for database ownership being granted via role membership.

> There is absolutely no question that the query you suggest is
> wrong;

Please provide some evidence of that.

> the only case I had to think about very hard after
> establishing that CREATEDB was not inherited was when bob owned a
> database but did not have CREATEDB permission.

I agree that we should not look at createdb here.  If we did, the
effect would be that for a user without createdb, psql would refuse
to offer any completions, which seems more confusing than helpful.
(If it were the tab completion's code to enforce that, which it isn't,
surely it should have complained much earlier in the command.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-12 Thread Michael Paquier

On Mon, Sep 12, 2016 at 9:08 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Mon, Sep 12, 2016 at 4:19 PM, Simon Riggs  wrote:
>>> Why would we need to backpatch this commit?
>
>> You are right there is no need to get that into 9.6. Sorry for the mistake.
>
> Oh, that's my fault, I'd incorrectly remembered this commit as having been
> further back than it was.  HEAD-only is correct so far as fixing
> Fujii-san's original complaint is concerned.  However, don't we have the
> problem that am_db_walsender processes will show up in pg_stat_activity
> anyway?

Yes, those show up..

> Do we want to do something about those further back?

Hiding them is not something that we should do, and changing the query
field to show something that we think is helpful may impact existing
applications that rely on the fact that this field is NULL. So I'd
vote for doing nothing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-12 Thread Vitaly Burovoy

On 9/12/16, Kevin Grittner  wrote:
> On Sun, Sep 11, 2016 at 8:07 PM, Tom Lane  wrote:
>
>> I wasn't aware that this patch was doing anything nontrivial ...
>
> Well, it is not doing anything other than showing candidate
> templates for tab completion beyond those flagged as template
> databases.
>
>> After looking at it I think it's basically uninformed about how to test
>> for ownership.  An explicit join against pg_roles is almost never the
>> right way for SQL queries to do that.  Lose the join and write it more
>> like this:
>>
>> +"SELECT pg_catalog.quote_ident(d.datname) "\
>> +"  FROM pg_catalog.pg_database d "\
>> +" WHERE substring(pg_catalog.quote_ident(d.datname),1,%d)='%s' "\
>> +"   AND (d.datistemplate OR pg_catalog.pg_has_role(d.datdba, 'USAGE'))"
>
> But that gives incorrect results for the case I asked about earlier
> on the thread, while the query I pushed gives correct results:
>
> test=# \du fred
>   List of roles
>  Role name |   Attributes| Member of
> ---+-+---
>  fred  | Create DB, Cannot login | {}
>
> test=# \du bob
>List of roles
>  Role name | Attributes | Member of
> ---++---
>  bob   || {fred}
>
> test=# \l
>  List of databases
> Name|  Owner  | Encoding |   Collate   |Ctype|  Access
> privileges
> +-+--+-+-+-
>  db1| fred| UTF8 | en_US.UTF-8 | en_US.UTF-8 |
>  db2| bob | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
>  postgres   | fred| UTF8 | en_US.UTF-8 | en_US.UTF-8 |
>  regression | kgrittn | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
> =Tc/kgrittn+
> | |  | | |
> kgrittn=CTc/kgrittn
>  template0  | kgrittn | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
> =c/kgrittn +
> | |  | | |
> kgrittn=CTc/kgrittn
>  template1  | kgrittn | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
> =c/kgrittn +
> | |  | | |
> kgrittn=CTc/kgrittn
>  test   | kgrittn | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
> (7 rows)
>
> test=# set role bob;
> SET
> test=> SELECT pg_catalog.quote_ident(d.datname)
>   FROM pg_catalog.pg_database d
>  WHERE (d.datistemplate OR pg_catalog.pg_has_role(d.datdba, 'USAGE'));
>  quote_ident
> -
>  template1
>  template0
>  postgres
>  db1
>  db2
> (5 rows)
>
> test=> SELECT pg_catalog.quote_ident(datname)
>   FROM pg_catalog.pg_database d
>   JOIN pg_catalog.pg_roles r ON rolname = CURRENT_USER
>  WHERE (datistemplate OR rolsuper OR datdba = r.oid);
>  quote_ident
> -
>  template1
>  template0
>  db2
> (3 rows)
>
> There is absolutely no question that the query you suggest is
> wrong; the only case I had to think about very hard after
> establishing that CREATEDB was not inherited was when bob owned a
> database but did not have CREATEDB permission.  It seemed to me
> least astonishing to show such a database, because that seemed
> parallel to, for example, tab completion for table names on CREATE
> TABLE AS.  We show a database object in the tab completion when it
> would be available except for absence of a permission on the user.
> That seems sane to me, because the attempt to actually execute with
> that object gives a potentially useful error message.  Anyway, I
> tend to like symmetry in these things -- it could also be
> considered sane not to show t2 on tab completion fro bob above, but
> then we should probably add more security tests to other tab
> completion queries, like CREATE TABLE AS ... SELECT ... FROM.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

While I was checking Tom's version, I thought that way, but finally I
realized there is no matter what tab completion shows because user
without the CREATEDB privileges can create no database at all:

postgres=# \du fred
  List of roles
 Role name |   Attributes| Member of
---+-+---
 fred  | Create DB, Cannot login | {}

postgres=# \du bob
   List of roles
 Role name | Attributes | Member of
---++---
 bob   || {fred}

postgres=# \l db*
List of databases
 Name | Owner | Encoding |   Collate   |Ctype| Access privileges
--+---+--+-+-+---
 db1  | fred  | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 db2  | bob   | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(2 rows)

postgres=# set role bob;
SET
postgres=> CREATE DATABASE ss TEMPLATE db  -- shows both
db1  db2
postgres=> CREATE DATABASE ss TEMPLATE db2;
ERROR:  permission denied to create database
postgres=>

So a check for the CREATEDB privilege should

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-09-12 Thread Etsuro Fujita


On 2016/09/09 21:35, Etsuro Fujita wrote:

On 2016/09/08 19:51, Etsuro Fujita wrote:

On 2016/09/06 22:07, Ashutosh Bapat wrote:

This patch tries to do two things at a time 1. support join pushdown for
FULL join when the joining relations have remote conditions 2. better
support for fetching placeholder vars, whole row references and some
system columns. To make reviews easy, I think we should split the patch
into two 1. supporting subqueries to be deparsed with support for one of
the above (I will suggest FULL join support) 2. support for the other.



OK, will try.



I extracted #1 from the patch.  Attached is a patch for that.  If that
patch is reasonable, I'll create a patch for #2 on top of it.


Attached is a patch for #2.  In that patch I fixed some bugs and added a 
bit more comments.  For testing, please apply the patch for #1 first.


Best regards,
Etsuro Fujita


postgres-fdw-phv-pushdown-v1.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...

2016-09-12 Thread Kevin Grittner

On Sun, Sep 11, 2016 at 8:07 PM, Tom Lane  wrote:

> I wasn't aware that this patch was doing anything nontrivial ...

Well, it is not doing anything other than showing candidate
templates for tab completion beyond those flagged as template
databases.

> After looking at it I think it's basically uninformed about how to test
> for ownership.  An explicit join against pg_roles is almost never the
> right way for SQL queries to do that.  Lose the join and write it more
> like this:
>
> +"SELECT pg_catalog.quote_ident(d.datname) "\
> +"  FROM pg_catalog.pg_database d "\
> +" WHERE substring(pg_catalog.quote_ident(d.datname),1,%d)='%s' "\
> +"   AND (d.datistemplate OR pg_catalog.pg_has_role(d.datdba, 'USAGE'))"

But that gives incorrect results for the case I asked about earlier
on the thread, while the query I pushed gives correct results:

test=# \du fred
  List of roles
 Role name |   Attributes| Member of
---+-+---
 fred  | Create DB, Cannot login | {}

test=# \du bob
   List of roles
 Role name | Attributes | Member of
---++---
 bob   || {fred}

test=# \l
 List of databases
Name|  Owner  | Encoding |   Collate   |Ctype|  Access
privileges
+-+--+-+-+-
 db1| fred| UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 db2| bob | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 postgres   | fred| UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 regression | kgrittn | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=Tc/kgrittn+
| |  | | |
kgrittn=CTc/kgrittn
 template0  | kgrittn | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=c/kgrittn +
| |  | | |
kgrittn=CTc/kgrittn
 template1  | kgrittn | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
=c/kgrittn +
| |  | | |
kgrittn=CTc/kgrittn
 test   | kgrittn | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(7 rows)

test=# set role bob;
SET
test=> SELECT pg_catalog.quote_ident(d.datname)
  FROM pg_catalog.pg_database d
 WHERE (d.datistemplate OR pg_catalog.pg_has_role(d.datdba, 'USAGE'));
 quote_ident
-
 template1
 template0
 postgres
 db1
 db2
(5 rows)

test=> SELECT pg_catalog.quote_ident(datname)
  FROM pg_catalog.pg_database d
  JOIN pg_catalog.pg_roles r ON rolname = CURRENT_USER
 WHERE (datistemplate OR rolsuper OR datdba = r.oid);
 quote_ident
-
 template1
 template0
 db2
(3 rows)

There is absolutely no question that the query you suggest is
wrong; the only case I had to think about very hard after
establishing that CREATEDB was not inherited was when bob owned a
database but did not have CREATEDB permission.  It seemed to me
least astonishing to show such a database, because that seemed
parallel to, for example, tab completion for table names on CREATE
TABLE AS.  We show a database object in the tab completion when it
would be available except for absence of a permission on the user.
That seems sane to me, because the attempt to actually execute with
that object gives a potentially useful error message.  Anyway, I
tend to like symmetry in these things -- it could also be
considered sane not to show t2 on tab completion fro bob above, but
then we should probably add more security tests to other tab
completion queries, like CREATE TABLE AS ... SELECT ... FROM.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-12 Thread Tom Lane

Michael Paquier  writes:
> On Mon, Sep 12, 2016 at 4:19 PM, Simon Riggs  wrote:
>> Why would we need to backpatch this commit?

> You are right there is no need to get that into 9.6. Sorry for the mistake.

Oh, that's my fault, I'd incorrectly remembered this commit as having been
further back than it was.  HEAD-only is correct so far as fixing
Fujii-san's original complaint is concerned.  However, don't we have the
problem that am_db_walsender processes will show up in pg_stat_activity
anyway?  Do we want to do something about those further back?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-12 Thread Jeevan Chalke

On Mon, Sep 12, 2016 at 12:20 PM, Prabhat Sahu <
prabhat.s...@enterprisedb.com> wrote:

> Hi,
>
> While testing "Aggregate pushdown", i found the below error:
> -- GROUP BY alias showing different behavior after adding patch.
>
> -- Create table "t1", insert few records.
> create table t1(c1 int);
> insert into t1 values(10), (20);
>
> -- Create foreign table:
> create foreign table f_t1 (c1 int) server db1_server options (table_name
> 't1');
>
> -- with local table:
> postgres=# select 2 a, avg(c1) from t1 group by a;
>  a | avg
> ---+-
>  2 | 15.
> (1 row)
>
> -- with foreign table:
> postgres=# select 2 a, avg(c1) from f_t1 group by a;
> ERROR:  aggregate functions are not allowed in GROUP BY
> CONTEXT:  Remote SQL command: EXPLAIN SELECT 2, avg(c1) FROM public.t1
> GROUP BY 2
>
>
>
Thanks for reporting this bug in *v1.patch Prabhat.

I will have a look over this issue and will post a fix in next version.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-12 Thread Jeevan Chalke

On Thu, Sep 8, 2016 at 10:41 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
>
>
>> While checking for shippability, we build the target list which is passed
>> to
>> the foreign server as fdw_scan_tlist. The target list contains
>> a. All the GROUP BY expressions
>> b. Shippable entries from the target list of upper relation
>> c. Var and Aggref nodes from non-shippable entries from the target list of
>> upper relation
>>
>
> The code in the patch doesn't seem to add Var nodes explicitly. It assumes
> that
> the Var nodes will be part of GROUP BY clause. The code is correct, I
> think.
>

Yes. Code is correct. Var nodes are already part of GROUP BY else we hit
error well before this point.

Thanks Ashutosh for the detailed review comments.

I am working on it and will post updated patch once I fix all your concerns.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

[HACKERS] [PATCH] Tab completion for ALTER TYPE … RENAME VALUE …

2016-09-12 Thread Dagfinn Ilmari Mannsåker

Hi hackers,

Here's a patch to add psql tab completion for the recently-added ALTER
TYPE … RENAME VALUE feature (thanks to Tom for fixing it up and
committing it).

It's modelled on the ALTER TYPE … RENAME ATTRIBUTE completion, but
tweaked to return string literals instead of identifiers.

- ilmari

>From f4fa474262e6e65f02095f9de09205bff7ea2a1d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 12 Sep 2016 12:17:37 +0100
Subject: [PATCH] =?UTF-8?q?Add=20psql=20tab=20completion=20for=20ALTER=20T?=
 =?UTF-8?q?YPE=20=E2=80=A6=20RENAME=20VALUE?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Modelled on the completion for attributes, tweaked to return string
literals intead of identifiers.
---
 src/bin/psql/tab-complete.c | 57 +++--
 1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 3e2f084..40790c9 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -202,6 +202,31 @@ do { \
 	matches = completion_matches(text, complete_from_query); \
 } while (0)
 
+#define COMPLETE_WITH_ENUM_VALUE(type) \
+do { \
+	char   *_completion_schema; \
+	char   *_completion_type; \
+\
+	_completion_schema = strtokx(type, " \t\n\r", ".", "\"", 0, \
+ false, false, pset.encoding); \
+	(void) strtokx(NULL, " \t\n\r", ".", "\"", 0, \
+   false, false, pset.encoding); \
+	_completion_type = strtokx(NULL, " \t\n\r", ".", "\"", 0, \
+			   false, false, pset.encoding);  \
+	if (_completion_type == NULL)\
+	{ \
+		completion_charp = Query_for_list_of_enum_values; \
+		completion_info_charp = type; \
+	} \
+	else \
+	{ \
+		completion_charp = Query_for_list_of_enum_values_with_schema; \
+		completion_info_charp = _completion_type; \
+		completion_info_charp2 = _completion_schema; \
+	} \
+	matches = completion_matches(text, complete_from_query); \
+} while (0)
+
 #define COMPLETE_WITH_FUNCTION_ARG(function) \
 do { \
 	char   *_completion_schema; \
@@ -598,6 +623,26 @@ static const SchemaQuery Query_for_list_of_matviews = {
 "   AND (pg_catalog.quote_ident(nspname)='%s' "\
 "OR '\"' || nspname || '\"' ='%s') "
 
+#define Query_for_list_of_enum_values \
+"SELECT pg_catalog.quote_literal(enumlabel) "\
+"  FROM pg_catalog.pg_enum e, pg_catalog.pg_type t "\
+" WHERE t.oid = e.enumtypid "\
+"   AND substring(pg_catalog.quote_literal(enumlabel),1,%d)='%s' "\
+"   AND (pg_catalog.quote_ident(typname)='%s' "\
+"OR '\"' || typname || '\"'='%s') "\
+"   AND pg_catalog.pg_type_is_visible(t.oid)"
+
+#define Query_for_list_of_enum_values_with_schema \
+"SELECT pg_catalog.quote_literal(enumlabel) "\
+"  FROM pg_catalog.pg_enum e, pg_catalog.pg_type t, pg_catalog.pg_namespace n "\
+" WHERE t.oid = e.enumtypid "\
+"   AND n.oid = t.typnamespace "\
+"   AND substring(pg_catalog.quote_literal(enumlabel),1,%d)='%s' "\
+"   AND (pg_catalog.quote_ident(typname)='%s' "\
+"OR '\"' || typname || '\"'='%s') "\
+"   AND (pg_catalog.quote_ident(nspname)='%s' "\
+"OR '\"' || nspname || '\"' ='%s') "
+
 #define Query_for_list_of_template_databases \
 "SELECT pg_catalog.quote_ident(d.datname) "\
 "  FROM pg_catalog.pg_database d "\
@@ -1873,11 +1918,13 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_LIST2("ATTRIBUTE", "VALUE");
 	/* ALTER TYPE  RENAME	*/
 	else if (Matches4("ALTER", "TYPE", MatchAny, "RENAME"))
-		COMPLETE_WITH_LIST2("ATTRIBUTE", "TO");
+		COMPLETE_WITH_LIST3("ATTRIBUTE", "TO", "VALUE");
 	/* ALTER TYPE xxx RENAME ATTRIBUTE yyy */
 	else if (Matches6("ALTER", "TYPE", MatchAny, "RENAME", "ATTRIBUTE", MatchAny))
 		COMPLETE_WITH_CONST("TO");
-
+	/* ALTER TYPE xxx RENAME VALUE yyy */
+	else if (Matches6("ALTER", "TYPE", MatchAny, "RENAME", "VALUE", MatchAny))
+		COMPLETE_WITH_CONST("TO");
 	/*
 	 * If we have ALTER TYPE  ALTER/DROP/RENAME ATTRIBUTE, provide list
 	 * of attributes
@@ -1897,6 +1944,12 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches5("ALTER", "GROUP", MatchAny, "ADD|DROP", "USER"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_roles);
 
+	/*
+	 * If we have ALTER TYPE  RENAME VALUE, provide list of enum values
+	 */
+	else if (Matches5("ALTER", "TYPE", MatchAny, "RENAME", "VALUE"))
+		COMPLETE_WITH_ENUM_VALUE(prev3_wd);
+
 /* BEGIN */
 	else if (Matches1("BEGIN"))
 		COMPLETE_WITH_LIST6("WORK", "TRANSACTION", "ISOLATION LEVEL", "READ", "DEFERRABLE", "NOT DEFERRABLE");
-- 
2.9.3


-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refactoring of heapam code.

2016-09-12 Thread Pavan Deolasee

On Tue, Sep 6, 2016 at 8:39 PM, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> 06.09.2016 07:44, Pavan Deolasee:
>
> 2. I don't understand the difference between PageGetItemHeapHeaderOnly()
> and PageGetItemHeap(). They seem to do exactly the same thing and can be
> used interchangeably.
>
>
> The only difference between these two macros is that
> PageGetItemHeapHeaderOnly() doesn't touch any key fields of a tuple,
> it only checks header fields (see HeapTupleHeaderData). I divided it into
> two separate functions, while I was working on page compression and
> I wanted to avoid unnecessary decompression calls. These names are
> just a kind of 'markers' to make the code reading and improving easier.
>
>
Ok. I still don't get it, but that's probably because I haven't seen a real
use case of that. Right now, both macros look exactly the same.


> 3. While I like the idea of using separate interfaces to get heap/index
> tuple from a page, may be they can internally use a common interface
> instead of duplicating what PageGetItem() does already.
>
>
> I don't sure I get it right. Do you suggest to leave PageGetItem and write
> PageGetItemHeap() and PageGetItemIndex() as wrappers on it?
> It sounds reasonable while we have similar layout for both heap and index
> pages.
> In any case, it'll be easy to separate them when necessary.
>
>
Yes, that's what I was thinking.


> 4. Should we rename PageGetItemHeap() to PageGetHeapTuple() or something
> like that?
>
>
> I don't feel like doing that, because HeapTuple is a different structure.
> What we do get from page is a HeapTupleHeaderData structure
> followed by user's data.
>

Ok, makes sense.


>
>
> I also looked at the refactoring design doc. Looks like a fine approach to
> me, but the API will need much more elaborate discussions. I am not sure if
> the interfaces as presented are enough to support everything that even
> heapam can do today.
>
>
> What features of heapam do you think could be unsupportable in this API?
> Maybe I've just missed them.
>

I was thinking about locking, bulk reading (like page-mode API) etc. While
you've added an API for vacuuming, we would probably also need an API to
collect dead tuples, pruning etc (not sure if that can be combined with
vacuum). Of course, these are probably very specific to current
implementation of heap/MVCC and not all storages will need that.


>
> I suggest refactoring, that will allow us to develop new heap-like access
> methods.
> For the first version, they must have methods to "heapify" tuple i.e turn
> internal
> representation of the data to regular HeapTuple, for example put some
> stubs into
> MVCC fields. Of course this approach has its disadvantages, such as
> performance issues.
> It definitely won't be enough to write column storage or to implement
> other great
> data structures. But it allows us not to depend of the Executor's code.
>
>
Ok, understood.


>
> - There are many upper modules that need access to system attributes
> (xmin, xmax for starters). How do you plan to handle that? You think we can
> provide enough abstraction so that the callers don't need to know the tuple
> structures of individual PAM?
>
>
> To be honest, I didn't thought about it.
> Do you mean external modules or upper levels of abstraction in Postgres?
>

I meant upper levels of abstraction like the executor. For example, while
inserting a new tuple, the executor (the index AM's insert routine to be
precise) may need to wait for another transaction to finish. Today, it can
easily get that information by looking at the xmax of the conflicting
tuple. How would we handle that with abstraction since not every PAM will
have a notion of xmax?

Thanks,
Pavan

 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] asynchronous and vectorized execution

2016-09-12 Thread Kyotaro HORIGUCHI

Hello,

At Thu, 01 Sep 2016 16:12:31 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20160901.161231.110068639.horiguchi.kyot...@lab.ntt.co.jp>
> There's perfomance degradation for non-asynchronous nodes, as
> shown as 't0' below.
> 
> The patch adds two "if-then" and one additional function call as
> asynchronous stuff into ExecProcnode, which is heavily passed and
> foremerly consists only five meaningful lines. The stuff slows
> performance by about 1% for simple seqscan case. The following is
> the performance numbers previously shown upthread.  (Or the
> difference might be too small to get meaningful performance
> difference..)

I tried __builtin_expect before moving the stuff out of
execProcNode. (attached patch) I found a conversation about the
pragma in past discussion.

https://www.postgresql.org/message-id/CA+TgmoYknejCgWMb8Tg63qA67JoUG2uCc0DZc5mm9=e18am...@mail.gmail.com

> If we can show cases where it reliably produces a significant
> speedup, then I would think it would be worthwhile

I got a result as the followings.

master(67e1e2a)-O2
  time(ms)  stddev(ms)
  t0: 3928.22 (  0.40)   # Simple SeqScan only
  pl: 1665.14 (  0.53)   # Append(SeqScan)

Patched-O2 / NOT Use __builtin_expect
  t0: 4042.69 (  0.92)degradation to master is 2.9%
  pl: 1698.46 (  0.44)degradation to master is 2.0%

Patched-O2 / Use __builtin_expect
  t0: 3886.69 (  1.93)*gain* to master is 1.06%
  pl: 1671.66 (  0.67)degradation to master is 0.39%

I haven't directly seen the pragmra's implication for
optimization on surrounding code but I suspect there's some
implication. I also tried the pragma to ExecAppend but no
difference seen. The numbers flucture easily by any changes in
the machine's state so the lower digits aren't trustworthy but
several succeeding repetitions showed fluctuations up to some
milliseconds.

execProcNode will be allowed to be as it is if __builtin_expect
is usable but ExecAppend still needs an improvement.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From f1f02557f7f4d694f0e3b4d62a6bdfad8e746b03 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 12 Sep 2016 16:36:37 +0900
Subject: [PATCH] Use __builtin_expect to optimize branches

__builtin_expect can minimize the cost of failure of branch prediction
for certain cases. It seems to work very fine here.
---
 src/backend/executor/execProcnode.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index cef262b..c247fa3 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -585,13 +585,22 @@ ExecExecuteNode(PlanState *node)
  *  execution subtree and every subtree should have an individual context.
  *  
  */
+#define MY_USE_LIKELY
+#if defined MY_USE_LIKELY
+#define my_likely(x) __builtin_expect((x),1)
+#define my_unlikely(x) __builtin_expect((x),0)
+#else
+#define my_likely(x) (x)
+#define my_unlikely(x) (x)
+#endif
+
 TupleTableSlot *
 ExecProcNode(PlanState *node)
 {
 	CHECK_FOR_INTERRUPTS();
 
 	/* Return unconsumed result if any */
-	if (node->result_ready)
+	if (my_unlikely(node->result_ready))
 		return ExecConsumeResult(node);
 
 	if (node->chgParam != NULL) /* something changed */
@@ -599,7 +608,7 @@ ExecProcNode(PlanState *node)
 
 	ExecDispatchNode(node);
 
-	if (!node->result_ready)
+	if (my_unlikely(!node->result_ready))
 		ExecAsyncWaitForNode(node);
 
 	return ExecConsumeResult(node);
-- 
2.9.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-12 Thread Michael Paquier

On Mon, Sep 12, 2016 at 4:59 PM, Simon Riggs  wrote:
> On 12 September 2016 at 08:28, Michael Paquier
>  wrote:
>> On Mon, Sep 12, 2016 at 4:19 PM, Simon Riggs  wrote:
>>> On 12 September 2016 at 03:27, Michael Paquier
>>>  wrote:
>>>
 So I'd propose the attached for 9.6 and HEAD.
>>>
>>> The $OP commit was against HEAD only, not against 9.6
>>>
>>> Why would we need to backpatch this commit?
>>
>> You are right there is no need to get that into 9.6. Sorry for the mistake.
>
> Committed to HEAD only.

Thanks. Using walsender here is fine for me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CVE-2016-1238 fix breaks (at least) pg_rewind tests

2016-09-12 Thread Michael Paquier

On Fri, Sep 9, 2016 at 6:49 AM, Andres Freund  wrote:
> On 2016-09-08 18:13:06 -0300, Alvaro Herrera wrote:
>> > I can't vouch for the windows stuff, and
>> > the invocations indeed look vulnerable. I'm not sure if the fix actually
>> > matters on windows, given . is the default for pretty much everything
>> > there.
>>
>> Well, maybe it doesn't matter now but as I understand the fix is going
>> to enter the next stable upstream perl, so it'll fail eventually.  It'd
>> be saner to just fix the thing completely so that we can forget about
>> it.
>
> Yea, it'd need input from somebody on windows. Michael? What happens if
> you put a line remove . from INC (like upthread) in the msvc stuff?

I haven't tested that directly because I am not sure how to enforce
INC when running the prove command through system(), and there is no
point to use pop on @INC directly in vcregress.pl as that would just
be ignored in the system() call. But to be short: this will blow up as
@INC is part of the default per perl -V if one uses active perl. So it
looks fair to me to append the local source directory as well to what
is included. You can actually do that by just adding $dir to
$ENV{PERL5LIB} in vcregress.pl and that would be fine.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in filename identification

2016-09-12 Thread Simon Riggs

On 11 September 2016 at 23:56, Daniel Gustafsson  wrote:
> The IDENTIFICATION filename in src/backend/storage/ipc/dsm_impl.c is
> incorrectly labelling the file dsm.c.  Patch fixing the typo attached.
>
> cheers ./daniel

Applied, thanks.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-12 Thread Simon Riggs

On 12 September 2016 at 08:28, Michael Paquier
 wrote:
> On Mon, Sep 12, 2016 at 4:19 PM, Simon Riggs  wrote:
>> On 12 September 2016 at 03:27, Michael Paquier
>>  wrote:
>>
>>> So I'd propose the attached for 9.6 and HEAD.
>>
>> The $OP commit was against HEAD only, not against 9.6
>>
>> Why would we need to backpatch this commit?
>
> You are right there is no need to get that into 9.6. Sorry for the mistake.

Committed to HEAD only.


[No longer a 9.6 issue.]

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2016-09-12 Thread Amit Langote

Hi Fabien,

On 2016/09/07 23:01, Fabien COELHO wrote:
>> Custom script looks like:
>>
>> \;
>> select a \into a
>>from tab where a = 1;
>> \set i debug(:a)
>>
>> I get the following error:
>>
>> undefined variable "a"
>> client 0 aborted in state 1; execution of meta-command failed
> 
> Good catch!
> 
> Indeed, there is a problem with empty commands which are simply ignored by
> libpq/postgres if there are other commands around, so my synchronization
> between results & commands was too naive.
> 
> In order to fix this, I made the scanner also count empty commands and
> ignore comments. I guessed that proposing to change libpq/postgres
> behavior was not an option.

Seems to be fixed.

>> Comments on the patch follow:
>>
>> +
>> + 
>> +  Stores the first fields of the resulting row from the current or
>> preceding
>> +  SELECT command into these variables.
>>
>> Any command returning rows ought to work, no?
> 
> Yes. I put "SQL command" instead.

Check.

>> -char   *line;   /* text of command line */
>> +char   *line;   /* first line for short display */
>> +char   *lines;  /* full multi-line text of command */
>>
>> When I looked at this and related hunks (and also the hunks related to
>> semicolons), it made me wonder whether this patch contains two logical
>> changes.  Is this a just a refactoring for the \into implementation or
>> does this provide some new externally visible feature?
> 
> There is essentially a refactoring that I did when updating how Command is
> managed because it has to be done in several stages to fit "into" into it
> and to take care of compounds.
> 
> However there was small "new externally visible feature": on -r, instead
> of cutting abruptly a multiline command at the end of the first line it
> appended "..." as an ellipsis because it looked nicer.
> I've removed this small visible changed.

There still seems to be a change in behavior of the -r option due to the
patch. Consider the following example:

# no \into used in the script
$ cat /tmp/into.sql
select a from a where a = 1 \;
select a+1 from a where a = 1;
\set a 1
\set b 2
\set i debug(:a)
\set i debug(:b)

$ pgbench -r -n -t 1 -f /tmp/into.sql postgres

 - statement latencies in milliseconds:
 2.889  select a from a where a = 1 ;
 0.012  \set a 1
 0.009  \set b 2
 0.031  \set i debug(:a)
 0.014  \set i debug(:b)

# behavior wrt compound statement changes when \into is used
$ cat /tmp/into.sql
select a from a where a = 1 \; \into a
select a+1 from a where a = 1; \into b
\set i debug(:a)
\set i debug(:b)

$ pgbench -r -n -t 1 -f /tmp/into.sql postgres

 - statement latencies in milliseconds:
 2.093  select a from a where a = 1 ; select a+1 from a where a = 1;
 0.034  \set i debug(:a)
 0.015  \set i debug(:b)

One more thing I observed which I am not sure if it's a fault of this
patch is illustrated below:

$ cat /tmp/into.sql
\;
select a from a where a = 1 \;
select a+1 from a where a = 1;
\set a 1
\set b 2
\set i debug(:a)
\set i debug(:b)

$ pgbench -r -n -t 1 -f /tmp/into.sql postgres

 - statement latencies in milliseconds:
 2.349  ;
 0.013  \set a 1
 0.009  \set b 2
 0.029  \set i debug(:a)
 0.015  \set i debug(:b)

Note that the compound select statement is nowhere to be seen in the
latencies output. The output remains the same even if I use the \into's.
What seems to be going on is that the empty statement on the first line
(\;) is the only part kept of the compound statement spanning lines 1-3.

>> Also I wonder if intonames or intoargs would be a slightly better name
>> for the field.
> 
> I put "intovars" as they are variable names.

Sounds fine.

>> +fprintf(stderr,
>> +"client %d state %d compound %d: "
>> +"cannot apply \\into to non SELECT statement\n",
>> +st->id, st->state, compound);
>>
>> How about make this error message say something like other messages
>> related to \into, perhaps something like: "\\into cannot follow non SELECT
>> statements\n"
> 
> As you pointed out above, there may be statements without "SELECT" which
> which return a row. I wrote "\\into expects a row" instead.

Sounds fine.

> 
>> /*
>>  * Read and discard the query result; note this is not
>> included in
>> - * the statement latency numbers.
>> + * the statement latency numbers (above), thus if reading the
>> + * response fails the transaction is counted nevertheless.
>>  */
>>
>> Does this comment need to mention that the result is not discarded when
>> \into is specified?
> 
> Hmmm. The result structure is discarded, but the results are copied into
> variables before that, so the comments seems ok...

Hmm, OK.

>> +boolsql_command_in_progress = false;
>>
>> This variable's name could be

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-12 Thread Michael Paquier

On Mon, Sep 12, 2016 at 4:19 PM, Simon Riggs  wrote:
> On 12 September 2016 at 03:27, Michael Paquier
>  wrote:
>
>> So I'd propose the attached for 9.6 and HEAD.
>
> The $OP commit was against HEAD only, not against 9.6
>
> Why would we need to backpatch this commit?

You are right there is no need to get that into 9.6. Sorry for the mistake.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

2016-09-12 Thread Simon Riggs

On 12 September 2016 at 03:27, Michael Paquier
 wrote:

> So I'd propose the attached for 9.6 and HEAD.

The $OP commit was against HEAD only, not against 9.6

Why would we need to backpatch this commit?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS]

2016-09-12 Thread Pavel Stehule

2016-09-12 9:07 GMT+02:00 Craig Ringer :

> On 12 September 2016 at 14:29, Pavel Stehule 
> wrote:
>
> >> I would've expected once per query. There's no way the expressions can
> >> reference the row data, so there's no reason to evaluate them each
> >> time.
> >
> > I disagree - it is hypothetical situation but it is possible
> >
> > if somebody store documents like
> >
> > id, xml
> > =
> > id = 1, xml =  <>
> > id = 2, xml =  
> >
> > Then evaluating one per query doesn't allow to use any reference to other
> > columns, and doesn't to build expressions like PATH (...[@id= ' || id ||
> ']
>
> Referencing columns on the same evaluation level? I dunno about that.
> You're relying on strict order of evaluation which is pretty unusual
> for SQL.
>
> I guess this is why full XQuery would be desirable, but that's a whole
> different business.
>
> I would personally expect this sort of thing to be handled by a second
> pass; isn't that part of why it's so easy to return xml fields from
> xmltable?
>
> Evaluating expressions each time seems likely to be bad for
> performance, but I guess it's not going to make a big difference
> compared to all the XML crud, so I don't have a super strong opinion
> here.
>

When expression will a constant, then the cost will be minimal - more, we
can do preevaluation in parser/transform time, and if expression is some
constant, then we should not to evaluate it later.

We can wait if some other people will have a opinion to this topic. This is
important topic, but it is not to hard implement both variants, and more -
this is corner case - it is not important for any example that I found on
net.

Regards

Pavel



>
> Either way, it's crucial that the behaviour be documented.
>
> > DEFAULT should be evaluated per output row - anybody can use volatile
> > function there - example: when I have not data - use some random there
>
> That would be consistent with how we handle DEFAULT on a table, so I
> agree. It's a departure from what we do normally, but we didn't have
> table functions before either.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS]

2016-09-12 Thread Craig Ringer

On 12 September 2016 at 14:29, Pavel Stehule  wrote:

>> I would've expected once per query. There's no way the expressions can
>> reference the row data, so there's no reason to evaluate them each
>> time.
>
> I disagree - it is hypothetical situation but it is possible
>
> if somebody store documents like
>
> id, xml
> =
> id = 1, xml =  <>
> id = 2, xml =  
>
> Then evaluating one per query doesn't allow to use any reference to other
> columns, and doesn't to build expressions like PATH (...[@id= ' || id || ']

Referencing columns on the same evaluation level? I dunno about that.
You're relying on strict order of evaluation which is pretty unusual
for SQL.

I guess this is why full XQuery would be desirable, but that's a whole
different business.

I would personally expect this sort of thing to be handled by a second
pass; isn't that part of why it's so easy to return xml fields from
xmltable?

Evaluating expressions each time seems likely to be bad for
performance, but I guess it's not going to make a big difference
compared to all the XML crud, so I don't have a super strong opinion
here.

Either way, it's crucial that the behaviour be documented.

> DEFAULT should be evaluated per output row - anybody can use volatile
> function there - example: when I have not data - use some random there

That would be consistent with how we handle DEFAULT on a table, so I
agree. It's a departure from what we do normally, but we didn't have
table functions before either.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

95 matches

Mail list logo