Re: [HACKERS] logical replication deranged sender

2017-05-10 Thread Michael Paquier
On Wed, May 10, 2017 at 3:06 AM, Petr Jelinek
 wrote:
> Okay, then it's the same issue Masahiko Sawada reported in nearby
> thread, or at least has same cause.

This report is here:
https://www.postgresql.org/message-id/CAD21AoBYpyqTSw+=ES+xXtRGMPKh=pkiqjnxzknnuae0pst...@mail.gmail.com
It may be useful to keep all discussions on the other thread as that's
the first report.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-10 Thread Michael Paquier
On Thu, May 11, 2017 at 11:56 AM, Tom Lane  wrote:
> It would be hard to reject at the grammar level, and not very friendly
> either because you'd only get "syntax error".  We could certainly make
> the runtime code throw an error if you gave a column list without saying
> ANALYZE.  But on the other hand, why bother?  I do not remember ever
> seeing a question that boiled down to somebody being confused by this.

The docs also say that adding a column list implies an ANALYZE even if
other keywords are added, and that's the case. Sorry for the noise.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Duplicate usage of tablespace location?

2017-05-10 Thread Kyotaro HORIGUCHI
Hello,

At Fri, 5 May 2017 21:42:47 +1000, Neha Khatri  wrote in 

> As Kyotaro san pointed out, the commit 22817041 started allowing creation
> of multiple "tablespace version directories" in same location. However the
> original purpose of that commit was to allow that just for the upgrade
> purpose. So couple of points:
> - The commit violated the requirement of emptiness of the tablespace
> location directory.
> (Though it is still prevented to create multiple tablespaces belonging
> to one server, in same location.)
> - The comment did not document this change in specification.
> 
> Probably it was not anticipated at that time that a user could create the
> tablespaces for different server version at the same location.
> 
> Now that this behaviour is present in field for a while, there is
> likelihood of having systems with tablespaces for two different versions,
> in same location. To avoid the problem reported in [1] for such systems,
> here are couple of alternative approaches:
> 
> 1. Allow creation of multiple tablespaces in single location for different
> server versions, but not the same version(exception).
> a) Also allow this capability in utilities like pg_basebackup( and others
> that update tablespaces) .
> b) Update the documentation about this specification change.
> 
> I don't see this breaking any backwards compatibility.

Yeah, it is just clarification of the behavior in the
documentation. The current behavior is somewhat inconsistent but
practical.

> 2. Retain the current base rule of creating Tablespaces i.e. "The location
> must be an existing, empty directory". This means:
> a) For the future release, have a strict directory emptiness check while
> creating new tablespace.
> b) Only during upgrade, allow creation of multiple tablepaces in same
> location .
> c) Document the fact that only during upgrade the system would create
> multiple tablespaces in same location.

Honestly saying, I think it adds nothing good other than seeming
consistency. (Even though I sent such a patch:p)

> d) Provide a flexibility to change the location of an existing tablespace,
> something like:
> ALTER TABLESPACE tblspcname SET LOCATION '/path/to/newlocation'
> [where newlocation is an existing empty direcotry]
> 
> With the altered location of a tablespace it should be possible to perform
> the pg_basebackup successfully.

If we can accept multiple server versions share a tablespace
directory, pg_basebackup also can allow that situation. The
attached patch does that. Similary to the server code, it
correctly fails if the same version subdirectory exists.

$ pg_basebackup -D $PGDATA -h /tmp -p 5432 -X stream -T 
/home/horiguti/data/tsp1=/home/horiguti/data/tsp2
pg_basebackup: could not create directory 
"/home/horiguti/data/tsp2/PG_10_201705091": File exists
pg_basebackup: removing contents of data directory 
"/home/horiguti/data/data_work_s"

> I noticed some solutions for moving PostgreSQL tablesspaces, on internet.
> But some are slow, others cause incompatibility for tools like pgAdmin. I
> am not able to find any discussion about moving tablespace location in
> mailing lists too. So I am not sure if there is already any conclusion
> about supporting or not supporting ALTER TABLESPACE LOCATION.
> To me, the first approach above looks like providing more independence to
> the user about choice of tablespace location. Also, it is not clear that
> why the directory emptiness rule was introduced in first place. Any insight
> on that will be useful.

Originally (before 9.0) files in a tablespace is directly placed
in the "location" and it is reasonable at that time.

> Regards,
> Neha
> 
> [1]https://www.postgresql.org/message-id/2008148.rxBNyNRHPZ@peanuts2
> 
> Cheers,
> Neha
> 
> On Fri, Apr 7, 2017 at 11:02 AM, Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.jp> wrote:
> 
> > I don't mean that this is the only or best way to go.
> >
> > I apologize for the possible lack of explanation.
> >
> > At Thu, 06 Apr 2017 12:03:51 -0400, Tom Lane  wrote in
> > <21084.1491494...@sss.pgh.pa.us>
> > > Kyotaro HORIGUCHI  writes:
> > > > I noticed by the following report, PostgreSQL can share the same
> > > > directory as tablespaces of two servers with different
> > > > pg-versions.
> > >
> > > > https://www.postgresql.org/message-id/2008148.rxBNyNRHPZ@peanuts2
> > >
> > > > 8.4 checked that the tablespace location is empty, but from 9.0,
> > > > the check is replaced with creating a PG_PGVER_CATVER
> > > > subdirectory. This works for multiple servers with the same
> > > > version, but don't for servers with different versions.
> > >
> > > Please explain why you think it doesn't work.  This patch seems to
> > > be reverting an intentional behavioral change, and you haven't
> >
> > I understand that the change is for in-place upgrade, not for
> > sharing a 

Re: [HACKERS] SCRAM in the PG 10 release notes

2017-05-10 Thread Michael Paquier
On Thu, May 11, 2017 at 11:50 AM, Bruce Momjian  wrote:
> I have added this as an open item because we will have to wait to see
> where we are with driver support as the release gets closer.

As Postgres ODBC now has a hard dependency with libpq, no actions is
taken from there. At least this makes one driver worth mentioning.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-10 Thread Bossart, Nathan
On 5/10/17, 8:10 PM, "Masahiko Sawada"  wrote:
> I agree to report per-table information. Especially In case of one of
> tables specified failed during vacuuming, I think we should report at
> least information of tables that is done successfully so far.

+1

I believe you already get all per-table information when you do not specify a 
table name (“VACUUM VERBOSE;”), so I would expect to get this for free as long 
as we are building this into the existing ExecVacuum(…) and vacuum(…) functions.

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-10 Thread Masahiko Sawada
On Thu, May 11, 2017 at 11:13 AM, Michael Paquier
 wrote:
> On Thu, May 11, 2017 at 6:40 AM, Tom Lane  wrote:
>> "Bossart, Nathan"  writes:
>>> Currently, VACUUM commands allow you to specify one table or all of the 
>>> tables in the current database to vacuum.  I’ve recently found myself 
>>> wishing I could specify multiple tables in a single VACUUM statement.  For 
>>> example, this would be convenient when there are several large tables in a 
>>> database and only a few need cleanup for XID purposes.  Is this a feature 
>>> that the community might be interested in?
>>
>> I'm a bit surprised to realize that we don't allow that, since the
>> underlying code certainly can do it.
>>
>> You realize of course that ANALYZE should grow this capability as well.
>
> Yup. It is just a matter of extending ExecVacuum() to handle a list of
> qualified names with a quick look at the grammar as we are talking
> only about manual commands. One question I am wondering though is do
> we want to have everything happening in the same transaction? I would
> say yes to that to simplify the code. I think that VERBOSE should also
> report the per-table information, so this can be noisy with many
> tables but that's more helpful than gathering all the results.

I agree to report per-table information. Especially In case of one of
tables specified failed during vacuuming, I think we should report at
least information of tables that is done successfully so far.

>
>>> I’ve attached my first attempt at introducing this functionality.  In the 
>>> patch, I’ve extended the table_name parameter in the VACUUM grammar to a 
>>> qualified_name_list.  While this fits into the grammar decently well, I 
>>> suspect that it may be desirable to be able to specify a column list for 
>>> each table as well (e.g. VACUUM foo (a), bar (b)).
>>
>> The column list only matters for ANALYZE (or VACUUM ANALYZE).  But yes,
>> it should be per-table.
>
> The grammar allows that by the way:
> =# VACUUM (full) aa (a);
> VACUUM
> Perhaps that's an oversight? I don't think it makes much sense.
> --
> Michael
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-10 Thread Tom Lane
Michael Paquier  writes:
> On Thu, May 11, 2017 at 6:40 AM, Tom Lane  wrote:
>> The column list only matters for ANALYZE (or VACUUM ANALYZE).  But yes,
>> it should be per-table.

> The grammar allows that by the way:
> =# VACUUM (full) aa (a);
> VACUUM
> Perhaps that's an oversight? I don't think it makes much sense.

It would be hard to reject at the grammar level, and not very friendly
either because you'd only get "syntax error".  We could certainly make
the runtime code throw an error if you gave a column list without saying
ANALYZE.  But on the other hand, why bother?  I do not remember ever
seeing a question that boiled down to somebody being confused by this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SCRAM in the PG 10 release notes

2017-05-10 Thread Bruce Momjian
On Mon, May  1, 2017 at 08:12:51AM -0400, Robert Haas wrote:
> On Tue, Apr 25, 2017 at 10:16 PM, Bruce Momjian  wrote:
> > Well, we could add "MD5 users are encouraged to switch to
> > SCRAM-SHA-256".  Now whether we want to list this as something on the
> > SCRAM-SHA-256 description, or mention it as an incompatibility, or
> > under Migration.  I am not clear that MD5 is in such terrible shape that
> > this is warranted.
> 
> I think it's warranted.  The continuing use of MD5 has been a headache
> for some EnterpriseDB customers who have compliance requirements which
> they must meet.  It's not that they themselves necessarily know or
> care whether MD5 is secure, although in some cases they do; it's that
> if they use it, they will be breaking laws or regulations to which
> their business or agency is subject.  I imagine customers of other
> PostgreSQL companies have similar issues.  But leaving that aside, the
> advantage of SCRAM isn't merely that it uses a better algorithm to
> hash the password.  It has other advantages also, like not being
> vulnerable to replay attacks.  If you're doing password
> authentication, you should really be using SCRAM, and encouraging
> people to move to SCRAM after upgrading is a good idea.
> 
> That having been said, SCRAM is a wire protocol break.  You will not
> be able to upgrade to SCRAM unless and until the drivers you use to
> connect to the database add support for it.  The only such driver
> that's part of libpq; other drivers that have reimplemented the
> PostgreSQL wire protocol will have to be updated with SCRAM support
> before it will be possible to use SCRAM with those drivers.  I think
> this should be mentioned in the release notes, too.  I also think it
> would be great if somebody would put together a wiki page listing all
> the popular drivers and (1) whether they use libpq or reimplement the
> wire protocol, and (2) if the latter, the status of any efforts to
> implement SCRAM, and (3) if those efforts have been completed, the
> version from which they support SCRAM.  Then, I think we should reach
> out to all of the maintainers of those driver authors who aren't
> moving to support SCRAM and encourage them to do so.

I have added this as an open item because we will have to wait to see
where we are with driver support as the release gets closer.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-column range partition constraint

2017-05-10 Thread Amit Langote
On 2017/05/10 12:08, Robert Haas wrote:
> On Mon, May 8, 2017 at 2:59 AM, Amit Langote
>  wrote:
>> Yes, disallowing this in the first place is the best thing to do.
>> Attached patch 0001 implements that.  With the patch:
> 
> Committed.

Thanks.

> With regard to 0002, some of the resulting constraints are a bit more
> complicated than seems desirable:
> 
> create table foo1 partition of foo for values from (unbounded,
> unbounded, unbounded) to (1, unbounded, unbounded);
> yields:
> Partition constraint: CHECK (((a < 1) OR (a = 1) OR (a = 1)))
> 
> It would be better not to have (a = 1) in there twice, and better
> still to have the whole thing as (a <= 1).
> 
> create table foo2 partition of foo for values from (3, 4, 5) to (6, 7,
> unbounded);
> yields:
> Partition constraint: CHECK a > 3) OR ((a = 3) AND (b > 4)) OR ((a
> = 3) AND (b = 4) AND (c >= 5))) AND ((a < 6) OR ((a = 6) AND (b < 7))
> OR ((a = 6) AND (b = 7)
> 
> The first half of that (for the lower bound) is of course fine, but
> the second half could be written better using <=, like instead of
> 
> ((a = 6) AND (b < 7)) OR ((a = 6) AND (b = 7))
> you could have:
> ((a = 6) AND (b <= 7)
> 
> This isn't purely cosmetic because the simpler constraint is probably
> noticeably faster to evaluate.

I think that makes sense.  I modified things such that a simpler
constraint expression as you described above results in the presence of
UNBOUNDED values.

> I think you should have a few test cases like this in the patch - that
> is, cases where some but not all bounds are finite.

Added tests like this in insert.sql and then in the second patch as well.

> 
>> BTW, is it strange that the newly added pg_get_partition_constraintdef()
>> requires the relcache entry to be created for the partition and all of its
>> ancestor relations up to the root (I mean the fact that the relcache entry
>> needs to be created at all)?  I can see only one other function,
>> set_relation_column_names(), creating the relcache entry at all.
> 
> I suggest that you display this information only when "verbose" is set
> - i.e. \d+ not just \d.  I don't intrinsically care think that forcing
> the relcache entry to be built here, but note that it means this will
> block when the parent is locked.  Between that and the fact that this
> information will only sometimes be of interest, restricting it to \d+
> seems preferable.

OK, done.

> Next update on this issue by Thursday 5/11.

Attached updated patches.

Thanks,
Amit
>From 17ae801b3f3fa5e89ab9f2332a825382281522ad Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 2 May 2017 11:03:54 +0900
Subject: [PATCH 1/2] Emit "correct" range partition constraint expression

Currently emitted expression does not always describe the constraint
correctly, especially in the case of multi-column range key.

Code surrounding get_qual_for_*() has been rearranged a little to
move common code to a couple of new functions.

Original issue reported by Olaf Gawenda (olaf...@googlemail.com)
---
 src/backend/catalog/partition.c   | 733 --
 src/include/nodes/pg_list.h   |  14 +
 src/test/regress/expected/inherit.out |  90 +
 src/test/regress/expected/insert.out  |  59 +++
 src/test/regress/sql/inherit.sql  |  18 +
 src/test/regress/sql/insert.sql   |  43 ++
 6 files changed, 748 insertions(+), 209 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 8641ae16a2..b05ffd2d90 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -118,10 +118,19 @@ static int32 qsort_partition_list_value_cmp(const void *a, const void *b,
 static int32 qsort_partition_rbound_cmp(const void *a, const void *b,
 		   void *arg);
 
-static List *get_qual_for_list(PartitionKey key, PartitionBoundSpec *spec);
-static List *get_qual_for_range(PartitionKey key, PartitionBoundSpec *spec);
 static Oid get_partition_operator(PartitionKey key, int col,
 	   StrategyNumber strategy, bool *need_relabel);
+static Expr* make_partition_op_expr(PartitionKey key, int keynum,
+	   uint16 strategy, Expr *arg1, Expr *arg2);
+static void get_range_key_properties(PartitionKey key, int keynum,
+		 PartitionRangeDatum *ldatum,
+		 PartitionRangeDatum *udatum,
+		 ListCell **partexprs_item,
+		 Expr **keyCol,
+		 Const **lower_val, Const **upper_val,
+		 NullTest **nulltest);
+static List *get_qual_for_list(PartitionKey key, PartitionBoundSpec *spec);
+static List *get_qual_for_range(PartitionKey key, PartitionBoundSpec *spec);
 static List *generate_partition_qual(Relation rel);
 
 static PartitionRangeBound *make_one_range_bound(PartitionKey key, int index,
@@ -1146,6 +1155,123 @@ RelationGetPartitionDispatchInfo(Relation rel, int lockmode,
 /* Module-local functions */
 
 /*
+ * get_partition_operator
+ *
+ * Return oid of the operator of given strategy for a given 

Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-10 Thread Michael Paquier
On Thu, May 11, 2017 at 6:40 AM, Tom Lane  wrote:
> "Bossart, Nathan"  writes:
>> Currently, VACUUM commands allow you to specify one table or all of the 
>> tables in the current database to vacuum.  I’ve recently found myself 
>> wishing I could specify multiple tables in a single VACUUM statement.  For 
>> example, this would be convenient when there are several large tables in a 
>> database and only a few need cleanup for XID purposes.  Is this a feature 
>> that the community might be interested in?
>
> I'm a bit surprised to realize that we don't allow that, since the
> underlying code certainly can do it.
>
> You realize of course that ANALYZE should grow this capability as well.

Yup. It is just a matter of extending ExecVacuum() to handle a list of
qualified names with a quick look at the grammar as we are talking
only about manual commands. One question I am wondering though is do
we want to have everything happening in the same transaction? I would
say yes to that to simplify the code. I think that VERBOSE should also
report the per-table information, so this can be noisy with many
tables but that's more helpful than gathering all the results.

>> I’ve attached my first attempt at introducing this functionality.  In the 
>> patch, I’ve extended the table_name parameter in the VACUUM grammar to a 
>> qualified_name_list.  While this fits into the grammar decently well, I 
>> suspect that it may be desirable to be able to specify a column list for 
>> each table as well (e.g. VACUUM foo (a), bar (b)).
>
> The column list only matters for ANALYZE (or VACUUM ANALYZE).  But yes,
> it should be per-table.

The grammar allows that by the way:
=# VACUUM (full) aa (a);
VACUUM
Perhaps that's an oversight? I don't think it makes much sense.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH v2] Progress command to monitor progression of long running SQL queries

2017-05-10 Thread Michael Paquier
On Thu, May 11, 2017 at 1:40 AM, Remi Colinet  wrote:
> This is version 2 of the new command name PROGRESS which I wrote in order to
> monitor long running SQL queries in a Postgres backend process.

It would be a good idea to add this patch to the next commit fest if
you are expecting some feedback:
https://commitfest.postgresql.org/14/
But please don't expect immediate feedback, the primary focus these
days is to stabilize the upcoming release.

> New command justification
> 
>
> [root@rco pg]# git diff --stat master..
> [...]
>  58 files changed, 5972 insertions(+), 2619 deletions(-)

For your first patch, you may want something less... Challenging.
Please note as well that it would be nice if you review other patches
to get more familiar with the community process, it is expected that
for each patch submitted, an equivalent amount of review is done.
That's hard to measure but for a patch as large as that you will need
to pick up at least one patch equivalent in difficulty.

Regarding the patch, this is way to close to the progress facility
already in place. So why don't you extend it for the executor? We can
likely come up with something that can help, though I see the part
where the plan run by a backend is shared among all processes as a
major bottleneck in performance. You have at least three concepts
different here:
- Store plans run in shared memory to allow access to any other processes.
- Allow a process to look at the plan run by another one with a SQL interface.
- Track the progress of a running plan, via pg_stat_activity.
All in all, I think that a new command is not adapted, and that you
had better split each concept before implementing something
over-engineered like the patch attached.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Race conditions with WAL sender PID lookups

2017-05-10 Thread Michael Paquier
Hi all,

I had my eyes on the WAL sender code this morning, and I have noticed
that walsender.c is not completely consistent with the PID lookups it
does in walsender.c. In two code paths, the PID value is checked
without holding the WAL sender spin lock (WalSndRqstFileReload and
pg_stat_get_wal_senders), which looks like a very bad idea contrary to
what the new WalSndWaitStopping() does and what InitWalSenderSlot() is
doing for ages.

Any thoughts about the patch attached to make things more consistent?
It seems to me that having some safeguards would be nice for
robustness.

That's an old bug, so I am adding that to the next CF.
Thanks,
-- 
Michael


walsnd-pid-races.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of plaintext password type references

2017-05-10 Thread Michael Paquier
On Wed, May 10, 2017 at 10:22 PM, Michael Paquier
 wrote:
> On Wed, May 10, 2017 at 10:01 PM, Tom Lane  wrote:
>> Heikki Linnakangas  writes:
>>> Also note that changing the signature check_password_hook_type would
>>> break any external modules that use the hook. Removing
>>> PASSWORD_TYPE_PLAINTEXT will do that too, because any password hook
>>> function would use that constant (see e.g. contrib/passwordcheck). If we
>>> were to change the signature, I'd actually like to simplify it by
>>> removing the password_type parameter altogether. The hook function can
>>> call get_password_type() on the password itself to get the same
>>> information. But it's not worth changing the API and breaking external
>>> modules for that.
>
> Ahah. I just had the same thought before reading this message.

And attached is a patch to do that. I am all for this one to get a
more simple interface in place.
-- 
Michael


simplify-passwd-hook.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of plaintext password type references

2017-05-10 Thread Vaishnavi Prabakaran
On Wed, May 10, 2017 at 5:58 PM, Heikki Linnakangas  wrote:

> On 05/10/2017 08:01 AM, Michael Paquier wrote:
>
>> On Wed, May 10, 2017 at 10:51 AM, Vaishnavi Prabakaran
>>  wrote:
>>
>>> Following recent removal of support to store password in plain text,
>>> modified the code to
>>>
>>> 1. Remove  "PASSWORD_TYPE_PLAINTEXT" macro
>>> 2. Instead of using "get_password_type" to retrieve the encryption method
>>> AND to check if the password is already encrypted or not, modified the
>>> code
>>> to
>>> a. Use "get_password_encryption_type" function to retrieve encryption
>>> method.
>>> b. Use "isPasswordEncrypted" function to check if the password is already
>>> encrypted or not.
>>>
>>> These changes are mainly to increase code readability and does not change
>>> underlying functionality.
>>>
>>
>> Actually, this patch reduces readability.
>>
>
> Yeah, I don't think this is an improvement. Vaishnavi, did you find the
> current code difficult? Perhaps some extra comments somewhere would help?
>

Thanks for willing to add extra comments, And current code is not difficult
but kind of gives the impression that still plaintext password storage
exists by holding "PASSWORD_TYPE_PLAINTEXT" macro and having switch case
checks for this macro.

I see "get_password_type" function call is used for 2 purposes. One is to
check the actual password encryption type during authentication process and
another purpose is to verify if the password passed is encrypted or not -
used in create/alter role command before checking the strength of
password(via passwordcheck module). So, I thought my patch will make this
purpose clear.  Nevertheless, if people thinks this actually reduces their
readability then I don't mind to go with the flow.


Thanks & Regards,
Vaishnavi
Fujitsu Australia.


Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-05-10 Thread Michael Paquier
On Thu, May 11, 2017 at 9:15 AM, Bruce Momjian  wrote:
> On Wed, May 10, 2017 at 01:09:36PM -0700, Joe Conway wrote:
>> On 05/10/2017 12:22 PM, Tom Lane wrote:
>> > Robert Haas  writes:
>> >> On Wed, May 10, 2017 at 1:13 PM, Tom Lane  wrote:
>> >>> In terms of the alternatives I listed previously, it seems like
>> >>> nobody liked alternatives #3, #4, or #5, leaving us with #1 (do
>> >>> nothing) or #2 (apply this patch).  By my count, Peter is the
>> >>> only one in favor of doing nothing, and is outvoted.  I'll push
>> >>> the patch later today if I don't hear additional comments.
>> >
>> >> For the record, I also voted for doing nothing.
>> >
>> > Hm, well, anybody else want to vote?
>>
>> +1 for #2
>
> Same, +1 for #2 (apply this patch)

#1, do nothing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-05-10 Thread Masahiko Sawada
Hi all,

Currently, the relation extension lock is implemented using
heavyweight lock manager and almost functions (except for
brin_page_cleanup) using LockRelationForExntesion use it with
ExclusiveLock mode. But actually it doesn't need multiple lock modes
or deadlock detection or any of the other functionality that the
heavyweight lock manager provides. I think It's enough to use
something like LWLock. So I'd like to propose to change relation
extension lock management so that it works using LWLock instead.

Attached draft patch makes relation extension locks uses LWLock rather
than heavyweight lock manager, using by shared hash table storing
information of the relation extension lock. The basic idea is that we
add hash table in shared memory for relation extension locks and each
hash entry is LWLock struct. Whenever the process wants to acquire
relation extension locks, it searches appropriate LWLock entry in hash
table and acquire it. The process can remove a hash entry when
unlocking it if nobody is holding and waiting it.

This work would be helpful not only for existing workload but also
future works like some parallel utility commands, which is discussed
on other threads[1]. At least for parallel vacuum, this feature helps
to solve issue that the implementation of parallel vacuum has.

I ran pgbench for 10 min three times(scale factor is 5000), here is a
performance measurement result.

clients   TPS(HEAD)   TPS(Patched)
4   2092.612   2031.277
8   3153.732   3046.789
16 4562.072   4625.419
32 6439.391   6479.526
64 7767.364   7779.636
100   7917.173   7906.567

* 16 core Xeon E5620 2.4GHz
* 32 GB RAM
* ioDrive

In current implementation, it seems there is no performance degradation so far.
Please give me feedback.

[1]
* Block level parallel vacuum WIP
   

* CREATE TABLE with parallel workers, 10.0?
  

* utility commands benefiting from parallel plan
  


Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Moving_extension_lock_out_of_heavyweight_lock_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-05-10 Thread Bruce Momjian
On Wed, May 10, 2017 at 01:09:36PM -0700, Joe Conway wrote:
> On 05/10/2017 12:22 PM, Tom Lane wrote:
> > Robert Haas  writes:
> >> On Wed, May 10, 2017 at 1:13 PM, Tom Lane  wrote:
> >>> In terms of the alternatives I listed previously, it seems like
> >>> nobody liked alternatives #3, #4, or #5, leaving us with #1 (do
> >>> nothing) or #2 (apply this patch).  By my count, Peter is the
> >>> only one in favor of doing nothing, and is outvoted.  I'll push
> >>> the patch later today if I don't hear additional comments.
> > 
> >> For the record, I also voted for doing nothing.
> > 
> > Hm, well, anybody else want to vote?
> 
> +1 for #2

Same, +1 for #2 (apply this patch)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: Re: [HACKERS] MSVC odd TAP test problem

2017-05-10 Thread Michael Paquier
On Thu, May 11, 2017 at 7:29 AM, Andrew Dunstan
 wrote:
> This isn't going to work. If you look at the code in IPC/Run.pm you see
> that the coup_d_grace signal is only used after it has first sent the
> hardcoded SIGTERM. It might be tempting to play with using Sysinternals'
> pskill utility, but we can hardly expect buildfarm owners and others to
> hack their copies of IPC/Run.pm, so I'm going to go ahead and commit the
> changes I proposed.

OK, thanks for checking. That was worth a try.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: Re: [HACKERS] MSVC odd TAP test problem

2017-05-10 Thread Andrew Dunstan


On 05/10/2017 01:53 AM, Andrew Dunstan wrote:
>
>> Does it make a different if you use for example coup_d_grace =>
>> "QUIT"? Per the docs of IPC::Run SIGTERM is used for kills on Windows.
>
> No idea. I'll try.
>
>
>


This isn't going to work. If you look at the code in IPC/Run.pm you see
that the coup_d_grace signal is only used after it has first sent the
hardcoded SIGTERM. It might be tempting to play with using Sysinternals'
pskill utility, but we can hardly expect buildfarm owners and others to
hack their copies of IPC/Run.pm, so I'm going to go ahead and commit the
changes I proposed.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-10 Thread Bossart, Nathan
Great, I’ll keep working on this patch and will update this thread with a more 
complete implementation.

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2017-05-10 Thread Alvaro Herrera
FWIW I ended up reverting the whole thing, even from master.  A more
complete solution would have to be researched.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-10 Thread Tom Lane
"Bossart, Nathan"  writes:
> Currently, VACUUM commands allow you to specify one table or all of the 
> tables in the current database to vacuum.  I’ve recently found myself wishing 
> I could specify multiple tables in a single VACUUM statement.  For example, 
> this would be convenient when there are several large tables in a database 
> and only a few need cleanup for XID purposes.  Is this a feature that the 
> community might be interested in?

I'm a bit surprised to realize that we don't allow that, since the
underlying code certainly can do it.

You realize of course that ANALYZE should grow this capability as well.

> I’ve attached my first attempt at introducing this functionality.  In the 
> patch, I’ve extended the table_name parameter in the VACUUM grammar to a 
> qualified_name_list.  While this fits into the grammar decently well, I 
> suspect that it may be desirable to be able to specify a column list for each 
> table as well (e.g. VACUUM foo (a), bar (b)).

The column list only matters for ANALYZE (or VACUUM ANALYZE).  But yes,
it should be per-table.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in pg_dump --table and --exclude-table for declarative partition table handling.

2017-05-10 Thread Tom Lane
Robert Haas  writes:
> On Tue, May 9, 2017 at 4:21 AM, Jeevan Ladhe
>  wrote:
>> Also, I can see similar issue exists with inheritance.

> That's a pretty insightful comment which makes me think that this
> isn't properly categorized as a bug.  Table partitioning is based on
> inheritance and is intended to behave like inheritance except when
> we've decided to make it behave otherwise.  We made no such decision
> in this case, so it behaves like inheritance in this case.  So, this
> is only a bug if the inheritance behavior is also a bug.

> And I think there's a pretty good argument that it isn't.

I agree.  There is room for a feature that would make --table or
--exclude-table on an inheritance parent apply to its children,
but that's a missing feature not a bug.  Also, if we did make that
happen automatically, for either inheritance or partitioning,
there would inevitably be people who need to turn it off.  ISTM that
the point of partitioning is mainly to be able to split up maintenance
work for a table that's too large to handle as an indivisible unit,
and it's hard to see why that wouldn't apply to dump/restore as much
as it does, say, vacuum.

So I think this can be classified as a feature to add later.

We should make sure that pg_dump behaves sanely when dumping just
some elements of a partition tree, of course.  And for that matter,
pg_restore ought to be able to successfully restore just some elements
out of a an archive containing more.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-10 Thread David Fetter
On Wed, May 10, 2017 at 08:10:48PM +, Bossart, Nathan wrote:
> Hi Hackers,
> 
> Currently, VACUUM commands allow you to specify one table or all of
> the tables in the current database to vacuum.  I’ve recently found
> myself wishing I could specify multiple tables in a single VACUUM
> statement.  For example, this would be convenient when there are
> several large tables in a database and only a few need cleanup for
> XID purposes.  Is this a feature that the community might be
> interested in?

I haven't read the implementation yet, but I've wanted this feature
several times.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-05-10 Thread Petr Jelinek
On 10/05/17 21:22, Tom Lane wrote:
> Robert Haas  writes:
>> On Wed, May 10, 2017 at 1:13 PM, Tom Lane  wrote:
>>> In terms of the alternatives I listed previously, it seems like
>>> nobody liked alternatives #3, #4, or #5, leaving us with #1 (do
>>> nothing) or #2 (apply this patch).  By my count, Peter is the
>>> only one in favor of doing nothing, and is outvoted.  I'll push
>>> the patch later today if I don't hear additional comments.
> 
>> For the record, I also voted for doing nothing.
> 
> Hm, well, anybody else want to vote?
> 

I am for standardizing (although I don't have preference of location vs
lsn).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-05-10 Thread Petr Jelinek
On 10/05/17 22:17, Dmitry Dolgov wrote:
>> On 4 May 2017 at 20:05, Magnus Hagander  > wrote:
>>
>> PFA a patch that adds a new function, pg_move_replication_slot, that
> makes it
>> possible to move the location of a replication slot without actually
>> consuming all the WAL on it.
>  
> Just a few questions just a few questions out of curiosity:
> 
> * does it make sense to create a few tests for this function in
>   `contrib/test_decoding` (as shown in attachment)?
> 

As Craig said this will not work correctly for logical slots so it
should throw error on those at the moment (at least until we write
something that works, but that's far more complex than this patch).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in pg_dump --table and --exclude-table for declarative partition table handling.

2017-05-10 Thread Robert Haas
On Tue, May 9, 2017 at 4:21 AM, Jeevan Ladhe
 wrote:
>> Current pg_dump --exclude-table option excludes partitioned relation
>> and dumps all its child relations and vice versa for --table option, which
>> I think is incorrect.
>>
>> In this case we might need to explore all partitions and exclude or
>> include
>> from dump according to given pg_dump option, attaching WIP patch proposing
>> same fix.   Thoughts/Comments?
>
> +1.
>
> Also, I can see similar issue exists with inheritance.

That's a pretty insightful comment which makes me think that this
isn't properly categorized as a bug.  Table partitioning is based on
inheritance and is intended to behave like inheritance except when
we've decided to make it behave otherwise.  We made no such decision
in this case, so it behaves like inheritance in this case.  So, this
is only a bug if the inheritance behavior is also a bug.

And I think there's a pretty good argument that it isn't.  Are we
saying we think that it was always intended that dumping an
inheritance parent would always dump its inheritance children as well,
and the code accidentally failed to achieve that?  Are we saying we'd
back-patch a behavior change in this area to correct the wrong
behavior we've had since time immemorial?  I can't speak for anyone
else, but I think the first is probably false and I would vote against
the second.

That's not to say that this isn't a possibly-useful behavior change.
I can easily imagine that users would find it convenient to be able to
dump a whole partitioning hierarchy by just selecting the parent table
rather than having to list all of the partitions.  But that's also
true of inheritance.  Is partitioning different enough from
inheritance that the two should have different behaviors, perhaps
because the partitioning parent can't contain data but the inheritance
parent could?  Or should we change the behavior for inheritance as
well, on the theory that the proposed new behavior is just plain
better than the current behavior and everyone will want it?  Or should
we do nothing at all, so as to avoid breaking things for the user who
says they want to dump JUST the parent and really means it?  Even if
the parent couldn't contain any rows, it's still got a schema that can
be dumped.  The option of changing partitioning in one patch and then
having a separate patch later that makes a similar change for
inheritance seems like the worst option, since then users might get
any of three behaviors depending on which release they have.  If we
want to change this, let's change it all at once.  But first we need
to get clarity on exactly what we're changing and for what reason.

There is a question of timing.  If we accept that this is not a bug
but a proposed behavior change, then it's not clear that it really
qualifies as an open item for v10.  I understand that there's an urge
to keep tinkering and making things better, but it's far from clear to
me that anything bad will happen if we just postpone changing anything
until v11, especially if we decide to change both partitioning and
inheritance.  I am somewhat inclined to classify this proposed open
item as a non-bug (i.e. a feature) but I'm also curious to hear what
others think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-10 Thread Sven R. Kunze

On 10.05.2017 17:59, Robert Haas wrote:

Well, I don't think it would be a HUGE problem, but I think the fact
that Amit chose to implement this with syntax similar to that of
Oracle is probably not a coincidence, but rather a goal, and I think
the readability problem that you're worrying about is really pretty
minor.  I think most people aren't going to subpartition their default
partition, and I think those who do will probably find the syntax
clear enough anyway.


I agree here.


Optional keywords may not be the root of ALL evil, but they're pretty
evil.  See my posting earlier on this same thread on this topic:

http://postgr.es/m/CA+TgmoZGHgd3vKZvyQ1Qx3e0L3n=voxy57mz9ttncvet-al...@mail.gmail.com

The issues here are more or less the same.


Ah, I see. I didn't draw the conclusion from the optionality of a 
keyword the other day but after re-reading your post, it's exactly the 
same issue.

Let's avoid optional keywords!

Sven


Re: [HACKERS] Transaction held open by autoanalyze can be a bottleneck

2017-05-10 Thread Andres Freund
Hi,

On 2017-05-10 13:09:38 -0700, Jeff Janes wrote:
> Autovacuum's analyze starts a transaction when it starts on a table, and
> holds it for the duration. This holds back the xmin horizon.

Yea, I also complained about this:
http://archives.postgresql.org/message-id/20151031145303.GC6064%40alap3.anarazel.de


> Does analyze need all of its work done under the same transaction?

It's imo, as pointed out in the above email, not trivial to change it,
but it's imo doable.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-05-10 Thread Dmitry Dolgov
> On 4 May 2017 at 20:05, Magnus Hagander  wrote:
>
> PFA a patch that adds a new function, pg_move_replication_slot, that
makes it
> possible to move the location of a replication slot without actually
> consuming all the WAL on it.

Just a few questions just a few questions out of curiosity:

* does it make sense to create a few tests for this function in
  `contrib/test_decoding` (as shown in attachment)?

* what should happen if the second argument is `NULL`? There is a
verification
  `XLogRecPtrIsInvalid(moveto)`, but it's possible to pass `NULL`, and looks
  like it leads to result different from boolean:

```
SELECT pg_move_replication_slot('regression_slot_5', NULL);
 pg_move_replication_slot
--

(1 row)
```
diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 9f5f8a9..7c69110 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -101,3 +101,39 @@ SELECT pg_drop_replication_slot('regression_slot1');
 ERROR:  replication slot "regression_slot1" does not exist
 SELECT pg_drop_replication_slot('regression_slot2');
 ERROR:  replication slot "regression_slot2" does not exist
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_3', 'test_decoding');
+ ?column? 
+--
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_4', 'test_decoding', true);
+ ?column? 
+--
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_5', 'test_decoding');
+ ?column? 
+--
+ init
+(1 row)
+
+SELECT pg_move_replication_slot('regression_slot_3', '0/aabbccdd');
+ pg_move_replication_slot 
+--
+ t
+(1 row)
+
+SELECT pg_move_replication_slot('regression_slot_4', '0/aabbccdd');
+ pg_move_replication_slot 
+--
+ t
+(1 row)
+
+SELECT pg_move_replication_slot('regression_slot_5', NULL);
+ pg_move_replication_slot 
+--
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index fa9561f..207f845 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -53,3 +53,12 @@ SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot1', 'test_
 -- both should error as they should be dropped on error
 SELECT pg_drop_replication_slot('regression_slot1');
 SELECT pg_drop_replication_slot('regression_slot2');
+
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_3', 'test_decoding');
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_4', 'test_decoding', true);
+
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_5', 'test_decoding');
+
+SELECT pg_move_replication_slot('regression_slot_3', '0/aabbccdd');
+SELECT pg_move_replication_slot('regression_slot_4', '0/aabbccdd');
+SELECT pg_move_replication_slot('regression_slot_5', NULL);
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6298657..d05974e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18986,6 +18986,24 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
 be called when connected to the same database the slot was created on.

   
+  
+   
+
+ pg_move_replication_slot
+
+pg_move_replication_slot(slot_name name, position pg_lsn)
+   
+   
+bool
+   
+   
+Moves the current restart position of a physical or logical
+replication slot named slot_name.
+The slot will not be moved backwards, and it will not be
+moved beyond the current insert location. Returns true if
+the position was changed.
+   
+  
 
   

diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6ee1e68..a9faa10 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -176,6 +176,66 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 }
 
 /*
+ * SQL function for moving the position in a replication slot.
+ */
+Datum
+pg_move_replication_slot(PG_FUNCTION_ARGS)
+{
+	Name		slotname = PG_GETARG_NAME(0);
+	XLogRecPtr	moveto = PG_GETARG_LSN(1);
+	char	   *slotnamestr;
+	bool		changed = false;
+	bool 		backwards = false;
+
+	Assert(!MyReplicationSlot);
+
+	check_permissions();
+
+	if (XLogRecPtrIsInvalid(moveto))
+		ereport(ERROR,
+(errmsg("Invalid target xlog position ")));
+
+	/* Temporarily acquire the slot so we "own" it */
+	ReplicationSlotAcquire(NameStr(*slotname));
+
+	if (moveto > GetXLogWriteRecPtr())
+		/* Can't move past current position, so truncate there */
+		moveto = GetXLogWriteRecPtr();
+
+	/* Now adjust it */
+	SpinLockAcquire(>mutex);
+	if (MyReplicationSlot->data.restart_lsn != moveto)
+	{
+		/* Never move backwards, because bad things can happen */
+		if 

[HACKERS] Transaction held open by autoanalyze can be a bottleneck

2017-05-10 Thread Jeff Janes
Autovacuum's analyze starts a transaction when it starts on a table, and
holds it for the duration. This holds back the xmin horizon.

On a TPC-B-like benchmark, this can be a problem.  While it is
autoanalyzing pgbench_accounts and pgbench_history, dead-but-for-analyze
tuples accumulate rapidly in pgbench_tellers and pgbench_branches.  Now the
UPDATES to those tables have to walk the unprunable HOT chains to find
their tuples to update, greatly slowing them down.

The analyze actually takes quite a while, because it is frequently setting
hint bits and so dirtying pages and so sleeping
for autovacuum_vacuum_cost_delay.

If I set autovacuum_vacuum_cost_delay=0, then tps averaged over an hour
goes from 12,307.6 to 24,955.2.  I can get a similar gain just by changing
the relopts for those two tables to autovacuum_analyze_threshold =
20.  I don't think these are particularly attractive solutions, but
they do demonstrate the nature of the problem.

Does analyze need all of its work done under the same transaction?  Is
there an elegant way to make it periodically discard the transaction and
get a new one, so that the xmin horizon can advance? I think doing so every
time vacuum_delay_point decides to sleep would be a good time to do that,
but that would expand its contract quite a bit. And it is probably possible
to have analyze take a long time without ever deciding to sleep, so doing
it there would not be a fully general solution.

Cheers,

Jeff


Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-05-10 Thread Joe Conway
On 05/10/2017 12:22 PM, Tom Lane wrote:
> Robert Haas  writes:
>> On Wed, May 10, 2017 at 1:13 PM, Tom Lane  wrote:
>>> In terms of the alternatives I listed previously, it seems like
>>> nobody liked alternatives #3, #4, or #5, leaving us with #1 (do
>>> nothing) or #2 (apply this patch).  By my count, Peter is the
>>> only one in favor of doing nothing, and is outvoted.  I'll push
>>> the patch later today if I don't hear additional comments.
> 
>> For the record, I also voted for doing nothing.
> 
> Hm, well, anybody else want to vote?

+1 for #2

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] [PATCH v2] Progress command to monitor progression of long running SQL queries

2017-05-10 Thread Pavel Stehule
Hi

2017-05-10 18:40 GMT+02:00 Remi Colinet :

> Hello,
>
> This is version 2 of the new command name PROGRESS which I wrote in order
> to monitor long running SQL queries in a Postgres backend process.
>

This patch introduce lot of reserved keyword. Why? It can breaks lot of
applications. Cannot you enhance EXPLAIN command?

Regards

Pavel


Re: [HACKERS] Issues with replication slots(which created manually) against logical replication

2017-05-10 Thread Robert Haas
On Tue, May 9, 2017 at 5:59 AM, tushar  wrote:
> I think -we should throw an error while creating subscription.

I don't really think it's worth complicating the code to check for
this.  If you work hard enough, you can create up a configuration that
doesn't work.  So don't do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH v2] Progress command to monitor progression of long running SQL queries

2017-05-10 Thread David Fetter
On Wed, May 10, 2017 at 06:40:31PM +0200, Remi Colinet wrote:
> Hello,
> 
> This is version 2 of the new command name PROGRESS which I wrote in
> order to monitor long running SQL queries in a Postgres backend
> process.

Perhaps I missed something important in the discussion, but was there
a good reason that this isn't a function callable from SQL, i.e. not
restricted to the psql client?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-05-10 Thread Tom Lane
Robert Haas  writes:
> On Wed, May 10, 2017 at 1:13 PM, Tom Lane  wrote:
>> In terms of the alternatives I listed previously, it seems like
>> nobody liked alternatives #3, #4, or #5, leaving us with #1 (do
>> nothing) or #2 (apply this patch).  By my count, Peter is the
>> only one in favor of doing nothing, and is outvoted.  I'll push
>> the patch later today if I don't hear additional comments.

> For the record, I also voted for doing nothing.

Hm, well, anybody else want to vote?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-10 Thread Robert Haas
On Wed, May 10, 2017 at 12:12 PM, Alvaro Herrera
 wrote:
> I'm surprised that there is so much activity in this thread.  Is this
> patch being considered for pg10?

Of course not.  Feature freeze was a month ago.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-05-10 Thread Robert Haas
On Wed, May 10, 2017 at 1:13 PM, Tom Lane  wrote:
> David Steele  writes:
>> If I read this correctly, we won't change the names of any functions
>> that we haven't *already* changed the names of, and only one view would
>> be changed to bring it into line with the rest.
>
> I have now looked through the patch more carefully, and noted some changes
> I forgot to account for in my previous summary: it also renames some
> function arguments and output columns, which previously were variously
> "location", "wal_position", etc.  I'd missed that for functions that don't
> have a formal view in front of them.  This affects
>
> pg_control_checkpoint
> pg_control_recovery
> pg_create_logical_replication_slot
> pg_create_physical_replication_slot
> pg_logical_slot_get_binary_changes
> pg_logical_slot_get_changes
> pg_logical_slot_peek_binary_changes
> pg_logical_slot_peek_changes
>
> So that's an additional source of possible compatibility breaks.
> It doesn't seem like enough to change anybody's vote on the issue,
> but I mention it for completeness.
>
> In terms of the alternatives I listed previously, it seems like
> nobody liked alternatives #3, #4, or #5, leaving us with #1 (do
> nothing) or #2 (apply this patch).  By my count, Peter is the
> only one in favor of doing nothing, and is outvoted.  I'll push
> the patch later today if I don't hear additional comments.

For the record, I also voted for doing nothing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-05-10 Thread Robert Haas
On Wed, May 3, 2017 at 9:09 AM, amul sul  wrote:
> Fixed in the attached version.

+[ PARTITION BY { HASH | RANGE | LIST } ( { column_name | ( expression ) } [ COLLATE value,
REMAINDER value ) }

Maybe value -> modulus and value -> remainder?

  
+  When creating a hash partition, MODULUS should be
+  greater than zero and REMAINDER should be greater than
+  or equal to zero.  Every MODULUS must be a factor of
+  the next larger modulus.
[ ... and it goes on from there ... ]

This paragraph is fairly terrible, because it's a design spec that I
wrote, not an explanation intended for users.  Here's an attempt to
improve it:

===
When creating a hash partition, a modulus and remainder must be
specified.  The modulus must be a positive integer, and the remainder
must a non-negative integer less than the modulus.  Typically, when
initially setting up a hash-partitioned table, you should choose a
modulus equal to the number of partitions and assign every table the
same modulus and a different remainder (see examples, below).
However, it is not required that every partition have the same
modulus, only that every modulus which occurs among the children of a
hash-partitioned table is a factor of the next larger modulus.  This
allows the number of partitions to be increased incrementally without
needing to move all the data at once.  For example, suppose you have a
hash-partitioned table with 8 children, each of which has modulus 8,
but find it necessary to increase the number of partitions to 16.  You
can detach one of the modulus-8 partitions, create two new modulus-16
partitions covering the same portion of the key space (one with a
remainder equal to the remainder of the detached partition, and the
other with a remainder equal to that value plus 8), and repopulate
them with data.  You can then repeat this -- perhaps at a later time
-- for each modulus-8 partition until none remain.  While this may
still involve a large amount of data movement at each step, it is
still better than having to create a whole new table and move all the
data at once.
===

+CREATE TABLE postal_code (
+code int not null,
+city_id  bigint not null,
+address  text
+) PARTITION BY HASH (code);

It would be fairly silly to hash-partition the postal_code table,
because there aren't enough postal codes to justify it.  Maybe make
this a lineitem or order table, and partition on the order number.
Also, extend the example to show creating 4 partitions with modulus 4.

+if (spec->strategy != PARTITION_STRATEGY_HASH)
+elog(ERROR, "invalid strategy in partition bound spec");

I think this should be an ereport() if it can happen or an Assert() if
it's supposed to be prevented by the grammar.

+if (!(datumIsEqual(b1->datums[i][0], b2->datums[i][0],
+   true, sizeof(int)) &&

It doesn't seem necessary to use datumIsEqual() here.  You know the
datums are pass-by-value, so why not just use == ?  I'd include a
comment but I don't think using datumIsEqual() adds anything here
except unnecessary complexity.  More broadly, I wonder why we're
cramming this into the datums arrays instead of just adding another
field to PartitionBoundInfoData that is only used by hash
partitioning.

/*
+ * Check rule that every modulus must be a factor of the
+ * next larger modulus.  For example, if you have a bunch
+ * of partitions that all have modulus 5, you can add a new
+ * new partition with modulus 10 or a new partition with
+ * modulus 15, but you cannot add both a partition with
+ * modulus 10 and a partition with modulus 15, because 10
+ * is not a factor of 15.  However, you could
simultaneously
+ * use modulus 4, modulus 8, modulus 16, and modulus 32 if
+ * you wished, because each modulus is a factor of the next
+ * larger one.  You could also use modulus 10, modulus 20,
+ * and modulus 60. But you could not use modulus 10,
+ * modulus 15, and modulus 60 for the same reason.
+ */

I think just the first sentence is fine here; I'd nuke the rest of this.

The block that follows could be merged into the surrounding block.
There's no need to increase the indentation level here, so let's not.
I also suspect that the code itself is wrong.  There are two ways a
modulus can be invalid: it can either fail to be a multiple of the
next lower-modulus, or it can fail to be a factor of the next-higher
modulus.  I think your code only checks the latter.  So for example,
if the current modulus list is (4, 36), your code would correctly
disallow 3 because it's not a factor of 4 and would correctly disallow
23 because it's not a factor of 36, but it looks to me 

Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression

2017-05-10 Thread Tom Lane
Andres Freund  writes:
> On 2017-05-10 10:29:02 -0400, Tom Lane wrote:
>> As long as it doesn't block, the change in lock strength doesn't actually
>> make any speed difference does it?

> The issue isn't the strength, but that we currently have this weird
> hackery around open_share_lock():

Oh!  I'd forgotten about that.  Yes, if we change that then we'd
need to do some performance checking.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-05-10 Thread Tom Lane
David Steele  writes:
> If I read this correctly, we won't change the names of any functions 
> that we haven't *already* changed the names of, and only one view would 
> be changed to bring it into line with the rest.

I have now looked through the patch more carefully, and noted some changes
I forgot to account for in my previous summary: it also renames some
function arguments and output columns, which previously were variously
"location", "wal_position", etc.  I'd missed that for functions that don't
have a formal view in front of them.  This affects

pg_control_checkpoint
pg_control_recovery
pg_create_logical_replication_slot
pg_create_physical_replication_slot
pg_logical_slot_get_binary_changes
pg_logical_slot_get_changes
pg_logical_slot_peek_binary_changes
pg_logical_slot_peek_changes

So that's an additional source of possible compatibility breaks.
It doesn't seem like enough to change anybody's vote on the issue,
but I mention it for completeness.

In terms of the alternatives I listed previously, it seems like
nobody liked alternatives #3, #4, or #5, leaving us with #1 (do
nothing) or #2 (apply this patch).  By my count, Peter is the
only one in favor of doing nothing, and is outvoted.  I'll push
the patch later today if I don't hear additional comments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-05-10 Thread Robert Haas
On Wed, May 10, 2017 at 8:34 AM, Ashutosh Bapat
 wrote:
> Hash partitioning will partition the data based on the hash value of the
> partition key. Does that require collation? Should we throw an error/warning 
> if
> collation is specified in PARTITION BY clause?

Collation is only relevant for ordering, not equality.  Since hash
opclasses provide only equality, not ordering, it's not relevant here.
I'm not sure whether we should error out if it's specified or just
silently ignore it.  Maybe an ERROR is a good idea?  But not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-05-10 Thread Robert Haas
On Thu, May 4, 2017 at 1:44 AM, Jeff Davis  wrote:
>> Hmm, that could be a problem in some cases.  I think there's probably
>> much less of a problem if the modulus isn't a power of two?
>
> That's true, but it's awkward to describe that to users. And I think
> most people would be inclined to use a power-of-two number of
> partitions, perhaps coming from other systems.

Yeah, true.

>>> To fix this, I think we need to include a salt in the hash API. Each
>>> level of hashing can choose a random salt.
>>
>> Do you mean that we'd salt partitioning hashing differently from
>> grouping hashing which would be salted different from aggregation
>> hashing which, I suppose, would be salted differently from hash index
>> hashing?
>
> Yes. The way I think about it is that choosing a new random salt is an
> easy way to get a new hash function.

OK.  One problem, though, is we don't quite have the opclass
infrastructure for this.  A hash opclass's support function is
expected to take one argument, a value of the data type at issue.  The
first idea that occurred to me was to allow an optional second
argument which would be a seed, but that seems like it would require
extensive changes to all of the datatype-specific hash functions and
some of them would probably emerge noticeably slower.  If a function
is just calling hash_uint32 right now then I don't see how we're going
to replace that with something more complex that folds in a salt
without causing performance to drop.  Even just the cost of unpacking
the extra argument might be noticeable.

Another alternative would be to be to add one additional, optional
hash opclass support function which takes a value of the type in
question as one argument and a seed as a second argument.  That seems
like it might work OK.  Existing code can use the existing support
function 1 with no change, and hash partitioning can use support
function 2.

>> Or do you mean that you'd have to specify a salt when
>> creating a hash-partitioned table, and make sure it's the same across
>> all compatibly partitioned tables you might want to hash-join?  That
>> latter sounds unappealing.
>
> I don't see a reason to expose the salt to users. If we found a reason
> in the future, we could, but it would create all of the problems you
> are thinking about.

Right, OK.

>> You're basically describing what a hash opfamily already does, except
>> that we don't have a single opfamily that covers both varchar(10) and
>> char(10), nor do we have one that covers both int and numeric.  We
>> have one that covers int2, int4, and int8, though.  If somebody wanted
>> to make the ones you're suggesting, there's nothing preventing it,
>> although I'm not sure exactly how we'd encourage people to start using
>> the new one and deprecating the old one.  We don't seem to have a good
>> infrastructure for that.
>
> OK. I will propose new hash opfamilies for varchar/bpchar/text,
> int2/4/8/numeric, and timestamptz/date.

Cool!  I have no idea how we'll convert from the old ones to the new
ones without breaking things but I agree that it would be nicer if it
were like that rather than the way it is now.

> One approach is to promote the narrower type to the wider type, and
> then hash. The problem is that would substantially slow down the
> hashing of integers, so then we'd need to use one hash opfamily for
> partitioning and one for hashjoin, and it gets messy.

Yes, that sounds messy.

> The other approach is to check if the wider type is within the domain
> of the narrower type, and if so, *demote* the value and then hash. For
> instance, '4.2'::numeric would hash the same as it does today, but
> '4'::numeric would hash as an int2. I prefer this approach, and int8
> already does something resembling it.

Sounds reasonable.

> It's a little early in the v11 cycle to be having this argument.
> Really what I'm saying is that a small effort now may save us a lot of
> headache later.

Well, that's fair enough.  My concern is basically that it may the
other way around: a large effort to save a small headache later. I
agree that it's probably a good idea to figure out a way to salt the
hash function so that we don't end up with this and partitionwise join
interacting badly, but I don't see the other issues as being very
critical.  I don't have any evidence that there's a big need to
replace our hash functions with new ones, and over on the
partitionwise join thread we gave up on the idea of a cross-type
partitionwise join.  It wouldn't be particularly common (or sensible,
really) even if we ended up supporting it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression

2017-05-10 Thread Andres Freund
On 2017-05-10 10:29:02 -0400, Tom Lane wrote:
> Petr Jelinek  writes:
> > On 10/05/17 07:09, Peter Eisentraut wrote:
> >> I think the correct fix is to have nextval() and ALTER SEQUENCE use
> >> sensible lock levels so that they block each other.  Since
> >> nextval() currently uses AccessShareLock, the suggestion was for
> >> ALTER SEQUENCE to therefore use AccessExclusiveLock.  But I think a
> >> better idea would be for nextval() to use RowExclusiveLock
> >> (analogous to UPDATE) and ALTER SEQUENCE to use
> >> ShareRowExclusiveLock, which would also satisfy issue #1.
> 
> > When I proposed this upstream, Andres raised concern about performance
> > of nextval() if we do this, did you try to run any benchmark on this?
> 
> As long as it doesn't block, the change in lock strength doesn't actually
> make any speed difference does it?

The issue isn't the strength, but that we currently have this weird
hackery around open_share_lock():
/*
 * Open the sequence and acquire AccessShareLock if needed
 *
 * If we haven't touched the sequence already in this transaction,
 * we need to acquire AccessShareLock.  We arrange for the lock to
 * be owned by the top transaction, so that we don't need to do it
 * more than once per xact.
 */

This'd probably need to be removed, as we'd otherwise would get very
weird semantics around aborted subxacts.  Upthread I theorized whether
that's actually still meaningful given fastpath locking and such, but I
guess we'll have to evaluate that.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-10 Thread Alvaro Herrera
I'm surprised that there is so much activity in this thread.  Is this
patch being considered for pg10?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cached plans and statement generalization

2017-05-10 Thread Konstantin Knizhnik

On 02.05.2017 21:26, Robert Haas wrote:


I am sympathetic to the fact that this is a hard problem to solve.
I'm just telling you that the way you've got it is not acceptable and
nobody's going to commit it like that (or if they do, they will end up
having to revert it).  If you want to have a technical discussion
about what might be a way to change the patch to be more acceptable,
cool, but I don't want to get into a long debate about whether what
you have is acceptable or not; I've already said what I think about
that and I believe that opinion will be widely shared.  I am not
trying to beat you up here, just trying to be clear.




Based on the Robert's feedback and Tom's proposal I have implemented two 
new versions of autoprepare patch.


First version is just refactoring of my original implementation: I have 
extracted common code into prepare_cached_plan and exec_prepared_plan
function to avoid code duplication. Also I rewrote assignment of values 
to parameters. Now types of parameters are inferred from types of 
literals, so there may be several
prepared plans which are different only by types of parameters. Due to 
the problem with type coercion for parameters, I have to catch errors in 
parse_analyze_varparams.


Robert, can you please explain why using TRY/CATCH is not safe here:

This is definitely not a safe way of using TRY/CATCH.


Second version is based on Tom's suggestion:

Personally I'd think about
replacing the entire literal-with-cast construct with a Param having
already-known type.
So here I first patch raw parse tree, replacing A_Const with ParamRef. 
Such plan is needed to perform cache lookup.
Then I restore original raw parse tree and call pg_analyze_and_rewrite. 
Then I mutate analyzed tree, replacing Const with Param nodes.
In this case type coercion is already performed and I know precise types 
which should be used for parameters.
It seems to be more sophisticated approach. And I can not extract common 
code in prepare_cached_plan,
because preparing of plan is currently mix of steps done in 
exec_simple_query and exec_parse_message.

But there is no need to catch analyze errors.

Finally performance of both approaches is the same: at pgbench it is 
180k TPS on read-only queries, comparing with 80k TPS for not prepared 
queries.
In both cases 7 out of  177 regression tests  are not passed (mostly 
because session environment is changed between subsequent query execution).


I am going to continue work on this patch I will be glad to receive any 
feedback and suggestions for its improvement.
In most cases, applications are not accessing Postgres directly, but 
using some connection pooling layer and so them are not able to use 
prepared statements.
But at simple OLTP Postgres spent more time on building query plan than 
on execution itself. And it is possible to speedup Postgres about two 
times at such workload!
Another alternative is true shared plan cache.  May be it is even more 
perspective approach, but definitely much more invasive and harder to 
implement.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 6e52eb7..f2eb0f5 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -3696,6 +3696,454 @@ raw_expression_tree_walker(Node *node,
 }
 
 /*
+ * raw_expression_tree_mutator --- transform raw parse tree. 
+ *
+ * This function is implementing slightly different approach for tree update than expression_tree_mutator().
+ * Callback is given pointer to pointer to the current node and can update this field instead of returning reference to new node.
+ * It makes it possible to remember changes and easily revert them without extra traversal of the tree.
+ * 
+ * This function do not need QTW_DONT_COPY_QUERY flag: it never implicitly copy tree nodes, doing in-place update.
+ * 
+ * Like raw_expression_tree_walker, there is no special rule about query
+ * boundaries: we descend to everything that's possibly interesting.
+ *
+ * Currently, the node type coverage here extends only to DML statements
+ * (SELECT/INSERT/UPDATE/DELETE) and nodes that can appear in them, because
+ * this is used mainly during analysis of CTEs, and only DML statements can
+ * appear in CTEs. If some other node is visited, iteration is immediately stopped and true is returned.
+ */
+bool
+raw_expression_tree_mutator(Node *node,
+			bool (*mutator) (),
+			void *context)
+{
+	ListCell   *temp;
+
+	/*
+	 * The walker has already visited the current node, and so we need only
+	 * recurse into any sub-nodes it has.
+	 */
+	if (node == NULL)
+		return false;
+
+	/* Guard against stack overflow due to overly complex expressions */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		case T_SetToDefault:
+		case T_CurrentOfExpr:
+		case T_Integer:
+		case T_Float:
+		case T_String:
+		case T_BitString:
+		case T_Null:
+		case 

Re: [HACKERS] WIP Patch: Precalculate stable functions, infrastructure v1

2017-05-10 Thread Marina Polyakova

Hello, Aleksander!


I've noticed that this patch needs a review and decided to take a look.


Thank you very much!


There are a trailing whitespaces,
  though.


Oh, sorry, I'll check them.


I see 8-10% performance improvement on full text search queries.


Glad to hear it =)


It seems that there is no obvious performance degradation on regular
  queries (according to pgbench).


Thanks for testing it, I'll try not to forget about it next time =[


In short, it looks very promising.


And thanks again!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-10 Thread Robert Haas
On Wed, May 10, 2017 at 10:59 AM, Sven R. Kunze  wrote:
> You are definitely right. Changing it here would require to change it
> everywhere AND thus to loose syntax parity with Oracle.

Right.

> I am not in a position to judge this properly whether this would be a huge
> problem. Personally, I don't have an issue with that. But don't count me as
> most important opion on this.

Well, I don't think it would be a HUGE problem, but I think the fact
that Amit chose to implement this with syntax similar to that of
Oracle is probably not a coincidence, but rather a goal, and I think
the readability problem that you're worrying about is really pretty
minor.  I think most people aren't going to subpartition their default
partition, and I think those who do will probably find the syntax
clear enough anyway.   So I don't favor changing it.  Now, if there's
an outcry of support for your position then I'll stand aside but I
don't anticipate that.

>> So I guess I'm still in favor of the CREATE TABLE p1 PARTITION OF test
>> DEFAULT syntax, but if it ends up being AS DEFAULT instead, I can live
>> with that.
>
> Is to make it optional an option?

Optional keywords may not be the root of ALL evil, but they're pretty
evil.  See my posting earlier on this same thread on this topic:

http://postgr.es/m/CA+TgmoZGHgd3vKZvyQ1Qx3e0L3n=voxy57mz9ttncvet-al...@mail.gmail.com

The issues here are more or less the same.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Precalculate stable functions, infrastructure v1

2017-05-10 Thread Aleksander Alekseev
Hi Marina,

I've noticed that this patch needs a review and decided to take a look.
Here is a short summary:

* Patches apply to the master branch. There are a trailing whitespaces,
  though.
* All tests pass.
* I see 8-10% performance improvement on full text search queries.
* It seems that there is no obvious performance degradation on regular
  queries (according to pgbench).

In short, it looks very promising. 

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-10 Thread Sven R. Kunze

On 09.05.2017 09:19, Rahila Syed wrote:
+1 for AS DEFAULT syntax if it helps in improving readability 
specially in following case


CREATE TABLE p1 PARTITION OF test AS DEFAULT PARTITION BY LIST(a);

Thank you,
Rahila Syed

On Tue, May 9, 2017 at 1:13 AM, Robert Haas > wrote:


On Thu, May 4, 2017 at 4:40 PM, Sven R. Kunze > wrote:
> It yields
>
> CREATE TABLE p1 PARTITION OF test DEFAULT PARTITION BY LIST(b);
>
> This reads to me like "DEFAULT PARTITION".
>
> I can imagine a lot of confusion when those queries are
encountered in the
> wild. I know this thread is about creating a default partition
but I were to
> propose a minor change in the following direction, I think
confusion would
> be greatly avoided:
>
> CREATE TABLE p1 PARTITION OF test AS DEFAULT PARTITIONED BY LIST(b);
>
> I know it's a bit longer but I think those 4 characters might serve
> readability in the long term. It was especially confusing to see
PARTITION
> in two positions serving two different functions.

Well, we certainly can't make that change just for default partitions.
I mean, that would be non-orthogonal, right?  You can't say that the
way to subpartition is to write "PARTITION BY strategy" when the table
unpartitioned or is a non-default partition but "PARTITIONED BY
strategy" when it is a default partition.  That would certainly not be
a good way of confusing users less, and would probably result in a
variety of special cases in places like ruleutils.c or pg_dump, plus
some weasel-wording in the documentation.  We COULD do a general
change from "CREATE TABLE table_name PARTITION BY strategy" to "CREATE
TABLE table_name PARTITIONED BY strategy".  I don't have any
particular arguments against that except that the current syntax is
more like Oracle, which might count for something, and maybe the fact
that we're a month after feature freeze.  Still, if we want to change
that, now would be the time; but I favor leaving it alone.



You are definitely right. Changing it here would require to change it 
everywhere AND thus to loose syntax parity with Oracle.


I am not in a position to judge this properly whether this would be a 
huge problem. Personally, I don't have an issue with that. But don't 
count me as most important opion on this.




So I guess I'm still in favor of the CREATE TABLE p1 PARTITION OF test
DEFAULT syntax, but if it ends up being AS DEFAULT instead, I can live
with that.



Is to make it optional an option?

Sven


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-10 Thread Chapman Flack
On 05/10/2017 03:56 AM, Craig Ringer wrote:
> On 10 May 2017 10:44 am, "Chapman Flack"  wrote:
>> On 05/09/17 18:48, Mark Dilger wrote: 
>>> SET SESSION ON BEHALF OF 'joe user'
> 
> No need to do anything they custom and specific. No need for new syntax
> either.
> SET myapp.appuser = 'joe'

We seem to be agreed here ... I was claiming that Mark's new syntax
and simply setting a custom variable are different ways of "spelling"
the same idea, but the custom variable approach is less invasive and
can probably be done all within an extension.

>> The other bit of my proposal was to prevent Mallory from spoofing
>> his appident info by managing to inject some SQL through your app
> 
> If your attacker gets that far you're kind of screwed anyway.

Sure, but part of defense-in-depth still applies then, either to
collect more info on who's screwing you or to throw tables and chairs
to make them climb over.

> But that's where something like 'secure variables' or package variables
> come in. See the mailing list discussion on that topic a couple of months
> ago.

My hasty search for a mailing list discussion didn't find one (got
a link?), but I did find a wiki page; is that what you have in mind?

https://wiki.postgresql.org/wiki/Variable_Design

>> That's where it might be handy to have a
>> way to associate the info with the current thread or current request
>> in a way that doesn't need any support in third party layers in the middle,
>> but can be retrieved by the driver (or a thin wrapper around it, down
>> at the bottom of the stack) and turned into the proper SET commands.
> 
> I don't see how postgres can do anything about this. PgJDBC maybe. ...

We're agreed here too; that's why I described it as a separable,
future-work idea. I was just sketching how it might be possible in
the client-side stack to build on whatever basic function (secure
variables or a GUCs-with-cookie extension) postgres could provide.

Any uptake of that idea would have to happen in PgJDBC *and* in
psycopg2 *and* in DBD::Pg *and* in you-name-it for programming
languages/environments x, y, and z. So it's clearly science fiction.
(Actually, a more practical way to do it might be to write thin
wrappers that implement the DB driver API in each language, that
could be used *in place of* PgJDBC or psycopg2 or DBD::Pg and just
delegate everything to the real driver, except the 'connect'
operation would delegate to the real driver but then issue one
query for the cookie, and there'd be some additional non-standard
API for higher layers to call and set values, to be passed along
via SET commands and the cookie ahead of later queries.)

The design for any such thing would want to be idiomatic and
sensitive to the typical usages in each language or software
stack, so I don't remotely propose to do or even design any
for now. Just wanted to put rough ideas into the collective
consciousness to soak for a while.

> The main part I would like is a generic mech[an]ism to inject the value of a
> GUC into the logs.
> 
> For csvlog, it'd be a list of GUC names, each  a to be emitted as a
> separate field if set, or empty field if unset.
> 
> For normal log, it'd be available in log_line_prefix as something like
> 
> %(myapp.user)g 
> 
> I can see this being plenty useful for all sorts of work, and nicely
> flexible.

Thank you for bringing that back in, since logging was my motive when
I started this thread, and then my last message was all about other
things.

Yes, I think I'd like to see logging extended in exactly those ways
(modulo whatever exact spelling ends up preferred for %(gucname)g).

Looked at that way, that's a logging extension of general utility
so it's also separable from the 'variables to store on-behalf-of
identities' extension proposed here.

... with the caveat that if the variables-to-store-identities idea
were to end up going a different direction than an-extension-with-GUCs
(secure variables or whatever), then I'd want the logging extension
to also provide an escape syntax for logging whatever those are.

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression

2017-05-10 Thread Tom Lane
Petr Jelinek  writes:
> On 10/05/17 07:09, Peter Eisentraut wrote:
>> I think the correct fix is to have nextval() and ALTER SEQUENCE use
>> sensible lock levels so that they block each other.  Since
>> nextval() currently uses AccessShareLock, the suggestion was for
>> ALTER SEQUENCE to therefore use AccessExclusiveLock.  But I think a
>> better idea would be for nextval() to use RowExclusiveLock
>> (analogous to UPDATE) and ALTER SEQUENCE to use
>> ShareRowExclusiveLock, which would also satisfy issue #1.

> When I proposed this upstream, Andres raised concern about performance
> of nextval() if we do this, did you try to run any benchmark on this?

As long as it doesn't block, the change in lock strength doesn't actually
make any speed difference does it?

If we were taking AccessExclusiveLock somewhere, I'd be worried about
the cost of WAL-logging those; but this proposal does not include any.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 'nocopy data' option is set in SUBSCRIPTION but still data is getting migrated

2017-05-10 Thread Petr Jelinek
On 10/05/17 15:27, tushar wrote:
> Hi,
> 
> Please refer this scenario -where 'nocopy data' option is set in
> SUBSCRIPTION but still data is getting migrated
> 
> Publication - (X)
> create table t(n int);
> insert into t values (generate_series(1,99));
> create publication pub for table  t;
> 
> Subscription (Y)
> create table t(n int);
> CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
> port=5000 user=centos password=a' PUBLICATION pub WITH (copy
> data,SYNCHRONOUS_COMMIT=on);
> select count(*) from t;  ->showing 99 rows
> alter subscription sub refresh publication with (nocopy data);
> restart the server (Y)
> 
> X - insert more records into table 't'
> Y - check the row count , rows have been migrated from X .
> 
> Is it the right behavior in this case where nocopy data option is set ?
> 

Yes, (no)copy data only affects existing data at the time of running the
command, any additional data are always replicated.

The "alter subscription sub refresh publication" does nothing unless you
added/removed tables to/from publication.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression

2017-05-10 Thread Petr Jelinek
On 10/05/17 07:09, Peter Eisentraut wrote:
> On 5/7/17 19:43, Andres Freund wrote:
>> 3. Keep the catalog, make ALTER properly transactional, blocking
>>concurrent nextval()s. This resolves the issue that nextval() can't
>>see the changed definition of the sequence.
> 
> This was the intended choice.
> 
> [...]
> 
> 5. nextval() disregarding uncommitted ALTER SEQUENCE changes.  In
>Currently, it goes ahead even if there is an uncommitted ALTER
>SEQUENCE pending that would prohibit what it's about to do (e.g.,
>MAXVALUE change).
> 
>I think the correct fix is to have nextval() and ALTER SEQUENCE use
>sensible lock levels so that they block each other.  Since
>nextval() currently uses AccessShareLock, the suggestion was for
>ALTER SEQUENCE to therefore use AccessExclusiveLock.  But I think a
>better idea would be for nextval() to use RowExclusiveLock
>(analogous to UPDATE) and ALTER SEQUENCE to use
>ShareRowExclusiveLock, which would also satisfy issue #1.
> 

When I proposed this upstream, Andres raised concern about performance
of nextval() if we do this, did you try to run any benchmark on this?

Looking at the changes to open_share_lock(), I wonder if we need to have
lock level as parameter so that lastval() is not blocked.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 'nocopy data' option is set in SUBSCRIPTION but still data is getting migrated

2017-05-10 Thread tushar

Hi,

Please refer this scenario -where 'nocopy data' option is set in 
SUBSCRIPTION but still data is getting migrated


Publication - (X)
create table t(n int);
insert into t values (generate_series(1,99));
create publication pub for table  t;

Subscription (Y)
create table t(n int);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost 
port=5000 user=centos password=a' PUBLICATION pub WITH (copy 
data,SYNCHRONOUS_COMMIT=on);

select count(*) from t;  ->showing 99 rows
alter subscription sub refresh publication with (nocopy data);
restart the server (Y)

X - insert more records into table 't'
Y - check the row count , rows have been migrated from X .

Is it the right behavior in this case where nocopy data option is set ?

--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of plaintext password type references

2017-05-10 Thread Michael Paquier
On Wed, May 10, 2017 at 10:01 PM, Tom Lane  wrote:
> Heikki Linnakangas  writes:
>> Also note that changing the signature check_password_hook_type would
>> break any external modules that use the hook. Removing
>> PASSWORD_TYPE_PLAINTEXT will do that too, because any password hook
>> function would use that constant (see e.g. contrib/passwordcheck). If we
>> were to change the signature, I'd actually like to simplify it by
>> removing the password_type parameter altogether. The hook function can
>> call get_password_type() on the password itself to get the same
>> information. But it's not worth changing the API and breaking external
>> modules for that.

Ahah. I just had the same thought before reading this message.

> FWIW, I think we've never hesitated to change hook signatures across major
> versions if there was a good reason for it.  It seems actually rather
> unlikely that an external module interested in check_password_hook would
> not need to know about the SCRAM changes, so this case seems like it's
> easily justifiable.

My modules break every couple of months (utility hook is the last one
in date), and usually fixes are no big deals. Removing password_type
is in this category, and as long as compilation fails properly people
will be able to notice problems easily.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2017-05-10 Thread Robert Haas
On Wed, May 10, 2017 at 6:50 AM, Etsuro Fujita
 wrote:
> I started working on this.  I agree that the changes made in
> make_modifytable would be unacceptable, but I'd vote for Amit's idea of
> passing a modified PlannerInfo to PlanForeignModify so that the FDW can do
> query planning for INSERT into a foreign partition in the same way as for
> INSERT into a non-partition foreign table.  (Though, I think we should
> generate a more-valid-looking working-copy of the PlannerInfo which has
> Query with the foreign partition as target.)  I'm not sure it's a good idea
> to add a new FDW API or modify the existing one such as PlanForeignModify
> for this purpose.

Thanks for working on this, but I think it would be better to start a
new thread for this discussion.

And probably also for any other issues that come up.  This thread has
gotten so long that between the original discussion and commit of the
patch and discussion of multiple follow-on commits and patches that
it's very hard to follow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)

2017-05-10 Thread Robert Haas
On Wed, May 10, 2017 at 8:02 AM, Thomas Munro
 wrote:
> On Wed, May 10, 2017 at 11:10 PM, Thomas Munro
>  wrote:
>> 2.  If you attach a row-level trigger with transition tables to any
>> inheritance child, it will see transition tuples from all tables in
>> the inheritance hierarchy at or below the directly named table that
>> were modified by the same statement, sliced so that they appear as
>> tuples from the directly named table.
>
> Of course that's a bit crazy, not only for trigger authors to
> understand and deal with, but also for plan caching: it just doesn't
> really make sense to have a database object, even an ephemeral one,
> whose type changes depending on how the trigger was invoked, because
> the plans stick around.  Perhaps you could modify NamedTuplestorescan
> to convert on the fly to the TupleDesc of the table that the row-level
> trigger is attached to, using NULL for missing columns, but that'd be
> a slightly strange too, depending on how you did it.

I don't think it's crazy from a user perspective, but the plan caching
thing sounds like a problem.

> Perhaps we should reject row-level triggers with transition tables on
> tables that are part of an inheritance hierarchy, but allow them for
> partitions.

Sounds like a sensible solution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression

2017-05-10 Thread Michael Paquier
On Wed, May 10, 2017 at 2:09 PM, Peter Eisentraut
 wrote:
> I think I have more clarity about the different, overlapping issues:
>
> 1. Concurrent ALTER SEQUENCE causes "tuple concurrently updated"
>error, caused by updates to pg_sequence catalog.  This can be fixed
>by taking a self-exclusive lock somewhere.
>
>A typical answer for other catalogs is to use
>ShareRowExclusiveLock.  But in another context it was argued that
>is undesirable for other reasons, and it's better to stick with
>RowExclusiveLock and deal with the occasional consequences.  But
>this question is obsoleted by #2.

Yes.

> 2. There is some inconsistency or disagreement about what to lock
>when modifying relation metadata.  When you alter a non-relation
>object, you just have the respective catalog to lock, and you have
>to make the trade-offs described in #1.  When you alter a relation
>object, you can lock the catalog or the actual relation, or both.
>I had gone with locking the catalog, but existing practice in
>tablecmds.c is to lock the relation with the most appropriate lock
>level, and use RowExclusiveLock for the catalog.  We can make
>sequences do that, too.

Agreed, sequences are relations as well at the end. Consistency sounds
good to me.

> 3. Sequence WAL writes and catalog updates are not protected by same
>lock.  We can fix that by holding a lock longer.  If per #2 we make
>the lock on the sequence the "main" one, then we just hold that one
>across the catalog update.
>
> 4. Some concerns about in AlterSequence() opening the pg_sequence
>catalog while holding the sequence buffer lock.  Move some things
>around to fix that.
>
> 5. nextval() disregarding uncommitted ALTER SEQUENCE changes.  In
>Currently, it goes ahead even if there is an uncommitted ALTER
>SEQUENCE pending that would prohibit what it's about to do (e.g.,
>MAXVALUE change).

Yes, and we want to block all nextval() calls until concurrent ALTER
SEQUENCE are committed.

>I think the correct fix is to have nextval() and ALTER SEQUENCE use
>sensible lock levels so that they block each other.  Since
>nextval() currently uses AccessShareLock, the suggestion was for
>ALTER SEQUENCE to therefore use AccessExclusiveLock.  But I think a
>better idea would be for nextval() to use RowExclusiveLock
>(analogous to UPDATE) and ALTER SEQUENCE to use
>ShareRowExclusiveLock, which would also satisfy issue #1.

No objections to that.

> Attached patches:
>
> 0001 Adds isolation tests for sequence behavior, in particular regarding
> point #5.  The expected files in 0001 show how it behaves in 9.6, and if
> you apply it to 10 without the subsequent fixes, it will show some problems.
>
> 0002 Locking changes discussed above, with test changes.
>
> 0003 (optional) Reduce locking if only RESTART is specified (because
> that does not do a catalog update, so it can run concurrently with nextval).

Looking at 0001 and 0002... So you are correctly blocking nextval()
when ALTER SEQUENCE holds a lock on the sequence object. And
concurrent calls of nextval() don't conflict. As far as I can see this
matches the implementation of 3.

Here are some minor comments.

+   /* lock page' buffer and read tuple into new sequence structure */
Nit: there is a single quote in this comment.

 /*
+ * TODO rename for lock level change
+ *
  * Open the sequence and acquire AccessShareLock if needed
  *
Some of the functions calling init_sequence() reference
AccessShareLock in comments. Those will need an update as well.

+++ b/src/test/isolation/expected/sequence-nextval.out
@@ -0,0 +1,35 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1a s1b s2a s2b s1c s1d s2c s2d
+step s1a: SELECT nextval('seq1');
+nextval
I am not sure that this brings much value...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of plaintext password type references

2017-05-10 Thread Tom Lane
Heikki Linnakangas  writes:
> Also note that changing the signature check_password_hook_type would 
> break any external modules that use the hook. Removing 
> PASSWORD_TYPE_PLAINTEXT will do that too, because any password hook 
> function would use that constant (see e.g. contrib/passwordcheck). If we 
> were to change the signature, I'd actually like to simplify it by 
> removing the password_type parameter altogether. The hook function can 
> call get_password_type() on the password itself to get the same 
> information. But it's not worth changing the API and breaking external 
> modules for that.

FWIW, I think we've never hesitated to change hook signatures across major
versions if there was a good reason for it.  It seems actually rather
unlikely that an external module interested in check_password_hook would
not need to know about the SCRAM changes, so this case seems like it's
easily justifiable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-05-10 Thread Ashutosh Bapat
On Wed, May 3, 2017 at 6:39 PM, amul sul  wrote:
> On Thu, Apr 27, 2017 at 1:42 AM, Robert Haas  wrote:
>
>>
>> This is not yet a detailed review - I may be missing things, and
>> review and commentary from others is welcome.  If there is no major
>> disagreement with the idea of moving forward using Amul's patch as a
>> base, then I will do a more detailed review of that patch (or,
>> hopefully, an updated version that addresses the above comments).
>

I agree that Amul's approach makes dump/restore feasible whereas
Nagata-san's approach makes that difficult. That is a major plus point
about Amul's patch. Also, it makes it possible to implement
Nagata-san's syntax, which is more user-friendly in future.

Here are some review comments after my initial reading of Amul's patch:

Hash partitioning will partition the data based on the hash value of the
partition key. Does that require collation? Should we throw an error/warning if
collation is specified in PARTITION BY clause?

+int   *indexes;/* Partition indexes; in case of hash
+ * partitioned table array length will be
+ * value of largest modulus, and for others
+ * one entry per member of the datums array
+ * (plus one if range partitioned table) */
This may be rewritten as "Partition indexes: For hash partitioned table the
number of indexes will be same as the largest modulus. For list partitioned
table the number of indexes will be same as the number of datums. For range
partitioned table the number of indexes will be number of datums plus one.".
You may be able to reword it to a shorter version, but essentially we will have
separate description for each strategy.

I guess, we need to change the comments for the other members too. For example
"datums" does not contain tuples with key->partnatts attributes for hash
partitions. It contains a tuple with two attributes, modulus and remainder. We
may not want to track null_index separately since rows with NULL partition key
will fit in the partition corresponding to the hash value of NULL. OR may be we
want to set null_index to partition which contains NULL values, if there is a
partition created for corresponding remainder, modulus pair and set has_null
accordingly. Accordingly we will need to update the comments.

cal_hash_value() may be renamed as calc_has_value() or compute_hash_value()?

Should we change the if .. else if .. construct in RelationBuildPartitionDesc()
to a switch case? There's very less chance that we will support a fourth
partitioning strategy, so if .. else if .. may be fine.

+intmod = hbounds[i]->modulus,
+place = hbounds[i]->remainder;
Although there are places in the code where we separate variable declaration
with same type by comma, most of the code declares each variable with the data
type on separate line. Should variable "place" be renamed as "remainder" since
that's what it is ultimately?

RelationBuildPartitionDesc() fills up mapping array but never uses it. In this
code the index into mapping array itself is the mapping so it doesn't need to
be maintained separately like list partiioning case. Similary next_index usage
looks unnecessary, although that probably improves readability, so may be fine.

+ *   for p_p1: satisfies_hash_partition(2, 1, pkey, value)
+ *   for p_p2: satisfies_hash_partition(4, 2, pkey, value)
+ *   for p_p3: satisfies_hash_partition(8, 0, pkey, value)
+ *   for p_p4: satisfies_hash_partition(8, 4, pkey, value)
What the function builds is satisfies_hash_partition(2, 1, pkey). I don't see
code to add value as an argument to the function. Is that correct?

+intmodulus = DatumGetInt32(datum);
May be you want to rename this variable to greatest_modulus like in the other
places.

+Assert(spec->modulus > 0 && spec->remainder >= 0);
I liked this assertion. Do you want to add spec->modulus > spec->reminder also
here?

+char   *strategy;/* partitioning strategy
+   ('hash', 'list' or 'range') */

We need the second line to start with '*'

+-- check validation when attaching list partitions
Do you want to say "hash" instead of "list" here?

I think we need to explain the reasoning behind this syntax somewhere
as a README or in the documentation or in the comments. Otherwise it's
difficult to understand how various pieces of code are related.

This is not full review. I am still trying to understand how the hash
partitioning implementation fits with list and range partitioning. I
am going to continue to review this patch further.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to 

Re: [HACKERS] transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)

2017-05-10 Thread Thomas Munro
On Wed, May 10, 2017 at 11:10 PM, Thomas Munro
 wrote:
> 2.  If you attach a row-level trigger with transition tables to any
> inheritance child, it will see transition tuples from all tables in
> the inheritance hierarchy at or below the directly named table that
> were modified by the same statement, sliced so that they appear as
> tuples from the directly named table.

Of course that's a bit crazy, not only for trigger authors to
understand and deal with, but also for plan caching: it just doesn't
really make sense to have a database object, even an ephemeral one,
whose type changes depending on how the trigger was invoked, because
the plans stick around.  Perhaps you could modify NamedTuplestorescan
to convert on the fly to the TupleDesc of the table that the row-level
trigger is attached to, using NULL for missing columns, but that'd be
a slightly strange too, depending on how you did it.

Perhaps we should reject row-level triggers with transition tables on
tables that are part of an inheritance hierarchy, but allow them for
partitions.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)

2017-05-10 Thread Thomas Munro
On Wed, May 10, 2017 at 3:55 PM, Robert Haas  wrote:
> On Tue, May 9, 2017 at 11:48 PM, Thomas Munro
>  wrote:
>> Hmm.  DB2 has transition tables (invented them maybe?) and it allows
>> OLD/NEW TABLE on row-level triggers:
>>
>> https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.1.0/com.ibm.db2.luw.admin.dbobj.doc/doc/t0020236.html
>
> Yeah, my impression is that Kevin was pretty keen on supporting that
> case.  I couldn't say exactly why, though.

Ok, here's a new version that handles row-level triggers with
transition tables on any child table.  The regression tests show
partition and inheritance examples of that.  To be clear about what
this does:

1.  If you attach a row-level trigger with transition tables to any
partition, it will see transition tuples from all partitions that were
modified by the same statement.

2.  If you attach a row-level trigger with transition tables to any
inheritance child, it will see transition tuples from all tables in
the inheritance hierarchy at or below the directly named table that
were modified by the same statement, sliced so that they appear as
tuples from the directly named table.

On Wed, May 10, 2017 at 3:41 PM, Robert Haas  wrote:
> Hmm.  What if the partitioning hierarchy contains foreign tables?

Arghalalkjhsdflg.  Looking into that...

-- 
Thomas Munro
http://www.enterprisedb.com


transition-tuples-from-child-tables-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2017-05-10 Thread Etsuro Fujita

On 2016/11/18 1:43, Robert Haas wrote:

On Thu, Nov 17, 2016 at 6:27 AM, Amit Langote
 wrote:



- The code in make_modifytable() to swap out the rte_array for a fake
one looks like an unacceptable kludge.  I don't know offhand what a
better design would look like, but what you've got is really ugly.



Agree that it looks horrible.  The problem is we don't add partition
(child table) RTEs when planning an insert on the parent and FDW
partitions can't do without some planner handling - planForeignModify()
expects a valid PlannerInfo for deparsing target lists (basically, to be
able to use planner_rt_fetch()).



If it's only needed for foreign tables, how about for v1 we just throw
an error and say errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot route inserted tuples to a foreign table") for now.  We
can come back and fix it later.  Doing more inheritance expansion



Coming up with some new FDW API or some modification
to the existing one is probably better, but I don't really want to get
hung up on that right now.


I started working on this.  I agree that the changes made in 
make_modifytable would be unacceptable, but I'd vote for Amit's idea of 
passing a modified PlannerInfo to PlanForeignModify so that the FDW can 
do query planning for INSERT into a foreign partition in the same way as 
for INSERT into a non-partition foreign table.  (Though, I think we 
should generate a more-valid-looking working-copy of the PlannerInfo 
which has Query with the foreign partition as target.)  I'm not sure 
it's a good idea to add a new FDW API or modify the existing one such as 
PlanForeignModify for this purpose.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-10 Thread Rahila Syed
>It seems that adding a new partition at the same level as the default
>partition will require scanning it or its (leaf) partitions if
>partitioned.  Consider that p1, pd are partitions of a list-partitioned
>table p accepting 1 and everything else, respectively, and pd is further
>partitioned.  When adding p2 of p for 2, we need to scan the partitions of
>pd to check if there are any (2, ...) rows.

 This is a better explanation. May be following sentence was confusing,
"That is prohibit creation of new partition after a default partition which
is further partitioned"
Here, what I meant was default partition is partitioned further.

>As for fixing the reported issue whereby the partitioned default
>partition's non-existent file is being accessed, it would help to take a
>look at the code in ATExecAttachPartition() starting at the following:
OK. I get it now. If attach partition already supports scanning all the
partitions before attach,
similar support should be provided in the case of adding a partition after
default partition as well.

Thank you,
Rahila Syed

On Wed, May 10, 2017 at 6:42 AM, Amit Langote  wrote:

> On 2017/05/10 2:09, Robert Haas wrote:
> > On Tue, May 9, 2017 at 9:26 AM, Rahila Syed 
> wrote:
> >>> Hi Rahila,
> >>
> >>> I am not able add a new partition if default partition is further
> >>> partitioned
> >>> with default partition.
> >>
> >>> Consider example below:
> >>
> >>> postgres=# CREATE TABLE test ( a int, b int, c int) PARTITION BY LIST
> (a);
> >>> CREATE TABLE
> >>> postgres=# CREATE TABLE test_p1 PARTITION OF test FOR VALUES IN(4, 5,
> 6, 7,
> >>> 8);
> >>> CREATE TABLE
> >>> postgres=# CREATE TABLE test_pd PARTITION OF test DEFAULT PARTITION BY
> >>> LIST(b);
> >>> CREATE TABLE
> >>> postgres=# CREATE TABLE test_pd_pd PARTITION OF test_pd DEFAULT;
> >>> CREATE TABLE
> >>> postgres=# INSERT INTO test VALUES (20, 24, 12);
> >>> INSERT 0 1
> >>> postgres=# CREATE TABLE test_p2 PARTITION OF test FOR VALUES IN(15);
> >> ERROR:  could not open file "base/12335/16420": No such file or
> directory
> >>
> >> Regarding fix for this I think we need to prohibit this case. That is
> >> prohibit creation
> >> of new partition after a default partition which is further partitioned.
> >> Currently before adding a new partition after default partition all the
> rows
> >> of default
> >> partition are scanned and if a row which matches the new partitions
> >> constraint exists
> >> the new partition is not added.
> >>
> >> If we allow this for default partition which is partitioned further, we
> will
> >> have to scan
> >> all the partitions of default partition for matching rows which can slow
> >> down execution.
> >
> > I think this case should be allowed
>
> +1
>
> > and I don't think it should
> > require scanning all the partitions of the default partition.  This is
> > no different than any other case where multiple levels of partitioning
> > are used.  First, you route the tuple at the root level; then, you
> > route it at the next level; and so on.  It shouldn't matter whether
> > the routing at the top level is to that level's default partition or
> > not.
>
> It seems that adding a new partition at the same level as the default
> partition will require scanning it or its (leaf) partitions if
> partitioned.  Consider that p1, pd are partitions of a list-partitioned
> table p accepting 1 and everything else, respectively, and pd is further
> partitioned.  When adding p2 of p for 2, we need to scan the partitions of
> pd to check if there are any (2, ...) rows.
>
> As for fixing the reported issue whereby the partitioned default
> partition's non-existent file is being accessed, it would help to take a
> look at the code in ATExecAttachPartition() starting at the following:
>
> /*
>  * Set up to have the table be scanned to validate the partition
>  * constraint (see partConstraint above).  If it's a partitioned
> table, we
>  * instead schedule its leaf partitions to be scanned.
>  */
> if (!skip_validate)
> {
>
> Thanks,
> Amit
>
>


Re: [HACKERS] Page Scan Mode in Hash Index

2017-05-10 Thread Ashutosh Sharma
While doing the code coverage testing of v7 patch shared with - [1], I
found that there are few lines of code in _hash_next() that are
redundant and needs to be removed. I basically came to know this while
testing the scenario where a hash index scan starts when a split is in
progress. I have removed those lines and attached is the newer version
of patch.

[1] - 
https://www.postgresql.org/message-id/CAE9k0Pmn92Le0iOTU5b6087eRXzUnkq1PKcihxtokHJ9boXCBg%40mail.gmail.com

On Mon, May 8, 2017 at 6:55 PM, Ashutosh Sharma  wrote:
> Hi,
>
> On Tue, Apr 4, 2017 at 3:56 PM, Amit Kapila  wrote:
>> On Sun, Apr 2, 2017 at 4:14 AM, Ashutosh Sharma  
>> wrote:
>>>
>>> Please note that these patches needs to be applied on top of  [1].
>>>
>>
>> Few more review comments:
>>
>> 1.
>> - page = BufferGetPage(so->hashso_curbuf);
>> + blkno = so->currPos.currPage;
>> + if (so->hashso_bucket_buf == so->currPos.buf)
>> + {
>> + buf = so->currPos.buf;
>> + LockBuffer(buf, BUFFER_LOCK_SHARE);
>> + page = BufferGetPage(buf);
>> + }
>>
>> Here, you are assuming that only bucket page can be pinned, but I
>> think that assumption is not right.  When _hash_kill_items() is called
>> before moving to next page, there could be a pin on the overflow page.
>> You need some logic to check if the buffer is pinned, then just lock
>> it.  I think once you do that, it might not be convinient to release
>> the pin at the end of this function.
>
> Yes, there are few cases where we might have pin on overflow pages too
> and we need to handle such cases in _hash_kill_items. I have taken
> care of this in the attached v7 patch. Thanks.
>
>>
>> Add some comments on top of _hash_kill_items to explain the new
>> changes or say some thing like "See _bt_killitems for details"
>
> Added some more comments on the new changes.
>
>>
>> 2.
>> + /*
>> + * We save the LSN of the page as we read it, so that we know whether it
>> + * safe to apply LP_DEAD hints to the page later.  This allows us to drop
>> + * the pin for MVCC scans, which allows vacuum to avoid blocking.
>> + */
>> + so->currPos.lsn = PageGetLSN(page);
>> +
>>
>> The second part of above comment doesn't make sense because vacuum's
>> will still be blocked because we hold the pin on primary bucket page.
>
> That's right. It doesn't make sense because we won't allow vacuum to
> start. I have removed it.
>
>>
>> 3.
>> + {
>> + /*
>> + * No more matching tuples were found. return FALSE
>> + * indicating the same. Also, remember the prev and
>> + * next block number so that if fetching tuples using
>> + * cursor we remember the page from where to start the
>> + * scan.
>> + */
>> + so->currPos.prevPage = (opaque)->hasho_prevblkno;
>> + so->currPos.nextPage = (opaque)->hasho_nextblkno;
>>
>> You can't read opaque without having lock and by this time it has
>> already been released.
>
> I have corrected it. Please refer to the attached v7 patch.
>
> Also, I think if you want to save position for
>> cursor movement, then you need to save the position of last bucket
>> when scan completes the overflow chain, however as you have written it
>> will be always invalid block number. I think there is similar problem
>> with prevblock number.
>
> Did you mean last bucket or last page. If it is last page, then I am
> already storing it.
>
>>
>> 4.
>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber 
>> offnum,
>> +   OffsetNumber maxoff, ScanDirection dir)
>> +{
>> + HashScanOpaque so = (HashScanOpaque) scan->opaque;
>> + IndexTuple  itup;
>> + int itemIndex;
>> +
>> + if (ScanDirectionIsForward(dir))
>> + {
>> + /* load items[] in ascending order */
>> + itemIndex = 0;
>> +
>> + /* new page, relocate starting position by binary search */
>> + offnum = _hash_binsearch(page, so->hashso_sk_hash);
>>
>> What is the need to find offset number when this function already has
>> that as an input parameter?
>
> It's not required. I have removed it.
>
>>
>> 5.
>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber 
>> offnum,
>> +   OffsetNumber maxoff, ScanDirection dir)
>>
>> I think maxoff is not required to be passed a parameter to this
>> function as you need it only for forward scan. You can compute it
>> locally.
>
> It is required for both forward and backward scan. However, I am not
> passing it to _hash_load_qualified_items() now.
>
>>
>> 6.
>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber 
>> offnum,
>> +   OffsetNumber maxoff, ScanDirection dir)
>> {
>> ..
>> + if (ScanDirectionIsForward(dir))
>> + {
>> ..
>> + while (offnum <= maxoff)
>> {
>> ..
>> + if (so->hashso_sk_hash == _hash_get_indextuple_hashkey(itup) &&
>> + _hash_checkqual(scan, itup))
>> + {
>> + /* tuple is qualified, so remember it */
>> + _hash_saveitem(so, itemIndex, offnum, itup);
>> + itemIndex++;
>> + }
>> +
>> + offnum = OffsetNumberNext(offnum);
>> ..
>>
>> Why are you traversing the whole page 

Re: [HACKERS] snapbuild woes

2017-05-10 Thread Petr Jelinek
On 09/05/17 22:11, Erik Rijkers wrote:
> On 2017-05-09 21:00, Petr Jelinek wrote:
>> On 09/05/17 19:54, Erik Rijkers wrote:
>>> On 2017-05-09 11:50, Petr Jelinek wrote:
>>>
>>
>> Ah okay, so this is same issue that's reported by both Masahiko Sawada
>> [1] and Jeff Janes [2].
>>
>> [1]
>> https://www.postgresql.org/message-id/CAD21AoBYpyqTSw%2B%3DES%2BxXtRGMPKh%3DpKiqjNxZKnNUae0pSt9bg%40mail.gmail.com
>>
>> [2]
>> https://www.postgresql.org/message-id/flat/CAMkU%3D1xUJKs%3D2etq2K7bmbY51Q7g853HLxJ7qEB2Snog9oRvDw%40mail.gmail.com
>>
> 
> I don't understand why you come to that conclusion: both Masahiko Sawada
> and Jeff Janes have a DROP SUBSCRIPTION in the mix; my cases haven't. 
> Isn't that a real difference?
> 

Because I only see the sync workers running in the output you shown, the
bug described in those threads can as one of side effects cause the sync
workers to wait forever for the apply that was killed, crashed, etc, in
the right moment, which I guess is what happened during your failed test
(there should be some info in the log about apply exiting).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Documentation about pg_stat_bgwriter

2017-05-10 Thread Kyotaro HORIGUCHI
Hi. While I read the documentation I found the following
description about some columns in pg_stat_bgwriter.

https://www.postgresql.org/docs/devel/static/monitoring-stats.html#pg-stat-bgwriter-view

This table shows cluster-global values, not per-backend values.

> maxwritten_clean:
>   Number of times the background writer stopped a cleaning scan
>   because it had written too many buffers
> buffers_backend:
>   Number of buffers written directly by a backend
> buffers_backend_fsync:
>   Number of times a backend had to execute its own fsync call
>   (normally the background writer handles those even when the
>   backend does its own write)

Since the values are the summary in a cluster, the 'a backend's
in the last two seems wrong *to me*. I suppose the 'a backend'
should be just 'backends' or 'backends other than the background
writer' (This seems a bit verbose.).

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Centerhas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of plaintext password type references

2017-05-10 Thread Heikki Linnakangas

On 05/10/2017 08:01 AM, Michael Paquier wrote:

On Wed, May 10, 2017 at 10:51 AM, Vaishnavi Prabakaran
 wrote:

Following recent removal of support to store password in plain text,
modified the code to

1. Remove  "PASSWORD_TYPE_PLAINTEXT" macro
2. Instead of using "get_password_type" to retrieve the encryption method
AND to check if the password is already encrypted or not, modified the code
to
a. Use "get_password_encryption_type" function to retrieve encryption
method.
b. Use "isPasswordEncrypted" function to check if the password is already
encrypted or not.

These changes are mainly to increase code readability and does not change
underlying functionality.


Actually, this patch reduces readability.


Yeah, I don't think this is an improvement. Vaishnavi, did you find the 
current code difficult? Perhaps some extra comments somewhere would help?


Also note that changing the signature check_password_hook_type would 
break any external modules that use the hook. Removing 
PASSWORD_TYPE_PLAINTEXT will do that too, because any password hook 
function would use that constant (see e.g. contrib/passwordcheck). If we 
were to change the signature, I'd actually like to simplify it by 
removing the password_type parameter altogether. The hook function can 
call get_password_type() on the password itself to get the same 
information. But it's not worth changing the API and breaking external 
modules for that.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-10 Thread Craig Ringer
On 10 May 2017 10:44 am, "Chapman Flack"  wrote:

On 05/09/17 18:48, Mark Dilger wrote:

> I don't have any positive expectation that the postgres community will go
> along with any of this, but just from my point of view, the cleaner way to
> do what you are proposing is something like setting a session variable.
>
> In your middle tier java application or whatever, you'd run something like
>
> SET SESSION ON BEHALF OF 'joe user'


No need to do anything they custom and specific. No need for new syntax
either.


SET myapp.appuser = 'joe'

Or use SET LOCAL for xact scoped.


The other bit of my proposal was to prevent Mallory from spoofing
his appident info by managing to inject some SQL through your app


If your attacker gets that far you're kind of screwed anyway.

But that's where something like 'secure variables' or package variables
come in. See the mailing list discussion on that topic a couple of months
ago.



SET SESSION ON BEHALF OF 'joe user' BECAUSE I HAVE :cookie AND I SAY SO;


I do want something similar to this for SET SESSION AUTHORIZATION.

But for most things a secure variable model with a setter function should
work better.



Without any new syntax


Much, much more chance of this.

 with it.

It's those more complex architectures I was thinking of with the client-
side ideas, where your code may be at the top of such a tall stack of
persistence/ORM/whatever layers that you're not sure you can just emit
an arbitrary SET command and have it come out in front of the right query
generated by the lower layers.


Surely in that case you have the same problem with something based on new
syntax?


That's where it might be handy to have a
way to associate the info with the current thread or current request
in a way that doesn't need any support in third party layers in the middle,
but can be retrieved by the driver (or a thin wrapper around it, down
at the bottom of the stack) and turned into the proper SET commands. That's
really a separable, less immediate, future-work idea.


I don't see how postgres can do anything about this. PgJDBC maybe. But
probably not.

The main part I would like is a generic mechs ism to inject the value of a
GUC into the logs.

For csvlog, it'd be a list of GUC names, each  a to be emitted as a
separate field if set, or empty field if unset.

For normal log, it'd be available in log_line_prefix as something like

%(myapp.user)g

... or whatever.

I can see this being plenty useful for all sorts of work, and nicely
flexible.


Re: [HACKERS] PQhost may return socket dir for network connection

2017-05-10 Thread Kyotaro HORIGUCHI
At Mon, 1 May 2017 15:48:17 -0400, Robert Haas  wrote in 

> On Mon, May 1, 2017 at 1:36 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Mon, May 1, 2017 at 12:06 PM, Tom Lane  wrote:
> >>> Having said that, the behavior stated in $subject does sound wrong.
> >
> >> I'm not sure.  My understanding of the relationship between host and
> >> hostaddr is that hostaddr overrides our notion of where to find host,
> >> but not our notion of the host to which we're connecting.  Under that
> >> definition, the current behavior as described by Kyotaro sounds
> >> correct.
> >
> > Perhaps.  But hostaddr also forces us to believe that we're making an
> > IP connection, so it still seems pretty dubious to return a socket
> > path.  The true situation is that we're connecting to an IP host that
> > we do not know the name of.
> 
> Yes, I think that's a reasonable interpretation.
> 
> > I notice that one of the recent changes was made to avoid situations where
> > PQhost() would return NULL and thereby provoke a crash if the application
> > wasn't expecting that (which is not unreasonable of it, since the PQhost()
> > documentation mentions no such hazard).  So I would not want to see us
> > return NULL in this case.
> > And I believe we already considered and rejected the idea of having it
> > return the hostaddr string, back in some of the older discussions.
> > (We could revisit that decision, no doubt, but let's go back and see
> > what the reasoning was first.)
> >
> > But maybe returning an empty string ("") would be OK?
> 
> Yeah, that might be OK.  But I'd be inclined not to back-patch any
> behavior changes we make in this area unless it's clear that 9.6
> regressed relative to previous releases.

I personally don't have a specific wish on this since I don't
have a specific usage of PQhost. (I think that users are
reposible for the result of contradicting host and hostaddr.)

However, it might be a problem that the documentation doesn't
mention what the returned value from PQhost is.

https://www.postgresql.org/docs/9.6/static/libpq-status.html

> Returns the server host name of the connection. This can be a
> host name, an IP address, or a directory path if the connection
> is via Unix socket. (The path case can be distinguished because
> it will always be an absolute path, beginning with /.)

I don't think more strict definition is required but the above
should describe the expected usage or obvious limitation. Anyway
it is contradicting to the current behavior. It can return a
socket path for a IP connection.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time based lag tracking for logical replication

2017-05-10 Thread Noah Misch
On Mon, May 08, 2017 at 10:30:45PM -0700, Noah Misch wrote:
> On Fri, May 05, 2017 at 07:07:26AM +, Noah Misch wrote:
> > On Wed, May 03, 2017 at 08:28:53AM +0200, Simon Riggs wrote:
> > > On 23 April 2017 at 01:10, Petr Jelinek  
> > > wrote:
> > > > Hi,
> > > >
> > > > The time based lag tracking commit [1] added interface for logging
> > > > progress of replication so that we can report lag as time interval
> > > > instead of just bytes. But the patch didn't contain patch for the
> > > > builtin logical replication.
> > > >
> > > > So I wrote something that implements this. I didn't like all that much
> > > > the API layering in terms of exporting the walsender's LagTrackerWrite()
> > > > for use by plugin directly. Normally output plugin does not have to care
> > > > if it's running under walsender or not, it uses abstracted write
> > > > interface for that which can be implemented in various ways (that's how
> > > > we implement SQL interface to logical decoding after all). So I decided
> > > > to add another function to the logical decoding write api called
> > > > update_progress and call that one from the output plugin. The walsender
> > > > then implements that new API to call the LagTrackerWrite() while the SQL
> > > > interface just does not implement it at all. This seems like cleaner way
> > > > of doing it.
> > > >
> > > > Thoughts?
> > > 
> > > Agree cleaner.
> > > 
> > > I don't see any pacing or comments about it, nor handling of
> > > intermediate messages while we process a large transaction.
> > > 
> > > I'll look at adding some pacing code in WalSndUpdateProgress()
> > 
> > [Action required within three days.]
> > 
> > Simon, I understand, from an annotation on the open items list, that you 
> > have
> > volunteered to own this item.  Please observe the policy on open item
> > ownership[1] and send a status update within three calendar days of this
> > message.  Include a date for your subsequent status update.  Testers may
> > discover new open items at any time, and I want to plan to get them all 
> > fixed
> > well in advance of shipping v10.  Consequently, I will appreciate your 
> > efforts
> > toward speedy resolution.  Thanks.
> > 
> > [1] 
> > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
> 
> This PostgreSQL 10 open item is past due for your status update.  Kindly send
> a status update within 24 hours, and include a date for your subsequent status
> update.  Refer to the policy on open item ownership:
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is long past due
for your status update.  Please reacquaint yourself with the policy on open
item ownership[1] and then reply immediately.  If I do not hear from you by
2017-05-11 06:00 UTC, I will transfer this item to release management team
ownership without further notice.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove pre-10 compatibility code in hash index

2017-05-10 Thread Amit Kapila
On Wed, May 10, 2017 at 9:17 AM, Robert Haas  wrote:
> On Tue, May 9, 2017 at 1:41 AM, Amit Kapila  wrote:
>> Commit ea69a0dead5128c421140dc53fac165ba4af8520 has bumped the hash
>> index version and obviates the need for backward compatibility code
>> added by commit 293e24e507838733aba4748b514536af2d39d7f2.  The same
>> has been mentioned in the commit message, please find attached patch
>> to remove the pre-10 compatibility code in hash index.
>
> Committed.
>

Thanks!

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bumping HASH_VERSION to 3

2017-05-10 Thread Amit Kapila
On Tue, Apr 11, 2017 at 11:53 PM, Bruce Momjian  wrote:
> On Fri, Mar 31, 2017 at 02:48:05PM -0400, Tom Lane wrote:
>> +1, as long as we're clear on what will happen when pg_upgrade'ing
>> an installation containing hash indexes.  I think a minimum requirement is
>> that it succeed and be able to start up, and allow the user to manually
>> REINDEX such indexes afterwards.  Bonus points for:
>>
>> 1. teaching pg_upgrade to create a script containing the required REINDEX
>> commands.  (I think it's produced scripts for similar requirements in the
>> past.)
>>
>> 2. marking the index invalid so that the system would silently ignore it
>> until it's been reindexed.  I think there might be adequate infrastructure
>> for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter
>> of getting pg_upgrade to hack the indexes' catalog state.  (If not, it's
>> probably not worth the trouble.)
>
> We already have code to do all of that, but it was removed from
> pg_upgrade in 9.5.  You can still see it in 9.4:
>
> 
> contrib/pg_upgrade/version_old_8_3.c::old_8_3_invalidate_hash_gin_indexes()
>

Thanks for the pointer.

> I would be happy to restore that code and make it work for PG 10.
>

Attached patch implements the two points suggested by Tom. I will add
this to PG-10 open issues list so that we don't forget about this
work, let me know if you think otherwise.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pg_upgrade_invalidate_old_hash_index_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers