date:20170915

Re: [HACKERS] Parallel Append implementation

2017-09-15 Thread Amit Kapila

On Thu, Sep 14, 2017 at 9:41 PM, Robert Haas  wrote:
> On Mon, Sep 11, 2017 at 9:25 AM, Amit Kapila  wrote:
>> I think the patch stores only non-partial paths in decreasing order,
>> what if partial paths having more costs follows those paths?
>
> The general picture here is that we don't want the leader to get stuck
> inside some long-running operation because then it won't be available
> to read tuples from the workers.  On the other hand, we don't want to
> just have the leader do no work because that might be slow.  And in
> most cast cases, the leader will be the first participant to arrive at
> the Append node, because of the worker startup time.  So the idea is
> that the workers should pick expensive things first, and the leader
> should pick cheap things first.
>

At a broader level, the idea is good, but I think it won't turn out
exactly like that considering your below paragraph which indicates
that it is okay if leader picks a partial path that is costly among
other partial paths as a leader won't be locked with that.
Considering this is a good design for parallel append, the question is
do we really need worker and leader to follow separate strategy for
choosing next path.  I think the patch will be simpler if we can come
up with a way for the worker and leader to use the same strategy to
pick next path to process.  How about we arrange the list of paths
such that first, all partial paths will be there and then non-partial
paths and probably both in decreasing order of cost.  Now, both leader
and worker can start from the beginning of the list. In most cases,
the leader will start at the first partial path and will only ever
need to scan non-partial path if there is no other partial path left.
This is not bulletproof as it is possible that some worker starts
before leader in which case leader might scan non-partial path before
all partial paths are finished, but I think we can avoid that as well
if we are too worried about such cases.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-15 Thread Bruce Momjian

On Thu, Sep 14, 2017 at 09:21:25PM -0400, Stephen Frost wrote:
> Michael, all,
> 
> * Michael Paquier (michael.paqu...@gmail.com) wrote:
> > On Fri, Sep 15, 2017 at 8:23 AM, Andreas Joseph Krogh
> >  wrote:
> > > I tested upgrading from 9.6 to 10 now, using pg_upgrade, and pg_upgrade
> > > creates the new data-dir with pg_wal "in it" (just like regular initdb), 
> > > so
> > > pg_upgrade seems not to care about where the old version's pg_xlog was. 
> > > You
> > > have to move (by symlinking) pg_wal to a separate location manually 
> > > *after*
> > > running pg_upgrade on the master.
> > 
> > That's true, this should definitely be mentioned in the documentation.
> 
> Uh, this seems like something that should be *fixed*, not documented.
> That initdb accepts an alternative location for pg_xlog/pg_wal now
> raises that to a first-class feature, in my view, and other tools should
> recognize the case and deal with it correctly.
> 
> Of course, that having been said, there's some technical challenges
> there.  One being that we don't really want to mess with the old
> cluster's WAL during pg_upgrade.  Frustratingly, we have a way to deal
> with that case today for tablespaces, it was just never applied to WAL
> when we started allowing WAL to be stored elsewhere (formally).  That
> seems like it was a mistake to me.
> 
> Then again, the more I think about this, the more I wonder about putting
> more-or-less everything in PGDATA into per-catalog-version directories
> and making everything self-contained.  Part of the ugly bit with the
> rsync is exactly that we have to go up an extra level for the data
> directories themselves, and users can actually put them anywhere so
> there might not even *be* a common parent directory to use.

I am going to need to outline where I think we are before I can suggest
a solution.

What we did with the tablespace directory is to use the catalog version
_inside_ the tablespace directory, e.g.:

/vol1/pg_tblsp/PG_9.5_201510051
/vol1/pg_tblsp/PG_9.6_201608131

We did not do this for the WAL directory because by _default_ it is in
PGDATA, which is major-version specific.  Where this breaks is when the
initdb --waldir option is used.  We could have created a major version
subdirectory inside the directory specified by --waldir, but that would
have made the PGDATA contents and the --waldir differ, and I think it
would have been confusing.

What we have now is the problem that perhaps people are not creating
major-version-specific names for --waldir.  I am not sure how anyone is
doing an upgrade except for dumping the data, deleteing the old cluster
and the old WAL directory, recreating the new cluster and new WAL
directory, and then loading the data.  Because pg_upgrade needs to have
the old and new servers running, having different external WAL
directories for each cluster is a requirement.

We have not had any complaints about this so I am confused why it is now
an issue just because of rsync.  If they created separate WAL
directories on the primary, they will need separate ones on the standby,
the symlinks will be copied, and they need to use rsync to copy stuff. 
There are no hard links in there, but using the link option should be
harmless.

> > An improvement could be done as well here for pg_upgrade: when using
> > --link, the new PGDATA created could consider as well the source
> > pg_wal and create a link to it, and then clean up its contents. I am
> > not completely sure if this would be worth doing as people are likely
> > used to the current flow though. The documentation needs to outline
> > the matter at least.
> 
> No, one of the baseline requirements of pg_upgrade is to *not* screw
> with the existing cluster.  Removing its WAL or "cleaning it up"
> definitely seems like it's violating that principle.
> 
> I tend to agree that it'd be good for the documentation to address this,
> but this is all really getting to be a bit much for a manpage to be able
> to handle, I think..

Yes, I am struggling with this too.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-15 Thread Robert Haas

On Thu, Sep 14, 2017 at 12:59 PM, David G. Johnston
 wrote:
> I think we are being consistent as a project by enforcing strictness of
> input in this situation so I'll toss my +0.5/+1 here as well.

All right, since all three new votes are going the same direction with
this, committed the patch for that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we cacheline align PGXACT?

2017-09-15 Thread Alexander Korotkov

On Tue, Sep 5, 2017 at 2:47 PM, Daniel Gustafsson  wrote:

> > On 04 Apr 2017, at 14:58, David Steele  wrote:
> >
> > On 4/4/17 8:55 AM, Alexander Korotkov wrote:
> >> On Mon, Apr 3, 2017 at 9:58 PM, Andres Freund  >>
> >>I'm inclined to push this to the next CF, it seems we need a lot more
> >>benchmarking here.
> >>
> >> No objections.
> >
> > This submission has been moved to CF 2017-07.
>
> This CF has now started (well, 201709 but that’s what was meant in above),
> can
> we reach closure on this patch in this CF?
>

During previous commitfest I come to doubts if this patch is really needed
when same effect could be achieved by another (perhaps random) change of
alignment.  The thing I can do now is to retry my benchmark on current
master and check what effect this patch has now.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] path toward faster partition pruning

2017-09-15 Thread Amit Langote

On Sat, Sep 16, 2017 at 4:04 AM, Robert Haas  wrote:
> On Fri, Sep 15, 2017 at 4:50 AM, Amit Langote
>  wrote:
>> Rebased patches attached.  Because Dilip complained earlier today about
>> clauses of the form (const op var) not causing partition-pruning, I've
>> added code to commute the clause where it is required.  Some other
>> previously mentioned limitations remain -- no handling of OR clauses, no
>> elimination of redundant clauses for given partitioning column, etc.
>>
>> A note about 0001: this patch overlaps with
>> 0003-Canonical-partition-scheme.patch from the partitionwise-join patch
>> series that Ashutosh Bapat posted yesterday [1].
>
> It doesn't merely overlap; it's obviously a derivative work,

Yes it is.  I noted that upthread [1] that most of these are derived
from Ashutosh's patch on his suggestion.  I guess I should have
repeated that in this message too, sorry.

> and the
> commit message in your version should credit all the authors.

That was a mistake on my part, too.  Will be careful hereon.

Thanks,
Amit

[1] 
https://www.postgresql.org/message-id/0e829199-a43c-2a66-b966-89a0020a6cd4%40lab.ntt.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] additional contrib test suites

2017-09-15 Thread Michael Paquier

On Sat, Sep 16, 2017 at 5:15 AM, Tom Lane  wrote:
> I wrote:
>> Peter Eisentraut  writes:
>>> So, we have one failure for chkpass on OpenBSD, because OpenBSD crypt()
>>> doesn't support the traditional two-character salt format.
>
>>> Option:
>
>>> - Use the resultmap features to make this an expected failure on OpenBSD.
>
>>> - Fix the module to work on OpenBSD.  This would involve making a
>>> platform-specific modification to use whatever advanced salt format they
>>> want.
>
>>> - Replace the entire module with something that does not depend on crypt().
>
>> Or (4) drop the module's regression test again.
>
> Noting that mandrill is showing yet a different failure, one that I think
> is inherent to chkpass:
>
>   CREATE TABLE test (i int, p chkpass);
>   INSERT INTO test VALUES (1, 'hello'), (2, 'goodbye');
> + WARNING:  type chkpass has unstable input conversion for "hello"
> + LINE 1: INSERT INTO test VALUES (1, 'hello'), (2, 'goodbye');
> + ^
> + WARNING:  type chkpass has unstable input conversion for "goodbye"
> + LINE 1: INSERT INTO test VALUES (1, 'hello'), (2, 'goodbye');
> +   ^
>
> I'm starting to think that (4) might be the best avenue.  Or we could
> consider
>
> (5) drop contrib/chkpass altogether, on the grounds that it's too badly
> designed, and too obsolete crypto-wise, to be useful or supportable.

crypt() uses the 7 lowest characters, which makes for 7.2e16 values,
so I would be fine with (5), then (4) as the test suite is not
portable.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] [HACKERS] Channel binding support for SCRAM-SHA-256

2017-09-15 Thread Michael Paquier

On Sat, Sep 16, 2017 at 12:42 AM, Robert Haas  wrote:
> On Thu, Sep 14, 2017 at 12:58 PM, Peter Eisentraut
>  wrote:
>> Second thoughts, to make things simpler.  All we need for channel
>> binding is a connection flag that says "I require channel binding".  It
>> could be modeled after the sslmode parameter, e.g., cbind=disable (maybe
>> for debugging), cbind=prefer (default), cbind=require.  If you specify
>> "require", then libpq would refuse to proceed unless scram-sha2-256-plus
>> (or future similar mechanisms) was offered for authentication.
>
> +1, although I think cbind is too brief.  I'd spell it out.

I would like to point out that per the RFC, if the client attempts a
SSL connection with SCRAM and that the server supports channel
binding, then it has to publish the SASL mechanism for channel
binding, aka SCRAM-PLUS. If the client tries to force the use of SCRAM
even if SCRAM-PLUS is specified, this is seen as a downgrade attack by
the server which must reject the connection. So this parameter has
meaning only if you try to connect to a PG10 server using a PG11
client (assuming that channel binding gets into PG11). If you connect
with a PG11 client to a PG11 server with SSL, the server publishes
SCRAM-PLUS, the client has to use it, hence this turns out to make
cbind=disable and prefer meaningless in the long-term. If the client
does not use SSL, then there is no channel binding, and cbind=require
loses its value. So cbind's fate is actually linked to sslmode.

>> We don't even need a parameter that specifies which channel binding type
>> to use.  If libpq implements tls-unique, it should always use that.  We
>> might need a flag for testing other types, but that should not be an
>> in-the-user's-face option.
>
> I'm not so sure about this part.  Why don't we want to let users control this?

That's at least useful for at least testing, tls-unique should be the
default, but we need as well end-point which we much likely would like
to be able to enforce from the client, and being able to enforce the
type of channel binding used does not betray the RFC.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-15 Thread Thomas Munro

On Sat, Sep 16, 2017 at 9:23 AM, Robert Haas  wrote:
> On the overall patch set:
>
> - I am curious to know how this has been tested.  How much of the new
> code is covered by the tests in 0007-Partition-wise-join-tests.patch?
> How much does coverage improve with
> 0008-Extra-testcases-for-partition-wise-join-NOT-FOR-COMM.patch?  What
> code, if any, is not covered by either of those test suites?  Could we
> do meaningful testing of this with something like Andreas
> Seltenreich's sqlsmith?

FWIW I'm working on an answer to both of those question, but keep
getting distracted by other things catching on fire...

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-09-15 Thread Andres Freund

Hi Thom,

On 2017-09-15 22:05:35 +0100, Thom Brown wrote:
> Okay, I've retested it using a prepared statement executed 100,000
> times (which selects a single row from a table with 101 columns, each
> column is 42-43 characters long, and 2 rows in the table), and I get
> the following:
> 
> master:
> 
> real1m23.485s
> user1m2.800s
> sys0m1.200s
> 
> 
> patched:
> 
> real1m22.530s
> user1m2.860s
> sys0m1.140s
> 
> 
> Not sure why I'm not seeing the gain.

I suspect a part of that is that you're testing the patch in isolation,
whereas I tested it as part of multiple speedup patches. There's some
bigger bottlenecks than this one. I've attached my whole stack.

But even that being said, here's the result for these patches in
isolation on my machine:

setup:
psql -p 5440 -f ~/tmp/create_many_cols.sql

pgbench -M prepared -f ~/tmp/pgbench-many-cols.sql -n -T 10 -P 1
master (best of three):
tps = 13347.023301 (excluding connections establishing)
patched (best of three):
tps = 14309.690741 (excluding connections establishing)

Which is a bigger win than what you're observing. I've also attached the
benchmark scripts I used.  Could you detail how your benchmark works a
bit more? Any chance you looped in plpgsql or such?


Just for fun/reference, these are the results with all the patches
applied:
tps = 19069.115553 (excluding connections establishing)
and with just this patch reverted:
tps = 17342.006825 (excluding connections establishing)

Regards,

Andres
>From a2325a2649da403dc562b8e93df972839823d924 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 13 Sep 2017 19:25:34 -0700
Subject: [PATCH 1/8] Speedup pgstat_report_activity by moving mb-aware
 truncation to read side.

Previously multi-byte aware truncation was done on every
pgstat_report_activity() call - proving to be a bottleneck for
workloads with long query strings that execute quickly.

Instead move the truncation to the read side, which is commonly
executed far less frequently. That's possible because all server
encodings allow to determine the length of a multi-byte string from
the first byte.

Author: Andres Freund
Discussion: https://postgr.es/m/20170912071948.pa7igbpkkkvie...@alap3.anarazel.de
---
 src/backend/postmaster/pgstat.c | 63 -
 src/backend/utils/adt/pgstatfuncs.c | 17 +++---
 src/include/pgstat.h| 12 +--
 3 files changed, 72 insertions(+), 20 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index accf302cf7..1ffdac5448 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -2701,7 +2701,7 @@ CreateSharedBackendStatus(void)
 		buffer = BackendActivityBuffer;
 		for (i = 0; i < NumBackendStatSlots; i++)
 		{
-			BackendStatusArray[i].st_activity = buffer;
+			BackendStatusArray[i].st_activity_raw = buffer;
 			buffer += pgstat_track_activity_query_size;
 		}
 	}
@@ -2922,11 +2922,11 @@ pgstat_bestart(void)
 #endif
 	beentry->st_state = STATE_UNDEFINED;
 	beentry->st_appname[0] = '\0';
-	beentry->st_activity[0] = '\0';
+	beentry->st_activity_raw[0] = '\0';
 	/* Also make sure the last byte in each string area is always 0 */
 	beentry->st_clienthostname[NAMEDATALEN - 1] = '\0';
 	beentry->st_appname[NAMEDATALEN - 1] = '\0';
-	beentry->st_activity[pgstat_track_activity_query_size - 1] = '\0';
+	beentry->st_activity_raw[pgstat_track_activity_query_size - 1] = '\0';
 	beentry->st_progress_command = PROGRESS_COMMAND_INVALID;
 	beentry->st_progress_command_target = InvalidOid;
 
@@ -3017,7 +3017,7 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 			pgstat_increment_changecount_before(beentry);
 			beentry->st_state = STATE_DISABLED;
 			beentry->st_state_start_timestamp = 0;
-			beentry->st_activity[0] = '\0';
+			beentry->st_activity_raw[0] = '\0';
 			beentry->st_activity_start_timestamp = 0;
 			/* st_xact_start_timestamp and wait_event_info are also disabled */
 			beentry->st_xact_start_timestamp = 0;
@@ -3034,8 +3034,12 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 	start_timestamp = GetCurrentStatementStartTimestamp();
 	if (cmd_str != NULL)
 	{
-		len = pg_mbcliplen(cmd_str, strlen(cmd_str),
-		   pgstat_track_activity_query_size - 1);
+		/*
+		 * Compute length of to-be-stored string unaware of multi-byte
+		 * characters. For speed reasons that'll get corrected on read, rather
+		 * than computed every write.
+		 */
+		len = Min(strlen(cmd_str), pgstat_track_activity_query_size - 1);
 	}
 	current_timestamp = GetCurrentTimestamp();
 
@@ -3049,8 +3053,8 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 
 	if (cmd_str != NULL)
 	{
-		memcpy((char *) beentry->st_activity, cmd_str, len);
-		beentry->st_activity[len] = '\0';
+		memcpy((char *) beentry->st_activity_raw, cmd_str, len);
+		beentry->st_activity_raw[len] = '\0';
 		beentry->st_activity_start_timestamp = start_timestamp;
 	}
 
@@

Re: [HACKERS] pgjdbc logical replication client throwing exception

2017-09-15 Thread Dipesh Dangol

Hi Vladimir,
Ya, initially I was trying with withStatusInterval(20, TimeUnit.SECONDS),
that didn't work so, then only I switched to  .withStatusInterval(20,
TimeUnit.MILLISECONDS)
but it is not working as well. I am not aware of type of test cases that
you are pointing.
Could you please send me any link for that one.

For generating the load, I am using benchmarkSQL, which will generate
around 9000
transactions per second. I am trying to run streamAPI at the same time
BenchmarskSQL is generating load. If i don't run benchmarkSQL it works fine
I mean
when there are only few transactions to replicate at a time, it works fine.
But when
I run it with that benchmarskSql and try to add some logic like some
conditions, then
it breaks down in between, most of the time within few seconds.

Hi Andres,
I haven't check the server log yet. Now, I don't access to my working
environment, I will be able to check that only on Monday. If I find any
suspicious
thing in log, I will let you know.

Thank you guys.

On Fri, Sep 15, 2017 at 10:05 PM, Andres Freund  wrote:

> On 2017-09-15 20:00:34 +, Vladimir Sitnikov wrote:
> > ++pgjdbc dev list.
> >
> > >I am facing unusual connection breakdown problem. Here is the simple
> code
> > that I am using to read WAL file:
> >
> > Does it always fails?
> > Can you create a test case? For instance, if you file a pull request with
> > the test, it will get automatically tested across various PG versions, so
> > it would be easier to reson about
> >
> > Have you tried "withStatusInterval(20, TimeUnit.SECONDS)" instead of 20
> > millis? I don't think it matter much, however 20ms seems to be an
> overkill.
>
> Also, have you checked the server log?
>
> - Andres
>

Re: [HACKERS] More efficient truncation of pg_stat_activity query strings

2017-09-15 Thread Robert Haas

On Fri, Sep 15, 2017 at 4:49 PM, Andres Freund  wrote:
> On 2017-09-15 16:45:47 -0400, Tom Lane wrote:
>> Andres Freund  writes:
>> > Version correcting these is attached. Thanks!
>>
>> I'd vote for undoing the s/st_activity/st_activity_raw/g business.
>> That may have been useful while writing the patch, to ensure you
>> found all the references; but it's just creating a lot of unnecessary
>> delta in the final code, with the attendant hazards for back-patches.
>
> I was wondering about that too (see [1]). My only concern is that some
> extensions out there might be accessing the string expecting it to be
> properly truncated. The rename at least forces them to look for the new
> name...  I'm slightly in favor of keeping the rename, it doesn't seem
> likely to cause a lot of backpatch pain.

I tend to agree with you, but it's not a huge deal either way.  Even
if somebody fails to update third-party code that touches this, a lot
of times it'll work anyway.  But that very fact is, of course, why it
would be slightly better to break it explicitly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-15 Thread Robert Haas

On Fri, Sep 15, 2017 at 6:11 AM, Ashutosh Bapat
 wrote:
> Thanks a lot Robert.
>
> Here are rebased patches.

I didn't get quite as much time to look at these today as I would have
liked, but here's what I've got so far.

Comments on 0001:

- In the RelOptInfo, part_oids is defined in a completely different
part of the structure than nparts, but you can't use it without nparts
because you don't know how long it is.  I suggest moving the
definition to just after nparts.

- On the other hand, maybe we should just remove it completely.  I
don't see it in any of the subsequent patches.  If it's used by the
advanced matching code, let's leave it out of 0001 for now and add it
back after the basic feature is committed.

- Similarly, partsupfunc isn't used by the later patches either.  It
seems it could also be removed, at least for now.

- The comment for partexprs isn't very clear about how the lists
inside the array work.  My understanding is that the lists have as
many members as the partition key has columns/expressions.

- I'm not entirely sure whether maintaining partexprs and
nullable_partexprs is the right design.  If I understand correctly,
whether or not a partexpr is nullable is really a per-RTI property,
not a per-expression property.  You could consider something like
"Relids nullable_rels".

Comments on 0002:

- The relationship between deciding to set partition scheme and
related details and the configured value of enable_partition_wise_join
needs some consideration.  If the only use of the partition scheme is
partition-wise join, there's no point in setting it even for a baserel
unless enable_partition_wise_join is set -- but if there are other
important uses for that data, such as Amit's partition pruning work,
then we might want to always set it.  And similarly for a join: if the
details are only needed in the partition-wise join case, then we only
need to set them in that case, but if there are other uses, then it's
different.  If it turns out that setting these details for a baserel
is useful in other cases but that it's only for a joinrel in the
partition-wise join case, then the patch gets it exactly right.  But
is that correct?  I'm not sure.

- The naming of enable_partition_wise_join might also need some
thought.  What happens when we also have partition-wise aggregate?
What about the proposal to strength-reduce MergeAppend to Append --
would that use this infrastructure?  I wonder if we out to call this
enable_partition_wise or enable_partition_wise_planning to make it a
bit more general.  Then, too, I've never really liked having
partition_wise in the GUC name because it might make someone think
that it makes you partitions have a lot of wisdom.  Removing the
underscore might help: partitionwise.  Or maybe there is some whole
different name that would be better.  If anyone wants to bikeshed,
now's the time.

- It seems to me that build_joinrel_partition_info() could be
simplified a bit.  One thing is that list_copy() is perfectly capable
of handling a NIL input, so there's no need to test for that before
calling it.

Comments on 0003:

- Instead of reorganizing add_paths_to_append_rel as you did, could
you just add an RTE_JOIN case to the switch?  Not sure if there's some
problem with that idea, but it seems like it might come out nicer.

On the overall patch set:

- I am curious to know how this has been tested.  How much of the new
code is covered by the tests in 0007-Partition-wise-join-tests.patch?
How much does coverage improve with
0008-Extra-testcases-for-partition-wise-join-NOT-FOR-COMM.patch?  What
code, if any, is not covered by either of those test suites?  Could we
do meaningful testing of this with something like Andreas
Seltenreich's sqlsmith?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-09-15 Thread Thom Brown

On 15 September 2017 at 19:23, Andres Freund  wrote:
> Hi Thom,
>
> Thanks for taking a whack at this!
>
> On 2017-09-15 12:16:22 +0100, Thom Brown wrote:
>> I've run a fairly basic test with a table with 101 columns, selecting
>> a single row from the table and I get the following results:
>>
>>
>> Columns with 1-character names:
>>
>> master (80 jobs, 80 connections, 60 seconds):
>
> FWIW, I don't think it's useful to test this with a lot of concurrency -
> at that point you're likely saturating the machine with context switches
> etc. unless you have a lot of cores. As this is isn't related to
> concurrency I'd rather just check a single connection.
>
>
>> transaction type: /tmp/test.sql
>> scaling factor: 1
>> query mode: simple
>
> I think you'd need to use prepared statements / -M prepared to see
> benefits - when parsing statements for every execution the bottleneck is
> elsewhere (hello O(#available_columns * #selected_columns) in
> colNameToVar()).

Okay, I've retested it using a prepared statement executed 100,000
times (which selects a single row from a table with 101 columns, each
column is 42-43 characters long, and 2 rows in the table), and I get
the following:

master:

real1m23.485s
user1m2.800s
sys0m1.200s


patched:

real1m22.530s
user1m2.860s
sys0m1.140s


Not sure why I'm not seeing the gain.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Nico Williams

On Fri, Sep 15, 2017 at 04:07:33PM -0400, Tom Lane wrote:
> Nico Williams  writes:
> > On Fri, Sep 15, 2017 at 02:19:29PM -0500, Nico Williams wrote:
> >> On Fri, Sep 15, 2017 at 11:26:08AM -0700, Andres Freund wrote:
> >>> I think you should also explain why that's a desirable set of
> >>> semantics.
> 
> > Now, why is this desirable: atomicity.
> 
> >The client/user can no longer add statements to the transaction,
> >therefore now (commit-time) is a good time to run a procedure that
> >can *atomically* examine the totatility of the changes made by the
> >user.
> 
> I do not really understand this claim.  The argument for a commit trigger
> seems to be exactly "I want to be the last thing that happens in the
> transaction".  But as soon as you have two of them, one of them is not

The user defining the triggers is a very different thing from the user
sending the various statements up to and including the COMMIT.  They
need not be the same.

The DBA/developers/ops own the triggers.  The client user may not have
any trigger creation privilege.

Being in control of the [pre-]commit triggers, I can control the order
in which they run (by-name, lexical order).

The important thing is that all of the [pre-]commit triggers will run
after the last statement in the TX send by the client.

Surely there's nothing strange abolut this -- it's already the case for
[DEFERRED] CONSTRAINT TRIGGERs!

> going to be the last thing.  Maybe you could address that by requiring
> the triggers to be read-only, but then it doesn't seem like a very useful

No, you think something is a problem that isn't.  And I need these
triggers to be able to write.

> feature.  If there can be only one, maybe it's a usable feature or maybe
> not, but I'm inclined to think that CREATE TRIGGER is a pretty poor API

Multiple [pre-]commit triggers make as much (or little) sense as
multiple triggers on any table.

> for such a definition.  Triggers are generally expected to be objects
> you can create any number of.

That's what I expect of commitr triggers too: that I could have as many
as I want.

> Another question is exactly how you're going to "examine the totality of
> the transaction's changes"; a trigger, per se, isn't going to know *any*
> of what the transaction did let alone all of it.  We have some features
> like transition tables, but they're not accessible after end of statement;

Currently I use an audit facility I wrote (based on any number of audit
facilities I've seen) which automatically creates for-each-row triggers
for all tables and which then record (txid,old_row,new_row) for all
updates.  So a pre-copmmit trigger can simply examine all such audit
rows for txid_current() each table it cares about.

> and they're pretty useless in the face of DDL too.  It's going to be hard

I use event triggers to discover schema changes.

(I'm not concerned about superusers -- they can disable my
implementation.  I _am_ concerned about unprivileged users.)

> to produce a very credible use-case without a lot of work to expose that
> information somehow.

One dead-trivial use-case is to

  INSERT INTO tx_log.tx_log("txid","who","when","how_long")
  SELECT txid_current(), current_user, current_timestamp,
 clock_timestamp() - current_timestamp;

And yes, I could just as well use a DEFERRED CONSTRAINT TRIGGER on every
table than does this INSERT with ON CONFLICT ("txid") DO NOTHING.
Except that an unprivileged user could SET CONSTRAINTS ALL IMMEDIATE
(ugh), and that this would be slower (once per-row) than doing it just
once at commit time.

> (Some closely related work is being done for logical decoding, btw.
> I wonder which use-cases for this might be satisfied by having a logical
> decoding plug-in watching the WAL output.)

My use case involves generating a stream of incremental updates to
hand-materialized views (because the materialized views in PG do not
expose deltas).  That *almost* fits the logical WAL decoder concept, but
fails on account of needing to be incremental updates not of _raw_ data,
but of views on that data.

> > Commit triggers also allow one to record transaction boundaries, and
> > NOTIFY listeners less frequently than if one did a NOTIFY in normal
> > for-each-row/statement triggers.
> 
> Um, NOTIFY already collapses duplicate notifications per-transaction.

Oh, that I didn't know.  (But it changes nothing for me.)

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More efficient truncation of pg_stat_activity query strings

2017-09-15 Thread Andres Freund

On 2017-09-15 16:45:47 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Version correcting these is attached. Thanks!
> 
> I'd vote for undoing the s/st_activity/st_activity_raw/g business.
> That may have been useful while writing the patch, to ensure you
> found all the references; but it's just creating a lot of unnecessary
> delta in the final code, with the attendant hazards for back-patches.

I was wondering about that too (see [1]). My only concern is that some
extensions out there might be accessing the string expecting it to be
properly truncated. The rename at least forces them to look for the new
name...  I'm slightly in favor of keeping the rename, it doesn't seem
likely to cause a lot of backpatch pain.

Regards,

Andres Freund

[1] 
http://archives.postgresql.org/message-id/20170914060329.j7lt7oyyzquxiut6%40alap3.anarazel.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More efficient truncation of pg_stat_activity query strings

2017-09-15 Thread Tom Lane

Andres Freund  writes:
> Version correcting these is attached. Thanks!

I'd vote for undoing the s/st_activity/st_activity_raw/g business.
That may have been useful while writing the patch, to ensure you
found all the references; but it's just creating a lot of unnecessary
delta in the final code, with the attendant hazards for back-patches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case for removing replacement selection sort

2017-09-15 Thread Peter Geoghegan

Tomas' e-mail from earlier today, that I've already replied to
directly, seems to have been lost to the mailing list. This must be
due to having a 1MB attachment (results spreadsheet), which seems a
bit aggressive as a reason to withold it IMV.

Here is a link to his results, converted to a Google docs spreadsheet:

https://docs.google.com/spreadsheets/d/1SL_IIkPdiJUZ9BHUgROcYkEKWZUch4wRax2SyOmMuT8/edit?usp=sharing

Here is his e-mail, in full:

On Fri, Sep 15, 2017 at 6:34 AM, Tomas Vondra
 wrote:
> Attached is a spreadsheet with updated data (using the more accurate
> timing, and comparing master with the default replacement_sort_tuples
> value (150k) and increased per Peter's instructions (to 1B).
>
> There are 6 sheets - one for each combination of a dataset size (100k
> and 1M) and machine where I ran the tests (with different CPU models).
>
> There are 5 columns - first three are medians for each of the tested
> PostgreSQL configurations:
>
> - master (default)
> - master (1billion)
> - no-replacement-sort (with patch applied)
>
> The numbers are medians from 5 runs (perhaps minimums would be better in
> this case).
>
> The last two columns are comparisons / speed-ups
>
> - master(1B) / master(default)
> - no-replacement-sort / master(default)
>
> Green numbers (below 1.0) mean speed-up, red (above 1.0) slow-down.
>
> Firstly, the master with r_s_t=1B setting performs either the same or
> worse compared to a default values, in almost every test. On the 100k
> data set the results are a bit noisy (particularly on the oldest CPU),
> but on the 1M data set the difference is quite clear. So I've only
> compared results for master(default) and patched.
>
> Quick summary, for each CPU model (which clearly affects the behavior).
>
>
> e5-2620-v4
> --
> - probably the CPU we should be looking at, as it's the current model
> - in some cases this gives us 3-5x speedup with low work_mem settings
> - consider for example the very last line, which shows that
>
>   SELECT DISTINCT a FROM text_test_padding ORDER BY a
>
> completed in ~5531ms on work_mem=8MB and 1067ms on 32MB, but with the
> patch it completes in 1784ms (1MB), 1211ms (4MB) and 1104 (8MB).
>
> - Of course, this is for already-sorted data, and for other data sets
> the improvement is more modest. It's difficult to summarize this into a
> single number, but there are plenty of improvements in 20-30% range.
>
> - Some cases are a bit slower, of course, but overall I'd say the chart
> is much more green than red. Also the slow-downs are much smaller,
> compared to speed-ups (generally within 1-5%).
>
> i5-2500k
> 
> - same story, but this CPU gives more stable results (less noise)
>
> e5-5450
> ---
> - rather noisy CPU, particularly on the small (100k) dataset
> - the larger data set mostly matches the other CPUs, although the
> regressions are somewhat more significant
> - I wouldn't really worry about this too much, it's clearly an obsolete
> CPU and not something performance-conscious person would use nowadays
> (the other CPUs are often 2-3x faster).
>
> If needed, full data is available here (each machine is pushing data to
> a separate git repository):
>
> * https://bitbucket.org/tvondra/sort-bench-i5-2500k
> * https://bitbucket.org/tvondra/sort-bench-e5-2620-v4
> * https://bitbucket.org/tvondra/sort-bench-e5-5450
>
> At this point the 10M row tests are running, but I don't expect anything
> particularly surprising from those results. That is, it's not something
> that should block getting this patch committed, if the agreement is to
> commit otherwise.
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More efficient truncation of pg_stat_activity query strings

2017-09-15 Thread Andres Freund

On 2017-09-15 08:25:09 -0400, Robert Haas wrote:
> On Thu, Sep 14, 2017 at 2:03 AM, Andres Freund  wrote:
> > Here's a patch that implements that idea.
> 
> From the department of cosmetic nitpicking:
> 
> + * All supported server-encodings allow to determine the length of a
> 
> make it possible to determine
> 
> + * multi-byte character from it's first byte (not the case for client
> 
> its
> 
> this is not the case
> 
> + * encodings, see GB18030). As st_activity always is stored using server
> 
> is always stored using a server
> 
> + * pgstat_clip_activity() to trunctae correctly.

Version correcting these is attached. Thanks!

Greetings,

Andres Freund
>From a2325a2649da403dc562b8e93df972839823d924 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 13 Sep 2017 19:25:34 -0700
Subject: [PATCH] Speedup pgstat_report_activity by moving mb-aware truncation
 to read side.

Previously multi-byte aware truncation was done on every
pgstat_report_activity() call - proving to be a bottleneck for
workloads with long query strings that execute quickly.

Instead move the truncation to the read side, which is commonly
executed far less frequently. That's possible because all server
encodings allow to determine the length of a multi-byte string from
the first byte.

Author: Andres Freund
Discussion: https://postgr.es/m/20170912071948.pa7igbpkkkvie...@alap3.anarazel.de
---
 src/backend/postmaster/pgstat.c | 63 -
 src/backend/utils/adt/pgstatfuncs.c | 17 +++---
 src/include/pgstat.h| 12 +--
 3 files changed, 72 insertions(+), 20 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index accf302cf7..1ffdac5448 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -2701,7 +2701,7 @@ CreateSharedBackendStatus(void)
 		buffer = BackendActivityBuffer;
 		for (i = 0; i < NumBackendStatSlots; i++)
 		{
-			BackendStatusArray[i].st_activity = buffer;
+			BackendStatusArray[i].st_activity_raw = buffer;
 			buffer += pgstat_track_activity_query_size;
 		}
 	}
@@ -2922,11 +2922,11 @@ pgstat_bestart(void)
 #endif
 	beentry->st_state = STATE_UNDEFINED;
 	beentry->st_appname[0] = '\0';
-	beentry->st_activity[0] = '\0';
+	beentry->st_activity_raw[0] = '\0';
 	/* Also make sure the last byte in each string area is always 0 */
 	beentry->st_clienthostname[NAMEDATALEN - 1] = '\0';
 	beentry->st_appname[NAMEDATALEN - 1] = '\0';
-	beentry->st_activity[pgstat_track_activity_query_size - 1] = '\0';
+	beentry->st_activity_raw[pgstat_track_activity_query_size - 1] = '\0';
 	beentry->st_progress_command = PROGRESS_COMMAND_INVALID;
 	beentry->st_progress_command_target = InvalidOid;
 
@@ -3017,7 +3017,7 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 			pgstat_increment_changecount_before(beentry);
 			beentry->st_state = STATE_DISABLED;
 			beentry->st_state_start_timestamp = 0;
-			beentry->st_activity[0] = '\0';
+			beentry->st_activity_raw[0] = '\0';
 			beentry->st_activity_start_timestamp = 0;
 			/* st_xact_start_timestamp and wait_event_info are also disabled */
 			beentry->st_xact_start_timestamp = 0;
@@ -3034,8 +3034,12 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 	start_timestamp = GetCurrentStatementStartTimestamp();
 	if (cmd_str != NULL)
 	{
-		len = pg_mbcliplen(cmd_str, strlen(cmd_str),
-		   pgstat_track_activity_query_size - 1);
+		/*
+		 * Compute length of to-be-stored string unaware of multi-byte
+		 * characters. For speed reasons that'll get corrected on read, rather
+		 * than computed every write.
+		 */
+		len = Min(strlen(cmd_str), pgstat_track_activity_query_size - 1);
 	}
 	current_timestamp = GetCurrentTimestamp();
 
@@ -3049,8 +3053,8 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 
 	if (cmd_str != NULL)
 	{
-		memcpy((char *) beentry->st_activity, cmd_str, len);
-		beentry->st_activity[len] = '\0';
+		memcpy((char *) beentry->st_activity_raw, cmd_str, len);
+		beentry->st_activity_raw[len] = '\0';
 		beentry->st_activity_start_timestamp = start_timestamp;
 	}
 
@@ -3278,8 +3282,8 @@ pgstat_read_current_status(void)
  */
 strcpy(localappname, (char *) beentry->st_appname);
 localentry->backendStatus.st_appname = localappname;
-strcpy(localactivity, (char *) beentry->st_activity);
-localentry->backendStatus.st_activity = localactivity;
+strcpy(localactivity, (char *) beentry->st_activity_raw);
+localentry->backendStatus.st_activity_raw = localactivity;
 localentry->backendStatus.st_ssl = beentry->st_ssl;
 #ifdef USE_SSL
 if (beentry->st_ssl)
@@ -3945,10 +3949,13 @@ pgstat_get_backend_current_activity(int pid, bool checkUser)
 			/* Now it is safe to use the non-volatile pointer */
 			if (checkUser && !superuser() && beentry->st_userid != GetUserId())
 return "";
-			else if (*(beentry->st_activity) == '\0')
+

Re: [HACKERS] postgres_fdw: evaluate placeholdervars on remote server

2017-09-15 Thread Daniel Gustafsson

> On 15 Sep 2017, at 17:19, Robert Haas  wrote:
> 
> On Fri, Sep 15, 2017 at 10:15 AM, Daniel Gustafsson  wrote:
>> Have you had a chance to look at this such that we can expect a rebased 
>> version
>> of this patch during the commitfest?
> 
> Frankly, I think things where there was a ping multiple weeks before
> the CommitFest started and no rebase before it started should be
> regarded as untimely submissions, and summarily marked Returned with
> Feedback.  The CommitFest is supposed to be a time to get things that
> are ready before it starts committed before it ends.  Some amount of
> back-and-forth during the CF is of course to be expected, but we don't
> even really have enough bandwidth to deal with the patches that are
> being timely updated, never mind the ones that aren’t.

I don’t necessarily disagree with this, and especially not the part about
bandwidth which is absolutely correct.

What has happened a lot however is that these patches have been moved to the
next CF, and possibly the next from there.  In that scenario, if we can get the
patches rebased now they wont be in a worse state when pushed to the next
compared to if they bitrot further.  That being said, perhaps we should move
closer to a model like what you describe, but thats for another thread to
discuss rather than threadjacking this one more IMO.

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-15 Thread Bruce Momjian

On Fri, Sep 15, 2017 at 01:23:45AM +0200, Andreas Joseph Krogh wrote:
> I tested upgrading from 9.6 to 10 now, using pg_upgrade, and pg_upgrade 
> creates
> the new data-dir with pg_wal "in it" (just like regular initdb), so pg_upgrade
> seems not to care about where the old version's pg_xlog was. You have to move
> (by symlinking) pg_wal to a separate location manually *after* running
> pg_upgrade on the master. No special handling is needed when rsync'ing it over
> to the standby, so it doesn't need any --hard-links or --size-only, correct?

What rsync is going to do is to reproduce the directory structure of the
old cluster _in_ the standby's old cluster, and the structure of the new
cluster on the standby's new cluster.  If you had the WAL directory
relocated in the new cluster, the relocation symbolic link will be
reproduced by rsync, but the directory it points _to_ will not be
copied, so it will point to nothing.

Of course, of both old and new clusters share the same WAL directory,
which I think is impossible, things would get very confusing quickly.  I
will reply to this now in a later email.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] additional contrib test suites

2017-09-15 Thread Tom Lane

I wrote:
> Peter Eisentraut  writes:
>> So, we have one failure for chkpass on OpenBSD, because OpenBSD crypt()
>> doesn't support the traditional two-character salt format.

>> Option:

>> - Use the resultmap features to make this an expected failure on OpenBSD.

>> - Fix the module to work on OpenBSD.  This would involve making a
>> platform-specific modification to use whatever advanced salt format they
>> want.

>> - Replace the entire module with something that does not depend on crypt().

> Or (4) drop the module's regression test again.

Noting that mandrill is showing yet a different failure, one that I think
is inherent to chkpass:

  CREATE TABLE test (i int, p chkpass);
  INSERT INTO test VALUES (1, 'hello'), (2, 'goodbye');
+ WARNING:  type chkpass has unstable input conversion for "hello"
+ LINE 1: INSERT INTO test VALUES (1, 'hello'), (2, 'goodbye');
+ ^
+ WARNING:  type chkpass has unstable input conversion for "goodbye"
+ LINE 1: INSERT INTO test VALUES (1, 'hello'), (2, 'goodbye');
+   ^

I'm starting to think that (4) might be the best avenue.  Or we could
consider

(5) drop contrib/chkpass altogether, on the grounds that it's too badly
designed, and too obsolete crypto-wise, to be useful or supportable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Tom Lane

Nico Williams  writes:
> On Fri, Sep 15, 2017 at 02:19:29PM -0500, Nico Williams wrote:
>> On Fri, Sep 15, 2017 at 11:26:08AM -0700, Andres Freund wrote:
>>> I think you should also explain why that's a desirable set of
>>> semantics.

> Now, why is this desirable: atomicity.

>The client/user can no longer add statements to the transaction,
>therefore now (commit-time) is a good time to run a procedure that
>can *atomically* examine the totatility of the changes made by the
>user.

I do not really understand this claim.  The argument for a commit trigger
seems to be exactly "I want to be the last thing that happens in the
transaction".  But as soon as you have two of them, one of them is not
going to be the last thing.  Maybe you could address that by requiring
the triggers to be read-only, but then it doesn't seem like a very useful
feature.  If there can be only one, maybe it's a usable feature or maybe
not, but I'm inclined to think that CREATE TRIGGER is a pretty poor API
for such a definition.  Triggers are generally expected to be objects
you can create any number of.

Another question is exactly how you're going to "examine the totality of
the transaction's changes"; a trigger, per se, isn't going to know *any*
of what the transaction did let alone all of it.  We have some features
like transition tables, but they're not accessible after end of statement;
and they're pretty useless in the face of DDL too.  It's going to be hard
to produce a very credible use-case without a lot of work to expose that
information somehow.

(Some closely related work is being done for logical decoding, btw.
I wonder which use-cases for this might be satisfied by having a logical
decoding plug-in watching the WAL output.)

> Commit triggers also allow one to record transaction boundaries, and
> NOTIFY listeners less frequently than if one did a NOTIFY in normal
> for-each-row/statement triggers.

Um, NOTIFY already collapses duplicate notifications per-transaction.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgjdbc logical replication client throwing exception

2017-09-15 Thread Andres Freund

On 2017-09-15 20:00:34 +, Vladimir Sitnikov wrote:
> ++pgjdbc dev list.
> 
> >I am facing unusual connection breakdown problem. Here is the simple code
> that I am using to read WAL file:
> 
> Does it always fails?
> Can you create a test case? For instance, if you file a pull request with
> the test, it will get automatically tested across various PG versions, so
> it would be easier to reson about
> 
> Have you tried "withStatusInterval(20, TimeUnit.SECONDS)" instead of 20
> millis? I don't think it matter much, however 20ms seems to be an overkill.

Also, have you checked the server log?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgjdbc logical replication client throwing exception

2017-09-15 Thread Vladimir Sitnikov

++pgjdbc dev list.

>I am facing unusual connection breakdown problem. Here is the simple code
that I am using to read WAL file:

Does it always fails?
Can you create a test case? For instance, if you file a pull request with
the test, it will get automatically tested across various PG versions, so
it would be easier to reson about

Have you tried "withStatusInterval(20, TimeUnit.SECONDS)" instead of 20
millis? I don't think it matter much, however 20ms seems to be an overkill.

Vladimir

пт, 15 сент. 2017 г. в 19:57, Dipesh Dangol :

> hi,
>
> I am trying to implement logical replication stream API of postgresql.
> I am facing unusual connection breakdown problem. Here is the simple code
> that I am
> using to read WAL file:
>
> String url = "jdbc:postgresql://pcnode2:5432/benchmarksql";
> Properties props = new Properties();
> PGProperty.USER.set(props, "benchmarksql");
> PGProperty.PASSWORD.set(props, "benchmarksql");
> PGProperty.ASSUME_MIN_SERVER_VERSION.set(props, "9.4");
> PGProperty.REPLICATION.set(props, "database");
> PGProperty.PREFER_QUERY_MODE.set(props, "simple");
>
> Connection conn = DriverManager.getConnection(url, props);
> PGConnection replConnection = conn.unwrap(PGConnection.class);
>
> PGReplicationStream stream = replConnection.getReplicationAPI()
> .replicationStream().logical()
> .withSlotName("replication_slot3")
> .withSlotOption("include-xids", true)
> .withSlotOption("include-timestamp", "on")
> .withSlotOption("skip-empty-xacts", true)
> .withStatusInterval(20, TimeUnit.MILLISECONDS).start();
> while (true) {
>
> ByteBuffer msg = stream.read();
>
> if (msg == null) {
> TimeUnit.MILLISECONDS.sleep(10L);
> continue;
> }
>
> int offset = msg.arrayOffset();
> byte[] source = msg.array();
> int length = source.length - offset;
> String data = new String(source, offset, length);
>* System.out.println(data);*
>
> stream.setAppliedLSN(stream.getLastReceiveLSN());
> stream.setFlushedLSN(stream.getLastReceiveLSN());
>
> }
>
> Even the slightest modification in the code like commenting
> *System.out.println(data)*;
> which is just printing the data in the console, causes connection
> breakdown problem with
> following error msg
>
> org.postgresql.util.PSQLException: Database connection failed when reading
> from copy
> at
> org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1028)
> at
> org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
> at
> org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
> at
> org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
> at
> org.postgresql.core.v3.replication.V3PGReplicationStream.read(V3PGReplicationStream.java:70)
> at Server.main(Server.java:52)
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at
> org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140)
> at
> org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:109)
> at
> org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:191)
> at org.postgresql.core.PGStream.receive(PGStream.java:495)
> at org.postgresql.core.PGStream.receive(PGStream.java:479)
> at
> org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1161)
> at
> org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1026)
> ... 5 more
>
> I am trying to implement some logic like filtering out the unrelated table
> after reading log.
> But due to this unusual behavior I couldn't implement properly.
> Can somebody give me some hint how to solve this problem.
>
> Thank you.
>

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Nico Williams

On Fri, Sep 15, 2017 at 12:25:08PM -0700, Andres Freund wrote:
> On 2017-09-15 14:19:29 -0500, Nico Williams wrote:
> > Please see my post and the linked file to see why.
> 
> The discussions here are often going to be referred back to in years, so
> external links where we aren't sure about the longevity (like e.g. links
> to the mailing list archive, where we're fairly sure), aren't liked
> much.  If you want to argue for a change, it should happen on-list.

Fair enough.  I thought I had given enough detail, but here is the code.
It's just an event trigger that ensures every table has a DEFERRED
CONSTRAINT TRIGGER that runs a function that debounces invocations so
that the "commit trigger" function runs just once:

/*
 * Copyright (c) 2017 Two Sigma Open Source, LLC.
 * All Rights Reserved
 *
 * Permission to use, copy, modify, and distribute this software and its
 * documentation for any purpose, without fee, and without a written agreement
 * is hereby granted, provided that the above copyright notice and this
 * paragraph and the following two paragraphs appear in all copies.
 *
 * IN NO EVENT SHALL TWO SIGMA OPEN SOURCE, LLC BE LIABLE TO ANY PARTY FOR
 * DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
 * LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION,
 * EVEN IF TWO SIGMA OPEN SOURCE, LLC HAS BEEN ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 * TWO SIGMA OPEN SOURCE, LLC SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING,
 * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
 * FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS"
 * BASIS, AND TWO SIGMA OPEN SOURCE, LLC HAS NO OBLIGATIONS TO PROVIDE
 * MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
 */

/*
 * This file demonstrates how to create a "COMMIT TRIGGER" for
 * PostgreSQL using CONSTRAINT TRIGGERs.
 *
 * There have been many threads on the PG mailing lists about commit
 * triggers, with much skepticism shown about the possible semantics of
 * such a thing.
 *
 * Below we demonstrate reasonable, useful, and desirable semantics, how
 * to obtain them with PG today.
 *
 * There are three shortcomings of this implementation:
 *
 * a) It is possible defeat this implementation by using
 *
 *  SET CONSTRAINTS ... IMMEDIATE;
 *
 *or otherwise disabling the triggers created under the hood herein.
 *
 *The ability to make these triggers run early can be *dangerous*,
 *depending on the application.  It is especially dangerous given
 *that no privilege is needed in order to do this, and there's no
 *way for a CONSTRAINT TRIGGER to detect when it is called _last_,
 *only when it is called _first_, in any transaction.
 *
 * b) This implementation serializes write transactions implicitly by
 *having a single row encode commit trigger state.
 *
 *(This is easily fixed though.)
 *
 * c) This implementation is inefficient because CONSTRAINT TRIGGERs
 *have to be FOR EACH ROW triggers.  Thus a transaction that does
 *1,000 inserts will cause 999 unnecessary trigger procedure calls
 *under the hood.  Also, because CONSTRAINT TRIGGERs have to be FOR
 *EACH ROW triggers, PG has to track OLD/NEW row values for all
 *affected rows, even though commit triggers obviously don't need
 *this.
 *
 * (Also, for simplicity we use SECURITY DEFINER functions here,
 * otherwise we'd have to have additional code to grant to
 * public the ability to call our functions.  We would need additional
 * code by which to ensure that users do not toggle internal state to
 * prevent commit trigger execution.)
 *
 * For example, to create a commit trigger that invokes
 * commit_trigger.example_proc() at the end of any _write_ transaction,
 * run the following in psql:
 *
 *  -- Load commit trigger functionality:
 *  \i commit_trigger.sql
 *
 *  -- CREATE COMMIT TRIGGER egt
 *  -- EXECUTE PROCEDURE commit_trigger.example_proc();
 *  INSERT INTO commit_trigger.triggers
 *  (trig_name, proc_schema, proc_name)
 *  SELECT 'egt', 'commit_trigger', 'example_proc';
 *
 * Demo:
 *
 *  db=# \i commit_trigger.sql
 *  
 *  db=# INSERT INTO commit_trigger.triggers
 *  db-# (trig_name, proc_schema, proc_name)
 *  db-# SELECT 'egt', 'commit_trigger', 'example_proc';
 *  db=#
 *  db=# CREATE SCHEMA eg;
 *  CREATE SCHEMA
 *  db=# CREATE TABLE eg.x(a text primary key);
 *  CREATE TABLE
 *  db=# BEGIN;
 *  BEGIN
 *  db=# INSERT INTO eg.x (a) VALUES('foo');
 *  INSERT 0 1
 *  db=# INSERT INTO eg.x (a) VALUES('bar');
 *  INSERT 0 1
 *  db=# COMMIT;
 *  NOTICE:  example_proc() here!  Should be just one for this TX (txid 208036)
 *  CONTEXT:  PL/pgSQL function example_proc() line 3 at
 *  RAISE
 *  COMMIT
 *  db=# INSERT INTO eg.x (a) VALUES('foobar');
 *  NOTICE:  example_proc() here!  Should be just one for this TX (txid 208037)
 *  CONTEXT:  PL/pgSQL function

Re: [HACKERS] POC: Sharing record typmods between backends

2017-09-15 Thread Andres Freund

On 2017-09-15 15:39:49 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-09-14 23:29:05 -0400, Tom Lane wrote:
> >> FWIW, I'm not on board with that.  I think the version of typedefs.list
> >> in the tree should reflect the last official pgindent run.
> 
> > Why? I see pretty much no upside to that. You can't reindent anyway, due
> > to unindented changes. You can get the used typedefs.list trivially from
> > git.
> 
> Perhaps, but the real problem is still this:
> 
> >> There's also a problem that it only works well if *every* committer
> >> faithfully updates typedefs.list, which isn't going to happen.
> 
> We can't even get everybody to pgindent patches before commit, let alone
> update typedefs.list.

Well, that's partially because right now it's really painful to do so,
and we've not tried to push people to do so.  You essentially have to:
1) Pull down a new typedefs.list (how many people know where from?)
2) Add new typedefs that have been added in the commit-to-be
3) Run pgindent only on the changed files, because there's bound to be
   thousands of unrelated reindents
4) Revert reindents in changed files that are unrelated to the commit.

1) is undocumented 2) is painful (add option to automatically
generate?), 3) is painful (add commandline tool?) 4) is painful.  So
it's not particularly surprising that many don't bother.

> >> For local pgindent'ing, I pull down
> >> https://buildfarm.postgresql.org/cgi-bin/typedefs.pl
> 
> > That's a mighty manual process - I want to be able to reindent files,
> > especially new ones where it's still reasonably possible, without having
> > to download files, then move changes out of the way, so I can rebase,
> 
> Well, that just shows you don't know how to use it.  You can tell pgindent
> to use an out-of-tree copy of typedefs.list.  I have the curl fetch and
> using the out-of-tree list all nicely scripted ;-)

Not sure how that invalidates my statement. If you have to script it
locally, and still have to add typedefs manually, that's still plenty
stuff every committer (and better even, ever contributor!) has to learn.

> There might be something to be said for removing the typedefs list
> from git altogether, and adjusting the standard wrapper script to pull
> it from the buildfarm into a .gitignore'd location if there's not a
> copy there already.

I wonder if we could add a command that pulls down an up2date list *and*
regenerates a list for the local tree with the local settings. And then
runs pgindent with the combined list - in most cases that'd result in a
properly indented tree. The number of commits with platform specific
changes that the author/committer doesn't compile/run isn't that high.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Nico Williams

On Fri, Sep 15, 2017 at 02:19:29PM -0500, Nico Williams wrote:
> On Fri, Sep 15, 2017 at 11:26:08AM -0700, Andres Freund wrote:
> > I think you should also explain why that's a desirable set of
> > semantics.

Note that DEFERRED CONSTRAINT TRIGGERs already have these semantics,
except of course that an unprivileged user can force them to run
IMMEDIATEly rather than DEFERRED.

Now, why is this desirable: atomicity.

   The client/user can no longer add statements to the transaction,
   therefore now (commit-time) is a good time to run a procedure that
   can *atomically* examine the totatility of the changes made by the
   user.

This allows one to perform consistency checks across the entire DB.
One might speed them up by examining deltas recorded by for-each-row
triggers, but logically one could check the entire state of the DB.

Commit triggers also allow one to record transaction boundaries, and
NOTIFY listeners less frequently than if one did a NOTIFY in normal
for-each-row/statement triggers.

These are all good things -- or at least they aren't bad things.

For NOTIFY, it would be nice to have a post-commit trigger, though such
a thing could do nothing more than NOTIFY!

(A "post-commit" trigger could certainly not write to the DB unless it
did so in a new transaction that itself could not invoke post-commit
triggers that could write more...  Such a thing would be very strange!
But I'm not asking for a post-commit trigger feature.)

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POC: Sharing record typmods between backends

2017-09-15 Thread Tom Lane

Andres Freund  writes:
> On 2017-09-14 23:29:05 -0400, Tom Lane wrote:
>> FWIW, I'm not on board with that.  I think the version of typedefs.list
>> in the tree should reflect the last official pgindent run.

> Why? I see pretty much no upside to that. You can't reindent anyway, due
> to unindented changes. You can get the used typedefs.list trivially from
> git.

Perhaps, but the real problem is still this:

>> There's also a problem that it only works well if *every* committer
>> faithfully updates typedefs.list, which isn't going to happen.

We can't even get everybody to pgindent patches before commit, let alone
update typedefs.list.  So sooner or later your process is going to need
to involve getting a current list from the buildfarm.

>> For local pgindent'ing, I pull down
>> https://buildfarm.postgresql.org/cgi-bin/typedefs.pl

> That's a mighty manual process - I want to be able to reindent files,
> especially new ones where it's still reasonably possible, without having
> to download files, then move changes out of the way, so I can rebase,

Well, that just shows you don't know how to use it.  You can tell pgindent
to use an out-of-tree copy of typedefs.list.  I have the curl fetch and
using the out-of-tree list all nicely scripted ;-)

There might be something to be said for removing the typedefs list from
git altogether, and adjusting the standard wrapper script to pull it from
the buildfarm into a .gitignore'd location if there's not a copy there
already.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-09-15 Thread Oleg Bartunov

On Fri, Sep 15, 2017 at 7:31 PM, Robert Haas  wrote:
> On Fri, Sep 15, 2017 at 10:10 AM, Daniel Gustafsson  wrote:
>> Can we expect a rebased version of this patch for this commitfest?  Since 
>> it’s
>> a rather large feature it would be good to get it in as early as we can in 
>> the
>> process.
>
> Again, given that this needs a "major" rebase and hasn't been updated
> in a month, and given that the CF is already half over, this should
> just be bumped to the next CF.  We're supposed to be trying to review
> things that were ready to go by the start of the CF, not the end.

We are supporting v10 branch in our github repository
https://github.com/postgrespro/sqljson/tree/sqljson_v10

Since the first post we made a lot of changes, mostly because of
better understanding the standard and availability of technical report
(http://standards.iso.org/ittf/PubliclyAvailableStandards/c067367_ISO_IEC_TR_19075-6_2017.zip).
Most important are:

1.We abandoned FORMAT support, which could confuse our users, since we
have data types json[b].

2. We use XMLTABLE infrastructure, extended for  JSON_TABLE support.

3. Reorganize commits, so we could split one big patch by several
smaller patches, which could be reviewed independently.

4. The biggest problem is documentation, we are working on it.

Nikita will submit patches soon.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Andres Freund

Hi,

On 2017-09-15 14:19:29 -0500, Nico Williams wrote:
> Please see my post and the linked file to see why.

The discussions here are often going to be referred back to in years, so
external links where we aren't sure about the longevity (like e.g. links
to the mailing list archive, where we're fairly sure), aren't liked
much.  If you want to argue for a change, it should happen on-list.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Nico Williams

On Fri, Sep 15, 2017 at 11:26:08AM -0700, Andres Freund wrote:
> On 2017-09-14 14:41:12 -0500, Nico Williams wrote:
> > I've read through several old threads on COMMIT TRIGGERs.  Rather than
> > write a lengthy post addressing past debates, here's an implementation
> > and demonstration of [an approximation of] COMMIT TRIGGERs with natural
> > and _desirable_ semantics:
> 
> I think you should also explain why that's a desirable set of
> semantics. E.g. explain the use case you have and potential other
> use-cases.  I think it should also be explained that these are
> *pre*-commit triggers - IIRC some people have asked for *post*-commit
> triggers in the past.

Responding out of order:

 - Yes, this is a pre-commit thing.

   It's the *same* as DEFERRED CONSTRAINT TRIGGERs.  After all, that's
   how I'm implementing this now :)

   Critically, the client/user can no longer execute additional
   statements at this point, since they've executed COMMIT.  Therefore
   these trigger procedures will see *all* of the changes made by the
   user (and all of the changes made by commit triggers that run before
   them, so, as usual, trigger invocation order matters).

 - As to use-cases, I listed a few in my post:

- update/refresh view materializations
- consistency checks
- NOTIFY
- record history (in particular, record transaction boundaries)
- and, no doubt, others

Of course all of this can [almost!] be done with CONSTRAINT TRIGGERs,
since that's what I'm demonstrating.

HOWEVER, there's a *very serious* problem with CONSTRAINT TRIGGERs:
unprivileged users can make them run IMMEDIATEly rather than deferred.

Also, using CONSTRAINT TRIGGERs for this is inefficient.  Please see my
post and the linked file to see why.

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-09-15 Thread Robert Haas

On Fri, Sep 15, 2017 at 4:50 AM, Amit Langote
 wrote:
> Rebased patches attached.  Because Dilip complained earlier today about
> clauses of the form (const op var) not causing partition-pruning, I've
> added code to commute the clause where it is required.  Some other
> previously mentioned limitations remain -- no handling of OR clauses, no
> elimination of redundant clauses for given partitioning column, etc.
>
> A note about 0001: this patch overlaps with
> 0003-Canonical-partition-scheme.patch from the partitionwise-join patch
> series that Ashutosh Bapat posted yesterday [1].

It doesn't merely overlap; it's obviously a derivative work, and the
commit message in your version should credit all the authors.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add --if-exists to pg_recvlogical

2017-09-15 Thread Peter Eisentraut

On 9/15/17 13:18, Rosser Schwarz wrote:
> On Fri, Sep 15, 2017 at 04:15 Daniel Gustafsson  > wrote:
> 
> This patch is "Waiting for Author” due to the above review comments
> from Peter
> and Thomas.  Do you think you will have time to address these
> shortly so we can
> move this patch further in the process?
> 
> 
> I have a patch addressing both comments that I'm testing and tweaking
> now (well, not *now* now), which I hope to submit this weekend. 
> 
> Is that enough time?

There is no rush of any kind.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Andres Freund

Hi,

On 2017-09-14 14:41:12 -0500, Nico Williams wrote:
> I've read through several old threads on COMMIT TRIGGERs.  Rather than
> write a lengthy post addressing past debates, here's an implementation
> and demonstration of [an approximation of] COMMIT TRIGGERs with natural
> and _desirable_ semantics:

I think you should also explain why that's a desirable set of
semantics. E.g. explain the use case you have and potential other
use-cases.  I think it should also be explained that these are
*pre*-commit triggers - IIRC some people have asked for *post*-commit
triggers in the past.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-09-15 Thread Andres Freund

Hi Thom,

Thanks for taking a whack at this!

On 2017-09-15 12:16:22 +0100, Thom Brown wrote:
> I've run a fairly basic test with a table with 101 columns, selecting
> a single row from the table and I get the following results:
> 
> 
> Columns with 1-character names:
> 
> master (80 jobs, 80 connections, 60 seconds):

FWIW, I don't think it's useful to test this with a lot of concurrency -
at that point you're likely saturating the machine with context switches
etc. unless you have a lot of cores. As this is isn't related to
concurrency I'd rather just check a single connection.

> transaction type: /tmp/test.sql
> scaling factor: 1
> query mode: simple

I think you'd need to use prepared statements / -M prepared to see
benefits - when parsing statements for every execution the bottleneck is
elsewhere (hello O(#available_columns * #selected_columns) in
colNameToVar()).

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CLUSTER command progress monitor

2017-09-15 Thread Jeff Janes

On Wed, Aug 30, 2017 at 7:12 PM, Tatsuro Yamada <
yamada.tats...@lab.ntt.co.jp> wrote:

>
> The view provides the information of CLUSTER command progress details as
> follows
> postgres=# \d pg_stat_progress_cluster
>View "pg_catalog.pg_stat_progress_cluster"
>Column|  Type   | Collation | Nullable | Default
> -+-+---+--+-
>  pid | integer |   |  |
>  datid   | oid |   |  |
>  datname | name|   |  |
>  relid   | oid |   |  |
>  phase   | text|   |  |
>  scan_method | text|   |  |
>  scan_index_relid| bigint  |   |  |
>  heap_tuples_total   | bigint  |   |  |
>  heap_tuples_scanned | bigint  |   |  |
>

I think it should be cluster_index_relid, not scan_index_relid.  If the
scan_method is seq, then the index isn't being scanned.

Cheers,

Jeff

Re: [HACKERS] POC: Sharing record typmods between backends

2017-09-15 Thread Andres Freund

On 2017-09-14 23:29:05 -0400, Tom Lane wrote:
> Thomas Munro  writes:
> > On Fri, Sep 15, 2017 at 3:03 PM, Andres Freund  wrote:
> >> - added typedefs to typedefs.list
> 
> > Should I do this manually with future patches?

I think there's sort of a circuit split on that one. Robert and I do
regularly, most others don't.

> FWIW, I'm not on board with that.  I think the version of typedefs.list
> in the tree should reflect the last official pgindent run.

Why? I see pretty much no upside to that. You can't reindent anyway, due
to unindented changes. You can get the used typedefs.list trivially from
git.

> There's also a problem that it only works well if *every* committer
> faithfully updates typedefs.list, which isn't going to happen.
> 
> For local pgindent'ing, I pull down
> 
> https://buildfarm.postgresql.org/cgi-bin/typedefs.pl
> 
> and then add any typedefs created by the patch I'm working on to that.
> But I don't put the result into the commit.  Maybe we need a bit better
> documentation and/or tool support for using an unofficial typedef list.

That's a mighty manual process - I want to be able to reindent files,
especially new ones where it's still reasonably possible, without having
to download files, then move changes out of the way, so I can rebase,
...

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case for removing replacement selection sort

2017-09-15 Thread Peter Geoghegan

On Fri, Sep 15, 2017 at 6:34 AM, Tomas Vondra
 wrote:
> e5-2620-v4
> --
> - probably the CPU we should be looking at, as it's the current model
> - in some cases this gives us 3-5x speedup with low work_mem settings
> - consider for example the very last line, which shows that
>
>   SELECT DISTINCT a FROM text_test_padding ORDER BY a
>
> completed in ~5531ms on work_mem=8MB and 1067ms on 32MB, but with the
> patch it completes in 1784ms (1MB), 1211ms (4MB) and 1104 (8MB).

I think that this is because of the importance of getting a final
on-the-fly merge. Overlapping I/O with computation matters a lot here.

> - Of course, this is for already-sorted data, and for other data sets
> the improvement is more modest. It's difficult to summarize this into a
> single number, but there are plenty of improvements in 20-30% range.
>
> - Some cases are a bit slower, of course, but overall I'd say the chart
> is much more green than red. Also the slow-downs are much smaller,
> compared to speed-ups (generally within 1-5%).

The larger regressions mostly involve numeric. We have two
palloc()/pfree() cycles in the standard numeric comparator (one for
each datum that is compared), a cost which could be removed by
enhancing numeric SortSupport directly. That's why the numeric
abbreviated key stuff was the most effective (more effective than
text) when added to 9.5. We currently have very expensive full
comparisons of datums within L1 cache ("merge comparisons") relative
to abbreviated key comparisons -- the difference is huge, but could be
reduced.

Anyway, it seems ironic that in those few cases where replacement
selection still does better, it seems to be due to reduced CPU costs,
not reduced I/O costs. This is the opposite benefit to the one you'd
expect from reading Knuth.

Thanks for benchmarking! I hope that this removes the doubt that
replacement selection previously benefited from; it now clearly
deserves to be removed from tuplesort.c.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-15 Thread Arseny Sher

Peter Eisentraut  writes:

> Here is a simple patch that fixes this, based on my original proposal
> point #4.

I checked, it passes the tests and solves the problem. However, isn't
this

+   if (slotname || !subenabled)

is a truism? Is it possible that subscription has no slot but still
enabled?

Besides, we can avoid stopping the workers if subscription has no
associated replication origin, though this probably means that
subscription was broken by user and is not worth it.

--
Arseny Sher

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add --if-exists to pg_recvlogical

2017-09-15 Thread Rosser Schwarz

On Fri, Sep 15, 2017 at 04:15 Daniel Gustafsson  wrote:

> This patch is "Waiting for Author” due to the above review comments from
> Peter
> and Thomas.  Do you think you will have time to address these shortly so
> we can
> move this patch further in the process?

I have a patch addressing both comments that I'm testing and tweaking now
(well, not *now* now), which I hope to submit this weekend.

Is that enough time?

--rosser

> --
:wq

Re: [HACKERS] Optimise default partition scanning while adding new partition

2017-09-15 Thread Amit Langote

On Sat, Sep 16, 2017 at 12:59 AM, Robert Haas  wrote:
> On Fri, Sep 15, 2017 at 2:00 AM, Amit Langote
>  wrote:
>> I wonder if we should call check_default_allows_bound() from
>> ATExecAttachPartition(), too, instead of validating updated default
>> partition constraint using ValidatePartitionConstraints()?  That is, call
>> the latter only to validate the partition constraint of the table being
>> attached and call check_default_allows_bound() to validate the updated
>> default partition constraint.  That way, INFO/ERROR messages related to
>> default partition constraint are consistent across the board.
>
> I believe the intended advantage of the current system is that if you
> specify multiple operations in a single ALTER TABLE command, you only
> do one scan rather than having a second scan per operation.  If that's
> currently working, we probably don't want to make it stop working.

OK.

How about squash Jeevan's and my patch, so both
check_default_allows_bound() and ValidatePartitionConstraints() know
to scan default partition's children and there won't be any surprises
in the regression test output as you found after applying just the
Jeevan's patch.  Unfortunately, I'm not able to post such a patch
right now.

Thanks,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pgjdbc logical replication client throwing exception

2017-09-15 Thread Dipesh Dangol

hi,

I am trying to implement logical replication stream API of postgresql.
I am facing unusual connection breakdown problem. Here is the simple code
that I am
using to read WAL file:

String url = "jdbc:postgresql://pcnode2:5432/benchmarksql";
Properties props = new Properties();
PGProperty.USER.set(props, "benchmarksql");
PGProperty.PASSWORD.set(props, "benchmarksql");
PGProperty.ASSUME_MIN_SERVER_VERSION.set(props, "9.4");
PGProperty.REPLICATION.set(props, "database");
PGProperty.PREFER_QUERY_MODE.set(props, "simple");

Connection conn = DriverManager.getConnection(url, props);
PGConnection replConnection = conn.unwrap(PGConnection.class);

PGReplicationStream stream = replConnection.getReplicationAPI()
.replicationStream().logical()
.withSlotName("replication_slot3")
.withSlotOption("include-xids", true)
.withSlotOption("include-timestamp", "on")
.withSlotOption("skip-empty-xacts", true)
.withStatusInterval(20, TimeUnit.MILLISECONDS).start();
while (true) {

ByteBuffer msg = stream.read();

if (msg == null) {
TimeUnit.MILLISECONDS.sleep(10L);
continue;
}

int offset = msg.arrayOffset();
byte[] source = msg.array();
int length = source.length - offset;
String data = new String(source, offset, length);
   * System.out.println(data);*

stream.setAppliedLSN(stream.getLastReceiveLSN());
stream.setFlushedLSN(stream.getLastReceiveLSN());

}

Even the slightest modification in the code like commenting
*System.out.println(data)*;
which is just printing the data in the console, causes connection breakdown
problem with
following error msg

org.postgresql.util.PSQLException: Database connection failed when reading
from copy
at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryE
xecutorImpl.java:1028)
at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImp
l.java:41)
at org.postgresql.core.v3.replication.V3PGReplicationStream.rec
eiveNextData(V3PGReplicationStream.java:155)
at org.postgresql.core.v3.replication.V3PGReplicationStream.rea
dInternal(V3PGReplicationStream.java:124)
at org.postgresql.core.v3.replication.V3PGReplicationStream.
read(V3PGReplicationStream.java:70)
at Server.main(Server.java:52)
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.postgresql.core.VisibleBufferedInputStream.readMore(Visi
bleBufferedInputStream.java:140)
at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(V
isibleBufferedInputStream.java:109)
at org.postgresql.core.VisibleBufferedInputStream.read(VisibleB
ufferedInputStream.java:191)
at org.postgresql.core.PGStream.receive(PGStream.java:495)
at org.postgresql.core.PGStream.receive(PGStream.java:479)
at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(
QueryExecutorImpl.java:1161)
at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryE
xecutorImpl.java:1026)
... 5 more

I am trying to implement some logic like filtering out the unrelated table
after reading log.
But due to this unusual behavior I couldn't implement properly.
Can somebody give me some hint how to solve this problem.

Thank you.

Re: [HACKERS] GnuTLS support

2017-09-15 Thread Jeff Janes

On Thu, Aug 31, 2017 at 10:52 AM, Andreas Karlsson 
wrote:

>
>
>
> = Work left to do
>
> - Test code with older versions of GnuTLS
>

I can't build against gnutls-2.12.23-21.el6.x86_64 from CentOS 6.9

be-secure-gnutls.c: In function 'be_tls_init':
be-secure-gnutls.c:168: warning: implicit declaration of function
'gnutls_certificate_set_pin_function'
be-secure-gnutls.c: In function 'get_peer_certificate':
be-secure-gnutls.c:656: error: 'GNUTLS_X509_CRT_LIST_SORT' undeclared
(first use in this function)
be-secure-gnutls.c:656: error: (Each undeclared identifier is reported only
once
be-secure-gnutls.c:656: error: for each function it appears in.)

But I can build on ubuntu 16.04, using whatever baffling salami of package
versions they turned gnutls into on that system.

I can interoperate in both direction gnutls client to openssl server and
the reverse (and gnutls to gnutls).

Cheers,

Jeff

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-09-15 Thread Robert Haas

On Fri, Sep 15, 2017 at 10:10 AM, Daniel Gustafsson  wrote:
> Can we expect a rebased version of this patch for this commitfest?  Since it’s
> a rather large feature it would be good to get it in as early as we can in the
> process.

Again, given that this needs a "major" rebase and hasn't been updated
in a month, and given that the CF is already half over, this should
just be bumped to the next CF.  We're supposed to be trying to review
things that were ready to go by the start of the CF, not the end.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Log LDAP "diagnostic messages"?

2017-09-15 Thread Peter Eisentraut

On 9/14/17 10:21, Alvaro Herrera wrote:
> BTW I added --with-ldap and --with-pam to the configure line for the
> reports in coverage.postgresql.org and the % covered in auth.c went from
> 24% to 18.9% (from very bad to terribly sad).

You can add src/test/ldap/ now to make up for some of that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More flexible LDAP auth search filters?

2017-09-15 Thread Peter Eisentraut

On 9/12/17 19:04, Thomas Munro wrote:
>> Any further thoughts on the test suite?  Otherwise I'll commit it as we
>> have it, for manual use.

done

> I wonder if there is a reasonable way to indicate or determine whether
> you have slapd installed so that check-world could run this test...

The other problem is the open TCP/IP socket, so we can't really run it
by default anyway, like the SSL tests.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-09-15 Thread Robert Haas

On Fri, Sep 15, 2017 at 3:54 AM, Tsunakawa, Takayuki
 wrote:
> So far, there are four proponents (Tels (non-PG-developer), David J., Robert 
> and me), and two opponents (Tom and Michael).

4-2 is a reasonable vote in favor of proceeding, although it's a bit
marginal given the stature of the opponents.

I'm still not going to change this just before rc1 wraps though.  I
think it has to wait for 11.  There's too much chance of collateral
damage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimise default partition scanning while adding new partition

2017-09-15 Thread Robert Haas

On Fri, Sep 15, 2017 at 2:00 AM, Amit Langote
 wrote:
> I wonder if we should call check_default_allows_bound() from
> ATExecAttachPartition(), too, instead of validating updated default
> partition constraint using ValidatePartitionConstraints()?  That is, call
> the latter only to validate the partition constraint of the table being
> attached and call check_default_allows_bound() to validate the updated
> default partition constraint.  That way, INFO/ERROR messages related to
> default partition constraint are consistent across the board.

I believe the intended advantage of the current system is that if you
specify multiple operations in a single ALTER TABLE command, you only
do one scan rather than having a second scan per operation.  If that's
currently working, we probably don't want to make it stop working.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [JDBC] [HACKERS] Channel binding support for SCRAM-SHA-256

2017-09-15 Thread Robert Haas

On Thu, Sep 14, 2017 at 12:58 PM, Peter Eisentraut
 wrote:
> Second thoughts, to make things simpler.  All we need for channel
> binding is a connection flag that says "I require channel binding".  It
> could be modeled after the sslmode parameter, e.g., cbind=disable (maybe
> for debugging), cbind=prefer (default), cbind=require.  If you specify
> "require", then libpq would refuse to proceed unless scram-sha2-256-plus
> (or future similar mechanisms) was offered for authentication.

+1, although I think cbind is too brief.  I'd spell it out.

> We don't even need a parameter that specifies which channel binding type
> to use.  If libpq implements tls-unique, it should always use that.  We
> might need a flag for testing other types, but that should not be an
> in-the-user's-face option.

I'm not so sure about this part.  Why don't we want to let users control this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Process startup infrastructure is a mess

2017-09-15 Thread Robert Haas

On Thu, Sep 14, 2017 at 9:30 PM, Stephen Frost  wrote:
> I'm definitely in agreement with Andres on this one.  This isn't
> refactoring of little-to-never changed code, it's refactoring bits of
> the system which are changed with some regularity and looks likely to
> continue to need change as we add more features moving forward, and
> perhaps add greater controls over process startup.

I agree to some extent.

I think this is a mess and that cleaning it up would be better than
not cleaning it up.  I don't view it as a particularly urgent problem,
though, and I'm not sure I see a need to attack it in a monolithic
way.  I think that unifying the error recovery paths in various
processes would actually be more useful than unifying process startup,
but that's not to say unifying process startup wouldn't be nice.

Generally, I think the newer process startup methods are superior to
the older ones.  The new background worker mechanism provides a
flexible way of adding new processes whose exact number doesn't need
to be known in advance without having to tinker with so much code.  In
contrast, adding an auxiliary process is big nuisance.  So if I were
going to attack this problem, I'd try to make the background worker
machinery sufficiently capable that we can do everything that way, and
then gradually move more things over to that mechanism.

However, I honestly probably wouldn't bother with this unless it were
blocking something specific that I wanted to do.  I wouldn't go as far
as Simon did in his reply; I would not favor refusing a good patch
from a highly qualified contributor that makes this all less ugly and
fragile.  On the other hand, I think he's right that we wouldn't
necessarily get a lot of immediate benefit out of it apart from
feeling good about having done it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-09-15 Thread Konstantin Knizhnik




On 14.09.2017 18:53, Simon Riggs wrote:

It's not going to work, as already mentioned above. Those stats are at
table level and very little to do with this particular index.

But you've not commented on the design I mention that can work: index relcache.


Concerning your idea to check cost of index function: it certainly makes
sense.
The only problems: I do not understand now how to calculate this cost.
It can be easily calculated by optimizer when it is building query execution
plan.
But inside BuildIndexInfo I have just reference to Relation and have no idea
how
I can propagate here information about index expression cost from optimizer.

We could copy at create index, if we took that route. Or we can look
up the cost for the index expression and cache it.


Anyway, this is just jumping around because we still have a parameter
and the idea was to remove the parameter entirely by autotuning, which
I think is both useful and possible, just as HOT itself is autotuned.



Attached please find yet another version of the patch.
I have to significantly rewrite it,  because my first attempts to add 
auto-tune were not correct.

New patch does it in correct way (I hope) and more efficiently.
I moved auto-tune code from BuildIndexInfo, which is called many times, 
including heap_update (so at least once per update tuple).
to RelationGetIndexAttrBitmap which is called only when cached 
RelationData is filled by backend.
The problem with my original implementation of auto-tune was that 
switching off "projection" property of index, it doesn't update 
attribute masks,

calculated by RelationGetIndexAttrBitmap.

I have also added check for maximal cost of indexed expression.
So now decision whether to apply projection index optimization (compare 
old and new values of indexed expression)

is based  on three sources:
 1. Calculated hot update statistic: we compare number of hot updates 
which are performed
because projection index check shows that index expression is not 
changed with total
number of updates affecting attributes used in projection indexes. 
If it is smaller than

some threshold (10%), then index is considered as non-projective.
 2. Calculated cost of index expression: if it is higher than some 
threshold (1000) then
extra comparison of index expression values is expected to be too 
expensive.
 3. "projection" index option explicitly set by user. This setting 
overrides 1) and 2)




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 83ee7d3..52189ac 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 The optional WITH clause specifies storage
 parameters for the index.  Each index method has its own set of allowed
-storage parameters.  The B-tree, hash, GiST and SP-GiST index methods all
-accept this parameter:
+storage parameters. All indexes accept the following parameter:
+   
+
+   
+   
+projection
+
+ 
+   Functional index is based on on projection function: function which extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed. So to check that index is affected by the
+   update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered
+   as non-injective.
+   In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for
+   the expression expression(bookinfo-'isbn') defined
+   for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most
+   functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates
+   index only when function results are different. It allows to eliminate index update and use HOT update.
+   But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect
+   the function value is small, then marking index as projection may increase update speed.
+
+
+   
+   
+
+   
+ The B-tree, hash, GiST and SP-GiST index methods all accept this parameter:

 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ec10762..b73165f 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -130,6 +130,15 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			"projection",
+			"Evaluate functional index expression on update to check if its values is

Re: [HACKERS] [PATCH] Improve geometric types

2017-09-15 Thread Robert Haas

On Fri, Sep 15, 2017 at 4:23 AM, Kyotaro HORIGUCHI
 wrote:
> /* don't merge the following same functions with different types
>into single macros so that double evaluation won't happen */
>
> Is it still too verbose?

Personally, I don't think such a comment has much value, but I am not
going to spend a lot of time arguing about it.  It's really up to the
eventual committer to decide, and that probably won't be me in this
case.  My knowledge of the geometric types isn't great, and I am a tad
busy breaking^Wimproving things I do understand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres_fdw: evaluate placeholdervars on remote server

2017-09-15 Thread Robert Haas

On Fri, Sep 15, 2017 at 10:15 AM, Daniel Gustafsson  wrote:
> Have you had a chance to look at this such that we can expect a rebased 
> version
> of this patch during the commitfest?

Frankly, I think things where there was a ping multiple weeks before
the CommitFest started and no rebase before it started should be
regarded as untimely submissions, and summarily marked Returned with
Feedback.  The CommitFest is supposed to be a time to get things that
are ready before it starts committed before it ends.  Some amount of
back-and-forth during the CF is of course to be expected, but we don't
even really have enough bandwidth to deal with the patches that are
being timely updated, never mind the ones that aren't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Renaming PG_GETARG functions (was Re: PG_GETARG_GISTENTRY?)

2017-09-15 Thread Daniel Gustafsson

> On 12 Sep 2017, at 22:07, Tom Lane  wrote:
> 
> [ changing subject line to possibly draw more attention ]
> 
> Mark Dilger  writes:
>>> On Apr 5, 2017, at 9:23 AM, Tom Lane  wrote:
>>> In short, if you are supposed to write
>>> FOO  *val = PG_GETARG_FOO(n);
>>> then the macro designer blew it, because the name implies that it
>>> returns FOO, not pointer to FOO.  This should be
>>> FOO  *val = PG_GETARG_FOO_P(n);
> 
>> I have written a patch to fix these macro definitions across src/ and
>> contrib/.
> 
> So to summarize, this patch proposes to rename some DatumGetFoo,
> PG_GETARG_FOO, and PG_RETURN_FOO macros for these datatypes:
> 
> NDBOX (contrib/cube)
> HSTORE
> LTREE and other contrib/ltree types
> 
> PG_GETARG_ANY_ARRAY (and there are some related macros it maybe should
> have touched, like PG_RETURN_EXPANDED_ARRAY)
> 
> JSONB
> 
> RANGE
> 
> The contrib types don't seem like much of a problem, but I wonder
> whether anyone feels that rationalizing the names for array, JSON,
> or range-type macros will break too much code.
> 
> One option if we do feel that way is that we could provide the
> old names as alternatives, thus not breaking external modules.
> But that seems like it's sabotaging the basic goal of improving
> consistency of naming.
> 
> If there are not objections, I plan to push forward with this.

Judging by this post, I’m updating this to “Ready for Committer”.

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept

2017-09-15 Thread i . kartyshov


Alexander Korotkov писал 2017-09-05 00:33:

I've following comments on this patch:
1) You shouldn't use ">=" to compare xids.  You should use
TransactionIdFollowsOrEquals() function which handles transaction id
wraparound correctly.
I fixed it and as an additional I add GUC parameter that could turn off 
this behavior. These parameter can be set in the postgresql.conf 
configuration file, or with the help of the ALTER SYSTEM command. For 
the changes to take effect, call the pg_reload_conf() function or reload 
the database server.

Changes are in vacuum_lazy_truncate_v3.patch


2) As I understood, this patch makes heap truncate only when no
concurrent transactions are running on both master and replicas with
hot_standby_feedback enabled.  For busy system, it would be literally
"never do heap truncate after vacuum".
Yes, the difficulty is that if we have a lot of replicas and requests 
for them will overlap each other, then truncation will not occur on the 
loaded system



Perhaps, it's certainly bad idea to disable replaying
AccessExclusiveLock *every time*, we should skip taking
AccessExclusiveLock only while writing XLOG_SMGR_TRUNCATE record.
Besides that, this code violates out coding style.
In patch (standby_truncate_skip_v2.patch) fixed this behaviour and now 
it skip writing XLOG_STANDBY_LOCK records (with in AccessExclusiveLock) 
only while autovacuum truncations the table.


Amit Kapila писал 2017-09-05 12:52:

I think one deficiency of this solution is that it will keep on
generating extra WAL even if standby doesn't need it (as standby has
successfully truncated the relation).  I don't know if we can just get
away by saying that an additional WAL record per vacuum cycle is
harmless especially when that doesn't serve any purpose (like for the
cases when standby is always able to truncate the heap on first WAL
record).  Am I missing some part of solution which avoids it?
You are right, and this implementation generating extra WALs and it 
could have some cases. If you have any solutions to avoid it, I`ll be 
glade to hear them.


Simon Riggs писал 2017-09-05 14:44:

If we defer an action on standby replay, when and who will we apply
it? What happens if the standby is shutdown or crashes while an action
is pending.

Perhaps altering the way truncation requires an AccessExclusiveLock
would help workloads on both master and standby? If a Relation knew it
had a pending truncation then scans wouldn't need to go past newsize.
Once older lockers have gone we could simply truncate without the
lock. Would need a few pushups but seems less scary then trying to
make pending standby actions work well enough to commit.
Yes, it is the first idea that come to my mind and the most difficult 
one. I think that in a few days I'll finish the patch with such ideas of 
implementation.



--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 9a5fde0..a6dc369 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -31,6 +31,9 @@
 #include "storage/smgr.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
+#include "storage/lmgr.h"
+#include "storage/procarray.h"
+#include "access/transam.h"
 
 /*
  * We keep a list of all relations (represented as RelFileNode values)
@@ -495,9 +498,21 @@ smgr_redo(XLogReaderState *record)
 		xl_smgr_truncate *xlrec = (xl_smgr_truncate *) XLogRecGetData(record);
 		SMgrRelation reln;
 		Relation	rel;
+		LOCKTAG		locktag;
+		VirtualTransactionId *backends;
+		VirtualTransactionId *backends2;
 
 		reln = smgropen(xlrec->rnode, InvalidBackendId);
 
+		SET_LOCKTAG_RELATION(locktag, reln->smgr_rnode.node.dbNode,
+			 reln->smgr_rnode.node.relNode);
+
+		backends = GetLockConflicts(, AccessExclusiveLock);
+		backends2 = GetConflictingVirtualXIDs(InvalidTransactionId, InvalidOid);
+
+		if (!VirtualTransactionIdIsValid(*backends) && !VirtualTransactionIdIsValid(*backends2))
+		{
+
 		/*
 		 * Forcibly create relation if it doesn't exist (which suggests that
 		 * it was dropped somewhere later in the WAL sequence).  As in
@@ -542,6 +557,10 @@ smgr_redo(XLogReaderState *record)
 			visibilitymap_truncate(rel, xlrec->blkno);
 
 		FreeFakeRelcacheEntry(rel);
+		} else
+		{
+			XLogFlush(lsn);
+		}
 	}
 	else
 		elog(PANIC, "smgr_redo: unknown op code %u", info);
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 45b1859..3c25907 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -61,6 +61,8 @@
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
 #include "utils/tqual.h"
+#include "catalog/storage_xlog.h"
+#include "access/rmgr.h"
 
 
 /*
@@ -317,6 +319,32 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		new_rel_pages = vacrelstats->old_rel_pages;
 		new_rel_tuples = vacrelstats->old_rel_tuples;
 	}
+	else
+	{
+		if

Re: [HACKERS] Support to COMMENT ON DATABASE CURRENT_DATABASE

2017-09-15 Thread Bossart, Nathan

A few general comments.

While this patch applies, I am still seeing some whitespace errors:

  comment_on_current_database_no_pgdump_v4.1.patch:488: trailing whitespace.
ColId
  comment_on_current_database_no_pgdump_v4.1.patch:490: trailing whitespace.
| CURRENT_DATABASE
  comment_on_current_database_no_pgdump_v4.1.patch:491: trailing whitespace.
{
  comment_on_current_database_no_pgdump_v4.1.patch:501: trailing whitespace.
ColId
  comment_on_current_database_no_pgdump_v4.1.patch:502: trailing whitespace.
{
  warning: squelched 9 whitespace errors
  warning: 14 lines add whitespace errors.

It looks like the 'dbname' test is currently failing when you run
'make check-world'.  The Travis CI build log [1] shows the details
of the failure.  From a brief glance, I would guess that some of
the queries are returning unexpected databases that are created in
other tests.

Also, I think this change will require updates to the
documentation.

+   if (dbspecname->dbnametype == DBSPEC_CURRENT_DATABASE )
+   dbname = get_database_name(MyDatabaseId);
+   else
+   dbname = dbspecname->dbname;

This pattern is repeated in the patch several times.  It looks like
the end goal of these code blocks is either to get the database
name or the database OID, so perhaps we should have
get_dbspec_name() and get_dbspec_oid() helper functions (like
get_rolespec_oid() for RoleSpec nodes).

+static bool
+_equalDatabaseSpec(const DBSpecName *a, const DBSpecName *b)
+{
+   COMPARE_SCALAR_FIELD(dbnametype);
+   COMPARE_STRING_FIELD(dbname);
+   COMPARE_LOCATION_FIELD(location);
+
+   return true;
+}

There are some inconsistencies in the naming strategy.  For
example, this function is called _equalDatabaseSpec(), but the
struct is DBSpecName.  I would suggest calling it DatabaseSpec or
DbSpec throughout the patch.

Nathan

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/275747367


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statement-level rollback

2017-09-15 Thread Daniel Gustafsson

> On 01 Sep 2017, at 13:44, Simon Riggs  wrote:
> 
> On 14 August 2017 at 23:58, Peter Eisentraut
>  wrote:
>> On 2/28/17 02:39, Tsunakawa, Takayuki wrote:
>>> The code for stored functions is not written yet, but I'd like your 
>>> feedback for the specification and design based on the current patch.  I'll 
>>> add this patch to CommitFest 2017-3.
>> 
>> This patch needs to be rebased for the upcoming commit fest.
> 
> I'm willing to review this if the patch is going to be actively worked on.

This sounds like a too good offer to pass up on, can we expect a rebased patch
for the commitfest?

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres_fdw: evaluate placeholdervars on remote server

2017-09-15 Thread Daniel Gustafsson

> On 15 Aug 2017, at 01:00, Peter Eisentraut  
> wrote:
> 
> On 4/3/17 22:00, Etsuro Fujita wrote:
>> On 2017/04/04 3:21, Andres Freund wrote:
>>> On 2017-02-28 21:45:22 +0900, Etsuro Fujita wrote:
 Here is a patch for $subject.
>>> 
>>> This is a nontrivial patch, submitted just before the start of the last
>>> CF for postgres 10.  Therefore I think we should move this to the next
>>> CF.
>> 
>> Honestly, I'm not satisfied with this patch and I think it would need 
>> more work.  Moved to the next CF.
> 
> This patch needs to be rebased for the upcoming commit fest.

Have you had a chance to look at this such that we can expect a rebased version
of this patch during the commitfest?

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind proposed scope and interface changes

2017-09-15 Thread Chris Travers

On Wed, Sep 13, 2017 at 6:28 AM, Michael Paquier 
wrote:

> On Tue, Sep 12, 2017 at 11:52 PM, Chris Travers
>  wrote:
> > Additionally the wal, xact, timestamp and logical directories must be
> > processed in some way.
>
> To what does the term "logical directories" refer to?
>
> >   * if --wal=sync the directories are processed the way they are today
> >   * if --wal=clear then the contents of the directories are cleared and
> > replication is assumed to be used to bring the system up after.  Note
> this
> > will need to come with warning about the need for replication slots.
>
> Hm. I am not sure in what --wal=clear is helpful. Keeping around WAL
> segments from the point of the last checkpoint where WAL forked up to
> the point where WAL has forked is helpful, because you don't need to
> copy again those WAL segments, be they come from an archive or from
> streaming. Copying a set of WAL segments during the rewind of the new
> timeline is helpful as well because you don't need to do the copy
> again. One configuration where this is helpful is that there is
> already an archive local to the target server available with the
> segments of the new timeline available.
>

so maybe clear should just clear diverged wal files.  That makes some
sense.

>
> > Base, global, pg_tablespace
> >
> > With
> > pg_wal, pg_xact, pg_commit_ts, pg_logical added if wal strategy is set to
> > sync.
>
> Skipping some directories in a way similar to what a base backup does
> would be nicer I think. We already have a list of those in
> basebackup.c in the shape of excludeDirContents and excludeFiles. I
> think that it would be a good idea to export those into a header that
> pg_rewind could include, and refer to in order to exclude them when
> fetching a set of files. At the end of the day, a rewind is a kind of
> base backup in itself, and this patch would already serve well a lot
> of people.
>

I think there are several reasons not to just do this based on base
backup.  For base backup there may be some things we restore as such that
we don't want to restore with pg_rewind.  Configuration files, logs, and
replication slots are the ones that immediately come to mind.  Given that
configuration files can have arbitrary names and locations, expecting third
party failover tooling like repmgr or patroni to pick up on such policies
strikes me as asking a bit much.  I think we are better off with a program
that solves one problem well and solves it right and the problem is
rewinding a system so that it can follow a replica that was previously
following it.

>
> Having on top of that a way to exclude a wanted set of files and the
> log directory (for example: should we look at log_directory and
> exclude it from the fetched paths if it is not an absolute path?),
> which is smart enough to take care of not removing paths critical for
> a rewind like anything in base/, then you are good to go with a
> full-blown tool that I think would serve the purposes you are looking
> for.
>

If you are going to go this route, I think you need to look at the config
files themselves, parse them, and possibly look at the command line by
which PostgreSQL is started.Otherwise I don't know how you expect this
to be configured


> --
> Michael
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-09-15 Thread Daniel Gustafsson

> On 15 Aug 2017, at 04:30, Peter Eisentraut  
> wrote:
> 
> On 3/15/17 11:56, David Steele wrote:
>>> This patch has been moved to CF 2017-07.
>> 
>> I did not manage to move this patch when I said had.  It is now moved.
> 
> Unsurprisingly, this patch needs a major rebase.

Can we expect a rebased version of this patch for this commitfest?  Since it’s
a rather large feature it would be good to get it in as early as we can in the
process.

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow INSTEAD OF DELETE triggers to modify the tuple for RETURNING

2017-09-15 Thread Daniel Gustafsson

> On 05 Sep 2017, at 10:44, Haribabu Kommi  wrote:
> 
> On Mon, Aug 14, 2017 at 6:48 AM, Marko Tiikkaja  > wrote:
> On Fri, Jul 1, 2016 at 2:12 AM, I wrote:
> Currently the tuple returned by INSTEAD OF triggers on DELETEs is only used 
> to determine whether to pretend that the DELETE happened or not, which is 
> often not helpful enough; for example, the actual tuple might have been 
> concurrently UPDATEd by another transaction and one or more of the columns 
> now hold values different from those in the planSlot tuple. Attached is an 
> example case which is impossible to properly implement under the current 
> behavior.  For what it's worth, the current behavior seems to be an accident; 
> before INSTEAD OF triggers either the tuple was already locked (in case of 
> BEFORE triggers) or the actual pre-DELETE version of the tuple was fetched 
> from the heap.
> 
> So I'm suggesting to change this behavior and allow INSTEAD OF DELETE 
> triggers to modify the OLD tuple and use that for RETURNING instead of 
> returning the tuple in planSlot.  Attached is a WIP patch implementing that.
> 
> Is there any reason why we wouldn't want to change the current behavior?
> 
> Since nobody seems to have came up with a reason, here's a patch for that 
> with test cases and some documentation changes.  I'll also be adding this to 
> the open commit fest, as is customary.
> 
> Thanks for the patch. This patch improves the DELETE returning
> clause with the actual row.
> 
> Compilation and tests are passed. I have some review comments.
> 
> ! that was provided.  Likewise, for DELETE operations the
> ! OLD variable can be modified before returning it, and
> ! the changes will be reflected in the output data.
> 
> The above explanation is not very clear, how about the following?
> 
> Likewise, for DELETE operations the trigger may 
> modify the OLD row before returning it, and the
> change will be reflected in the output data of DELETE RETURNING.
> 
> 
> ! TupleTableSlot *
>   ExecIRDeleteTriggers(EState *estate, ResultRelInfo *relinfo,
> !  HeapTuple trigtuple, TupleTableSlot 
> *slot)
> 
> ! oldtuple = ExecMaterializeSlot(slot); --nodeModifyTable.c
> 
> The trigtuple is part of the slot anyway, I feel there is no need to pass
> the trigtuple seperately. The tuple can be formed internaly by Materializing
> slot.
> 
> Or
> 
> Don't materialize the slot before the ExecIRDeleteTriggers function
> call.
> 
> ! /*
> !  * Return the modified tuple using the es_trig_tuple_slot.  We 
> assume
> !  * the tuple was allocated in per-tuple memory context, and 
> therefore
> !  * will go away by itself. The tuple table slot should not try 
> to
> !  * clear it.
> !  */
> ! TupleTableSlot *newslot = estate->es_trig_tuple_slot;
> 
> The input slot that is passed to the function ExecIRDeleteTriggers
> is same as estate->es_trig_tuple_slot. And also the tuple descriptor
> is same. Instead of using the newslot, directly use the slot is fine.
> 
> 
> + /* trigger might have changed tuple */
> + oldtuple = ExecMaterializeSlot(slot);
> 
> 
> + if (resultRelInfo->ri_TrigDesc &&
> + resultRelInfo->ri_TrigDesc->trig_delete_instead_row)
> + return ExecProcessReturning(resultRelInfo, slot, 
> planSlot);
> 
> 
> Views cannot have before/after triggers, Once the call enters into
> Instead of triggers flow, the oldtuple is used to frame the slot, if the
> returning clause is present. But in case of instead of triggers, the call
> is returned early as above and the framed old tuple is not used.
> 
> Either change the logic of returning for instead of triggers, or remove
> the generation of oldtuple after instead triggers call execution.

Have you had a chance to work on this patch to address the above review?

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-15 Thread Peter Eisentraut

On 9/12/17 15:37, Peter Eisentraut wrote:
> On 9/11/17 14:26, Peter Eisentraut wrote:
>> On 9/10/17 12:14, Noah Misch wrote:
>>> This PostgreSQL 10 open item is past due for your status update.  Kindly 
>>> send
>>> a status update within 24 hours, and include a date for your subsequent 
>>> status
>>> update.  Refer to the policy on open item ownership:
>>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
>>
>> I'm looking into this now and will report tomorrow.
> 
> I need some community feedback on the possible solutions.  I will wait
> until Thursday.

I have posted a patch, which I will leave for comments until Sunday (so
that it could get into a tarball on Monday).

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-15 Thread Peter Eisentraut

On 9/14/17 15:47, Arseny Sher wrote:
> Peter Eisentraut  writes:
> 
>> On 9/13/17 07:00, Arseny Sher wrote:
>>> On the other hand, forbidding to execute disable sub and drop sub in one
>>> transaction makes it impossible e.g. to drop subscription in trigger
>>
>> Disabling a subscription before dropping it isn't very useful, is it?
>> So I'm not sure this is an important use case we need to optimize.  We
>> just need to prevent it from hanging.
> 
> Not quite so. Currently it is forbidded to set subscription's slot to
> NONE on enabled sub, and the only way to drop sub without touching the
> slot is to run ALTER SUBSCRIPTION SET (SLOT_NAME = NONE) beforehand.
> Practically this means that we have to go through disable sub -> alter
> sub -> drop sub pipeline if we want to left the slot alone or just would
> like to drop sub in transaction block -- with replication slot set the
> latter is impossible.

OK, good point.

Here is a simple patch that fixes this, based on my original proposal
point #4.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 0474d91ee46f5974fa2e814036b7ac92819f615c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 15 Sep 2017 09:41:55 -0400
Subject: [PATCH] Fix DROP SUBSCRIPTION hang

When ALTER SUBSCRIPTION DISABLE is run in the same transaction before
DROP SUBSCRIPTION, the latter will hang because workers will still be
running, not having seen the DISABLE committed, and DROP SUBSCRIPTION
will wait until the workers have vacated the replication origin slots.

To fix, kill the workers also if dropping a subscription that is
disabled.  This will address this specific scenario.  If the
subscription was already disabled before the current transaction, then
there shouldn't be any workers left and this won't make a difference.

Reported-by: Arseny Sher 
Discussion: 
https://www.postgresql.org/message-id/flat/87mv6av84w.fsf%40ars-thinkpad
---
 src/backend/commands/subscriptioncmds.c | 18 ++--
 src/test/subscription/t/007_ddl.pl  | 51 +
 2 files changed, 66 insertions(+), 3 deletions(-)
 create mode 100644 src/test/subscription/t/007_ddl.pl

diff --git a/src/backend/commands/subscriptioncmds.c 
b/src/backend/commands/subscriptioncmds.c
index 2ef414e084..9f85ae937e 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -816,6 +816,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool 
isTopLevel)
char   *subname;
char   *conninfo;
char   *slotname;
+   boolsubenabled;
List   *subworkers;
ListCell   *lc;
charoriginname[NAMEDATALEN];
@@ -886,6 +887,9 @@ DropSubscription(DropSubscriptionStmt *stmt, bool 
isTopLevel)
else
slotname = NULL;
 
+   /* Get subenabled */
+   subenabled = ((Form_pg_subscription) GETSTRUCT(tup))->subenabled;
+
/*
 * Since dropping a replication slot is not transactional, the 
replication
 * slot stays dropped even if the transaction rolls back.  So we cannot
@@ -910,8 +914,16 @@ DropSubscription(DropSubscriptionStmt *stmt, bool 
isTopLevel)
 
/*
 * If we are dropping the replication slot, stop all the subscription
-* workers immediately, so that the slot becomes accessible.  Otherwise
-* just schedule the stopping for the end of the transaction.
+* workers immediately, so that the slot becomes accessible.
+*
+* Also, if the subscription is disabled, stop the workers.  This is
+* necessary if the subscription was disabled in the same transaction, 
in
+* which case the workers haven't seen that yet and will still be 
running,
+* leading to hangs later when we want to drop the replication origin.  
If
+* the subscription was disabled before this transaction, then there
+* shouldn't be any workers left, so this won't make a difference.
+*
+* Otherwise just schedule the stopping for the end of the transaction.
 *
 * New workers won't be started because we hold an exclusive lock on the
 * subscription till the end of the transaction.
@@ -923,7 +935,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool 
isTopLevel)
{
LogicalRepWorker *w = (LogicalRepWorker *) lfirst(lc);
 
-   if (slotname)
+   if (slotname || !subenabled)
logicalrep_worker_stop(w->subid, w->relid);
else
logicalrep_worker_stop_at_commit(w->subid, w->relid);
diff --git a/src/test/subscription/t/007_ddl.pl 
b/src/test/subscription/t/007_ddl.pl
new file mode 100644
index 00..3f36238840
--- /dev/null
+++ b/src/test/subscription/t/007_ddl.pl
@@ -0,0 +1,51 @@

Re: [HACKERS] additional contrib test suites

2017-09-15 Thread Tom Lane

Peter Eisentraut  writes:
> So, we have one failure for chkpass on OpenBSD, because OpenBSD crypt()
> doesn't support the traditional two-character salt format.

> Option:

> - Use the resultmap features to make this an expected failure on OpenBSD.

> - Fix the module to work on OpenBSD.  This would involve making a
> platform-specific modification to use whatever advanced salt format they
> want.

> - Replace the entire module with something that does not depend on crypt().

Or (4) drop the module's regression test again.

I'd go for (1) at least as a short-term answer.  It's not clear to me
that it's worth the effort to do (2) or (3).  Also, (3) probably breaks
backwards compatibility, if there is anyone out there actually using
this datatype.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Nico Williams

On Fri, Sep 15, 2017 at 4:11 AM Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Thu, Sep 14, 2017 at 10:41 PM, Nico Williams 
> wrote:
>
>> https://github.com/twosigma/postgresql-contrib/
>>
>> https://github.com/twosigma/postgresql-contrib/blob/master/commit_trigger.sql
>>
>> https://raw.githubusercontent.com/twosigma/postgresql-contrib/master/commit_trigger.sql
>
>
> Do I understand correctly that this is SQL implementation of COMMIT
> TRIGGER functionality which is a prototype used to demonstrate it.  And if
> this prototype is well-accepted then you're going to write a patch for
> builtin COMMIT TRIGGER functionality.  Is it right?
>

That's the idea, yes.

Nico
-- 

>

Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-09-15 Thread Alexander Korotkov

On Fri, Sep 15, 2017 at 3:36 PM, Andrey Borodin 
wrote:

> > 14 сент. 2017 г., в 18:42, Alexander Korotkov 
> написал(а):
> > It would be good if someone would write patch for removing useless
> compress/decompress methods from builtin and contrib GiST opclasses.
> Especially when it gives benefit in IndexOnlyScan enabling.
> If the patch will be accepted, I'll either do this myself for commitfest
> at November or charge some students from Yandex Data School to do this
> (they earn some points in algorithms course for contributing to OSS).
>

Great!  I'm looking forward see their patches on commitfest!

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] additional contrib test suites

2017-09-15 Thread Peter Eisentraut

On 9/14/17 22:47, Peter Eisentraut wrote:
>> As for testing on more platforms, send it to the build farm?
> OK, committed, let's see.

So, we have one failure for chkpass on OpenBSD, because OpenBSD crypt()
doesn't support the traditional two-character salt format.

Option:

- Use the resultmap features to make this an expected failure on OpenBSD.

- Fix the module to work on OpenBSD.  This would involve making a
platform-specific modification to use whatever advanced salt format they
want.

- Replace the entire module with something that does not depend on crypt().

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Log LDAP "diagnostic messages"?

2017-09-15 Thread Thomas Munro

On Fri, Sep 15, 2017 at 2:12 AM, Alvaro Herrera  wrote:
> I think the ldap_unbind() changes should be in a separate preliminary
> patch to be committed separately and backpatched.

OK, here it is split into two patches.

> The other bits looks fine, with nitpicks
>
> 1. please move the new support function to the bottom of the section
> dedicated to LDAP, and include a prototype

OK.

> 2. please wrap lines longer than 80 chars, other than error message
> strings.  (I don't know why this file plays fast & loose with
> project-wide line length rules, but I also see no reason to continue
> doing it.)

Done for all lines I touched in the patch.

Thanks for the review!

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Improve-LDAP-cleanup-code-in-error-paths.patch
Description: Binary data


0002-Log-diagnostic-messages-if-errors-occur-during-LDAP-.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-09-15 Thread Andrey Borodin

Dmitry and Alexander, thank you for looking into the patch!

> 14 сент. 2017 г., в 18:42, Alexander Korotkov  
> написал(а):
> It would be good if someone would write patch for removing useless 
> compress/decompress methods from builtin and contrib GiST opclasses.  
> Especially when it gives benefit in IndexOnlyScan enabling.
If the patch will be accepted, I'll either do this myself for commitfest at 
November or charge some students from Yandex Data School to do this (they earn 
some points in algorithms course for contributing to OSS).

> BTW, this change in the patch look suspicious for me
> 
> We are typically evade changing formatting in fragments of codes not directly 
> touched by the patch.

Thanks for spotting this out. Here's fixed version.

Best regards, Andrey Borodin.



0001-Allow-uncompressed-GiST-4.patch
Description: Binary data


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] no test coverage for ALTER FOREIGN DATA WRAPPER name HANDLER ...

2017-09-15 Thread Amit Langote

On Fri, Sep 15, 2017 at 9:20 PM, Robert Haas  wrote:
> On Thu, Sep 14, 2017 at 8:30 AM, Ashutosh Bapat
>  wrote:
>> LGTM. The patch applies cleanly on the current HEAD, compiles (small
>> bit in regress.c requires compilation), and make check passes. Marking
>> this as "ready for committer".
>
> Committed.

Thanks, closed in the CF app.

Regards,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More efficient truncation of pg_stat_activity query strings

2017-09-15 Thread Robert Haas

On Thu, Sep 14, 2017 at 2:03 AM, Andres Freund  wrote:
> Here's a patch that implements that idea.

>From the department of cosmetic nitpicking:

+ * All supported server-encodings allow to determine the length of a

make it possible to determine

+ * multi-byte character from it's first byte (not the case for client

its

this is not the case

+ * encodings, see GB18030). As st_activity always is stored using server

is always stored using a server

+ * pgstat_clip_activity() to trunctae correctly.

truncate

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] no test coverage for ALTER FOREIGN DATA WRAPPER name HANDLER ...

2017-09-15 Thread Robert Haas

On Thu, Sep 14, 2017 at 8:30 AM, Ashutosh Bapat
 wrote:
> LGTM. The patch applies cleanly on the current HEAD, compiles (small
> bit in regress.c requires compilation), and make check passes. Marking
> this as "ready for committer".

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More efficient truncation of pg_stat_activity query strings

2017-09-15 Thread Kuntal Ghosh

On Thu, Sep 14, 2017 at 11:33 AM, Andres Freund  wrote:
> On 2017-09-12 00:19:48 -0700, Andres Freund wrote:
>> Hi,
>>
>> I've recently seen a benchmark in which pg_mbcliplen() showed up
>> prominently. Which it will basically in any benchmark with longer query
>> strings, but fast queries. That's not that uncommon.
>>
>> I wonder if we could avoid the cost of pg_mbcliplen() from within
>> pgstat_report_activity(), by moving some of the cost to the read
>> side. pgstat values are obviously read far less frequently in nearly all
>> cases that are performance relevant.
>>
>> Therefore I wonder if we couldn't just store a querystring that's
>> essentially just a memcpy()ed prefix, and do a pg_mbcliplen() on the
>> read side.  I think that should work because all *server side* encodings
>> store character lengths in the *first* byte of a multibyte character
>> (at least one clientside encoding, gb18030, doesn't behave that way).
>>
>> That'd necessitate an added memory copy in pg_stat_get_activity(), but
>> that seems fairly harmless.
>>
>> Faults in my thinking?
>
> Here's a patch that implements that idea.  Seems to work well.  I'm a
> bit loathe to add proper regression tests for this, seems awfully
> dependent on specific track_activity_query_size settings.  I did confirm
> using gdb that I see incomplete characters before
> pgstat_clip_activity(), but not after.
>
> I've renamed st_activity to st_activity_raw to increase the likelihood
> that potential external users of st_activity notice and adapt. Increases
> the noise, but imo to a very bareable amount. Don't feel strongly
> though.
>
Hello,

The patch looks good to me. I've done some regression testing with a
custom script on my local system. The script contains the following
statement:
SELECT 'aaa..' as col;

Test 1
---
duration: 300 seconds
clients/threads: 1

On HEAD TPS: 13181
+9.30% 0.27%  postgres  postgres [.] pgstat_report_activity
+8.88% 0.02%  postgres  postgres [.] pg_mbcliplen
+7.76% 4.77%  postgres  postgres [.] pg_encoding_mbcliplen
+4.06% 4.06%  postgres  postgres [.] pg_utf_mblen

With the patch TPS:13628 (+3.39%)
+0.36% 0.21%  postgres  postgres  [.] pgstat_report_activity

Test 2
---
duration: 300 seconds
clients/threads: 8

On HEAD TPS: 53084
+   12.17% 0.20%  postgres  postgres [.]
pgstat_report_activity
+   11.83% 0.02%  postgres  postgres [.] pg_mbcliplen
+   11.19% 8.03%  postgres  postgres [.] pg_encoding_mbcliplen
+3.74% 3.73%  postgres  postgres [.] pg_utf_mblen

With the patch TPS: 63949 (+20.4%)
+0.40% 0.25%  postgres  postgres  [.] pgstat_report_activity

This shows the significance of this patch in terms of performance
improvement of pgstat_report_activity. Is there any other tests I
should do for the same?


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-09-15 Thread Ashutosh Bapat

On Fri, Sep 15, 2017 at 2:09 PM, Rafia Sabih
 wrote:
> On TPC-H benchmarking of this patch, I found a regression in Q7. It
> was taking some 1500s with the patch and some 900s without the patch.
> Please find the attached pwd_reg.zip for the output of explain analyse
> on head and with patch.
>
> The experimental settings used were,
> commit-id = 0c504a80cf2e6f66df2cdea563e879bf4abd1629
> patch-version = v26
>
> Server settings:
> work_mem = 1GB
> shared_buffers = 10GB
> effective_cache_size = 10GB
> max_parallel_workers_per_gather = 4
>
> Partitioning information:
> Partitioning scheme = by range
> Number of partitions in lineitem and orders table = 106
> partition key for lineitem = l_orderkey
> partition key for orders = o_orderkey

I observe that with partition-wise join patch the planner is using
GatherMerge along-with partition-wise join and on head its not using
GatherMerge. Just to make sure that its partition-wise join which is
causing regression and not GatherMerge, can you please run the query
with enable_gathermerge = false?

I see following lines explain analyze output 7_1.out without the patch
 ->  Sort  (cost=84634030.40..84638520.55 rows=1796063
width=72) (actual time=1061001.435..1061106.608 rows=437209 loops=1)
   Sort Key: n1.n_name, n2.n_name,
(date_part('year'::text, (lineitem_001.l_shipdate)::timestamp without
time zone))
   Sort Method: quicksort  Memory: 308912kB
   ->  Hash Join  (cost=16080591.94..84447451.72
rows=1796063 width=72) (actual time=252745.701..1057447.219
rows=1749956 loops=1)
Since Sort doesn't filter any rows, we would expect it to output the
same number of rows as hash join underneath it. But the number of rows
differ in this case. I am wondering whether there's some problem with
the explain analyze output itself.

>
> Apart from these there is a regression case on a custom table, on head
> query completes in 20s and with this patch it takes 27s. Please find
> the attached .out and .sql file for the output and schema for the test
> case respectively. I have reported this case before (sometime around
> March this year) as well, but I am not sure if it was overlooked or is
> an unimportant and expected behaviour for some reason.
>

Are you talking about [1]? I have explained about the regression in
[2] and [3]. This looks like an issue with the existing costing model.

[1] 
https://www.postgresql.org/message-id/caogqiimwcjnrunj_fcdbscrtlej-clp7exfzzipe2ut71n4...@mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAFjFpRedUZPa7tKbCLEGK3u5UWdDNQoN=eyfb7ieg5d0d1p...@mail.gmail.com
[3] 
https://www.postgresql.org/message-id/cafjfprejksdcfaeuzjgd79hoetzpz5bkdxljgxr7qznrxx+...@mail.gmail.com
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

2017-09-15 Thread Daniel Gustafsson

> On 06 Sep 2017, at 09:45, Haribabu Kommi  wrote:
> 
> On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra  > wrote:
> On 7/25/17 12:55 AM, Tom Lane wrote:
> Tomas Vondra  > writes:
> It seems to me that VACUUM and ANALYZE somewhat disagree on what
> exactly reltuples means. VACUUM seems to be thinking that reltuples
> = live + dead while ANALYZE apparently believes that reltuples =
> live
> 
> The question is - which of the reltuples definitions is the right
> one? I've always assumed that "reltuples = live + dead" but perhaps
> not?
> 
> I think the planner basically assumes that reltuples is the live
> tuple count, so maybe we'd better change VACUUM to get in step.
> 
> Attached is a patch that (I think) does just that. The disagreement was 
> caused by VACUUM treating recently dead tuples as live, while ANALYZE treats 
> both of those as dead.
> 
> At first I was worried that this will negatively affect plans in the 
> long-running transaction, as it will get underestimates (due to reltuples not 
> including rows it can see). But that's a problem we already have anyway, you 
> just need to run ANALYZE in the other session.
> 
> Thanks for the patch.
> From the mail, I understand that this patch tries to improve the
> reltuples value update in the catalog table by the vacuum command
> to consider the proper visible tuples similar like analyze command.
> 
> - 
>  num_tuples);
> + 
>  num_tuples - nkeep);
> 
> With the above correction, there is a problem in reporting the number
> of live tuples to the stats.
> 
> postgres=# select reltuples, n_live_tup, n_dead_tup
>   from pg_stat_user_tables join pg_class using (relname)
>  where relname = 't';
>  reltuples | n_live_tup | n_dead_tup 
> ---++
> 899818 | 799636 | 100182
> (1 row)
> 
> 
> The live tuples data value is again decremented with dead tuples
> value before sending them to stats in function lazy_vacuum_rel(),
> 
>   /* report results to the stats collector, too */
>   new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;
> 
> The fix needs a correction here also. Or change the correction in 
> lazy_vacuum_rel() function itself before updating catalog table similar
> like stats.

This patch is marked Waiting for Author, have you had a chance to look at this
to address the comments in the above review?

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Off-by-one error in logical slot resource retention

2017-09-15 Thread Daniel Gustafsson

> On 01 Sep 2017, at 14:28, Aleksander Alekseev  
> wrote:
> 
> The following review has been posted through the commitfest application:
> make installcheck-world:  not tested
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
> 
> Hi Craig,
> 
> I'm afraid patches 0002 and 0003 don't apply anymore. Could you please 
> resolve the conflicts?
> 
> The new status of this patch is: Waiting on Author

Have you had a chance to look at rebasing this patch so we can get it further
in the process?

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-09-15 Thread Thom Brown

On 14 September 2017 at 07:34, Andres Freund  wrote:
> Hi,
>
> When running workloads that include fast queries with a lot of columns,
> SendRowDescriptionMessage(), and the routines it calls, becomes a
> bottleneck.  Besides syscache lookups (see [1] and [2]) a major cost of
> that is manipulation of the StringBuffer and the version specific
> branches in the per-attribute loop.  As so often, the performance
> differential of this patch gets bigger when the other performance
> patches are applied.
>
> The issues in SendRowDescriptionMessage() are the following:
>
> 1) All the pq_sendint calls, and also the pq_sendstring, are
>expensive. The amount of calculations done for a single 2/4 byte
>addition to the stringbuffer (particularly enlargeStringInfo()) is
>problematic, as are the reallocations themselves.
>
>I addressed this by adding pq_send*_pre() wrappers, implemented as
>inline functions, that require that memory is pre-allocated.
>Combining that with doing a enlargeStringInfo() in
>SendRowDescriptionMessage() that pre-allocates the maximum required
>memory, yields pretty good speedup.
>
>I'm not yet super sure about the implementation. For one, I'm not
>sure this shouldn't instead be stringinfo.h functions, with very very
>tiny pqformat.h wrappers. But conversely I think it'd make a lot of
>sense for the pqformat integer functions to get rid of the
>continually maintained trailing null-byte - I was hoping the compiler
>could optimize that away, but alas, no luck.  As soon as a single
>integer is sent, you can't rely on 0 terminated strings anyway.
>
> 2) It creates a new StringInfo in every iteration. That causes
>noticeable memory management overhead, and restarts the size of the
>buffer every time. When the description is relatively large, that
>leads to a number of reallocations for every
>SendRowDescriptionMessage() call.
>
>My solution here was to create persistent StringInfo for
>SendRowDescriptionMessage() that never gets deallocated (just
>reset). That in combination with new versions of
>pq_beginmessage/endmessage that keep the buffer alive, yields a nice
>speedup.
>
>Currently I'm using a static variable to allocate a string buffer for
>the function. It'd probably better to manage that outside of a single
>function - I'm also wondering why we're allocating a good number of
>stringinfos in various places, rather than reuse them. Most of them
>can't be entered recursively, and even if that's a concern, we could
>have one reusable per portal or such.
>
> 3) The v2/v3 branches in the attribute loop are noticeable (others too,
>but well...).  I solved that by splitting out the v2 and v3
>per-attribute loops into separate functions.  Imo also a good chunk
>more readable.
>
> Comments?

I've run a fairly basic test with a table with 101 columns, selecting
a single row from the table and I get the following results:


Columns with 1-character names:

master (80 jobs, 80 connections, 60 seconds):

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 559275
latency average = 8.596 ms
tps = 9306.984058 (including connections establishing)
tps = 11144.622096 (excluding connections establishing)


patched:

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 585526
latency average = 8.210 ms
tps = 9744.240847 (including connections establishing)
tps = 11454.339301 (excluding connections establishing)


master (80 jobs, 80 connections, 120 seconds):

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 1108312
latency average = 8.668 ms
tps = 9229.356994 (including connections establishing)
tps = 9987.385281 (excluding connections establishing)


patched:

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 120 s
number of transactions actually processed: 1130313
latency average = 8.499 ms
tps = 9412.904876 (including connections establishing)
tps = 10319.404302 (excluding connections establishing)



Columns with at least 42-character names:

master (80 jobs, 80 connections, 60 seconds):

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration: 60 s
number of transactions actually processed: 197815
latency average = 24.308 ms
tps = 3291.157486 (including connections establishing)
tps = 3825.309489 (excluding connections establishing)



patched:

transaction type: /tmp/test.sql
scaling factor: 1
query mode: simple
number of clients: 80
number of threads: 80
duration:

Re: [HACKERS] Patch: add --if-exists to pg_recvlogical

2017-09-15 Thread Daniel Gustafsson

> On 09 Sep 2017, at 06:05, Thomas Munro  wrote:
> 
> On Sat, Sep 2, 2017 at 5:22 AM, Peter Eisentraut
>  wrote:
>>  
>> +  --if-exists
>> +  
>> +   
>> +Do not error out when --drop-slot or
>> --start are
>> +specified and a slot with the specified name does not exist.
>> +   
>> +  
>> + 
>> 
>> I understand the --drop-slot part.  But I don't understand what it means
>> to ignore a missing replication slot when running --start.
> 
> Also "--start" breaks the documentation build (missing
> slash on the closing tag).

This patch is "Waiting for Author” due to the above review comments from Peter
and Thomas.  Do you think you will have time to address these shortly so we can
move this patch further in the process?

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] path toward faster partition pruning

2017-09-15 Thread Amit Langote

On 2017/09/15 11:16, Amit Langote wrote:
> I will post rebased patches later today, although I think the overall
> design of the patch on the planner side of things is not quite there yet.
> Of course, your and others' feedback is greatly welcome.

Rebased patches attached.  Because Dilip complained earlier today about
clauses of the form (const op var) not causing partition-pruning, I've
added code to commute the clause where it is required.  Some other
previously mentioned limitations remain -- no handling of OR clauses, no
elimination of redundant clauses for given partitioning column, etc.

A note about 0001: this patch overlaps with
0003-Canonical-partition-scheme.patch from the partitionwise-join patch
series that Ashutosh Bapat posted yesterday [1].  Because I implemented
the planner-portion of this patch based on what 0001 builds, I'm posting
it here.  It might actually turn out that we will review and commit
0003-Canonical-partition-scheme.patch on that thread, but meanwhile apply
0001 if you want to play with the later patches.  I would certainly like
to review  0003-Canonical-partition-scheme.patch myself, but won't be able
to immediately (see below).

> Also, I must inform to all of those who're looking at this thread that I
> won't be able to respond to emails from tomorrow (9/16, Sat) until 9/23,
> Sat, due to some personal business.

To remind.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFiTN-skmaqeCVaoAHCBqe2DyfO3f6sgdtEjHWrUgi0kV1yPLQ%40mail.gmail.com
From ff9ccd8df6555cfca31e54e22293ac1613db327c Mon Sep 17 00:00:00 2001
From: amit 
Date: Wed, 13 Sep 2017 18:24:55 +0900
Subject: [PATCH 1/5] Some optimizer data structures for partitioned rels

Nobody uses it though.
---
 src/backend/optimizer/util/plancat.c | 120 +++
 src/backend/optimizer/util/relnode.c |  20 ++
 src/include/nodes/nodes.h|   1 +
 src/include/nodes/relation.h | 135 +++
 src/include/optimizer/plancat.h  |   2 +
 5 files changed, 278 insertions(+)

diff --git a/src/backend/optimizer/util/plancat.c 
b/src/backend/optimizer/util/plancat.c
index a1ebd4acc8..f7e3a1df5f 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -68,6 +68,8 @@ static List *get_relation_constraints(PlannerInfo *root,
 static List *build_index_tlist(PlannerInfo *root, IndexOptInfo *index,
  Relation heapRelation);
 static List *get_relation_statistics(RelOptInfo *rel, Relation relation);
+static void get_relation_partition_info(PlannerInfo *root, RelOptInfo *rel,
+   Relation relation);
 
 /*
  * get_relation_info -
@@ -420,6 +422,10 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, 
bool inhparent,
/* Collect info about relation's foreign keys, if relevant */
get_relation_foreign_keys(root, rel, relation, inhparent);
 
+   /* Collect partitioning info, if relevant. */
+   if (relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+   get_relation_partition_info(root, rel, relation);
+
heap_close(relation, NoLock);
 
/*
@@ -1802,3 +1808,117 @@ has_row_triggers(PlannerInfo *root, Index rti, CmdType 
event)
heap_close(relation, NoLock);
return result;
 }
+
+static void
+get_relation_partition_info(PlannerInfo *root, RelOptInfo *rel,
+   Relation relation)
+{
+   int i;
+   ListCell *l;
+   PartitionKey key = RelationGetPartitionKey(relation);
+   PartitionDesc partdesc = RelationGetPartitionDesc(relation);
+
+   rel->part_scheme = find_partition_scheme(root, relation);
+   rel->partexprs = (List **) palloc0(key->partnatts * sizeof(List *));
+
+   l = list_head(key->partexprs);
+   for (i = 0; i < key->partnatts; i++)
+   {
+   Expr *keyCol;
+
+   if (key->partattrs[i] != 0)
+   {
+   keyCol = (Expr *) makeVar(rel->relid,
+ 
key->partattrs[i],
+ 
key->parttypid[i],
+ 
key->parttypmod[i],
+ 
key->parttypcoll[i],
+ 0);
+   }
+   else
+   {
+   if (l == NULL)
+   elog(ERROR, "wrong number of partition key 
expressions");
+   keyCol = (Expr *) copyObject(lfirst(l));
+   l = lnext(l);
+   }
+
+   rel->partexprs[i] = list_make1(keyCol);
+   }
+
+   /* Values are filled in build_simple_rel(). */
+

Re: [HACKERS] [PATCH] Improve geometric types

2017-09-15 Thread Kyotaro HORIGUCHI

Hello, just one point on 0001.

The patch replace pg_hypot with hypot in libc. The man page says
as follows.

man 3 hypot
>   If the result overflows, a range error occurs, and the functions return
>   HUGE_VAL, HUGE_VALF, or HUGE_VALL, respectively.
..
>ERRORS
>   See math_error(7) for information on how to determine whether an  error
>   has occurred when calling these functions.
>
>   The following errors can occur:
>
>   Range error: result overflow
>  errno  is  set  to ERANGE.  An overflow floating-point exception
>  (FE_OVERFLOW) is raised.
>
>   Range error: result underflow
>  An underflow floating-point exception (FE_UNDERFLOW) is raised.
>
>  These functions do not set errno for this case.

So, the code seems to need some amendments following to this
spec.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

2017-09-15 Thread Kyotaro HORIGUCHI

At Fri, 15 Sep 2017 17:23:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170915.172328.97446299.horiguchi.kyot...@lab.ntt.co.jp>
> At Thu, 14 Sep 2017 16:19:13 -0400, Robert Haas  wrote 
> in 
> > On Thu, Sep 14, 2017 at 3:33 AM, Kyotaro HORIGUCHI
> >  wrote:
> > > I recall a bit about the double-evaluation hazards. I think the
> > > functions needs a comment describing the reasons so that anyone
> > > kind won't try to merge them into a macro again.
> > 
> > I think we can count on PostgreSQL developers to understand the
> > advantages of an inline function over a macro.  Even if they don't,
> > the solution can't be to put a comment in every place where an inline
> > function is used explaining it.  That would be very repetitive.
> 
> Of course putting such a comment to all inline functions is
> silly. The point here is that many pairs of two functions with
> exactly the same shape but handle different types are defined
> side by side. Such situation seems tempting to merge them into
> single macros, as the previous author did there.
> 
> So a simple one like the following would be enough.
> 
> /* don't merge the following same functions with different types
>into single macros so that double evaluation won't happen */
> 
> Is it still too verbose?

That being said, I'm not stick on that if Robert or others think
it as needless.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Improve geometric types

2017-09-15 Thread Kyotaro HORIGUCHI

At Thu, 14 Sep 2017 16:19:13 -0400, Robert Haas  wrote 
in 
> On Thu, Sep 14, 2017 at 3:33 AM, Kyotaro HORIGUCHI
>  wrote:
> > I recall a bit about the double-evaluation hazards. I think the
> > functions needs a comment describing the reasons so that anyone
> > kind won't try to merge them into a macro again.
> 
> I think we can count on PostgreSQL developers to understand the
> advantages of an inline function over a macro.  Even if they don't,
> the solution can't be to put a comment in every place where an inline
> function is used explaining it.  That would be very repetitive.

Of course putting such a comment to all inline functions is
silly. The point here is that many pairs of two functions with
exactly the same shape but handle different types are defined
side by side. Such situation seems tempting to merge them into
single macros, as the previous author did there.

So a simple one like the following would be enough.

/* don't merge the following same functions with different types
   into single macros so that double evaluation won't happen */

Is it still too verbose?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-09-15 Thread Konstantin Knizhnik




On 14.09.2017 18:53, Simon Riggs wrote:



This works by looking at overall stats, and only looks at the overall
HOT %, so its too heavyweight and coarse.

I suggested storing stat info on the relcache and was expecting you
would look at how often the expression evaluates to new == old. If we
evaluate new against old many times, then if the success rate is low
we should stop attempting the comparison. (<10%?)

Another idea:
If we don't make a check when we should have done then we will get a
non-HOT update, so we waste time extra time difference between a HOT
and non-HOT update. If we check and fail we waste time take to perform
check. So the question is how expensive the check is against how
expensive a non-HOT update is. Could we simply say we don't bother to
check functions that have a cost higher than 1? So if the user
doesn't want to perform the check they can just increase the cost of
the function above the check threshold?


Attached pleased find one more patch which calculates hot update check hit
rate more precisely: I have to extended PgStat_StatTabEntry with two new
fields:
hot_update_hits and hot_update_misses.

It's not going to work, as already mentioned above. Those stats are at
table level and very little to do with this particular index.

But you've not commented on the design I mention that can work: index relcache.

Sorry, I do not completely agree with you.
Yes, certainly whether functional index is projective or not is property 
of the index, not of the table.
But the decision whether hot update is applicable or not is made for the 
whole table - for all indexes.
If a value of just one indexed expressions is changed then we can not 
use hot update and have to update all indexes.


Assume that we have table with "bookinfo" field of type JSONB.
And we create several functional indexes on this column: 
(bookinfo->'isbn'), (bookinfo->'title'), (bookinfo->'author'), 
(bookinfo->'rating').
Probability that indexed expression is changed is case of updating 
"bookinfo" field my be different for all this three indexes.
But there is completely no sense to check if 'isbn' is changed or not, 
if we already detect that most updates cause change of 'rating' 
attribute and
so comparing old and new values of (bookinfo->'rating') is just waste of 
time. In this case we should not also compare (bookinfo->'isbn') and
other indexed expressions because for already rejected possibility of 
hot update.


So despite to the fact that this information depends on particular 
index, it affects behavior of the whole table and it is reasonable (and 
simpler) to collect it in table's statistic.



Concerning your idea to check cost of index function: it certainly makes
sense.
The only problems: I do not understand now how to calculate this cost.
It can be easily calculated by optimizer when it is building query execution
plan.
But inside BuildIndexInfo I have just reference to Relation and have no idea
how
I can propagate here information about index expression cost from optimizer.

We could copy at create index, if we took that route. Or we can look
up the cost for the index expression and cache it.


Anyway, this is just jumping around because we still have a parameter
and the idea was to remove the parameter entirely by autotuning, which
I think is both useful and possible, just as HOT itself is autotuned.



Hot update in almost all cases is preferable to normal update, causing 
update of indexes.
There are can be some scenarios when hot updates reduce speed of some 
queries,

but it is very difficult to predict such cases user level.

But usually nature of index is well known by DBA or programmer. In 
almost all cases it is clear for person creating functional index 
whether it will perform projection or not
and whether comparing old/new expression value makes sense or is just 
waste of time. We can guess it from autotune, but such decision may be 
wrong (just because of application
business logic). Postgres indexes already have a lot of options. And I 
think that "projection" option (or whatever we name it) is also needed.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS

2017-09-15 Thread Alexander Korotkov

On Thu, Sep 14, 2017 at 10:41 PM, Nico Williams 
wrote:

> https://github.com/twosigma/postgresql-contrib/
> https://github.com/twosigma/postgresql-contrib/blob/
> master/commit_trigger.sql
> https://raw.githubusercontent.com/twosigma/postgresql-
> contrib/master/commit_trigger.sql

Do I understand correctly that this is SQL implementation of COMMIT TRIGGER
functionality which is a prototype used to demonstrate it.  And if this
prototype is well-accepted then you're going to write a patch for builtin
COMMIT TRIGGER functionality.  Is it right?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Re: [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-09-15 Thread Tsunakawa, Takayuki

Hello Tom, Michael,
Robert, Noah

From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> On Thu, Sep 14, 2017 at 3:23 AM, Tsunakawa, Takayuki
>  wrote:
> > Sorry again, but how can we handle this?  A non-PG-developer, Tels (and
> possibly someone else, IIRC) claimed that the behavior be changed during
> the beta period.  Why should we do nothing?
> 
> Because we do not have consensus on changing it.  I've decided that you're
> right, but several other people are saying "no".  I can't just go change
> it in the face of objections.

How are things decided in cases like this?  Does the RMT team usually do some 
kind of poll?

So far, there are four proponents (Tels (non-PG-developer), David J., Robert 
and me), and two opponents (Tom and Michael).

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-15 Thread Amit Langote

On 2017/09/14 16:00, Michael Paquier wrote:
> On Wed, Sep 13, 2017 at 4:43 PM, Amit Langote
>  wrote:
>> Sure, no problem.
> 
> OK, I took a closer look at all patches, but did not run any manual
> tests to test the compression except some stuff with
> wal_consistency_checking.

Thanks for the review.

> +   if (opaque->flags & GIN_DELETED)
> +   mask_page_content(page);
> +   else if (pagehdr->pd_lower != 0)
> +   mask_unused_space(page);
> [...]
> +   /* Mask the unused space, provided the page's pd_lower is set. */
> +   if (pagehdr->pd_lower != 0)
> mask_unused_space(page);
>
> For the masking functions, shouldn't those check use (pd_lower >
> SizeOfPageHeaderData) instead of 0? pd_lower gets set to a non-zero
> value on HEAD, so you would apply the masking even if the meta page is
> upgraded from an instance that did not enforce the value of pd_lower
> later on. Those conditions also definitely need comments. That will be
> a good reminder so as why it needs to be kept.

Agreed, done.

> +*
> +* This won't be of any help unless we use option like REGBUF_STANDARD
> +* while registering the buffer for metapage as otherwise, it won't try to
> +* remove the hole while logging the full page image.
>  */
> This comment is in the btree code. But you actually add
> REGBUF_STANDARD. So I think that this could be just removed.
> 
>  * Set pd_lower just past the end of the metadata.  This is not essential
> -* but it makes the page look compressible to xlog.c.
> +* but it makes the page look compressible to xlog.c.  See
> +* _bt_initmetapage.
> This reference could be removed as well as _bt_initmetapage does not
> provide any information, the existing comment is wrong without your
> patch, and then becomes right with this patch.

Amit K's reply may have addressed these comments.

> After that I have spotted a couple of places for btree, hash and
> SpGist where the updates of pd_lower are not happening. Let's keep in
> mind that updated meta pages could come from an upgraded version, so
> we need to be careful here about all code paths updating meta pages,
> and WAL-logging them.
>
> It seems to me that an update of pd_lower is missing in _bt_getroot(),
> just before marking the buffer as dirty I think. And there is a second
> in _bt_unlink_halfdead_page(), a third in _bt_insertonpg() and finally
> one in _bt_newroot().

Amit K's reply about btree and hash should've resolved any doubts for
those index types.  About SP-Gist, see the comment below.

> For SpGist, I think that there are two missing: spgbuild() and spgGetCache().

spgbuild() calls SpGistInitMetapage() before marking the metapage buffer
dirty.  The latter already sets pd_lower correctly, so we don't need to do
it explicitly in spgbuild() itself.

spgGetCache() doesn't write the metapage, only reads it:

/* Last, get the lastUsedPages data from the metapage */
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
LockBuffer(metabuffer, BUFFER_LOCK_SHARE);

metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));

if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
elog(ERROR, "index \"%s\" is not an SP-GiST index",
 RelationGetRelationName(index));

cache->lastUsedPages = metadata->lastUsedPages;

UnlockReleaseBuffer(metabuffer);

So, I think it won't be correct to set pd_lower here, no?

> For hash, hashbulkdelete(), _hash_vacuum_one_page(),
> _hash_addovflpage(), _hash_freeovflpage() and _hash_doinsert() are
> missing the shot, no? We could have a meta page of a hash index
> upgraded from PG10.

Amit K's reply. :)


Updated patch attached, which implements your suggested changes to the
masking functions.

By the way, as I noted on another unrelated thread, I will not be able to
respond to emails from tomorrow until 9/23.

Thanks,
Amit
From 6741334ce5f3261b5e5caffa3914d4e1485fe5d8 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 23 Jun 2017 11:20:41 +0900
Subject: [PATCH 1/3] Set pd_lower correctly in the GIN metapage.

Also tell xlog.c to treat the metapage like a standard page, so any
hole in it is compressed.
---
 src/backend/access/gin/ginfast.c   | 22 --
 src/backend/access/gin/gininsert.c |  4 ++--
 src/backend/access/gin/ginutil.c   | 19 ++-
 src/backend/access/gin/ginxlog.c   | 25 ++---
 4 files changed, 50 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 59e435465a..d96529cf72 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -399,6 +399,15 @@ ginHeapTupleFastInsert(GinState *ginstate, 
GinTupleCollector *collector)
/*
 * Write metabuffer, make xlog entry
 */
+
+   /*
+* Set pd_lower just past the end of the metadata.  This is not 
essential
+* but it

Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT

2017-09-15 Thread Amit Langote

On 2017/09/15 15:36, Amit Langote wrote:
> The fact that
> parent is locked after the child and with ShareUpdateExclusiveLock instead
> of AccessExclusiveLock, we observe this race condition when SELECTing from
> the parent.

Oops, I meant "parent table is locked with AccessShareLock instead of
AccessExclusiveLock..."

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT

2017-09-15 Thread Amit Langote

Hi.

On 2017/08/28 18:28, Kyotaro HORIGUCHI wrote:
> << the following is another topic >>
> 
 BTW, in the partitioned table case, the parent is always locked first
 using an AccessExclusiveLock.  There are other considerations in that case
 such as needing to recreate the partition descriptor upon termination of
 inheritance (both the DETACH PARTITION and also DROP TABLE child cases).
>>>
>>> Apart from the degree of concurrency, if we keep parent->children
>>> order of locking, such recreation does not seem to be
>>> needed. Maybe I'm missing something.
>>
>> Sorry to have introduced that topic in this thread, but I will try to
>> explain anyway why things are the way they are currently:
>>
>> Once a table is no longer a partition of the parent (detached or dropped),
>> we must make sure that the next commands in the transaction don't see it
>> as one.  That information is currently present in the relcache
>> (rd_partdesc), which is used by a few callers, most notably the
>> tuple-routing code.  Next commands must recreate the entry so that the
>> correct thing happens based on the updated information.  More precisely,
>> we must invalidate the current entry.  RelationClearRelation() will either
>> delete the entry or rebuild it.  If it's being referenced somewhere, it
>> will be rebuilt.  The place holding the reference may also be looking at
>> the content of rd_partdesc, which we don't require them to make a copy of,
>> so we must preserve its content while other fields of RelationData are
>> being read anew from the catalog.  We don't have to preserve it if there
>> has been any change (partition added/dropped), but to make such a change
>> one would need to take a strong enough lock on the relation (parent).  We
>> assume here that anyone who wants to reference rd_partdesc takes at least
>> AccessShareLock lock on the relation, and anyone who wants to change its
>> content must take a lock that will conflict with it, so
>> AccessExclusiveLock.  Note that in all of this, we are only talking about
>> one relation, that is the parent, so parent -> child ordering of taking
>> locks may be irrelevant.
> 
> I think I understand this, anyway DropInherit and DropPartition
> is different-but-at-the-same-level operations so surely needs
> amendment for drop/detach cases. Is there already a solution? Or
> reproducing steps?

Sorry, I think I forgot to reply to this.  Since you seem to have chosen
the other solution (checking that child is still a child), maybe this
reply is a bit too late, but anyway.

DropInherit or NO INHERIT is seen primarily as changing a child table's
(which is the target table of the command) property that it is no longer a
child of the parent, so we lock the child table to block concurrent
operations from considering it a child of parent anymore.  The fact that
parent is locked after the child and with ShareUpdateExclusiveLock instead
of AccessExclusiveLock, we observe this race condition when SELECTing from
the parent.

DropPartition or DETACH PARTITION is seen primarily as changing the parent
table's (which is the target table of the command) property that one of
the partitions is removed, so we lock the parent.   Any concurrent
operations that rely on the parent's relcache to get the partition list
will wait for the session that is dropping the partition to finish, so
that they get the fresh information from the relcache (or more importantly
do not end up with information obtained from the relcache going invalid
under them without notice).  Note that the lock on the partition/child is
also present and it plays more or less the the same role as it does in the
DropInherit case, but due to different order of locking, reported race
condition does not occur between SELECT on partitioned table and
DROP/DETACH PARTITION.

By the way, I will take a look at your patch when I come back from the
vacation.  Meanwhile, I noticed that it needs another rebase after
0a480502b092 [1].

Thanks,
Amit

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0a480502b092

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-15 Thread Douglas Doole

Ah. Sorry I missed them - I'll give them a look. (Won't be able to get to
it until Saturday though.)
On Thu, Sep 14, 2017 at 10:06 PM Oliver Ford  wrote:

> I'll fix the brace, but there are two other patches in the first email for
> tests and docs. For some reason the commitfest app didn't pick them up.
>
> On Friday, 15 September 2017, Doug Doole  wrote:
>
>> The following review has been posted through the commitfest application:
>> make installcheck-world:  tested, passed
>> Implements feature:   tested, passed
>> Spec compliant:   not tested
>> Documentation:not tested
>>
>> Code looks fine, but one niggly complaint at line 146 of the patch file
>> ("while (*cp) {"). A K style brace slipped in, which doesn't match the
>> formatting of the file.
>>
>> Given that this is providing new formatting options, there should be new
>> tests added that validate the options and error handling.
>>
>> There's also the "do we want this?" debate from the discussion thread
>> that still needs to be resolved. (I don't have an opinion either way.)
>>
>> I'm sending this back to the author to address the first two issues.
>>
>> The new status of this patch is: Waiting on Author
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>

Re: [HACKERS] [POC] hash partitioning

2017-09-15 Thread amul sul

On Fri, Sep 15, 2017 at 4:30 AM, Thom Brown  wrote:

> On 14 September 2017 at 09:58, amul sul  wrote:
> > On Wed, Sep 13, 2017 at 7:43 PM, Jesper Pedersen
> >  wrote:
> >>
> >> Hi Amul,
> >>
> >> On 09/08/2017 08:40 AM, amul sul wrote:
> >>>
> >>> Rebased 0002 against this commit & renamed to 0001, PFA.
> >>>
> >>
> >> This patch needs a rebase.
> >>
> >
> > Thanks for your note.
> > Attached is the patch rebased on the latest master head.
> > Also added error on
> > creating
> > d
> > efault partition
> > for the hash partitioned table
> > ,
> > and updated document &
> > test script for the same.
>
> Sorry, but this needs another rebase as it's broken by commit
> 77b6b5e9ceca04dbd6f0f6cd3fc881519acc8714.
>
> 
Attached rebased patch, thanks.

Regards,
Amul


0001-hash-partitioning_another_design-v20.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] utility commands benefiting from parallel plan

2017-09-15 Thread Haribabu Kommi

On Thu, Sep 14, 2017 at 2:42 PM, Rafia Sabih 
wrote:

> On Wed, Sep 13, 2017 at 2:29 PM, Haribabu Kommi
>  wrote:
> >
> >
> > On Wed, Sep 13, 2017 at 4:17 PM, Rafia Sabih <
> rafia.sa...@enterprisedb.com>
> > wrote:
> >>
> >> On Fri, Sep 1, 2017 at 12:31 PM, Haribabu Kommi
> >>  wrote:
> >> >
> >> > Hi All,
> >> >
> >> > Attached a rebased patch that supports parallelism for the queries
> >> > that are underneath of some utility commands such as CREATE TABLE AS
> >> > and CREATE MATERIALIZED VIEW.
> >> >
> >> > Note: This patch doesn't make the utility statement (insert operation)
> >> > to run in parallel. It only allows the select query to be parallel if
> >> > the
> >> > query
> >> > is eligible for parallel.
> >> >
> >>
> >> Here is my feedback fro this patch,
> >>
> >> - The patch is working as expected, all regression tests are passing
> >
> >
> > Thanks for the review.
> >
> >>
> >> - I agree with Dilip that having similar mechanism for 'insert into
> >> select...' statements would add more value to the patch, but even then
> >> this looks like a good idea to extend parallelism for atleast a few of
> >> the write operations
> >
> >
> > Yes, I also agree that supporting of 'insert into select' will provide
> more
> > benefit. I already tried to support the same in [1], but it have many
> > drawbacks especially with triggers. To support a proper parallel support
> > for DML queries, I feel the logic of ParalleMode needs an update to
> > avoid the errors from PreventCommandIfParallelMode() function to
> > identify whether it is nested query operation and that should execute
> > only in backend and etc.
> >
> > As the current patch falls into DDL category that gets benefited from
> > parallel query, because of this reason, I didn't add the 'insert into
> > select'
> > support into this patch. Without support of it also, it provides the
> > benefit.
> > I work on supporting the DML write support with parallel query as a
> > separate patch.
> >
> Sounds sensible. In that case, I'll be marking this patch ready for
> committer.


Thanks for the review.

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] Optimise default partition scanning while adding new partition

2017-09-15 Thread Amit Langote

On 2017/09/15 0:59, Robert Haas wrote:
> On Thu, Sep 14, 2017 at 4:03 AM, Jeevan Ladhe
>  wrote:
>> Thanks Amit for reviewing.
>>> Patch looks fine to me.  By the way, why don't we just say "Can we skip
>>> scanning part_rel?" in the comment before the newly added call to
>>> PartConstraintImpliedByRelConstraint()?  We don't need to repeat the
>>> explanation of what it does at the every place we call it.
>>
>> I have changed the comment.
>> PFA.
> 
> I'm probably missing something stupid, but why does this happen?
> 
>  ALTER TABLE list_parted2 ATTACH PARTITION part_7 FOR VALUES IN (7);
> -INFO:  partition constraint for table "list_parted2_def" is implied
> by existing constraints
>  ERROR:  partition constraint is violated by some row
> 
> Based on the regression test changes made up higher in the file, I'd
> expect that line to be replaced by two lines, one for
> list_parted2_def_p1 and another for list_parted2_def_p2, because at
> this point, list_parted2_def is a partitioned table with those two
> children, and they seem to have appropriate constraints.
> 
> list_parted2_def_p1 has
>  Check constraints:
>  "check_a" CHECK (a = ANY (ARRAY[21, 22]))
> 
> list_parted2_def_p2 has
>  Check constraints:
>  "check_a" CHECK (a = ANY (ARRAY[31, 32]))
> 
> Well, if a is 21 or 22 for the first, or 31 or 32 for the second, then
> it's not 7.  Or so I would think.

ISTM, Jeevan's patch modifies check_default_allows_bound(), which only
DefineRelation() calls as things stand today.  IOW, it doesn't handle the
ATTACH PARTITION case, so you're not seeing the lines for
list_parted2_def_p1 and list_parted2_def_p2, because ATExecAttachPartition
doesn't yet know how to skip the validation scan for default partition's
children.

I wonder if we should call check_default_allows_bound() from
ATExecAttachPartition(), too, instead of validating updated default
partition constraint using ValidatePartitionConstraints()?  That is, call
the latter only to validate the partition constraint of the table being
attached and call check_default_allows_bound() to validate the updated
default partition constraint.  That way, INFO/ERROR messages related to
default partition constraint are consistent across the board.  I hacked
that up in the attached 0001.  The patch also modifies the INFO message
that check_default_allows_bound() emits when the scan is skipped to make
it apparent that a default partition is involved.  (Sorry, this should've
been a review comment I should've posted before the default partition
patch was committed.)

Then apply 0002, which is Jeevan's patch.  With 0001 in place, Robert's
complaint is taken care of.

Then comes 0003, which is my patch to teach ValidatePartitionConstraints()
to skip child table scans using their existing constraint.

Thanks,
Amit
From 00765f2166c78902d0c92ed9d1a4dea033a50a12 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 15 Sep 2017 14:30:36 +0900
Subject: [PATCH 1/3] Teach ATExecAttachPartition to use
 check_default_allows_bound

Currently it schedules validation scan of the default partition
in the same way as it schedules the scan of the partition being
attached.  Instead, call check_default_allows_bound() to do the
scanning, so the handling is symmetric with DefineRelation().

Also, update the INFO message in check_default_allows_bound(), so
that it's clear from the message that the default partition is
involved.
---
 src/backend/catalog/partition.c   |  8 ++-
 src/backend/commands/tablecmds.c  | 40 ++-
 src/test/regress/expected/alter_table.out | 12 +-
 3 files changed, 20 insertions(+), 40 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 1ab6dba7ae..027b98d850 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -912,15 +912,11 @@ check_default_allows_bound(Relation parent, Relation 
default_rel,
def_part_constraints =
get_proposed_default_constraint(new_part_constraints);

-   /*
-* If the existing constraints on the default partition imply that it 
will
-* not contain any row that would belong to the new partition, we can
-* avoid scanning the default partition.
-*/
+   /* Can we skip scanning default_rel? */
if (PartConstraintImpliedByRelConstraint(default_rel, 
def_part_constraints))
{
ereport(INFO,
-   (errmsg("partition constraint for table \"%s\" 
is implied by existing constraints",
+   (errmsg("updated partition constraint for 
default partition \"%s\" is implied by existing constraints",

RelationGetRelationName(default_rel;
return;
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 563bcda30c..ed4a2092b3 100644

92 matches

Mail list logo