date:20170504

Re: [HACKERS] Change GetLastImportantRecPtr's definition? (wasSkip checkpoints, archiving on idle systems.)

2017-05-04 Thread Amit Kapila

On Fri, May 5, 2017 at 6:54 AM, Andres Freund  wrote:
> Hi,
>
> On 2016-12-22 19:33:30 +, Andres Freund wrote:
>> Skip checkpoints, archiving on idle systems.
>
> As part of an independent bugfix I noticed that Michael & I appear to
> have introduced an off-by-one here. A few locations do comparisons like:
> /*
>  * Only log if enough time has passed and interesting records have
>  * been inserted since the last snapshot.
>  */
> if (now >= timeout &&
> last_snapshot_lsn < GetLastImportantRecPtr())
> {
> last_snapshot_lsn = LogStandbySnapshot();
> ...
>
> which looks reasonable on its face.  But LogStandbySnapshot (via XLogInsert())
>  * Returns XLOG pointer to end of record (beginning of next record).
>  * This can be used as LSN for data pages affected by the logged action.
>  * (LSN is the XLOG point up to which the XLOG must be flushed to disk
>  * before the data page can be written out.  This implements the basic
>  * WAL rule "write the log before the data".)
>
> and GetLastImportantRecPtr
>  * GetLastImportantRecPtr -- Returns the LSN of the last important record
>  * inserted. All records not explicitly marked as unimportant are considered
>  * important.
>
> which means that we'll e.g. not notice if there's exactly a *single* WAL
> record since the last logged snapshot (and likely similar in the other
> users of GetLastImportantRecPtr()), because XLogInsert() will return
> where the next record will most of the time be inserted, and
> GetLastImportantRecPtr() returns the beginning of said record.
>
> This is trivially fixable by replacing < with <=.  But I wonder if the
> better fix would be to redefine GetLastImportantRecPtr() to point to the
> end of the record, too?
>

If you think it is straightforward to note the end of the record, then
that sounds sensible thing to do.  However, note that we remember the
position based on lockno and lock is released before we can determine
the EndPos.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-05-04 Thread Michael Paquier

On Wed, May 3, 2017 at 12:25 AM, Peter Eisentraut
 wrote:
> On 5/2/17 10:08, Michael Paquier wrote:
>> On Tue, May 2, 2017 at 9:30 PM, Peter Eisentraut
>>  wrote:
>>> On 5/2/17 03:11, Petr Jelinek wrote:
 logical decoding can theoretically
 do HOT pruning (even if the chance is really small) so it's not safe to
 start logical replication either.
>>>
>>> This seems a bit impossible to resolve.  On the one hand, we want to
>>> allow streaming until after the shutdown checkpoint.  On the other hand,
>>> streaming itself might produce new WAL.
>>
>> It would be nice to split things into two:
>> - patch 1 adding the signal handling that wins a backpatch.
>> - patch 2 fixing the side cases with logical decoding.
>
> The side cases with logical decoding are also not new and would need
> backpatching, AIUI.

Okay, I thought that there was some new concept part of logical
replication here.

>>> Can we prevent HOT pruning during logical decoding?
>>
>> It does not sound much difficult to do, couldn't you just make it a
>> no-op with am_walsender?
>
> That's my hope.

The only code path doing HOT-pruning and generating WAL is
heap_page_prune(). Do you think that we need to worry about FPWs as
well?

Attached is an updated patch, which also forbids the run of any
replication commands when the stopping state is reached.
-- 
Michael
From c8a44b6f84926712b7b6f2b36b4f13b0d1b41977 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 5 May 2017 14:10:16 +0900
Subject: [PATCH] Prevent panic during shutdown checkpoint

When the checkpointer writes the shutdown checkpoint, it checks
afterwards whether any WAL has been written since it started and throws
a PANIC if so.  At that point, only walsenders are still active, so one
might think this could not happen, but WAL senders can generate WAL
in the context of certain replication commands that can be run during
the shutdown checkpoint:
- certain variants of CREATE_REPLICATION_SLOT.
- BASE_BACKUP and the backend end WAL record.
- START_REPLICATION and logical decoding, able to do HOT-pruning.

To fix this, divide the walsender shutdown into two phases.  First, the
postmaster sends a SIGUSR2 signal to all walsenders.  The walsenders
then put themselves into the "stopping" state.  In this state, they
reject any commands that might generate WAL.  The checkpointer waits for
all walsenders to reach this state before proceeding with the shutdown
checkpoint.  After the shutdown checkpoint is done, the postmaster sends
SIGINT (previously unused) to the walsenders.  This triggers the
existing shutdown behavior of sending out the shutdown checkpointer and
then terminating.

Author: Michael Paquier 
Reported-by: Fujii Masao 
---
 doc/src/sgml/monitoring.sgml|   5 +
 src/backend/access/heap/pruneheap.c |   9 ++
 src/backend/access/transam/xlog.c   |   6 ++
 src/backend/postmaster/postmaster.c |   7 +-
 src/backend/replication/walsender.c | 143 
 src/include/replication/walsender.h |   1 +
 src/include/replication/walsender_private.h |   3 +-
 7 files changed, 150 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 2a83671b53..80d12b26d7 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1690,6 +1690,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
backup: This WAL sender is sending a backup.
   
  
+ 
+  
+   stopping: This WAL sender is stopping.
+  
+ 

  
 
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index d69a266c36..d510649b18 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -22,6 +22,7 @@
 #include "catalog/catalog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "replication/walsender.h"
 #include "storage/bufmgr.h"
 #include "utils/snapmgr.h"
 #include "utils/rel.h"
@@ -189,6 +190,14 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
 	PruneState	prstate;
 
 	/*
+	 * Do nothing in the presence of a WAL sender. This code path can be
+	 * taken when doing logical decoding, and it is better to avoid WAL
+	 * generation as this interferes with shutdown checkpoints.
+	 */
+	if (am_walsender)
+		return ndeleted;
+
+	/*
 	 * Our strategy is to scan the page and make lists of items to change,
 	 * then apply the changes within a critical section.  This keeps as much
 	 * logic as possible out of the critical section, and also ensures that
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a89d99838a..5d6f8b75b8 100644
--- a/src/backend/access/transam/xlog.c
+++

Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

2017-05-04 Thread Amit Kapila

On Thu, May 4, 2017 at 10:40 PM, Robert Haas  wrote:
> On Thu, May 4, 2017 at 1:04 PM, Amit Kapila  wrote:
>> On Thu, May 4, 2017 at 10:18 PM, Robert Haas  wrote:
>>> On Thu, May 4, 2017 at 12:18 PM, Amit Kapila  
>>> wrote:
 As soon as the first command fails due to timeout, we will set
 'abort_cleanup_failure' which will make a toplevel transaction to
 abort and also won't allow other statements to execute.  The patch is
 trying to enforce a 30-second timeout after statement execution, so it
 has to anyway wait till the command execution completes (irrespective
 of whether the command succeeds or fail).
>>>
>>> I don't understand what you mean by this.  If the command doesn't
>>> finish within 30 seconds, we *don't* wait for it to finish.
>>>
>>
>> + /*
>> + * Submit a query.  Since we don't use non-blocking mode, this also can
>> + * block.  But its risk is relatively small, so we ignore that for now.
>> + */
>> + if (!PQsendQuery(conn, query))
>> + {
>> + pgfdw_report_error(WARNING, NULL, conn, false, query);
>> + return false;
>> + }
>> +
>> + /* Get the result of the query. */
>> + if (pgfdw_get_cleanup_result(conn, endtime, ))
>> + return false;
>>
>> The 30 seconds logic is in function pgfdw_get_cleanup_result, can't
>> the command hang in PQsendQuery?
>
> Sure.  I thought about trying to fix that too, by using
> PQsetnonblocking(), but I thought the patch was doing enough already.
>

Okay, it seems better to deal it in a separate patch.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

2017-05-04 Thread Amit Kapila

On Thu, May 4, 2017 at 8:08 PM, Robert Haas  wrote:
> On Thu, May 4, 2017 at 7:13 AM, Amit Kapila  wrote:
>> In pgfdw_xact_callback, if the execution of ABORT TRANSACTION fails
>> due to any reason then I think it will close the connection.  The
>> relavant code is:
>> if (PQstatus(entry->conn) != CONNECTION_OK ||
>> PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
>>
>> Basically, if abort transaction fails then transaction status won't be
>> PQTRANS_IDLE.
>
> I don't think that's necessarily true.  Look at this code:
>
>  /* Rollback all remote subtransactions during abort */
>  snprintf(sql, sizeof(sql),
>   "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
>   curlevel, curlevel);
>  res = PQexec(entry->conn, sql);
>  if (PQresultStatus(res) != PGRES_COMMAND_OK)
>  pgfdw_report_error(WARNING, res, entry->conn, true, sql);
>  else
>  PQclear(res);
>
> If that sql command somehow returns a result status other than
> PGRES_COMMAND_OK, like say due to a local OOM failure or whatever, the
> connection is PQTRANS_IDLE but the rollback wasn't actually done.
>

I have tried to verify above point and it seems to me in such a
situation the transaction status will be either PQTRANS_INTRANS or
PQTRANS_INERROR.  I have tried to mimic "out of memory" failure by
making PQmakeEmptyPGresult return NULL (malloc failure).  AFAIU, it
will make the state as PQTRANS_IDLE only when the statement is
executed successfully.

>> Also, does it matter if pgfdw_subxact_callback fails
>> during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
>> to execute rollback to savepoint command before proceeding and that
>> should be good enough?
>
> I don't understand this.  If pgfdw_subxact_callback doesn't roll the
> remote side back to the savepoint, how is it going to get done later?
> There's no code in postgres_fdw other than that code to issue the
> rollback.
>

It is done via toplevel transaction ABORT.  I think how it works is
after "ROLLBACK TO SAVEPOINT" failure, any next execution is going to
fail.  Let us say if we issue Commit after failure, it will try to
execute RELEASE SAVEPOINT which will fail and lead to toplevel
transaction ABORT.  Now if the toplevel transaction ABORT also fails,
it will drop the connection.

>> About patch:
>>
>> + /* Rollback all remote subtransactions during abort */
>> + snprintf(sql, sizeof(sql),
>> + "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
>> + curlevel, curlevel);
>> + if (!pgfdw_exec_cleanup_query(entry->conn, sql))
>> + abort_cleanup_failure = true;
>>
>> If the above execution fails due to any reason, then it won't allow
>> even committing the mail transaction which I am not sure is the right
>> thing to do.
>
> Well, as I said in my reply to Tushar, I think it's the only correct
> thing to do.  Suppose you do the following:
>
> (1) make a change to the foreign table
> (2) enter a subtransaction
> (3) make another change to the foreign table
> (4) roll back the subtransaction
> (5) commit the transaction
>
> If the remote rollback gets skipped in step 4, it is dead wrong for
> step 5 to commit the remote transaction anyway.  Both the change made
> in step (1) and the change made in step (3) would get made on the
> remote side, which is 100% wrong.  Given the failure of the rollback
> in step 4, there is no way to achieve the desired outcome of
> committing the step (1) change and rolling back the step (3) change.
> The only way forward is to roll back everything - i.e. force a
> toplevel abort.
>

Agreed, in such a situation toplevel transaction abort is the only way
out.  However, it seems to me that will anyway happen in current code
even if we don't try to force it via abort_cleanup_failure.  I
understand that it might be better to force it as you have done in the
patch, but not sure if it is good to drop the connection where
previously it could have done without it (for ex. if the first
statement Rollback To Savepoint fails, but ABORT succeeds).  I feel it
is worth considering whether such a behavior difference is okay as you
have proposed to backpatch it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] compiler warning with VS 2017

2017-05-04 Thread Tom Lane

Haribabu Kommi  writes:
> I am getting a compiler warning when I build the latest HEAD PostgreSQL with
> visual studio 2017.
> The code at the line is,
> tuple->values[i] = (char *) (Size)0xdeadbeef; /* make bad usage more obvious 
> */

Yeah, you're not the first to complain about this.  To my mind that
coding is not pretty, not cute, and not portable: there's not even
a good reason to believe that dereferencing the pointer would result
in a crash.  Perhaps the author can explain to us why this is better
than just assigning NULL.

Actually, looking around a bit there, it's not even clear why
we should be booby-trapping the value of an unchanged column in
the first place.  So I'd say that not only is the code dubious
but the comment is inadequate too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Andres Freund

Hi,

On 2017-05-04 19:57:21 -0700, Joe Conway wrote:
> One thought, is that we treat a CTE in a similar way to foreign tables,
> with the same set of push downs.

A bit surprised about that suggestion - there seems to be very little
similarity between the cases. What'd be the benefit of that?  With FDWs
there's not even a common set of things pushed down between different
FDW users.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Thomas Munro

On Fri, May 5, 2017 at 2:23 PM, David Rowley
 wrote:
> On 5 May 2017 at 13:37, Andres Freund  wrote:
>> On 2017-05-02 15:13:58 -0400, Robert Haas wrote:
>>> Multiple people (including David Rowley
>>> as well as folks here at EnterpriseDB) have demonstrated that for
>>> certain queries, we can actually use a lot more workers and everything
>>> works great.  The problem is that for other queries, using a lot of
>>> workers works terribly.  The planner doesn't know how to figure out
>>> which it'll be - and honestly, I don't either.
>>
>> Have those benchmarks, even in a very informal form, been shared /
>> collected / referenced centrally?  I'd be very interested to know where
>> the different contention points are. Possibilities:
>
> I posted mine on [1], although the post does not go into much detail
> about the contention points. I only really briefly mention it at the
> end.

Just for fun, check out pages 42 and 43 of Wei Hong's thesis.  He
worked on Berkeley POSTGRES parallel query and a spin-off called XPRS,
and they got linear seq scan scaling up to number of spindles:

http://db.cs.berkeley.edu/papers/ERL-M93-28.pdf

It gather from flicking through the POSTGRES 4.2 sources and this
stuff about XPRS that they switched from a "launch N workers!" model
to a "generate tasks and schedule them" model somewhere between these
systems.  Chapters 2 and 3 cover the problem of avoiding excessive
parallelism that reduces performance adjusting dynamically to maximum
throughput.  I suspect we're going that way too at some point, and it
would certainly fix some problems I ran into with Parallel Shared
Hash.

XPRS's cost model included resource consumption, not just 'timerons'.
This is something I grappled with when trying to put a price tag on
Parallel Shared Hash plans where just one worker builds the hash table
while the others wait.  I removed that plan from the patch because it
became mostly redundant, but when it was there Postgres thought it was
the same cost as a plan where every worker hammers your system
building the same hash table, whereas XPRS would have considered such
a plan ludicrously expensive (depending on his 'w' term, see page 28,
which determines whether you care more about resource usage or
response time).

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Joe Conway

On 05/04/2017 07:04 PM, Tom Lane wrote:
> Craig Ringer  writes:
>> We're carefully maintaining this bizarre cognitive dissonance where we
>> justify the need for using this as a planner hint at the same time as
>> denying that we have a hint. That makes it hard to make progress here.
>> I think there's fear that we're setting some kind of precedent by
>> admitting what we already have.
> 
> I think you're overstating the case.  It's clear that there's a
> significant subset of CTE functionality where there has to be an
> optimization fence.  The initial implementation basically took the
> easy way out by deeming *all* CTEs to be optimization fences.  Maybe
> we shouldn't have documented that behavior, but we did.  Now we're
> arguing about how much of a compatibility break it'd be to change that
> planner behavior.  I don't see any particular cognitive dissonance here,
> just disagreements about the extent to which backwards compatibility is
> more important than better query optimization.

Exactly.

One thought, is that we treat a CTE in a similar way to foreign tables,
with the same set of push downs.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Andres Freund

On 2017-05-04 19:45:33 -0700, Andres Freund wrote:
> Increment phs_cblock without checking rs_nblocks, but outside of the
> lock do a % scan->rs_nblocks, to get the "actual" position.  Finish if
> (phs_cblock - phs_startblock) / scan->rs_nblocks >= 1.

Err, as I've been pointed to: It should be s/lock/atomic operation/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Andres Freund

On 2017-05-05 14:40:43 +1200, David Rowley wrote:
> On 5 May 2017 at 14:36, Andres Freund  wrote:
> > I wonder how much doing the atomic ops approach alone can help, that
> > doesn't have the issue that the work might be unevenly distributed
> > between pages.
> 
> I wondered that too, since I though the barrier for making this change
> would be lower by doing it that way.
> 
> I didn't manage to think of a way to get around the wrapping the
> position back to 0 when synch-scans are involved.
> 
> i.e
> parallel_scan->phs_cblock++;
> if (parallel_scan->phs_cblock >= scan->rs_nblocks)
> parallel_scan->phs_cblock = 0;

Increment phs_cblock without checking rs_nblocks, but outside of the
lock do a % scan->rs_nblocks, to get the "actual" position.  Finish if
(phs_cblock - phs_startblock) / scan->rs_nblocks >= 1.

The difficult part seems to be the parallel_scan->phs_startblock
computation, but that we probably can do via an read barrier & unlocked
check, and then a spinlock & recheck if still uninitialized.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread David Rowley

On 5 May 2017 at 14:36, Andres Freund  wrote:
> I wonder how much doing the atomic ops approach alone can help, that
> doesn't have the issue that the work might be unevenly distributed
> between pages.

I wondered that too, since I though the barrier for making this change
would be lower by doing it that way.

I didn't manage to think of a way to get around the wrapping the
position back to 0 when synch-scans are involved.

i.e
parallel_scan->phs_cblock++;
if (parallel_scan->phs_cblock >= scan->rs_nblocks)
parallel_scan->phs_cblock = 0;

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread David Rowley

On 3 May 2017 at 07:13, Robert Haas  wrote:
> Multiple people (including David Rowley
> as well as folks here at EnterpriseDB) have demonstrated that for
> certain queries, we can actually use a lot more workers and everything
> works great.  The problem is that for other queries, using a lot of
> workers works terribly.  The planner doesn't know how to figure out
> which it'll be - and honestly, I don't either.

For me, it seems pretty much related to the number of tuples processed
on a worker, vs how many they return. As a general rule, I'd say the
higher this ratio, the higher the efficiency ratio will be for the
worker. Although that's not taking into account contention points
where workers must wait for fellow workers to complete some operation.
I think parallel_tuple_cost is a good GUC to have, perhaps we can be
smarter about the use of it when deciding on how many workers should
be used.

By efficiency, I mean that if a query takes 10 seconds in a normal
serial plan, and adding 1 worker, it takes 5 seconds, it would be 100%
efficient to use another worker. I charted this in [1]. It would have
been interesting to chart the same in a query that returned a larger
number of groups, but I ran out of time, but I think it pretty much
goes, without testing, that more groups == less efficiency. Which'll
be due to more overhead in parallel tuple communication, and more work
to do in the serial portion of the plan.

[1] https://blog.2ndquadrant.com/parallel-monster-benchmark
-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Andres Freund

Hi,

On 2017-05-05 14:20:48 +1200, David Rowley wrote:
> Yeah, I did get some time to look over the contention in Parallel Seq
> Scan a while back and I discovered that on the machine that I was
> testing on. the lock obtained in heap_parallelscan_nextpage() was
> causing workers to have to wait for other workers to fetch their next
> task to work on.

Oh, if it's "just" that, it should be easy enough to address.  Two
approaches:
1) use atomic ops for increment, modulo afterwards to deal with
   wraparound in the synchronous scan
2) batching


> I ended up writing the attached (which I'd not intended to post until
> some time closer to when the doors open for PG11). At the moment it's
> basically just a test patch to see how it affects things when we give
> workers a bit more to do before they come back to look for more work.
> In this case, I've just given them 10 pages to work on, instead of the
> 1 that's allocated in 9.6 and v10.

Right.


> A quick test on a pretty large table on a large machine shows:
> 
> Unpatched:
> 
> postgres=# select count(*) from a;
>count
> 
>  187400
> (1 row)
> 
> Time: 5211.485 ms (00:05.211)
> 
> Patched:
> 
> postgres=# select count(*) from a;
>count
> 
>  187400
> (1 row)
> 
> Time: 2523.983 ms (00:02.524)

Neat!


> I'd had thoughts that the 10 pages wouldn't be constant, but the
> batching size would depend on the size of the relation to be scanned.
> I'd rough ideas to just try to make about 1 million batches. Something
> like batch_pages = Max(parallel_scan->phs_nblocks / 100, 1); so
> that we only take more than 1 page if there's some decent amount to
> process. We don't want to make the batches too big as we might end up
> having to wait on slow workers at the end of a scan.

I wonder how much doing the atomic ops approach alone can help, that
doesn't have the issue that the work might be unevenly distributed
between pages.


> Anyway. I don't want to hi-jack this thread with discussions on this.
> I just wanted to mark that I plan to work on this in order to avoid
> any repeat developments or analysis. I'll probably start a new thread
> for this sometime nearer PG11's dev cycle.

Cool.  I think it might sense to post about this soon, just to give it
some more visibility to reduce the potential for duplication.

- andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread David Rowley

On 5 May 2017 at 13:37, Andres Freund  wrote:
> On 2017-05-02 15:13:58 -0400, Robert Haas wrote:
>> Multiple people (including David Rowley
>> as well as folks here at EnterpriseDB) have demonstrated that for
>> certain queries, we can actually use a lot more workers and everything
>> works great.  The problem is that for other queries, using a lot of
>> workers works terribly.  The planner doesn't know how to figure out
>> which it'll be - and honestly, I don't either.
>
> Have those benchmarks, even in a very informal form, been shared /
> collected / referenced centrally?  I'd be very interested to know where
> the different contention points are. Possibilities:

I posted mine on [1], although the post does not go into much detail
about the contention points. I only really briefly mention it at the
end.

[1] https://blog.2ndquadrant.com/parallel-monster-benchmark/#comment-248273

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] compiler warning with VS 2017

2017-05-04 Thread Haribabu Kommi

I am getting a compiler warning when I build the latest HEAD PostgreSQL with
visual studio 2017.

src/backend/replication/logical/proto.c(482): warning C4312: 'type cast':
conversion from 'unsigned int' to 'char *' of greater size

Details of the warning is available in the link [1].

The code at the line is,

tuple->values[i] = (char *) (Size)0xdeadbeef; /* make bad usage more
obvious */

If I change the code as (char *) (Size)0xdeadbeef;
or (char *) (INT_PTR)0xdeadbeef; the warning disappears.

How about fixing it as below and add the typecast or disable
this warning?

/*
 * PTR_CAST
 * Used when converting integer addresses to a pointer.
 * The casting is used to avoid generating warning in
 * windows
 */
#ifdef WIN32
#define PTR_CAST INT_PTR
#else
#define PTR_CAST
#endif

[1] - https://msdn.microsoft.com/en-us/library/h97f4b9y.aspx

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread David Rowley

 On 3 May 2017 at 07:13, Robert Haas  wrote:
> It is of course possible that the Parallel Seq Scan could run into
> contention problems if the number of workers is large, but in my
> experience there are bigger problems here.  The non-parallel Seq Scan
> can also contend -- not of course over the shared mutex because there
> isn't one, but over access to the blocks themselves.  Every one of
> those blocks has a content lock and a buffer header and so on, and
> having multiple processes accessing those things at the same time
> scales well, but not perfectly.  The Hash node can also contend: if
> the hash join spills to disk, you've got multiple processes reading
> and writing to the temp directory at the same time and, of course,
> that can be worse than just one process doing it -- sometimes much
> worse.  It can also be better, depending on how much I/O gets
> generated and how much I/O bandwidth you have.

Yeah, I did get some time to look over the contention in Parallel Seq
Scan a while back and I discovered that on the machine that I was
testing on. the lock obtained in heap_parallelscan_nextpage() was
causing workers to have to wait for other workers to fetch their next
task to work on.

I ended up writing the attached (which I'd not intended to post until
some time closer to when the doors open for PG11). At the moment it's
basically just a test patch to see how it affects things when we give
workers a bit more to do before they come back to look for more work.
In this case, I've just given them 10 pages to work on, instead of the
1 that's allocated in 9.6 and v10.

A quick test on a pretty large table on a large machine shows:

Unpatched:

postgres=# select count(*) from a;
   count

 187400
(1 row)

Time: 5211.485 ms (00:05.211)

Patched:

postgres=# select count(*) from a;
   count

 187400
(1 row)

Time: 2523.983 ms (00:02.524)

So it seems worth looking into. "a" was just a table with a single int
column. I'm unsure as yet if there are more gains to be had for tables
with wider tuples. I do suspect the patch will be a bigger win in
those cases, since there's less work to do for each page, e.g less
advance aggregate calls, so likely they'll be looking for their next
page a bit sooner.

Now I'm not going to pretend that this patch is ready for the
prime-time. I've not yet worked out how to properly report sync-scan
locations without risking reporting later pages after reporting the
end of the scan. What I have at the moment could cause a report to be
missed if SYNC_SCAN_REPORT_INTERVAL is not divisible by the batch
size. I'm also not sure how batching like this affect read-aheads, but
at least the numbers above speak for something. Although none of the
pages in this case came from disk.

I'd had thoughts that the 10 pages wouldn't be constant, but the
batching size would depend on the size of the relation to be scanned.
I'd rough ideas to just try to make about 1 million batches. Something
like batch_pages = Max(parallel_scan->phs_nblocks / 100, 1); so
that we only take more than 1 page if there's some decent amount to
process. We don't want to make the batches too big as we might end up
having to wait on slow workers at the end of a scan.

Anyway. I don't want to hi-jack this thread with discussions on this.
I just wanted to mark that I plan to work on this in order to avoid
any repeat developments or analysis. I'll probably start a new thread
for this sometime nearer PG11's dev cycle.

The patch is attached if in the meantime someone wants to run this on
some big hardware.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

parallel_nextpage_batching_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Stephen Frost

Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Craig Ringer  writes:
> > We're carefully maintaining this bizarre cognitive dissonance where we
> > justify the need for using this as a planner hint at the same time as
> > denying that we have a hint. That makes it hard to make progress here.
> > I think there's fear that we're setting some kind of precedent by
> > admitting what we already have.
> 
> I think you're overstating the case.  It's clear that there's a
> significant subset of CTE functionality where there has to be an
> optimization fence.  The initial implementation basically took the
> easy way out by deeming *all* CTEs to be optimization fences.  Maybe
> we shouldn't have documented that behavior, but we did.  Now we're
> arguing about how much of a compatibility break it'd be to change that
> planner behavior.  I don't see any particular cognitive dissonance here,
> just disagreements about the extent to which backwards compatibility is
> more important than better query optimization.

This really begs the question of if it's "better."

If it's always better, then there's no discussion to be had.  If there
are cases where it's not, then that could cause problems for people and
there's a reason to question if we're ok with that.

Of course, if a user runs into such an issue, there's a few ways they
can address it without needs any additional syntax (use a temp table
instead, introduce a volative function into the CTE, etc).

I had originally been on the fence about this myself, and even advocated
the new-syntax approach at one point (as I recall), but with the new
statistics we're gaining and the improvmenets in the planner, along with
parallel query and the advantages offered by being able to pull a CTE
into the main query, I think I've come around to agree that removing the
optimization fence, when it won't change the resullts, is a good thing
to do.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] CTE inlining

2017-05-04 Thread Serge Rielau

In my past life when I was faced with such debates I argued that the number 
of customers We are hoping to attract in the future is much bigger than the 
ones we risk offending. Doesn't mean I wanted to piss everyone off. Just 
that I didn't want to be held hostage by history.

Cheers Serge
PS: On the opposing side was typically the mainframe crowd hint hint

Re: [HACKERS] CTE inlining

2017-05-04 Thread Tom Lane

Craig Ringer  writes:
> We're carefully maintaining this bizarre cognitive dissonance where we
> justify the need for using this as a planner hint at the same time as
> denying that we have a hint. That makes it hard to make progress here.
> I think there's fear that we're setting some kind of precedent by
> admitting what we already have.

I think you're overstating the case.  It's clear that there's a
significant subset of CTE functionality where there has to be an
optimization fence.  The initial implementation basically took the
easy way out by deeming *all* CTEs to be optimization fences.  Maybe
we shouldn't have documented that behavior, but we did.  Now we're
arguing about how much of a compatibility break it'd be to change that
planner behavior.  I don't see any particular cognitive dissonance here,
just disagreements about the extent to which backwards compatibility is
more important than better query optimization.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

2017-05-04 Thread Andres Freund

On 2017-05-02 15:13:58 -0400, Robert Haas wrote:
> On Tue, Apr 18, 2017 at 2:48 AM, Amit Khandekar  
> wrote:
> The main things that keeps this from being a crippling issue right now
> is the fact that we tend not to use that many parallel workers in the
> first place.  We're trying to scale a query that would otherwise use 1
> process out to 3 or 5 processes, and so the contention effects, in
> many cases, are not too bad.  Multiple people (including David Rowley
> as well as folks here at EnterpriseDB) have demonstrated that for
> certain queries, we can actually use a lot more workers and everything
> works great.  The problem is that for other queries, using a lot of
> workers works terribly.  The planner doesn't know how to figure out
> which it'll be - and honestly, I don't either.

Have those benchmarks, even in a very informal form, been shared /
collected / referenced centrally?  I'd be very interested to know where
the different contention points are. Possibilities:

- in non-resident workloads: too much concurrent IOs submitted, leading
  to overly much random IO
- internal contention in the the parallel node, e.g. parallel seqscan
- contention on PG componenents like buffer mapping, procarray, clog
- contention on individual buffers, e.g. btree root pages, or small
  tables in nestloop joins
- just too many context switches, due to ineffective parallelization

probably multiple of those are a problem, but without trying to figure
them out, we're going to have a hard time to develop a better costing
model...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Craig Ringer

On 5 May 2017 at 08:17, Joe Conway  wrote:
> On 05/04/2017 05:03 PM, Craig Ringer wrote:
>> On 5 May 2017 02:52, "Tom Lane" wrote:
>> I haven't been keeping close tabs either, but surely we still have
>> to have
>> the optimization fence in (at least) all these cases:
>>
>> * CTE contains INSERT/UPDATE/DELETE
>> * CTE contains SELECT FOR UPDATE/SHARE (else the set of rows that get
>>   locked might change)
>> * CTE contains volatile functions
>>
>> I'm willing to write off cases where, eg, a function should have been
>> marked volatile and was not.  That's user error and there are plenty
>> of hazards of that kind already.  But if the optimizer has reason
>> to know that discarding the fence might change any query side-effects,
>> it mustn't.
>>
>> I think everyone is in total agreement there.
>
> That's great, but if so, why do we need any change in syntax at all?

Because a lot of people use it as a query hint, and don't want to let
go of the one precious query hint they rely on in PostgreSQL to solve
urgent production issues or work around planner limitations. You can
currently (ab)use a CTE with fencing behaviour to force join order,
force evaluation of expensive functions, etc, if the planner would
otherwise pick a semantically identical plan that lands up running
slower.

Many people don't want to give up their hint because it's an important
tool in their "solve user/customer problems fast" toolbox.

Project policy says we don't want query hints, don't have them, and
won't add them. I understand the reasoning for this (and mostly
agree). But we kind of added one by accident anyway, and it's proving
rather hard to take it away!

We added it by documenting an implementation detail/limitation - that
there's limited or no inlining/pullup/pushdown across CTE terms.
People took that as blessing to rely on that behaviour in future
releases, treating it as an optimisation barrier query hint. Since
then some people have been promoting it as a workaround for
performance issues faced by queries with sub-optimal plans.

People "in the know" used to use OFFSET 0 for that instead, but it was
entirely undocumented and could go away in a future release without
any warning, plus it was uglier. The CTE limitation-as-feature got a
lot more press in blogs, etc since it's documented-ish.

So now we're collectively trying to get to a compromise position where
people who rely on it can keep the functionality by declaring the
no-inline requirement explicitly in their SQL where they depend on it.
That way it won't penalise everyone else who just wants the optimiser
to do its best the rest of the time, so we can just write SQL that
says "I want this outcome" without having to worry about a
semi-documented performance booby-trap. So surprise and confuse users
who're (reasonably) expecting the optimiser to treat a CTE like a
subquery, view or SQL function, those who're writing portable SQL and
those  who're porting from other DBMSes.

The sticking point is that this seems to require admitting that it
serves as a hint, of sorts, where it forces the optimiser to choose a
different plan than its cost based estimates would suggest if all
candidate plans are semantically identical.

We're carefully maintaining this bizarre cognitive dissonance where we
justify the need for using this as a planner hint at the same time as
denying that we have a hint. That makes it hard to make progress here.
I think there's fear that we're setting some kind of precedent by
admitting what we already have.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Andres Freund

On 2017-05-04 18:23:38 -0700, Peter Geoghegan wrote:
> On Thu, May 4, 2017 at 6:08 PM, Robert Haas  wrote:
> >> E.g. to power amazon's data migration service (yes, that scares me).
> >
> > If you recall, I did predict prior to commit that test_decoding would
> > get put into production use regardless of the name.
> 
> I thought you were completely wrong when you said that. Lesson
> learned, I suppose.

At least I was a bit more optimistic that we'd get a decent json plugin
into care soon-ish.  While there was some initial work towards that, it
unfortunately stalled at some point.

Euler, are you perchance interested in working towards that? ;)

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Change GetLastImportantRecPtr's definition? (wasSkip checkpoints, archiving on idle systems.)

2017-05-04 Thread Andres Freund

Hi,

On 2016-12-22 19:33:30 +, Andres Freund wrote:
> Skip checkpoints, archiving on idle systems.

As part of an independent bugfix I noticed that Michael & I appear to
have introduced an off-by-one here. A few locations do comparisons like:
/*
 * Only log if enough time has passed and interesting records have
 * been inserted since the last snapshot.
 */
if (now >= timeout &&
last_snapshot_lsn < GetLastImportantRecPtr())
{
last_snapshot_lsn = LogStandbySnapshot();
...

which looks reasonable on its face.  But LogStandbySnapshot (via XLogInsert())
 * Returns XLOG pointer to end of record (beginning of next record).
 * This can be used as LSN for data pages affected by the logged action.
 * (LSN is the XLOG point up to which the XLOG must be flushed to disk
 * before the data page can be written out.  This implements the basic
 * WAL rule "write the log before the data".)

and GetLastImportantRecPtr
 * GetLastImportantRecPtr -- Returns the LSN of the last important record
 * inserted. All records not explicitly marked as unimportant are considered
 * important.

which means that we'll e.g. not notice if there's exactly a *single* WAL
record since the last logged snapshot (and likely similar in the other
users of GetLastImportantRecPtr()), because XLogInsert() will return
where the next record will most of the time be inserted, and
GetLastImportantRecPtr() returns the beginning of said record.

This is trivially fixable by replacing < with <=.  But I wonder if the
better fix would be to redefine GetLastImportantRecPtr() to point to the
end of the record, too?  I don't quite see any upcoming user that'd need
the beginning, and this is a bit failure prone for likely users.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Peter Geoghegan

On Thu, May 4, 2017 at 6:08 PM, Robert Haas  wrote:
>> E.g. to power amazon's data migration service (yes, that scares me).
>
> If you recall, I did predict prior to commit that test_decoding would
> get put into production use regardless of the name.

I thought you were completely wrong when you said that. Lesson
learned, I suppose.


-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Robert Haas

On Thu, May 4, 2017 at 8:46 PM, Bruce Momjian  wrote:
> On Thu, May  4, 2017 at 05:09:40PM -0700, Andres Freund wrote:
>> > > I would not in any way refer to logical decoding as being only a proof
>> > > of concept, even before logical replication.
>> >
>> > The community ships a reliable logical _encoding_, and a test logical
>> > _decoding_.
>>
>> Yes, so what?  What you said is "I didn't think logical decoding was
>> really more than a proof-of-concept until now", which is plainly wrong,
>> given I know a significant number of users using it in production.  Some
>> of them are well known & large enterprises, and it's used to enable
>> critical things.
>
> I am getting tired of saying this.  When I am writing the release notes,
> I am trying to figure out how it affects our shipped code, and the only
> "decoding" I know of is test_decoding.

If you run 'git show --stat b89e151054a05f0f6d356ca52e3b725dd0505e53',
you will see that it includes a test_decoding module (which is a
sample logical decoding output plugin) and a tremendous pile of
changes to src/backend/replication/logical (which is the core logical
decoding infrastructure).  The latter is a larger volume of changes
than the former.  It would perhaps be fair to describe test_decoding
as a proof-of-concept, but it is not fair or correct to describe the
core infrastructure that way.  Anyway, they're separate things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Andres Freund

On 2017-05-04 21:08:41 -0400, Robert Haas wrote:
> On Thu, May 4, 2017 at 8:09 PM, Andres Freund  wrote:
> > On Mon, May  1, 2017 at 08:02:46AM -0400, Robert Haas wrote:
> >> Even test_decoding is (perhaps regrettably) being used to build production 
> >> solutions.
> >
> > E.g. to power amazon's data migration service (yes, that scares me).
>
> If you recall, I did predict prior to commit that test_decoding would
> get put into production use regardless of the name.

Yea.  I think we really should add a generic json plugin into core, I've
seen three (I think) distinct versions deployed into production now. One
of them derived from Euler's https://github.com/eulerto/wal2json, two
apparently developed largely from scratch (more efficient schema handling).

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Robert Haas

On Thu, May 4, 2017 at 8:09 PM, Andres Freund  wrote:
> On Mon, May  1, 2017 at 08:02:46AM -0400, Robert Haas wrote:
>> Even test_decoding is (perhaps regrettably) being used to build production 
>> solutions.
>
> E.g. to power amazon's data migration service (yes, that scares me).

If you recall, I did predict prior to commit that test_decoding would
get put into production use regardless of the name.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Andres Freund

On 2017-05-04 20:46:24 -0400, Bruce Momjian wrote:
> On Thu, May  4, 2017 at 05:09:40PM -0700, Andres Freund wrote:
> > > > I would not in any way refer to logical decoding as being only a proof
> > > > of concept, even before logical replication.
> > > 
> > > The community ships a reliable logical _encoding_, and a test logical
> > > _decoding_.
> > 
> > Yes, so what?  What you said is "I didn't think logical decoding was
> > really more than a proof-of-concept until now", which is plainly wrong,
> > given I know a significant number of users using it in production.  Some
> > of them are well known & large enterprises, and it's used to enable
> > critical things.
> 
> I am getting tired of saying this.  When I am writing the release notes,
> I am trying to figure out how it affects our shipped code, and the only
> "decoding" I know of is test_decoding.

Then ask for input instead of saying uninformed stuff like "I didn't
think logical decoding was really more than a proof-of-concept".  That's
what people are complaining about, not that you're not
all-knowledgeable.  I agree with you - and wrote that previously - that
the specific change is not necessarily worthwhile to be mentioned on its
own, I don't see how that makes your various statements in this
subthread ok.

My message was this:
>   
> https://www.postgresql.org/message-id/a6d13cf7-fbf8-913c-2353-f149c6f95fdc%402ndquadrant.com
>   > 
>   >> Or the ability of logical decoding to follow timeline switches.
>   > 
>   > I didn't think logical decoding was really more than a 
> proof-of-concept
>   > until now.
> 
> Now, if the "encoding"

There's not really a "encoding" part of this. The decoding refers to
decoding the WAL into something externally usable.

> I assume it is this commit:

That, and
commit 24c5f1a103ce6656a5cb430d9a996c34e61ab2a5
Author: Alvaro Herrera 
Date:   2016-03-30 20:07:05 -0300

Enable logical slots to follow timeline switches

When decoding from a logical slot, it's necessary for xlog reading to be
able to read xlog from historical (i.e. not current) timelines;
otherwise, decoding fails after failover, because the archives are in
the historical timeline.  This is required to make "failover logical
slots" possible; it currently has no other use, although theoretically
it could be used by an extension that creates a slot on a standby and
continues to replay from the slot when the standby is promoted.

This commit includes a module in src/test/modules with functions to
manipulate the slots (which is not otherwise possible in SQL code) in
order to enable testing, and a new test in src/test/recovery to ensure
that the behavior is as expected.

Author: Craig Ringer
Reviewed-By: Oleksii Kliukin, Andres Freund, Petr Jelínek

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Robert Haas

On Thu, May 4, 2017 at 7:56 PM, Bruce Momjian  wrote:
> The community ships a reliable logical _encoding_, and a test logical
> _decoding_.

As far as I am aware, logical encoding is a term you just made up,
because it's not referenced anywhere in the commit log or the source
tree, unlike logical decoding, which is referenced many times.

[rhaas pgsql]$ git log --grep='logical decoding' --oneline | wc -l
  59
[rhaas pgsql]$ git log --grep='logical encoding' --oneline | wc -l
   0
[rhaas pgsql]$ git grep 'logical decoding' | wc -l
 242
[rhaas pgsql]$ git grep 'logical encoding' | wc -l
   0

> This came up from discussion related to this item:
>
> the ability of logical decoding to follow timeline switches
>
> My point was that based on the text it is test_decoding that can do
> timeline switches, and is that significant enough to mention in the
> release notes?

Actually, that commit message says nothing about test_decoding.  You
seem to be confusing a logical decoding output plugin (of which
test_decoding is an example) with the core logical decoding facilities
(which are what got modified by this commit).  As Andres said, that's
like confusing postgres_fdw with the fact that PostgreSQL supports
foreign data wrappers.

> You can have all the emotional reactions you want.

I'll try to avoid emotional reactions on the list, but I think you
should try to understand how the features work if you're going to
write the release notes.  The email to which I responded was factually
incorrect, and what you said in the email to which I'm replying now
was, too.  If you want to keep insisting otherwise, you have that
right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Thu, May  4, 2017 at 05:09:40PM -0700, Andres Freund wrote:
> > > I would not in any way refer to logical decoding as being only a proof
> > > of concept, even before logical replication.
> > 
> > The community ships a reliable logical _encoding_, and a test logical
> > _decoding_.
> 
> Yes, so what?  What you said is "I didn't think logical decoding was
> really more than a proof-of-concept until now", which is plainly wrong,
> given I know a significant number of users using it in production.  Some
> of them are well known & large enterprises, and it's used to enable
> critical things.

I am getting tired of saying this.  When I am writing the release notes,
I am trying to figure out how it affects our shipped code, and the only
"decoding" I know of is test_decoding.  My message was this:

https://www.postgresql.org/message-id/a6d13cf7-fbf8-913c-2353-f149c6f95fdc%402ndquadrant.com
> 
>> Or the ability of logical decoding to follow timeline switches.
> 
> I didn't think logical decoding was really more than a 
proof-of-concept
> until now.

Now, if the "encoding" was changed so the external decoders can do more,
or the API we supply to the decoders has new abilities, that is fine to
mention, but someone needs to say that.  That is an "encoding" change,
right?

I said nothing about external decoders, only the user-interface decoder
we ship.

I assume it is this commit:

Author: Simon Riggs 
2017-03-22 [1148e22a8] Teach xlogreader to follow timeline switches

Teach xlogreader to follow timeline switches

Uses page-based mechanism to ensure we’re using the correct 
timeline.

Tests are included to exercise the functionality using a cold 
disk-level copy
of the master that's started up as a replica with slots intact, but 
the
intended use of the functionality is with later features.

Craig Ringer, reviewed by Simon Riggs and Andres Freund

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] snapbuild woes

2017-05-04 Thread Andres Freund

On 2017-05-04 17:00:04 -0700, Andres Freund wrote:
> Attached is a prototype patch for that.

Oops.

Andres
>From b6eb46e376e40f3e2e9a55d16b1b37b27904564b Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 4 May 2017 16:40:52 -0700
Subject: [PATCH 1/2] WIP: Fix off-by-one around GetLastImportantRecPtr.

---
 src/backend/postmaster/bgwriter.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index dcb4cf249c..d409d977c0 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -325,10 +325,11 @@ BackgroundWriterMain(void)
 
 			/*
 			 * Only log if enough time has passed and interesting records have
-			 * been inserted since the last snapshot.
+			 * been inserted since the last snapshot (it's <= because
+			 * last_snapshot_lsn points at the end+1 of the record).
 			 */
 			if (now >= timeout &&
-last_snapshot_lsn < GetLastImportantRecPtr())
+last_snapshot_lsn <= GetLastImportantRecPtr())
 			{
 last_snapshot_lsn = LogStandbySnapshot();
 last_snapshot_ts = now;
-- 
2.12.0.264.gd6db3f2165.dirty

>From 7ed2aeb832029f5602566a665b3f4dbe8baedfcd Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 4 May 2017 16:48:00 -0700
Subject: [PATCH 2/2] WIP: Possibly more robust snapbuild approach.

---
 contrib/test_decoding/expected/ondisk_startup.out |  15 +-
 contrib/test_decoding/specs/ondisk_startup.spec   |   8 +-
 src/backend/replication/logical/decode.c  |   3 -
 src/backend/replication/logical/snapbuild.c   | 386 +++---
 src/include/replication/snapbuild.h   |  25 +-
 5 files changed, 215 insertions(+), 222 deletions(-)

diff --git a/contrib/test_decoding/expected/ondisk_startup.out b/contrib/test_decoding/expected/ondisk_startup.out
index 65115c830a..c7b1f45b46 100644
--- a/contrib/test_decoding/expected/ondisk_startup.out
+++ b/contrib/test_decoding/expected/ondisk_startup.out
@@ -1,21 +1,30 @@
 Parsed test spec with 3 sessions
 
-starting permutation: s2txid s1init s3txid s2alter s2c s1insert s1checkpoint s1start s1insert s1alter s1insert s1start
-step s2txid: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT txid_current() IS NULL;
+starting permutation: s2b s2txid s1init s3b s3txid s2alter s2c s2b s2txid s3c s2c s1insert s1checkpoint s1start s1insert s1alter s1insert s1start
+step s2b: BEGIN;
+step s2txid: SELECT txid_current() IS NULL;
 ?column?   
 
 f  
 step s1init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); 
-step s3txid: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT txid_current() IS NULL;
+step s3b: BEGIN;
+step s3txid: SELECT txid_current() IS NULL;
 ?column?   
 
 f  
 step s2alter: ALTER TABLE do_write ADD COLUMN addedbys2 int;
 step s2c: COMMIT;
+step s2b: BEGIN;
+step s2txid: SELECT txid_current() IS NULL;
+?column?   
+
+f  
+step s3c: COMMIT;
 step s1init: <... completed>
 ?column?   
 
 init   
+step s2c: COMMIT;
 step s1insert: INSERT INTO do_write DEFAULT VALUES;
 step s1checkpoint: CHECKPOINT;
 step s1start: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', 'false');
diff --git a/contrib/test_decoding/specs/ondisk_startup.spec b/contrib/test_decoding/specs/ondisk_startup.spec
index 8223705639..12c57a813d 100644
--- a/contrib/test_decoding/specs/ondisk_startup.spec
+++ b/contrib/test_decoding/specs/ondisk_startup.spec
@@ -24,7 +24,8 @@ step "s1alter" { ALTER TABLE do_write ADD COLUMN addedbys1 int; }
 session "s2"
 setup { SET synchronous_commit=on; }
 
-step "s2txid" { BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT txid_current() IS NULL; }
+step "s2b" { BEGIN; }
+step "s2txid" { SELECT txid_current() IS NULL; }
 step "s2alter" { ALTER TABLE do_write ADD COLUMN addedbys2 int; }
 step "s2c" { COMMIT; }
 
@@ -32,7 +33,8 @@ step "s2c" { COMMIT; }
 session "s3"
 setup { SET synchronous_commit=on; }
 
-step "s3txid" { BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT txid_current() IS NULL; }
+step "s3b" { BEGIN; }
+step "s3txid" { SELECT txid_current() IS NULL; }
 step "s3c" { COMMIT; }
 
 # Force usage of ondisk snapshot by starting and not finishing a
@@ -40,4 +42,4 @@ step "s3c" { COMMIT; }
 # reached. In combination with a checkpoint forcing a snapshot to be
 # written and a new restart point computed that'll lead to the usage
 # of the snapshot.
-permutation "s2txid" "s1init" "s3txid" "s2alter" "s2c" "s1insert" "s1checkpoint" "s1start" "s1insert" "s1alter" "s1insert" "s1start"
+permutation "s2b" "s2txid" "s1init" "s3b" "s3txid" "s2alter" "s2c" "s2b" "s2txid" "s3c" "s2c" "s1insert" "s1checkpoint" "s1start" "s1insert" "s1alter" "s1insert" "s1start"
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 5c13d26099..68825ef598 100644
---

[HACKERS] no test coverage for ALTER FOREIGN DATA WRAPPER name HANDLER ...

2017-05-04 Thread Mark Dilger

Hackers,

just FYI, I cannot find any regression test coverage of this part of the 
grammar, not
even in the contrib/ directory or TAP tests.  I was going to submit a patch to 
help out,
and discovered it is not so easy to do, and perhaps that is why the coverage is 
missing.

My apologies for the noise if this is already common knowledge.  Also, if I'm 
wrong,
I'd appreciate a pointer to the test that I am failing to find.

Thanks,

Mark Dilger

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Andres Freund

On 2017-05-04 19:56:21 -0400, Bruce Momjian wrote:
> On Mon, May  1, 2017 at 08:02:46AM -0400, Robert Haas wrote:
> This came up from discussion related to this item:
> 
>   the ability of logical decoding to follow timeline switches
> 
> My point was that based on the text it is test_decoding that can do
> timeline switches, and is that significant enough to mention in the
> release notes?

FWIW, the relevant commit neither mentions nor modifies test_decoding
(It does add tests that *use* test_decoding, but that's it). Nor did
Petr's comment reference test_decoding.  So I don't see how "based on
the text it is test_decoding" follows.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Thu, May  4, 2017 at 07:01:17PM -0300, Alvaro Herrera wrote:
> Thanks for doing this, looks great.  A few notes:
> 
>   
>
>   
>Add the ability to compute a correlation ratio and the number of
>distinct values on several columns (Tomas Vondra, David Rowley)
>   
> 
> I think this should be worded in terms of "extended data statistics" or
> such.  I think your proposed wording is a bit obscure.  How about
> something like "Add extended statistics to improve query planning".
> Also, I'd add myself as co-author, with Tomas' permission.

I have adjusted the text to add your term, and added your name:

   Add multi-column optimizer statistics to compute the correlation
   ratio and number of distinct values (Tomas Vondra, David Rowley,
   lvaro Herrera)

I think we have to mention the exact statistics collected because we
know that they are of limited usefulness (per Tomas) and full
multi-column statistics are needed and hopefully coming in PG 11.  If we
don't mention the details I am afraid people will be disappointed with PG
10 and will not be excited when they are more powerful in PG 11.  Any
better wording?

I will work on the other items you posted shortly.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Joe Conway

On 05/04/2017 05:03 PM, Craig Ringer wrote:
> On 5 May 2017 02:52, "Tom Lane" wrote:
> I haven't been keeping close tabs either, but surely we still have
> to have
> the optimization fence in (at least) all these cases:
> 
> * CTE contains INSERT/UPDATE/DELETE
> * CTE contains SELECT FOR UPDATE/SHARE (else the set of rows that get
>   locked might change)
> * CTE contains volatile functions
> 
> I'm willing to write off cases where, eg, a function should have been
> marked volatile and was not.  That's user error and there are plenty
> of hazards of that kind already.  But if the optimizer has reason
> to know that discarding the fence might change any query side-effects,
> it mustn't.
> 
> I think everyone is in total agreement there. 

That's great, but if so, why do we need any change in syntax at all?

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] CTE inlining

2017-05-04 Thread Craig Ringer

On 5 May 2017 06:04, "Andreas Karlsson"  wrote:

On 05/04/2017 06:22 PM, Andrew Dunstan wrote:

> I wrote this query:
>
> select (json_populate_record(null::mytype, myjson)).*
> from mytable;
>
>
> It turned out that this was an order of magnitude faster:
>
> with r as
> (
>select json_populate_record(null::mytype, myjson) as x
>from mytable
> )
> select (x).*
> from r;
>

I do not know the planner that well, but I imagined that when we remove the
optimization fence that one would be evaluated similar to if it had been a
lateral join, i.e. there would be no extra function calls in this case
after removing the fence.

Sort of. PostgreSQL has a wart around (x).* expansion where it's
essentially macro-expanded into

(x).a, (x).b, (x).c, ...

Now, if x is a function call, PG will merrily execute it n times for its n
output columns.

Andres is working on fixing this. And it's trivially worked around with a
lateral query ; the above would be better written as

select (x).*
from mytable
cross join lateral json_populate_record(null::mytype, myjson) as x;

So this example just abuses our optimiser hint behaviour for CTEs to avoid
solving a planner issue (why project policy is against hints). But there's
already a solution.

I'm finding it increasingly hilarious watching people vociferously
defending their one precious (semi-documented) query/optimiser hint in
PostgreSQL. The one we don't admit is a hint, but treat as one by avoiding
optimising across it when it's safe to do so.

We can't remove or change our precious hint because we need it to solve
production issues. But we don't have hints because then people wouldn't
report planner/optimiser issues, would lock in bad plans and then complain
about it, etc.

Just like what's happening here. And people are leaping to defend it, lest
we risk exposing performance issues by changing anything, even though all
we're doing is documenting what is already so.

Hey. Crazy idea for backward compat to address Tom's complaint that adding
explicit syntax would require people who wanted the old behaviour to make
their queries incompatible with pg10 and below. Add the "MATERIALIZED"
keyword or whatever. The back patch the keyword as a no-op, since that's
what we already do in back branches. I can't see anything that could
possibly break in that context so long as we only go as far back as it was
already a keyword elsewhere.

We could at least add it to pg10.

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Thu, May  4, 2017 at 06:02:58PM -0300, Alvaro Herrera wrote:
> > I can't see how this can be added to an existing BRIN entry, so it would
> > have to be new.  The text would be:
> > 
> > Improve accuracy in determining if a BRIN index scan is beneficial
> > 
> > though this not something I would normally mention becuause most users
> > don't understand the optimizer choices and just assume it works.
> 
> The problem is that previously it was possible for a BRIN index to be
> created, and cause some queries to choose it which were faster using
> some other index (a btree).  The point is that with this change it
> becomes more credible to create BRIN indexes without fear that such
> queries are going to slow down.
> 
> Your proposed text sounds good to me.  Authors would be David Rowley and
> Emre Hasegeli.

OK, added, thanks:


 
+
+ Improve accuracy in determining if a BRIN index scan
+ is beneficial (David Rowley, Emre Hasegeli)
+
+   

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Andres Freund

On 2017-05-04 19:56:21 -0400, Bruce Momjian wrote:
> On Mon, May  1, 2017 at 08:02:46AM -0400, Robert Haas wrote:
> > On Tue, Apr 25, 2017 at 11:01 AM, Bruce Momjian  wrote:
> > >> Or the ability of logical decoding to follow timeline switches.
> > >
> > > I didn't think logical decoding was really more than a proof-of-concept
> > > until now.
> > 
> > /me searches for jaw on floor.
> > 
> > It sounds like you don't understand how logical decoding works.  There
> > are plugins -- fairly widely used, I think -- like
> > https://github.com/confluentinc/bottledwater-pg and
> > https://github.com/eulerto/wal2json which use the in-core
> > infrastructure to do very nifty things, much like there are foreign
> > data wrappers other than postgres_fdw.  Even test_decoding is (perhaps
> > regrettably) being used to build production solutions.  The point is
> > that most of the logic is in core; test_decoding or bottlewater or
> > wal2json are just small plugins that tap into that infrastructure.
> > 
> > I would not in any way refer to logical decoding as being only a proof
> > of concept, even before logical replication.
> 
> The community ships a reliable logical _encoding_, and a test logical
> _decoding_.

Yes, so what?  What you said is "I didn't think logical decoding was
really more than a proof-of-concept until now", which is plainly wrong,
given I know a significant number of users using it in production.  Some
of them are well known & large enterprises, and it's used to enable
critical things.


On Mon, May  1, 2017 at 08:02:46AM -0400, Robert Haas wrote:
> Even test_decoding is (perhaps regrettably) being used to build production 
> solutions.

E.g. to power amazon's data migration service (yes, that scares me).


> My point was that based on the text it is test_decoding that can do
> timeline switches, and is that significant enough to mention in the
> release notes?  Now, if it is that logical "encoding" now allows
> external logical decoding modules to handle timeline switches, that is
> different, but no one has said that yet.

The change has nothing to do with test_decoding.


Petr: The timeline change itself does, for the moment, not seem
particularly noteworthy to me - it's not really useful atm on its own?
To me it's more of infrastructure to add "logical decoding on standby"
next release?


> You can have all the emotional reactions you want.

Nice one.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Mon, May  1, 2017 at 10:20:38AM -0400, Robert Haas wrote:
> I'm pretty sure this is not the first year in which your policy of
> excluding certain performance-related items has met with opposition.
> I agree that there are some improvements that are sufficiently small
> and boring that they do not merit a mention, but (1) that's also true
> of non-performance items and (2) the fact that people keep complaining
> about performance items you excluded constitutes a discussion of
> changing your filter.
> 
> My own opinion is that the filter should not be particularly different
> for performance items vs. non-performance.  The question ought to be
> whether the change is significant enough that users are likely to
> care.  If you've got several people saying "hey, you forgot NN in
> the release notes!" it is probably a good bet that the change is
> significant enough that users will care about it.

Yes, the "do people care" filter is what I usually use.  When new
functionality is added, we usually mention it because someone usually
care, but for performance, the threshold is usually whether workloads,
even rare ones, would have a visible performance change.  It is
difficult to determine this from the release notes, which is why I
always need feedback on these items.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Craig Ringer

On 5 May 2017 02:52, "Tom Lane"  wrote:

Tomas Vondra  writes:
> On 5/4/17 8:03 PM, Joe Conway wrote:
>>> I haven't been able to follow this incredibly long thread, so please
>>> excuse me if way off base, but are we talking about that a CTE would be
>>> silently be rewritten as an inline expression potentially unless it is
>>> decorated with some new syntax?

> I agree with this, but there's a difference between "executed exactly
> once" and "producing the same result as if executed exactly once".

> I may be misunderstanding what other people proposed in this thread, but
> I think the plan was to only inline CTEs where we know it won't change
> the results, etc. So e.g. CTEs with volatile functions would not get
> inlined, which includes nextval() for example.

I haven't been keeping close tabs either, but surely we still have to have
the optimization fence in (at least) all these cases:

* CTE contains INSERT/UPDATE/DELETE
* CTE contains SELECT FOR UPDATE/SHARE (else the set of rows that get
  locked might change)
* CTE contains volatile functions

I'm willing to write off cases where, eg, a function should have been
marked volatile and was not.  That's user error and there are plenty
of hazards of that kind already.  But if the optimizer has reason
to know that discarding the fence might change any query side-effects,
it mustn't.

I think everyone is in total agreement there.

Re: [HACKERS] snapbuild woes

2017-05-04 Thread Andres Freund

Hi,

On 2017-05-02 08:55:53 +0200, Petr Jelinek wrote:
> Aah, now I understand we talked about slightly different things, I
> considered the running thing to be first step towards tracking aborted
> txes everywhere.
> I think
> we'll have to revisit tracking of aborted transactions in PG11 then
> though because of the 'snapshot too large' issue when exporting, at
> least I don't see any other way to fix that.

FWIW, that seems unnecessary - we can just check for that using the
clog.  Should be very simple to check for aborted xacts when exporting
the snapshot (like 2 lines + comments).  That should address your
concern, right?

> If you think that adding the SNAPBUILD_BUILD_INITIAL_SNAPSHOT would be
> less invasive/smaller patch I am okay with doing that for PG10.

Attached is a prototype patch for that.

What I decided is that essentially tracking the running xacts is too
unrealiable due to the race, so I decided that just relying on
oldestRunningXid and nextXid - which are solely in the procArray and
thus racefree - is better.

It's not perfect yet, primarily because we'd need to take a bit more
care about being ABI compatible for older releases, and because we'd
probably have to trigger LogStandbySnapshot() a bit more frequently
(presumably while waiting for WAL).  The change means we'll have to wait
a bit longer for slot creation, but it's considerably simpler / more
robust.

Could you have a look?

Regards,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Mon, May  1, 2017 at 08:02:46AM -0400, Robert Haas wrote:
> On Tue, Apr 25, 2017 at 11:01 AM, Bruce Momjian  wrote:
> >> Or the ability of logical decoding to follow timeline switches.
> >
> > I didn't think logical decoding was really more than a proof-of-concept
> > until now.
> 
> /me searches for jaw on floor.
> 
> It sounds like you don't understand how logical decoding works.  There
> are plugins -- fairly widely used, I think -- like
> https://github.com/confluentinc/bottledwater-pg and
> https://github.com/eulerto/wal2json which use the in-core
> infrastructure to do very nifty things, much like there are foreign
> data wrappers other than postgres_fdw.  Even test_decoding is (perhaps
> regrettably) being used to build production solutions.  The point is
> that most of the logic is in core; test_decoding or bottlewater or
> wal2json are just small plugins that tap into that infrastructure.
> 
> I would not in any way refer to logical decoding as being only a proof
> of concept, even before logical replication.

The community ships a reliable logical _encoding_, and a test logical
_decoding_.

This came up from discussion related to this item:

the ability of logical decoding to follow timeline switches

My point was that based on the text it is test_decoding that can do
timeline switches, and is that significant enough to mention in the
release notes?  Now, if it is that logical "encoding" now allows
external logical decoding modules to handle timeline switches, that is
different, but no one has said that yet.

You can have all the emotional reactions you want.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential issue with alter system

2017-05-04 Thread Joshua D. Drake


On 05/04/2017 12:49 PM, Tom Lane wrote:

"Joshua D. Drake"  writes:

So I did this:




If you have other entries you want to keep in the postgresql.auto.conf
file, you could get away with manually editing it to remove the newline.


Got it. Thanks for digging in. This is actually a very real and easy way 
to explain to people why they need to keep up to date on their dot releases.


Thanks,

JD



regards, tom lane




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Fri, Apr 28, 2017 at 01:12:34PM +0900, Masahiko Sawada wrote:
> On Tue, Apr 25, 2017 at 10:31 AM, Bruce Momjian  wrote:
> > I have committed the first draft of the Postgres 10 release notes.  They
> > are current as of two days ago, and I will keep them current.  Please
> > give me any feedback you have.
> >
> 
> Related to a following item in release note, oldestxmin is also
> reported by VACUUM VERBOSE and autovacuum , which is introduced by
> commit 9eb344faf54a849898d9be012ddfa8204cfeb57c (author is Simon).
> 
> * Have VACUUM VERBOSE report the number of skipped frozen pages
> (Masahiko Sawada)
> 
> Could we add this item to the release note?

OK, adjusted text below and commit added above this item:

   Have VACUUM VERBOSE report
   the number of skipped frozen pages and oldest xmin (Masahiko Sawada)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Thu, Apr 27, 2017 at 04:05:09PM +0100, Dagfinn Ilmari Mannsåker wrote:
> Bruce Momjian  writes:
> 
> > I have committed the first draft of the Postgres 10 release notes.  They
> > are current as of two days ago, and I will keep them current.  Please
> > give me any feedback you have.
> 
> I noticed a few niggles with the links in "my" entires, patch attached.

Thanks, applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Thu, Apr 27, 2017 at 03:04:57PM +0200, Daniel Verite wrote:
>   Fabien COELHO wrote:
> 
> >Fix psql \p to always print what would be executed by \g or \w (Daniel
> >Vérité)
> > 
> >Previously \p didn't properly print the reverted-to command after a
> >buffer contents reset. CLARIFY?
> > 
> > The fix is linked to the change introduced by Tom when 
> > refactoring/cleaning up in e984ef586 (\if) which change psql's \p 
> > behavior.
> 
> That's my recollection as well. The "Previously" does not refer to 9.6,
> but to that commit.

Yes, adjusted.

> > I'm not sure how this should appear in the release notes. Maybe not at 
> > all, associated to the feature in which the behavioral change was 
> > introduced...
> 
> There is small change of behavior coming as a by-product of the
> introduction of  /if.../endif blocks.
> 
> When doing in 9.x:
> 
> select '1st buffer' \g
> followed by \e
> and editing with select '2nd buffer' (without ending the query)
> and then back in psql doing '\r' and '\p', the result is
> select '2nd buffer'
> 
> The same with v10 leads instead to
> select '1st buffer'
> 
> I'm not sure whether it's above the level of detail worth being mentioned
> in the release notes.

Wow, I am not sure how to even explain that.  ;-)  Let's see if we get
any user feedback on the change.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Thu, Apr 27, 2017 at 10:43:34AM +0200, Fabien COELHO wrote:
> 
> Hello Bruce,
> 
> >I have committed the first draft of the Postgres 10 release notes.  They
> >are current as of two days ago, and I will keep them current.  Please
> >give me any feedback you have.
> 
> About:
> 
> """
>   Fix psql \p to always print what would be executed by \g or \w (Daniel
>   Vérité)
> 
>   Previously \p didn't properly print the reverted-to command after a
>   buffer contents reset. CLARIFY?
> """
> 
> The fix is linked to the change introduced by Tom when refactoring/cleaning
> up in e984ef586 (\if) which change psql's \p behavior.
> 
> This is not a user visible change version-to-version, it is more like a bug
> fix for a bug/behavioral change introduced within pg10 developement process
> itself.
> 
> I'm not sure how this should appear in the release notes. Maybe not at all,
> associated to the feature in which the behavioral change was introduced...

Agreed, I removed the item and moved those commits to the \if item.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Bruce Momjian

On Thu, Apr 27, 2017 at 10:21:44AM +0500, Andrew Borodin wrote:
> Hi, Bruce!
> 
> 2017-04-25 6:31 GMT+05:00 Bruce Momjian :
> > The only unusual thing is that this release has ~180 items while most
> > recent release have had ~220.  The pattern I see that there are more
> > large features in this release than previous ones.
> 
> I'm not sure, but, probably, it worth mentioning this [1,2] in the
> E.1.3.1.2. Indexes section.
> Better tuple management on GiST page allows faster inserts and
> updates. (In numbers: 15% for randomized data,3x for ordered data in
> specific cases)
> 
> [1] https://commitfest.postgresql.org/10/661/
> [2] 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b1328d78f88cdf4f7504004159e530b776f0de16

OK, got it.  Here is the new item:

   Author: Tom Lane 
   2016-09-09 [b1328d78f] Invent PageIndexTupleOverwrite, and
   teach BRIN and GiST

   Allow faster GiST inserts and updates by reusing
   index space more efficiently (Andrey Borodin)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication syntax (was DROP SUBSCRIPTION, query cancellations and slot handling)

2017-05-04 Thread Petr Jelinek

On 04/05/17 23:29, Tom Lane wrote:
> Robert Haas  writes:
>> On Wed, May 3, 2017 at 12:38 AM, Petr Jelinek
>>  wrote:
>>> Ok, Let me be clear, I actually happen to agree with your proposal. The
>>> reason I am moaning is that I always seem to find myself doing tons of
>>> mechanical work to rewrite some cosmetic aspect of some patch based on
>>> which committer is paying attention in a specific week. So while I am
>>> for doing exactly what you proposed, I'd like to see couple of +1s first
>>> (Peter?) since I don't want to rewrite it to something different again
>>> and it's also long past freeze.
> 
>> So, Tom Lane and Thom Brown and Josh Drake all seemed generally in
>> favor of cleaning this up.  Perhaps they could opine on this
>> particular proposal.
> 
> It seems like there's some remaining indecision between "make it look
> like the options in EXPLAIN, VACUUM, etc" and "make it look like the
> WITH options found in some other statements".  I do not have a strong
> opinion which one to do, but I'd definitely say that you should use WITH
> in the latter case but not in the former.  I think this mostly boils down
> to whether to use "=" or not; you've got "not" in the proposal, which
> means you are following the EXPLAIN precedent and should not use WITH.
> 

Okay, here is my initial attempt on changing this. I opted for WITH and
"=" as I like it slightly better (also the generic_options expect values
to be quoted which I don't like and then I would have to again invent
something like generic_options but not quite the same).

Most of the changes go to doc and tests, not that much code has changed
as I already used the definiton (which is the parser's name for these
WITH things). Except that I removed the NO shorthands and changed
publish_insert,etc to publish = 'insert,...'. I also changed the
NOREFRESH to SKIP REFRESH.

I didn't touch the DROP SUBSCRIPTION slot handling so far, that needs to
be separate patch as there is behavior change there, while this is
purely cosmetic and IMHO it's better to not mix those. (I plan to send
patch for that which changes the behavior heavily soonish as well)

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Rework-the-options-for-logical-replication.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Andres Freund

On 2017-05-04 17:33:13 -0500, Merlin Moncure wrote:
> On Mon, May 1, 2017 at 7:02 AM, Robert Haas  wrote:
> > On Tue, Apr 25, 2017 at 11:01 AM, Bruce Momjian  wrote:
> >> I didn't think logical decoding was really more than a proof-of-concept
> >> until now.
> >
> > /me searches for jaw on floor.

Yea, this kind of argument is getting really old.  People, including
Robert and I in this thread, have spent tremendous amounts of work on
it.  Not fun to just get that discounted.

> > I would not in any way refer to logical decoding as being only a proof
> > of concept, even before logical replication.
> 
> That's fair, but I think I understand what Bruce was going for here.
> Data point: github third party modules are generally not approved for
> deployment in my organization so logical decoding from a production
> perspective does not exist (for me) until 10.0.  Point being, an
> important threshold has been crossed.

By that argument the extension APIs, which after all are what external
FDWs, external index types, postgis, and other extensions use, aren't a
feature of postgres.  Some sites not being able to use external
extensions is *one* argument for building some things into core, but
that doesn't mean the extension APIs don't exist or aren't features.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Andres Freund

On 2017-04-25 15:29:01 -0400, Bruce Momjian wrote:
> Uh, the only logical decoding code that I know we ship pre-PG 10 is
> contrib/test_decoding/.

That's completely wrong.  src/backend/replication/logical/ is a a bit
bigger than that...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Merlin Moncure

On Mon, May 1, 2017 at 7:02 AM, Robert Haas  wrote:
> On Tue, Apr 25, 2017 at 11:01 AM, Bruce Momjian  wrote:
>> I didn't think logical decoding was really more than a proof-of-concept
>> until now.
>
> /me searches for jaw on floor.
>
> I would not in any way refer to logical decoding as being only a proof
> of concept, even before logical replication.

That's fair, but I think I understand what Bruce was going for here.
Data point: github third party modules are generally not approved for
deployment in my organization so logical decoding from a production
perspective does not exist (for me) until 10.0.  Point being, an
important threshold has been crossed.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-04 Thread Sven R. Kunze


On 04.05.2017 23:13, Tom Lane wrote:

I'm not against what you've done here, because I had no love for USING
in this context anyway; it conveys approximately nothing to the mind
about what is in the list it's introducing.  But I'm concerned whether
we're boxing ourselves in by using ON.

Actually, "ON" doesn't seem all that mnemonic either.  Maybe "FOR"
would be a good substitute, if it turns out that "ON" has a problem?


The whole syntax reminds me of a regular SELECT clause. So, SELECT?


Also considering the most generic form of statistic support mentioned in 
[1], one could even thing about allowing aggregates, windowing functions 
etc, aka the full SELECT clause in the future.



Sven


[1] 
https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=vbj78v...@mail.gmail.com 




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Andreas Karlsson


On 05/04/2017 06:22 PM, Andrew Dunstan wrote:

I wrote this query:

select (json_populate_record(null::mytype, myjson)).*
from mytable;


It turned out that this was an order of magnitude faster:

with r as
(
   select json_populate_record(null::mytype, myjson) as x
   from mytable
)
select (x).*
from r;


I do not know the planner that well, but I imagined that when we remove 
the optimization fence that one would be evaluated similar to if it had 
been a lateral join, i.e. there would be no extra function calls in this 
case after removing the fence.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] what's up with IDENTIFIER_LOOKUP_EXPR?

2017-05-04 Thread Tom Lane

Robert Haas  writes:
> The PLPGSQL_DTYPE_* constants are another thing that's not really
> documented.

Yeah :-(.  Complain to Jan sometime.

> You've mentioned that we should get rid of
> PLPGSQL_DTYPE_ROW in favor of, uh, whatever's better than that, but
> it's not clear to me what that really means, why one way is better
> than the other way, or what is involved.  I'm not really clear on what
> a PLpgSQL_datum_type is in general, or what any of the specific types
> actually mean.  I'll guess that PLPGSQL_DTYPE_VAR is a variable, but
> beyond that...

Well, from memory ...

PLpgSQL_datum is really a symbol table entry.  The conflict against what
we mean by "Datum" elsewhere is pretty unfortunate.

VAR - variable of scalar type (well, any non-composite type ---
arrays and ranges are this kind of PLpgSQL_datum too, for instance).

REC - variable of composite type, stored as a HeapTuple.

RECFIELD - reference to a field of a REC variable.  The datum includes
the field name and a link to the datum for the parent REC variable.
Notice this implies a runtime lookup of the field name whenever we're
accessing the datum; which sucks for performance but it makes life a
lot easier when you think about the possibility of the composite type
changing underneath you.

ARRAYELEM - reference to an element of an array.  The datum includes
a subscript expression and a link to the datum for the parent array
variable (which can be a VAR, and I think a RECFIELD too).

ROW - this is where it gets fun.  A ROW is effectively a variable
of a possibly-anonymous composite type, and it is defined by a list
(in its own datum) of links to PLpgSQL_datums representing the
individual columns.  Typically the member datums would be VARs
but that's not the only possibility.

As I mentioned earlier, the case that ROW is actually well adapted
for is multiple targets in INTO and similar constructs.  For example,
if you have

SELECT ...blah blah... INTO a,b,c

then the target of the PLpgSQL_stmt_execsql is represented as a single
ROW datum whose members are the datums for a, b, and c.  That's totally
determined by the text of the function and can't change under us.

However ... somebody thought it'd be cute to use the ROW infrastructure
for variables of named composite types, too.  So if you have

DECLARE foo some_composite_type;

then the name "foo" refers to a ROW datum, and the plpgsql compiler
generates additional anonymous VAR datums, one for each declared column
in some_composite_type, which become the members of the ROW datum.
The runtime representation is actually that each field value is stored
separately in its datum, as though it were an independent VAR.  Field
references "foo.col1" are not compiled into RECFIELD datums; we just look
up the appropriate member datum during compile and make the expression
tree point to that datum directly.

So, this representation is great for speed of access and modification
of individual fields of the composite variable.  It sucks when you
want to assign to the composite as a whole or retrieve its value as
a whole, because you have to deconstruct or reconstruct a tuple to
do that.  (The REC/RECFIELD approach has approximately the opposite
strengths and weaknesses.)  Also, dealing with changes in the named
composite type is a complete fail, because we've built its structure
into the function's symbol table at parse time.

I forget why there's a dtype for EXPR.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP Patch: Precalculate stable functions, infrastructure v1

2017-05-04 Thread Alexander Korotkov

On Thu, May 4, 2017 at 7:51 PM, Marina Polyakova  wrote:

> and here I send infrastructure patch which includes <...>
>>
>
> Next 2 patches:
>
> Patch 'planning and execution', which includes:
> - replacement nonvolatile functions and operators by appropriate cached
> expressions;
> - planning and execution cached expressions;
> - regression tests.
>
> Patch 'costs', which includes cost changes for cached expressions
> (according to their behaviour).


Great, thank you for your work.
It's good and widely used practice to prepend number to the patch name
while dealing with patch set.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Alvaro Herrera

Thanks for doing this, looks great.  A few notes:

  
   
  
   Add the ability to compute a correlation ratio and the number of
   distinct values on several columns (Tomas Vondra, David Rowley)
  

I think this should be worded in terms of "extended data statistics" or
such.  I think your proposed wording is a bit obscure.  How about
something like "Add extended statistics to improve query planning".
Also, I'd add myself as co-author, with Tomas' permission.

  
   
   
Cause BRIN index summarization to happen more
aggressively (lvaro Herrera)
   

   
Specifically, summarize the previous page range when a new page
range is created.
   
  

I think it should be pointed out that this is optional and defaults to
off.  Maybe start with "add option to ..."  (I wonder if it's worth
creating a linkable area in the docs that explain this feature, so that
we could have a link in the relnotes entry).


  
   
   
Add function brin_desummarize_range() to remove
BRIN summarization of a specified range (lvaro
Herrera)
   

   
This allows future BRIN index summarization to be
more compact.  CLARIFY
   
  

The point of this change is that if data has changed (delete, update) in
a way that could use tighter min/max limits, it would be worthwhile to
remove the previous BRIN tuple so that a new one is created so that
future scans can skip scanning the range.  Maybe word it something like
"This is useful if UPDATE and DELETE have changed the data to tighter
limits; a future BRIN index entry will be more accurate"?


 
  
  
   Add support for converting XML-formatted data into a row
   set (Pavel Stehule, lvaro Herrera)
  
XMLTABLE is a sql-standard-specified construct, so we should mention it
by name in the leading paragraph more prominently ("add support for the
XMLTABLE standard feature" or something like that); also, I think it
should be in the Queries section, not Functions.


   
Add GUC 
to limit the number of worker processes that can be used for
parallelism (Julien Rouhaud)
   

   
This can be set lower than  to reserve worker processes
for non-parallel purposes.
   

Strange last phrase.  I suggest "...to reserve worker processes for
purposes other than parallel queries".  Also in the leading para, "for
parallelism" should probably be "for query parallelism".


I think commit e306df7f9 ("Full Text Search support for JSON
and JSONB") does definitely not belong to section Indexes; I
think Data Types is a better fit.


I think commits 67dc4ccbb and 1753b1b02 (pg_sequence and pg_sequences)
should be listed in the opposite order.  Maybe merge them into one?


   
This proves better security than the existing md5
negotiation and storage method.
   
You probably mean "provides" not "proves".


   
Allow SSL configuration to be updated at
SIGHUP (Andreas Karlsson, Tom Lane)
   
SIGHUP seems much too technical.  How about "... to be updated during
configurating reload"


I think the entry for commits a734fd5d1 and dafa0848d does not belong in
the reliability section.  In fact I wonder if it's necessary to list
this one at all.



 Increase the maximum configurable WAL size to 1
 gigabyte (Beena Emerson)

"WAL segment size" perhaps, and note that this is useful to reduce
frequency of archive_command?


   
Also add -nosync option to disable fsync.
   
Misspelled --no-sync.


   
Add log options for pg_ctl wait (--wait)
and no-wait (--no-wait) (Vik Fearing)
   
Mispelled "long options".


-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-05-04 Thread Stephen Frost

Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Apr 14, 2017 at 9:16 AM, Stephen Frost  wrote:
> > I agreed already up-thread that there's an issue there and will be
> > looking to fix it.  That comment was simply replying to Rod's point that
> > the documentation could also be improved.
> 
> OK, thanks.  The wrap for the next set of minor releases is, according
> to my understanding, scheduled for Monday, so you'd better jump on
> this soon if you're hoping to get a fix out this time around.

I've worked out what's happening here and it's because the ALL policy
has both USING and WITH CHECK that it's not acting the same as the
SELECT policy (which can only have USING).  add_with_check_quals() is
what determines if the WITH CHECK policy or the USING policy should be
used (through a bit of a grotty #define, if you ask me..).

I've been considering how best to fix it.  The two main options are to
use a different WCOKind and then track that through, which might be nice
as we might be able to provide a more useful error message in that case,
or to just add an additional flag to add_with_check_quals() to say
"always add the USING clause when this flag is true."

Either way, I expect to wrap this up either later tonight or tomorrow.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] pg_dump emits ALTER TABLE ONLY partitioned_table

2017-05-04 Thread Stephen Frost

Amit,

* Amit Langote (amitlangot...@gmail.com) wrote:
> On Wed, May 3, 2017 at 12:05 PM, Stephen Frost  wrote:
> > Assuming this looks good to you, I'll push it tomorrow, possibly with
> > other minor adjustments and perhaps a few more tests.
> 
> Your latest patch looks good to me.

Found a few more issues (like pg_dump not working against older versions
of PG, because the queries for older versions hadn't been adjusted) and
did a bit more tidying.

I'll push this in a couple of hours.

Thanks!

Stephen
From ec25e5ecf44d461a93187999d3a491eae49b587f Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Thu, 4 May 2017 17:14:51 -0400
Subject: [PATCH] Change the way pg_dump retrieves partitioning info

This gets rid of the code that issued separate queries to retrieve the
partitioning parent-child relationship, parent partition key, and child
partition bound information.  With this patch, the information is
retrieved instead using the queries issued from getTables() and
getInherits(), which is both more efficient than the previous approach
and doesn't require any new code.

Since the partitioning parent-child relationship is now retrieved with
the same old code that handles inheritance, partition attributes receive
a proper flagInhAttrs() treatment (that it didn't receive before), which
is needed so that the inherited NOT NULL constraints are not emitted if
we already emitted it for the parent.

Also, fix a bug in pg_dump's --binary-upgrade code, which caused pg_dump
to emit invalid command to attach a partition to its parent.

Author: Amit Langote, with some additional changes by me.
---
 src/bin/pg_dump/common.c |  90 -
 src/bin/pg_dump/pg_dump.c| 264 +--
 src/bin/pg_dump/pg_dump.h|  15 +--
 src/bin/pg_dump/t/002_pg_dump.pl |  36 +-
 4 files changed, 153 insertions(+), 252 deletions(-)

diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index e2bc357..47191be 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -68,8 +68,6 @@ static int	numextmembers;
 
 static void flagInhTables(TableInfo *tbinfo, int numTables,
 			  InhInfo *inhinfo, int numInherits);
-static void flagPartitions(TableInfo *tblinfo, int numTables,
-			  PartInfo *partinfo, int numPartitions);
 static void flagInhAttrs(DumpOptions *dopt, TableInfo *tblinfo, int numTables);
 static DumpableObject **buildIndexArray(void *objArray, int numObjs,
 Size objSize);
@@ -77,8 +75,6 @@ static int	DOCatalogIdCompare(const void *p1, const void *p2);
 static int	ExtensionMemberIdCompare(const void *p1, const void *p2);
 static void findParentsByOid(TableInfo *self,
  InhInfo *inhinfo, int numInherits);
-static void findPartitionParentByOid(TableInfo *self, PartInfo *partinfo,
- int numPartitions);
 static int	strInArray(const char *pattern, char **arr, int arr_size);
 
 
@@ -97,10 +93,8 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	NamespaceInfo *nspinfo;
 	ExtensionInfo *extinfo;
 	InhInfo*inhinfo;
-	PartInfo*partinfo;
 	int			numAggregates;
 	int			numInherits;
-	int			numPartitions;
 	int			numRules;
 	int			numProcLangs;
 	int			numCasts;
@@ -238,10 +232,6 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	inhinfo = getInherits(fout, );
 
 	if (g_verbose)
-		write_msg(NULL, "reading partition information\n");
-	partinfo = getPartitions(fout, );
-
-	if (g_verbose)
 		write_msg(NULL, "reading event triggers\n");
 	getEventTriggers(fout, );
 
@@ -255,11 +245,6 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 		write_msg(NULL, "finding inheritance relationships\n");
 	flagInhTables(tblinfo, numTables, inhinfo, numInherits);
 
-	/* Link tables to partition parents, mark parents as interesting */
-	if (g_verbose)
-		write_msg(NULL, "finding partition relationships\n");
-	flagPartitions(tblinfo, numTables, partinfo, numPartitions);
-
 	if (g_verbose)
 		write_msg(NULL, "reading column info for interesting tables\n");
 	getTableAttrs(fout, tblinfo, numTables);
@@ -293,10 +278,6 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	getPolicies(fout, tblinfo, numTables);
 
 	if (g_verbose)
-		write_msg(NULL, "reading partition key information for interesting tables\n");
-	getTablePartitionKeyInfo(fout, tblinfo, numTables);
-
-	if (g_verbose)
 		write_msg(NULL, "reading publications\n");
 	getPublications(fout);
 
@@ -354,43 +335,6 @@ flagInhTables(TableInfo *tblinfo, int numTables,
 	}
 }
 
-/* flagPartitions -
- *	 Fill in parent link fields of every target table that is partition,
- *	 and mark parents of partitions as interesting
- *
- * modifies tblinfo
- */
-static void
-flagPartitions(TableInfo *tblinfo, int numTables,
-			  PartInfo *partinfo, int numPartitions)
-{
-	int		i;
-
-	for (i = 0; i < numTables; i++)
-	{
-		/* Some kinds are never partitions */
-		if (tblinfo[i].relkind == RELKIND_SEQUENCE ||
-			tblinfo[i].relkind == RELKIND_VIEW ||
-			tblinfo[i].relkind ==

Re: [HACKERS] logical replication syntax (was DROP SUBSCRIPTION, query cancellations and slot handling)

2017-05-04 Thread Tom Lane

Robert Haas  writes:
> On Wed, May 3, 2017 at 12:38 AM, Petr Jelinek
>  wrote:
>> Ok, Let me be clear, I actually happen to agree with your proposal. The
>> reason I am moaning is that I always seem to find myself doing tons of
>> mechanical work to rewrite some cosmetic aspect of some patch based on
>> which committer is paying attention in a specific week. So while I am
>> for doing exactly what you proposed, I'd like to see couple of +1s first
>> (Peter?) since I don't want to rewrite it to something different again
>> and it's also long past freeze.

> So, Tom Lane and Thom Brown and Josh Drake all seemed generally in
> favor of cleaning this up.  Perhaps they could opine on this
> particular proposal.

It seems like there's some remaining indecision between "make it look
like the options in EXPLAIN, VACUUM, etc" and "make it look like the
WITH options found in some other statements".  I do not have a strong
opinion which one to do, but I'd definitely say that you should use WITH
in the latter case but not in the former.  I think this mostly boils down
to whether to use "=" or not; you've got "not" in the proposal, which
means you are following the EXPLAIN precedent and should not use WITH.

I'm okay with the other specifics mentioned.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Missing feature in Phrase Search?

2017-05-04 Thread Sven R. Kunze


Hi everybody,

On 21.04.2017 20:47, Josh Berkus wrote:

Oleg, Teodor, folks:

I was demo'ing phrase search for a meetup yesterday, and the user
feedback I got showed that there's a missing feature with phrase search.
  Let me explain by example:


'fix <-> error' will match 'fixed error', 'fixing error'
but not 'fixed language error' or 'fixed a small error'

'fix <2> error' will match 'fixed language error',
but not 'fixing error' or 'fixed a small error'

'fix <3> error' will match 'fixed a small error',
but not any of the other strings.


This is because the # in <#> is an exact match.

Seems like we could really use a way for users to indicate that they
want a range of word gaps.  Like, in the example above, users could
search on:

'fix <1:3> error'

... which would search for any phrase where "error" followed "fix" by
between 1 and 3 words.

Not wedded to any particular syntax for that, of course.


That could be useful. I would like to add another idea here about 
leaving out one side of the range.


'fix <:3> error'
'fix <2:> error'

To either indicate 1 (left) or unbounded (right).

Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Gavin Flower


On 05/05/17 06:39, Tomas Vondra wrote:



On 5/4/17 8:03 PM, Joe Conway wrote:

On 05/04/2017 10:56 AM, Andrew Dunstan wrote:



On 05/04/2017 01:52 PM, Joe Conway wrote:

On 05/04/2017 10:33 AM, Alvaro Herrera wrote:

I'm not sure what your point is.  We know that for some cases the
optimization barrier semantics are useful, which is why the 
proposal is

to add a keyword to install one explicitely:

  with materialized r as
  (
 select json_populate_record(null::mytype, myjson) as x
 from mytable
  )
  select (x).*
  from r;

this would preserve the current semantics.

I haven't been able to follow this incredibly long thread, so please
excuse me if way off base, but are we talking about that a CTE 
would be

silently be rewritten as an inline expression potentially unless it is
decorated with some new syntax?

I would find that very disconcerting myself. For example, would 
this CTE

potentially get rewritten with multiple evaluation as follows?

DROP SEQUENCE IF EXISTS foo_seq;
CREATE SEQUENCE foo_seq;

WITH a(f1) AS (SELECT nextval('foo_seq'))
SELECT a.f1, a.f1 FROM a;
  f1 | ?column?
+--
   1 |1
(1 row)

ALTER SEQUENCE foo_seq RESTART;
SELECT nextval('foo_seq'), nextval('foo_seq');
  nextval | ?column?
-+--
1 |2
(1 row)



I think that would be a change in semantics, which we should definitely
not be getting. Avoiding a change in semantics might be an interesting
exercise, but we have lots of clever coders ...


Well I think my point is that I always have understood CTEs to be
executed precisely once producing a temporary result set that is then
referenced elsewhere. I don't think that property of CTEs should change.
Somewhere else in the thread someone mentioned predicate push down --
that makes sense and maybe some clever coder can come up with a patch
that does that, but I would not be in favor of CTEs being inlined and
therefore evaluated multiple times.



I agree with this, but there's a difference between "executed exactly 
once" and "producing the same result as if executed exactly once".


I may be misunderstanding what other people proposed in this thread, 
but I think the plan was to only inline CTEs where we know it won't 
change the results, etc. So e.g. CTEs with volatile functions would 
not get inlined, which includes nextval() for example.


regards

It was the behaviour of "producing the same result as if executed 
exactly once" that I was thinking of - I think this is still valid for 
triggers & volatile functions, but such behaviour should be clearly 
documented.  This what I implicitly thought about CTE's when I first 
came across them - to me it is the intuitively obvious behaviour.  
However, limiting the rows based on the body of the SELECT would often 
be a very useful optimisation



Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-04 Thread Tom Lane

Alvaro Herrera  writes:
> Here's a patch implementing this idea.  From gram.y's comment, the
> support syntax is now:

>   
> /*
>*
>*QUERY :
> !  *CREATE STATISTICS stats_name [(stat types)] arguments
> ! 
> !  *where 'arguments' can be one or more of:
> !  *{ ON (columns)
> !  *  | FROM relations
> !  *  | WITH (options)
> !  *  | WHERE expression }

> Note that I removed the USING keyword in the stat types list, and also
> made it mandatory that that list appears immediately after the new stats
> name.  This should make it possible to have USING in the relation list
> (the FROM clause), if we allow explicit multiple relations with join
> syntax there.  The other options can appear in any order.

Hmm ... I'm not sure that I buy that particular argument.  If you're
concerned that the grammar could not handle "FROM x JOIN y USING (z)",
wouldn't it also have a problem with "FROM x JOIN y ON (z)"?

It might work anyway, since the grammar should know whether ON or USING
is needed to complete the JOIN clause.  But I think you'd better check
whether the complete join syntax works there, even if we're not going
to support it now.

I'm not against what you've done here, because I had no love for USING
in this context anyway; it conveys approximately nothing to the mind
about what is in the list it's introducing.  But I'm concerned whether
we're boxing ourselves in by using ON.

Actually, "ON" doesn't seem all that mnemonic either.  Maybe "FOR"
would be a good substitute, if it turns out that "ON" has a problem?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] what's up with IDENTIFIER_LOOKUP_EXPR?

2017-05-04 Thread Robert Haas

On Thu, May 4, 2017 at 4:21 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> plpgsql has an enum called IdentifierLookup which includes a value
>> IDENTIFIER_LOOKUP_EXPR which is declared like this:
>> IDENTIFIER_LOOKUP_EXPR  /* In SQL expression --- special 
>> case */
>> It regrettably does not explain what exactly is special about it, and
>> AFAICT, neither does any other comment.  If I replace every usage of
>> IDENTIFIER_LOOKUP_EXPR with IDENTIFIER_LOOKUP_NORMAL, the regression
>> tests pass nonetheless.
>
> AFAICS, the only place where IDENTIFIER_LOOKUP_EXPR acts differently from
> IDENTIFIER_LOOKUP_NORMAL is plpgsql_parse_word(), and what it does is to
> prevent a lookup in plpgsql's namestack from occurring, so that an
> identifier is reported as "not recognized" even if there is a matching
> variable.  In the two places that currently select IDENTIFIER_LOOKUP_EXPR,
> that seems to be only an optimization, because they will act the same
> anyway for all the token types that plpgsql_parse_word could return.
> I don't remember if there originally was a bigger reason than that.
> But it seems reasonable to be able to signal the lookup machinery that
> we're in a SQL expression rather than someplace else; in principle that
> could have larger implications than it does right now.

Thanks for the explanation.

>> The rule, as far as I can tell from reading the code, is that
>> IDENTIFIER_LOOKUP_NORMAL looks up words, double-words (e.g. x.y), and
>> triple-words (e.g x.y.z), IDENTIFIER_LOOKUP_EXPR looks up only
>> double-words and triple-words, and IDENTIFIER_LOOKUP_DECLARE looks up
>> nothing.  But it's not clear to me exactly what the motivation for
>> that is. plpgsql_parse_word says:
>
> The doubleword and tripleword cases are slightly different: lookup can't
> be turned off completely, because we need to create matching RECFIELD
> datums for use later.  It's still true that we don't especially care what
> token is returned, but the side-effect of creating a RECFIELD entry is
> important.  Possibly we could shave a few more cycles by making the code
> know that explicitly, but it didn't seem worth the complexity at the time.

The PLPGSQL_DTYPE_* constants are another thing that's not really
documented.  You've mentioned that we should get rid of
PLPGSQL_DTYPE_ROW in favor of, uh, whatever's better than that, but
it's not clear to me what that really means, why one way is better
than the other way, or what is involved.  I'm not really clear on what
a PLpgSQL_datum_type is in general, or what any of the specific types
actually mean.  I'll guess that PLPGSQL_DTYPE_VAR is a variable, but
beyond that...

> Feel free to improve the comments if this bugs you ...

I might try to do that at some point, but I don't think my
understanding of all this is clear enough to attempt it at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Alvaro Herrera

Bruce Momjian wrote:
> On Tue, Apr 25, 2017 at 04:03:53PM +1200, David Rowley wrote:
> >  ..On 25 April 2017 at 13:31, Bruce Momjian  wrote:
> > > The only unusual thing is that this release has ~180 items while most
> > > recent release have had ~220.  The pattern I see that there are more
> > > large features in this release than previous ones.
> > 
> > Thanks for drafting this up.
> > 
> > I understand that it may have been filtered out, but I'd say that
> > 7e534adcdc70 is worth a mention.
> > 
> > Users creating BRIN indexes previously would have had to know
> > beforehand that the table would be sufficiently correlated on the
> > indexed columns for the index to be worthwhile, whereas now there's a
> > lesser need for the user to know this beforehand.
> 
> I can't see how this can be added to an existing BRIN entry, so it would
> have to be new.  The text would be:
> 
>   Improve accuracy in determining if a BRIN index scan is beneficial
> 
> though this not something I would normally mention becuause most users
> don't understand the optimizer choices and just assume it works.

The problem is that previously it was possible for a BRIN index to be
created, and cause some queries to choose it which were faster using
some other index (a btree).  The point is that with this change it
becomes more credible to create BRIN indexes without fear that such
queries are going to slow down.

Your proposed text sounds good to me.  Authors would be David Rowley and
Emre Hasegeli.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 10 release notes

2017-05-04 Thread Alvaro Herrera

Claudio Freire wrote:
> On Tue, Apr 25, 2017 at 2:45 PM, Bruce Momjian  wrote:

> > However, given your explanation, I have added the item:
> >
> >Improve speed of VACUUM's removal of trailing empty
> >heap pages (Alvaro Herrera)
> 
> That's enough for me, thanks.

Thanks!  I amended the entry by cons'ing Claudio's name as author.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-04 Thread Sven R. Kunze


Hi Rahila,

still thinking about the syntax (sorry):


On 04.05.2017 13:44, Rahila Syed wrote:

[...] The syntax implemented in this patch is as follows,

CREATE TABLE p11 PARTITION OF p1 DEFAULT;


Rewriting the following:

On Thu, May 4, 2017 at 4:02 PM, amul sul > wrote:


[...] CREATE TABLE p1 PARTITION OF test FOR VALUES IN  (DEFAULT)
PARTITION BY LIST(b); [...]



It yields

CREATE TABLE p1 PARTITION OF test DEFAULT PARTITION BY LIST(b);

This reads to me like "DEFAULT PARTITION".


I can imagine a lot of confusion when those queries are encountered in 
the wild. I know this thread is about creating a default partition but I 
were to propose a minor change in the following direction, I think 
confusion would be greatly avoided:


CREATE TABLE p1 PARTITION OF test*AS *DEFAULT PARTITION*ED* BY LIST(b);

I know it's a bit longer but I think those 4 characters might serve 
readability in the long term. It was especially confusing to see 
PARTITION in two positions serving two different functions.


Sven

Re: [HACKERS] transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)

2017-05-04 Thread Thomas Munro

On Fri, May 5, 2017 at 2:40 AM, Robert Haas  wrote:
> On Thu, May 4, 2017 at 4:46 AM, Thomas Munro
>  wrote:
>> On Thu, May 4, 2017 at 4:02 AM, Alvaro Herrera  
>> wrote:
>>> Robert Haas wrote:
 I suspect that most users would find it more useful to capture all of
 the rows that the statement actually touched, regardless of whether
 they hit the named table or an inheritance child.
>>>
>>> Yes, agreed.  For the plain inheritance cases each row would need to
>>> have an indicator of which relation it comes from (tableoid); I'm not
>>> sure if such a thing would be useful in the partitioning case.
>>
>> On Thu, May 4, 2017 at 4:26 AM, David Fetter  wrote:
>>> +1 on the not-duct-tape view of partitioned tables.
>>
>> Hmm.  Ok.  Are we talking about PG10 or PG11 here?  Does this approach
>> makes sense?
>
> I was thinking PG10 if it can be done straightforwardly.

Ok, I will draft a patch to do it the way I described and see what people think.

>> To avoid the whiff of duct tape, we'd probably also want to make ROW
>> triggers created on the partitioned table(s) containing partition to
>> fire too, with appropriate TypeConversionMap treatment.  Not sure what
>> exactly is involved there.
>>
>> On the other hand, doing that with inheritance hierarchies would be an
>> incompatible behavioural change, which I guess people don't want -- am
>> I right?
>
> Incompatible with what?  Transition tables haven't been released yet.
> If we're going to fix the definition of what they do, now's the time.

The last two paragraphs of my email were about ROW triggers created on
partitioned tables and tables with inheritance children, not the new
transition table stuff.  I will forget that for now and look only at
making the transition tables duct-tape-free.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-04 Thread Jeevan Ladhe

While reviewing the code I was trying to explore more cases, and I here
comes an
open question to my mind:
should we allow the default partition table to be partitioned further?

If we allow it(as in the current case) then observe following case, where I
have defined a default partitioned which is further partitioned on a
different
column.

postgres=# CREATE TABLE test ( a int, b int, c int) PARTITION BY LIST (a);
CREATE TABLE
postgres=# CREATE TABLE test_p1 PARTITION OF test FOR VALUES IN(4, 5, 6, 7,
8);
CREATE TABLE
postgres=# CREATE TABLE test_pd PARTITION OF test DEFAULT PARTITION BY
LIST(b);
CREATE TABLE
postgres=# INSERT INTO test VALUES (20, 24, 12);
ERROR:  no partition of relation "test_pd" found for row
DETAIL:  Partition key of the failing row contains (b) = (24).

Note, that it does not allow inserting the tuple(20, 24, 12) because though
a=20
would fall in default partition i.e. test_pd, table test_pd itself is
further
partitioned and does not have any partition satisfying b=24.
Further if I define a default partition for table test_pd, the the tuple
gets inserted.

Doesn't this sound like the whole purpose of having DEFAULT partition on
test
table is defeated?

Any views?

Regards,
Jeevan Ladhe

Re: [HACKERS] delta relations in AFTER triggers

2017-05-04 Thread Thomas Munro

On Fri, May 5, 2017 at 12:39 AM, Neha Sharma
 wrote:
> While testing the feature we encountered one more crash,below is the
> scenario to reproduce.
>
> create table t1 ( a int);
> create table t2 ( a int);
> insert into t1 values (11),(12),(13);
>
> create or replace function my_trig() returns trigger
> language plpgsql as $my_trig$
> begin
> insert into t2(select a from new_table);
> RETURN NEW;
> end;
> $my_trig$;
>
> create trigger my_trigger
> after truncate or update  on t1
> referencing new table as new_table old table as oldtab
> for each statement
> execute procedure my_trig();
>
> truncate t1;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.

Thanks.  Reproduced here.  The stack looks like this:

frame #3: 0x000103e5e8b0
postgres`ExceptionalCondition(conditionName="!((trigdata->tg_event)
& 0x0003) == 0x) || (((trigdata->tg_event) & 0x0003)
== 0x0002) || (((trigdata->tg_event) & 0x0003) == 0x0001))
&& (((trigdata->tg_event) & 0x0018) == 0x) &&
!(trigdata->tg_event & 0x0020) && !(trigdata->tg_event &
0x0040)) || (trigdata->tg_oldtable == ((void*)0) &&
trigdata->tg_newtable == ((void*)0)))", errorType="FailedAssertion",
fileName="trigger.c", lineNumber=2045) + 128 at assert.c:54
frame #4: 0x000103a6f542
postgres`ExecCallTriggerFunc(trigdata=0x7fff5c40bad0, tgindx=0,
finfo=0x7fd8ba0817b8, instr=0x,
per_tuple_context=0x7fd8b906f928) + 258 at trigger.c:2039
frame #5: 0x000103a754ed
postgres`AfterTriggerExecute(event=0x7fd8ba092460,
rel=0x0001043fd9c0, trigdesc=0x7fd8ba068758,
finfo=0x7fd8ba0817b8, instr=0x,
per_tuple_context=0x7fd8b906f928,
trig_tuple_slot1=0x,
trig_tuple_slot2=0x) + 1469 at trigger.c:3860
frame #6: 0x000103a73080
postgres`afterTriggerInvokeEvents(events=0x7fd8ba07fb00,
firing_id=1, estate=0x7fd8ba090440, delete_ok='\x01') + 592 at
trigger.c:4051
frame #7: 0x000103a72b7b
postgres`AfterTriggerEndQuery(estate=0x7fd8ba090440) + 203 at
trigger.c:4227
frame #8: 0x000103a498aa
postgres`ExecuteTruncate(stmt=0x7fd8ba059f40) + 2026 at
tablecmds.c:1485

There's an assertion that it's (one of INSERT, UPDATE, DELETE, an
AFTER trigger, not deferred) *or* there are no transition tables.
Here it's TRUNCATE and there are transition tables, so it fails:

/*
 * Protect against code paths that may fail to initialize transition table
 * info.
 */
Assert(((TRIGGER_FIRED_BY_INSERT(trigdata->tg_event) ||
 TRIGGER_FIRED_BY_UPDATE(trigdata->tg_event) ||
 TRIGGER_FIRED_BY_DELETE(trigdata->tg_event)) &&
TRIGGER_FIRED_AFTER(trigdata->tg_event) &&
!(trigdata->tg_event & AFTER_TRIGGER_DEFERRABLE) &&
!(trigdata->tg_event & AFTER_TRIGGER_INITDEFERRED)) ||
   (trigdata->tg_oldtable == NULL && trigdata->tg_newtable == NULL));


We can't possibly support transition tables on TRUNCATE (the whole
point of TRUNCATE is not to inspect all the rows so we can't collect
them), and we already reject ROW triggers on TRUNCATE, so we should
reject transition tables on STATEMENT triggers for TRUNCATE at
creation time too.  See attached.  Thoughts?

> Log file and core dump attached for reference.

Thanks!  Just by the way, it'd be better to post just an interesting
stack trace fragment rather than a core file, because core files can't
really be used without the exact executable that you built.

-- 
Thomas Munro
http://www.enterprisedb.com


no-transition-tables-for-truncate.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] what's up with IDENTIFIER_LOOKUP_EXPR?

2017-05-04 Thread Tom Lane

Robert Haas  writes:
> plpgsql has an enum called IdentifierLookup which includes a value
> IDENTIFIER_LOOKUP_EXPR which is declared like this:
> IDENTIFIER_LOOKUP_EXPR  /* In SQL expression --- special case 
> */
> It regrettably does not explain what exactly is special about it, and
> AFAICT, neither does any other comment.  If I replace every usage of
> IDENTIFIER_LOOKUP_EXPR with IDENTIFIER_LOOKUP_NORMAL, the regression
> tests pass nonetheless.

AFAICS, the only place where IDENTIFIER_LOOKUP_EXPR acts differently from
IDENTIFIER_LOOKUP_NORMAL is plpgsql_parse_word(), and what it does is to
prevent a lookup in plpgsql's namestack from occurring, so that an
identifier is reported as "not recognized" even if there is a matching
variable.  In the two places that currently select IDENTIFIER_LOOKUP_EXPR,
that seems to be only an optimization, because they will act the same
anyway for all the token types that plpgsql_parse_word could return.
I don't remember if there originally was a bigger reason than that.
But it seems reasonable to be able to signal the lookup machinery that
we're in a SQL expression rather than someplace else; in principle that
could have larger implications than it does right now.

> The rule, as far as I can tell from reading the code, is that
> IDENTIFIER_LOOKUP_NORMAL looks up words, double-words (e.g. x.y), and
> triple-words (e.g x.y.z), IDENTIFIER_LOOKUP_EXPR looks up only
> double-words and triple-words, and IDENTIFIER_LOOKUP_DECLARE looks up
> nothing.  But it's not clear to me exactly what the motivation for
> that is. plpgsql_parse_word says:

The doubleword and tripleword cases are slightly different: lookup can't
be turned off completely, because we need to create matching RECFIELD
datums for use later.  It's still true that we don't especially care what
token is returned, but the side-effect of creating a RECFIELD entry is
important.  Possibly we could shave a few more cycles by making the code
know that explicitly, but it didn't seem worth the complexity at the time.

Feel free to improve the comments if this bugs you ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-04 Thread Jeevan Ladhe

Hi Rahila,

I have started reviewing your latest patch, and here are my initial
comments:

1.
In following block, we can just do with def_index, and we do not need
found_def
flag. We can check if def_index is -1 or not to decide if default partition
is
present.

@@ -166,6 +172,8 @@ RelationBuildPartitionDesc(Relation rel)
  /* List partitioning specific */
  PartitionListValue **all_values = NULL;
  bool found_null = false;
+ bool found_def = false;
+ int def_index = -1;
  int null_index = -1;

2.
In check_new_partition_bound, in case of PARTITION_STRATEGY_LIST, remove
following duplicate declaration of boundinfo, because it is confusing and
after
your changes it is not needed as its not getting overridden in the if block
locally.
if (partdesc->nparts > 0)
{
PartitionBoundInfo boundinfo = partdesc->boundinfo;
ListCell   *cell;

3.
In following function isDefaultPartitionBound, first statement "return
false"
is not needed.

+ * Returns true if the partition bound is default
+ */
+bool
+isDefaultPartitionBound(Node *value)
+{
+ if (IsA(value, DefElem))
+ {
+ DefElem *defvalue = (DefElem *) value;
+ if(!strcmp(defvalue->defname, "DEFAULT"))
+ return true;
+ return false;
+ }
+ return false;
+}

4.
As mentioned in my previous set of comments, following if block inside a
loop
in get_qual_for_default needs a break:

+ foreach(cell1, bspec->listdatums)
+ {
+ Node *value = lfirst(cell1);
+ if (isDefaultPartitionBound(value))
+ {
+ def_elem = true;
+ *defid  = inhrelid;
+ }
+ }

5.
In the grammar the rule default_part_list is not needed:

+default_partition:
+ DEFAULT  { $$ = (Node *)makeDefElem("DEFAULT", NULL, @1); }
+
+default_part_list:
+ default_partition { $$ = list_make1($1); }
+ ;
+

Instead you can simply declare default_partition as a list and write it as:

default_partition:
DEFAULT
{
Node *def = (Node *)makeDefElem("DEFAULT", NULL, @1);
$$ = list_make1(def);
}

6.
You need to change the output of the describe command, which is currently
as below: postgres=# \d+ test; Table "public.test" Column | Type |
Collation | Nullable | Default | Storage | Stats target | Description
+-+---+--+-+-+--+-
a | integer | | | | plain | | b | date | | | | plain | | Partition key:
LIST (a) Partitions: pd FOR VALUES IN (DEFAULT), test_p1 FOR VALUES IN (4,
5) What about changing the Paritions output as below: Partitions: *pd
DEFAULT,* test_p1 FOR VALUES IN (4, 5)

7.
You need to handle tab completion for DEFAULT.
e.g.
If I partially type following command:
CREATE TABLE pd PARTITION OF test DEFA
and then press tab, I get following completion:
CREATE TABLE pd PARTITION OF test FOR VALUES

I did some primary testing and did not find any problem so far.

I will review and test further and let you know my comments.

Regards,
Jeevan Ladhe

On Thu, May 4, 2017 at 6:09 PM, Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> On Thu, May 4, 2017 at 5:14 PM, Rahila Syed 
> wrote:
>
>> The syntax implemented in this patch is as follows,
>>
>> CREATE TABLE p11 PARTITION OF p1 DEFAULT;
>>
>> Applied v9 patches, table description still showing old pattern of
> default partition. Is it expected?
>
> create table lpd (a int, b int, c varchar) partition by list(a);
> create table lpd_d partition of lpd DEFAULT;
>
> \d+ lpd
>  Table "public.lpd"
>  Column |   Type| Collation | Nullable | Default | Storage  |
> Stats target | Description
> +---+---+--+
> -+--+--+-
>  a  | integer   |   |  | | plain
> |  |
>  b  | integer   |   |  | | plain
> |  |
>  c  | character varying |   |  | | extended
> |  |
> Partition key: LIST (a)
> Partitions: lpd_d FOR VALUES IN (DEFAULT)
>

[HACKERS] what's up with IDENTIFIER_LOOKUP_EXPR?

2017-05-04 Thread Robert Haas

plpgsql has an enum called IdentifierLookup which includes a value
IDENTIFIER_LOOKUP_EXPR which is declared like this:

IDENTIFIER_LOOKUP_EXPR  /* In SQL expression --- special case */

It regrettably does not explain what exactly is special about it, and
AFAICT, neither does any other comment.  If I replace every usage of
IDENTIFIER_LOOKUP_EXPR with IDENTIFIER_LOOKUP_NORMAL, the regression
tests pass nonetheless.  It was introduced by
01f7d29902cb27fb698e5078d72cb837398e074c, committed by Tom in 2010:

commit 01f7d29902cb27fb698e5078d72cb837398e074c
Author: Tom Lane 
Date:   Sun Jan 10 17:15:18 2010 +

Improve plpgsql's handling of record field references by forcing
all potential
field references in SQL expressions to have RECFIELD datum-array entries at
parse time.  If it turns out that the reference is actually to a SQL column,
the RECFIELD entry is useless, but it costs little.  This allows
us to get rid
of the previous use of FieldSelect applied to a whole-row Param
for the record
variable; which was not only slower than a direct RECFIELD reference, but
failed for references to system columns of a trigger's NEW or OLD record.
Per report and fix suggestion from Dean Rasheed.

The rule, as far as I can tell from reading the code, is that
IDENTIFIER_LOOKUP_NORMAL looks up words, double-words (e.g. x.y), and
triple-words (e.g x.y.z), IDENTIFIER_LOOKUP_EXPR looks up only
double-words and triple-words, and IDENTIFIER_LOOKUP_DECLARE looks up
nothing.  But it's not clear to me exactly what the motivation for
that is. plpgsql_parse_word says:

/*
 * We should do nothing in DECLARE sections.  In SQL expressions, there's
 * no need to do anything either --- lookup will happen when the
 * expression is compiled.
 */

...but that doesn't really explain why we go out of our way to have
this third state, because the wording implies that it doesn't
particularly matter one way or the other.

Any help appreciated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix freeing of dangling IndexScanDesc.xs_hitup in GiST

2017-05-04 Thread Nikita Glukhov


On 04.05.2017 22:16, Tom Lane wrote:


Nikita Glukhov  writes:

In gistrescan() IndexScanDesc.xs_hitup is not reset after MemoryContextReset() 
of
so->queueCxt in which xs_hitup was allocated, then getNextNearest() tries to 
pfree()
dangling xs_hitup, which results in the reuse of this pointer and the 
subsequent crash.

Right.  I already did something about this, about an hour ago --- a
bit differently from your patch, but same idea.

regards, tom lane


Sorry that I'm not monitoring pgsql-bugs.

It might be interesting that I found this bug back in July 2016 when I
was experimenting with my KNN-btree implementation, but I failed to report
it because I could reproduce it only manually by a calling in a loop
gistrescan() and gistgettuple().

--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential issue with alter system

2017-05-04 Thread Tom Lane

"Joshua D. Drake"  writes:
> So I did this:

> postgres=# alter system set archive_command to 'rsynv -av %p 
> postgres@52.3.141.224:/data/archive/%f
> ';

> Note the new line. It properly created in postgresql.auto.conf:

> archive_command = 'rsynv -av %p postgres@52.3.141.224:/data/archive/%f
> '
> (note the new line)

Ugh.  That's broken --- we don't support newlines within values in
postgresql.conf files.

[ pokes around ... ]  HEAD seems to reject this properly:

regression=# alter system set archive_command to 'rsynv -av %p 
postgres@52.3.141.224:/data/archive/%f
regression'# ';
ERROR:  parameter value for ALTER SYSTEM must not contain a newline

> This is 9.5.2 (Yes, I know... I am in the process of setting up 
> replication to 9.5.6 for failover).

Ah.  [ digs further... ]  Yup, we fixed that in 9.5.3:

Author: Tom Lane 
Branch: master Release: REL9_6_BR [99f3b5613] 2016-04-04 18:05:23 -0400
Branch: REL9_5_STABLE Release: REL9_5_3 [f3d17491c] 2016-04-04 18:05:23 -0400
Branch: REL9_4_STABLE Release: REL9_4_8 [28148e258] 2016-04-04 18:05:24 -0400

Disallow newlines in parameter values to be set in ALTER SYSTEM.

As noted by Julian Schauder in bug #14063, the configuration-file parser
doesn't support embedded newlines in string literals.  While there might
someday be a good reason to remove that restriction, there doesn't seem
to be one right now.  However, ALTER SYSTEM SET could accept strings
containing newlines, since many of the variable-specific value-checking
routines would just see a newline as whitespace.  This led to writing a
postgresql.auto.conf file that was broken and had to be removed manually.

Pending a reason to work harder, just throw an error if someone tries this.

In passing, fix several places in the ALTER SYSTEM logic that failed to
provide an errcode() for an ereport(), and thus would falsely log the
failure as an internal XX000 error.

Back-patch to 9.4 where ALTER SYSTEM was introduced.

If you have other entries you want to keep in the postgresql.auto.conf
file, you could get away with manually editing it to remove the newline.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix freeing of dangling IndexScanDesc.xs_hitup in GiST

2017-05-04 Thread Tom Lane

Nikita Glukhov  writes:
> In gistrescan() IndexScanDesc.xs_hitup is not reset after 
> MemoryContextReset() of
> so->queueCxt in which xs_hitup was allocated, then getNextNearest() tries to 
> pfree()
> dangling xs_hitup, which results in the reuse of this pointer and the 
> subsequent crash.

Right.  I already did something about this, about an hour ago --- a
bit differently from your patch, but same idea.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json_agg produces nonstandard json

2017-05-04 Thread Tom Lane

Jordan Deitch  writes:
> However, I don't see consistency between the results of these two
> statements:

> select jsonb_agg((select 1 where false));
> select sum((select 1 where false));

Well, SUM() is defined to ignore null input values, which is not too
surprising as it couldn't do anything very useful with them.  So it ends
up deciding there are no input rows.  jsonb_agg() is defined to translate
null input values to JSON "null", which seems like a sane behavior to me
although I agree that they aren't exactly the same concept.
If you don't want that, you could suppress the null inputs with a FILTER
clause:

regression=# select jsonb_agg(x) from (values (1),(2),(null),(4)) v(x);
jsonb_agg
-
 [1, 2, null, 4]
(1 row)

regression=# select jsonb_agg(x) filter (where x is not null) from (values 
(1),(2),(null),(4)) v(x);
 jsonb_agg 
---
 [1, 2, 4]
(1 row)

regression=# select jsonb_agg(x) filter (where x is not null) from (values 
(null),(null),(null)) v(x);
 jsonb_agg 
---

(1 row)

We could perhaps invent a "jsonb_agg_strict()" variant that skips
nulls for you.  But I'd want to see multiple requests before
concluding that it was worth carrying such a function.  The FILTER
workaround seems good enough if it's an infrequent need.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Serge Rielau

I haven't been keeping close tabs either, but surely we still have to have
the optimization fence in (at least) all these cases:

* CTE contains INSERT/UPDATE/DELETE
* CTE contains SELECT FOR UPDATE/SHARE (else the set of rows that get
locked might change)
* CTE contains volatile functions

I'm willing to write off cases where, eg, a function should have been
marked volatile and was not. That's user error and there are plenty
of hazards of that kind already. But if the optimizer has reason
to know that discarding the fence might change any query side-effects,
it mustn't. Yes! +100
Cheers Serge

Re: [HACKERS] CTE inlining

2017-05-04 Thread Tom Lane

Tomas Vondra  writes:
> On 5/4/17 8:03 PM, Joe Conway wrote:
>>> I haven't been able to follow this incredibly long thread, so please
>>> excuse me if way off base, but are we talking about that a CTE would be
>>> silently be rewritten as an inline expression potentially unless it is
>>> decorated with some new syntax?

> I agree with this, but there's a difference between "executed exactly 
> once" and "producing the same result as if executed exactly once".

> I may be misunderstanding what other people proposed in this thread, but 
> I think the plan was to only inline CTEs where we know it won't change 
> the results, etc. So e.g. CTEs with volatile functions would not get 
> inlined, which includes nextval() for example.

I haven't been keeping close tabs either, but surely we still have to have
the optimization fence in (at least) all these cases:

* CTE contains INSERT/UPDATE/DELETE
* CTE contains SELECT FOR UPDATE/SHARE (else the set of rows that get
  locked might change)
* CTE contains volatile functions

I'm willing to write off cases where, eg, a function should have been
marked volatile and was not.  That's user error and there are plenty
of hazards of that kind already.  But if the optimizer has reason
to know that discarding the fence might change any query side-effects,
it mustn't.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication syntax (was DROP SUBSCRIPTION, query cancellations and slot handling)

2017-05-04 Thread Robert Haas

On Wed, May 3, 2017 at 12:38 AM, Petr Jelinek
 wrote:
> Ok, Let me be clear, I actually happen to agree with your proposal. The
> reason I am moaning is that I always seem to find myself doing tons of
> mechanical work to rewrite some cosmetic aspect of some patch based on
> which committer is paying attention in a specific week. So while I am
> for doing exactly what you proposed, I'd like to see couple of +1s first
> (Peter?) since I don't want to rewrite it to something different again
> and it's also long past freeze.

So, Tom Lane and Thom Brown and Josh Drake all seemed generally in
favor of cleaning this up.  Perhaps they could opine on this
particular proposal.

> To repeat the proposal:
> - change the WITH (...) clauses in subscription/publication commands to:
> (create_slot true/false, connect true/false, slot_name 'something',
> copy_data true/false, etc)
>
> - change the NOREFRESH to NO REFRESH in ALTER SUBSCRIPTION name SET
> PUBLICATION (btw I originally had SKIP REFRESH there but changed it to
> NOREFRESH for consistency with the other NO* stuff, wonder if SKIP would
> sound more english).
>
> - change the (publish insert/nopublish insert/publish update/nopublish
> update), etc options to (publish 'update,insert').
>
> And one question, if we are not using the definitions (key = value)
> should we keep the WITH keyword in the syntax, would it be confusing?

No opinion on that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json_agg produces nonstandard json

2017-05-04 Thread Jordan Deitch

Thank you for responding!

Good points.

However, I don't see consistency between the results of these two
statements:

select jsonb_agg((select 1 where false));
select sum((select 1 where false));

Therefore another option I would like to suggest is returning the same null
value-types for the sum() and json_agg().

So the select jsonb_agg((select 1 where false)); would return null as
opposed to [null]. In this case it would be compatible with coalesce()

---
Thanks
Jordan Deitch

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-04 Thread Alvaro Herrera

Here's a patch implementing this idea.  From gram.y's comment, the
support syntax is now:

  /*
   *
   *QUERY :
!  *CREATE STATISTICS stats_name [(stat types)] arguments
! 
!  *where 'arguments' can be one or more of:
!  *{ ON (columns)
!  *  | FROM relations
!  *  | WITH (options)
!  *  | WHERE expression }

Note that I removed the USING keyword in the stat types list, and also
made it mandatory that that list appears immediately after the new stats
name.  This should make it possible to have USING in the relation list
(the FROM clause), if we allow explicit multiple relations with join
syntax there.  The other options can appear in any order.

Also, both WITH and WHERE are accepted by the grammar, but immediately
throw "feature not implemented" error at parse time.

I was on the fence about adding copy/equal/out support for the new
StatisticArgument node; it seems pointless because that node does not
leave gram.y anyway.

Unless there are objections, I'll push this tomorrow.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
***
*** 1132,1138  WHERE tablename = 'road';
   To inspect functional dependencies on a statistics
   stts, you may do this:
  
! CREATE STATISTICS stts WITH (dependencies)
 ON (zip, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxname, stxkeys, stxdependencies
--- 1132,1138 
   To inspect functional dependencies on a statistics
   stts, you may do this:
  
! CREATE STATISTICS stts (dependencies)
 ON (zip, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxname, stxkeys, stxdependencies
***
*** 1219,1225  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 
AND b = 10;
   Continuing the above example, the n-distinct coefficients in a ZIP
   code table may look like the following:
  
! CREATE STATISTICS stts2 WITH (ndistinct)
 ON (zip, state, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxkeys AS k, stxndistinct AS nd
--- 1219,1225 
   Continuing the above example, the n-distinct coefficients in a ZIP
   code table may look like the following:
  
! CREATE STATISTICS stts2 (ndistinct)
 ON (zip, state, city) FROM zipcodes;
  ANALYZE zipcodes;
  SELECT stxkeys AS k, stxndistinct AS nd
*** a/doc/src/sgml/planstats.sgml
--- b/doc/src/sgml/planstats.sgml
***
*** 526,532  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND 
b = 1;
  multivariate statistics on the two columns:
  
  
! CREATE STATISTICS stts WITH (dependencies) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN  
 
--- 526,532 
  multivariate statistics on the two columns:
  
  
! CREATE STATISTICS stts (dependencies) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN  
 
***
*** 569,575  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY 
a, b;
  calculation, the estimate is much improved:
  
  DROP STATISTICS stts;
! CREATE STATISTICS stts WITH (dependencies, ndistinct) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
 QUERY PLAN 
   
--- 569,575 
  calculation, the estimate is much improved:
  
  DROP STATISTICS stts;
! CREATE STATISTICS stts (dependencies, ndistinct) ON (a, b) FROM t;
  ANALYZE t;
  EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
 QUERY PLAN 
   
*** a/doc/src/sgml/ref/create_statistics.sgml
--- b/doc/src/sgml/ref/create_statistics.sgml
***
*** 22,28  PostgreSQL documentation
   
  
  CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
! WITH ( option [= 
value] [, ... ] )
  ON ( column_name, 
column_name [, ...])
  FROM table_name
  
--- 22,28 
   
  
  CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
! [ ( statistic_type [, ... ] 
) ]
  ON ( column_name, 
column_name [, ...])
  FROM table_name
  
***
*** 75,80  CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
--- 75,93 
 
  
 
+ statistic_type
+ 
+  
+   A statistic type to be enabled for this statistics.  Currently
+   supported types are ndistinct, which enables
+   n-distinct coefficient tracking,
+   and

[HACKERS] Fix freeing of dangling IndexScanDesc.xs_hitup in GiST

2017-05-04 Thread Nikita Glukhov


Hello, hackers!

The last query in the following script crashes Postgres:

create table t (id serial, amount int);
insert into t (amount) select random() * 1000 from generate_series(1, 100);
create extension btree_gist;
create index t_gist_idx on t using gist(id, amount);

select p.id, p.amount, s.nearest
from t as p left join lateral
(
  select p.id, array_agg(l.id) as nearest from (
select id from t order by amount <-> p.amount limit 10
  ) l
) s using(id);

In gistrescan() IndexScanDesc.xs_hitup is not reset after MemoryContextReset() 
of
so->queueCxt in which xs_hitup was allocated, then getNextNearest() tries to 
pfree()
dangling xs_hitup, which results in the reuse of this pointer and the 
subsequent crash.

Attached patches fix this bug introduced in commit
d04c8ed9044eccebce043143a930617e3998c005 "Add support for index-only scans in 
GiST".
The bug is present in v9.5, v9.6, v10.0.

--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 2526a39..580b6ac 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -148,6 +148,11 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 		/* third or later time through */
 		MemoryContextReset(so->queueCxt);
 		first_time = false;
+		/*
+		 * scan->xs_itup is allocated in so->queueCxt and now it is invalid,
+		 * so we need to reset it to prevent it from freeing in getNextNearest().
+		 */
+		scan->xs_itup = NULL;
 	}
 
 	/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 81ff8fc..5e10ada 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -148,6 +148,11 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 		/* third or later time through */
 		MemoryContextReset(so->queueCxt);
 		first_time = false;
+		/*
+		 * scan->xs_hitup is allocated in so->queueCxt and now it is invalid,
+		 * so we need to reset it to prevent it from freeing in getNextNearest().
+		 */
+		scan->xs_hitup = NULL;
 	}
 
 	/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Potential issue with alter system

2017-05-04 Thread Joshua D. Drake


Folks,

So I did this:

postgres=# alter system set archive_command to 'rsynv -av %p 
postgres@52.3.141.224:/data/archive/%f

';

Note the new line. It properly created in postgresql.auto.conf:

archive_command = 'rsynv -av %p postgres@52.3.141.224:/data/archive/%f
'
(note the new line)

I noticed I spelled rsync wrong and now I get this:

postgres=# alter system set archive_command to 'rsync -av %p 
postgres@52.3.141.224:/data/archive/%f'

;
ERROR:  could not parse contents of file "postgresql.auto.conf"

This is 9.5.2 (Yes, I know... I am in the process of setting up 
replication to 9.5.6 for failover).


JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Tomas Vondra




On 5/4/17 8:03 PM, Joe Conway wrote:

On 05/04/2017 10:56 AM, Andrew Dunstan wrote:



On 05/04/2017 01:52 PM, Joe Conway wrote:

On 05/04/2017 10:33 AM, Alvaro Herrera wrote:

I'm not sure what your point is.  We know that for some cases the
optimization barrier semantics are useful, which is why the proposal is
to add a keyword to install one explicitely:

  with materialized r as
  (
 select json_populate_record(null::mytype, myjson) as x
 from mytable
  )
  select (x).*
  from r;

this would preserve the current semantics.

I haven't been able to follow this incredibly long thread, so please
excuse me if way off base, but are we talking about that a CTE would be
silently be rewritten as an inline expression potentially unless it is
decorated with some new syntax?

I would find that very disconcerting myself. For example, would this CTE
potentially get rewritten with multiple evaluation as follows?

DROP SEQUENCE IF EXISTS foo_seq;
CREATE SEQUENCE foo_seq;

WITH a(f1) AS (SELECT nextval('foo_seq'))
SELECT a.f1, a.f1 FROM a;
  f1 | ?column?
+--
   1 |1
(1 row)

ALTER SEQUENCE foo_seq RESTART;
SELECT nextval('foo_seq'), nextval('foo_seq');
  nextval | ?column?
-+--
1 |2
(1 row)



I think that would be a change in semantics, which we should definitely
not be getting. Avoiding a change in semantics might be an interesting
exercise, but we have lots of clever coders ...


Well I think my point is that I always have understood CTEs to be
executed precisely once producing a temporary result set that is then
referenced elsewhere. I don't think that property of CTEs should change.
Somewhere else in the thread someone mentioned predicate push down --
that makes sense and maybe some clever coder can come up with a patch
that does that, but I would not be in favor of CTEs being inlined and
therefore evaluated multiple times.



I agree with this, but there's a difference between "executed exactly 
once" and "producing the same result as if executed exactly once".


I may be misunderstanding what other people proposed in this thread, but 
I think the plan was to only inline CTEs where we know it won't change 
the results, etc. So e.g. CTEs with volatile functions would not get 
inlined, which includes nextval() for example.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Tom Lane

Alvaro Herrera  writes:
> I'm not sure what your point is.  We know that for some cases the
> optimization barrier semantics are useful, which is why the proposal is
> to add a keyword to install one explicitely:

>  with materialized r as
>  (
> select json_populate_record(null::mytype, myjson) as x
> from mytable
>  )
>  select (x).*
>  from r;

> this would preserve the current semantics.

Uh, no, it wouldn't, until you had changed your query (and made it
non-SQL-compliant, and unable to work at all on pre-v11 PG).
This might be a good design, but surely it does not provide backwards
compatibility.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reducing runtime of stats regression test

2017-05-04 Thread Tom Lane

Robert Haas  writes:
> On Thu, May 4, 2017 at 10:22 AM, Tom Lane  wrote:
>> Yes, but that would be getting into the realm of new features, not
>> post-feature-freeze test adjustments.  It certainly couldn't be
>> a candidate for back-patching.

> I'm not sure there's some bright line between adding a new
> SQL-callable function to cut down the test time and any other
> tinkering we might do to reduce the regression test time.  I think
> there's a pretty good argument that all of the recent changes you made
> in this area constitute strictly optional tinkering.  I'm haven't been
> objecting because they don't seem likely to destabilize anything, but
> I don't see that they're really helping us get ready for beta either,
> which is presumably what we ought to be focusing on at this point.

Well, to my mind, making the regression tests faster is something that
could be quite helpful during beta, because a lot of people will be
running them.  (Or so we hope, at least.)  If the fact that e.g.
the recovery tests take a lot of time discourages people from running
them, that can't be a good thing for beta purposes.  So I respectfully
reject your opinion about what I should be working on.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json_agg produces nonstandard json

2017-05-04 Thread Tom Lane

Jordan Deitch  writes:
> A json(b)_agg() will produce the following result when no results are
> passed in:
> "[null]"
> per:
> select jsonb_agg((select 1 where false));

Looks fine to me.

> I believe, generally speaking, '[]' would be the more appropriate output.

Why?  What you gave it was one null value.  An empty array result would
imply that there were zero inputs, which is wrong.

Perhaps you're confused about the way scalar sub-selects work?  The
above is equivalent to "select jsonb_agg(null::integer)"; it's not
the same as

# select jsonb_agg(1) where false;
 jsonb_agg 
---

(1 row)

Now you could legitimately argue that this case, where there are zero
input rows, should produce '[]' rather than a SQL null.  But I think we
had that discussion already, and agreed that this behavior is more in
keeping with the behavior of SQL's standard aggregates, notably SUM().
You can use coalesce() to inject '[]' (or whatever result you want) for
the no-rows case:

# select coalesce(jsonb_agg(1), '[]') where false;
 coalesce 
--
 []
(1 row)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Joe Conway

On 05/04/2017 10:56 AM, Andrew Dunstan wrote:
> 
> 
> On 05/04/2017 01:52 PM, Joe Conway wrote:
>> On 05/04/2017 10:33 AM, Alvaro Herrera wrote:
>>> I'm not sure what your point is.  We know that for some cases the
>>> optimization barrier semantics are useful, which is why the proposal is
>>> to add a keyword to install one explicitely:
>>>
>>>  with materialized r as
>>>  (
>>> select json_populate_record(null::mytype, myjson) as x
>>> from mytable
>>>  )
>>>  select (x).*
>>>  from r;
>>>
>>> this would preserve the current semantics.
>> I haven't been able to follow this incredibly long thread, so please
>> excuse me if way off base, but are we talking about that a CTE would be
>> silently be rewritten as an inline expression potentially unless it is
>> decorated with some new syntax?
>>
>> I would find that very disconcerting myself. For example, would this CTE
>> potentially get rewritten with multiple evaluation as follows?
>>
>> DROP SEQUENCE IF EXISTS foo_seq;
>> CREATE SEQUENCE foo_seq;
>>
>> WITH a(f1) AS (SELECT nextval('foo_seq'))
>> SELECT a.f1, a.f1 FROM a;
>>  f1 | ?column?
>> +--
>>   1 |1
>> (1 row)
>>
>> ALTER SEQUENCE foo_seq RESTART;
>> SELECT nextval('foo_seq'), nextval('foo_seq');
>>  nextval | ?column?
>> -+--
>>1 |2
>> (1 row)
>>
> 
> I think that would be a change in semantics, which we should definitely
> not be getting. Avoiding a change in semantics might be an interesting
> exercise, but we have lots of clever coders ...

Well I think my point is that I always have understood CTEs to be
executed precisely once producing a temporary result set that is then
referenced elsewhere. I don't think that property of CTEs should change.
Somewhere else in the thread someone mentioned predicate push down --
that makes sense and maybe some clever coder can come up with a patch
that does that, but I would not be in favor of CTEs being inlined and
therefore evaluated multiple times.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] CTE inlining

2017-05-04 Thread Tomas Vondra


On 5/4/17 7:56 PM, Andrew Dunstan wrote:



On 05/04/2017 01:52 PM, Joe Conway wrote:

On 05/04/2017 10:33 AM, Alvaro Herrera wrote:

I'm not sure what your point is.  We know that for some cases the
optimization barrier semantics are useful, which is why the proposal is
to add a keyword to install one explicitely:

  with materialized r as
  (
 select json_populate_record(null::mytype, myjson) as x
 from mytable
  )
  select (x).*
  from r;

this would preserve the current semantics.

I haven't been able to follow this incredibly long thread, so please
excuse me if way off base, but are we talking about that a CTE would be
silently be rewritten as an inline expression potentially unless it is
decorated with some new syntax?

I would find that very disconcerting myself. For example, would this CTE
potentially get rewritten with multiple evaluation as follows?

DROP SEQUENCE IF EXISTS foo_seq;
CREATE SEQUENCE foo_seq;

WITH a(f1) AS (SELECT nextval('foo_seq'))
SELECT a.f1, a.f1 FROM a;
  f1 | ?column?
+--
   1 |1
(1 row)

ALTER SEQUENCE foo_seq RESTART;
SELECT nextval('foo_seq'), nextval('foo_seq');
  nextval | ?column?
-+--
1 |2
(1 row)





I think that would be a change in semantics, which we should definitely
not be getting. Avoiding a change in semantics might be an interesting
exercise, but we have lots of clever coders ...



nextval is a volatile function, and in my understanding the plan was not 
to inline CTEs with volatile functions (or CTEs doing writes etc.). That 
is, we'd guarantee the same results as we get now.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Andrew Dunstan



On 05/04/2017 01:52 PM, Joe Conway wrote:
> On 05/04/2017 10:33 AM, Alvaro Herrera wrote:
>> I'm not sure what your point is.  We know that for some cases the
>> optimization barrier semantics are useful, which is why the proposal is
>> to add a keyword to install one explicitely:
>>
>>  with materialized r as
>>  (
>> select json_populate_record(null::mytype, myjson) as x
>> from mytable
>>  )
>>  select (x).*
>>  from r;
>>
>> this would preserve the current semantics.
> I haven't been able to follow this incredibly long thread, so please
> excuse me if way off base, but are we talking about that a CTE would be
> silently be rewritten as an inline expression potentially unless it is
> decorated with some new syntax?
>
> I would find that very disconcerting myself. For example, would this CTE
> potentially get rewritten with multiple evaluation as follows?
>
> DROP SEQUENCE IF EXISTS foo_seq;
> CREATE SEQUENCE foo_seq;
>
> WITH a(f1) AS (SELECT nextval('foo_seq'))
> SELECT a.f1, a.f1 FROM a;
>  f1 | ?column?
> +--
>   1 |1
> (1 row)
>
> ALTER SEQUENCE foo_seq RESTART;
> SELECT nextval('foo_seq'), nextval('foo_seq');
>  nextval | ?column?
> -+--
>1 |2
> (1 row)
>



I think that would be a change in semantics, which we should definitely
not be getting. Avoiding a change in semantics might be an interesting
exercise, but we have lots of clever coders ...

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Joe Conway

On 05/04/2017 10:33 AM, Alvaro Herrera wrote:
> I'm not sure what your point is.  We know that for some cases the
> optimization barrier semantics are useful, which is why the proposal is
> to add a keyword to install one explicitely:
> 
>  with materialized r as
>  (
> select json_populate_record(null::mytype, myjson) as x
> from mytable
>  )
>  select (x).*
>  from r;
> 
> this would preserve the current semantics.

I haven't been able to follow this incredibly long thread, so please
excuse me if way off base, but are we talking about that a CTE would be
silently be rewritten as an inline expression potentially unless it is
decorated with some new syntax?

I would find that very disconcerting myself. For example, would this CTE
potentially get rewritten with multiple evaluation as follows?

DROP SEQUENCE IF EXISTS foo_seq;
CREATE SEQUENCE foo_seq;

WITH a(f1) AS (SELECT nextval('foo_seq'))
SELECT a.f1, a.f1 FROM a;
 f1 | ?column?
+--
  1 |1
(1 row)

ALTER SEQUENCE foo_seq RESTART;
SELECT nextval('foo_seq'), nextval('foo_seq');
 nextval | ?column?
-+--
   1 |2
(1 row)

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] WIP Patch: Precalculate stable functions, infrastructure v1

2017-05-04 Thread Marina Polyakova


and here I send infrastructure patch which includes <...>


Next 2 patches:

Patch 'planning and execution', which includes:
- replacement nonvolatile functions and operators by appropriate cached 
expressions;

- planning and execution cached expressions;
- regression tests.

Patch 'costs', which includes cost changes for cached expressions 
(according to their behaviour).


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom cf446cbfc8625701f9e3f32d1870b47de869802a Mon Sep 17 00:00:00 2001
From: Marina Polyakova 
Date: Thu, 4 May 2017 19:36:05 +0300
Subject: [PATCH 3/3] Precalculate stable functions, costs v1

Now in Postgresql only immutable functions are precalculated; stable functions
are calculated for every row so in fact they don't differ from volatile
functions.

This patch includes:
- cost changes for cached expressions (according to their behaviour)
---
 src/backend/optimizer/path/costsize.c | 58 ---
 1 file changed, 53 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 52643d0..34707fa 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -140,6 +140,7 @@ static MergeScanSelCache *cached_scansel(PlannerInfo *root,
 			   PathKey *pathkey);
 static void cost_rescan(PlannerInfo *root, Path *path,
 			Cost *rescan_startup_cost, Cost *rescan_total_cost);
+static double cost_eval_cacheable_expr_per_tuple(Node *node);
 static bool cost_qual_eval_walker(Node *node, cost_qual_eval_context *context);
 static void get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
 		  ParamPathInfo *param_info,
@@ -3464,6 +3465,44 @@ cost_qual_eval_node(QualCost *cost, Node *qual, PlannerInfo *root)
 	*cost = context.total;
 }
 
+/*
+ * cost_eval_cacheable_expr_per_tuple
+ *		Evaluate per tuple cost for expressions that can be cacheable.
+ *
+ * This function was created to not duplicate code for some expression and
+ * cached some expression.
+ */
+static double
+cost_eval_cacheable_expr_per_tuple(Node *node)
+{
+	double result;
+
+	/*
+	 * For each operator or function node in the given tree, we charge the
+	 * estimated execution cost given by pg_proc.procost (remember to multiply
+	 * this by cpu_operator_cost).
+	 */
+	if (IsA(node, FuncExpr))
+	{
+		result = get_func_cost(((FuncExpr *) node)->funcid) * cpu_operator_cost;
+	}
+	else if (IsA(node, OpExpr))
+	{
+		OpExpr *opexpr = (OpExpr *) node;
+
+		/* rely on struct equivalence to treat these all alike */
+		set_opfuncid(opexpr);
+
+		result = get_func_cost(opexpr->opfuncid) * cpu_operator_cost;
+	}
+	else
+	{
+		elog(ERROR, "non cacheable expression node type: %d", (int) nodeTag(node));
+	}
+
+	return result;
+}
+
 static bool
 cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 {
@@ -3537,13 +3576,22 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 	 * moreover, since our rowcount estimates for functions tend to be pretty
 	 * phony, the results would also be pretty phony.
 	 */
-	if (IsA(node, FuncExpr))
+	if (IsA(node, FuncExpr) ||
+		IsA(node, OpExpr))
 	{
-		context->total.per_tuple +=
-			get_func_cost(((FuncExpr *) node)->funcid) * cpu_operator_cost;
+		context->total.per_tuple += cost_eval_cacheable_expr_per_tuple(node);
+	}
+	else if (IsA(node, CachedExpr))
+	{	
+		/* 
+		 * Calculate subexpression cost per tuple as usual and add it to startup
+		 * cost (because subexpression will be executed only once for all
+		 * tuples).
+		 */
+		context->total.startup += cost_eval_cacheable_expr_per_tuple(
+			get_subexpr((CachedExpr *) node));
 	}
-	else if (IsA(node, OpExpr) ||
-			 IsA(node, DistinctExpr) ||
+	else if (IsA(node, DistinctExpr) ||
 			 IsA(node, NullIfExpr))
 	{
 		/* rely on struct equivalence to treat these all alike */
-- 
1.9.1

From 508f8b959ff9d1ab78dfc79ab4657b4c10a11690 Mon Sep 17 00:00:00 2001
From: Marina Polyakova 
Date: Thu, 4 May 2017 19:09:51 +0300
Subject: [PATCH 2/3] Precalculate stable functions, planning and execution v1

Now in Postgresql only immutable functions are precalculated; stable functions
are calculated for every row so in fact they don't differ from volatile
functions.

This patch includes:
- replacement nonvolatile functions and operators by appropriate cached
expressions
- planning and execution cached expressions
- regression tests
---
 src/backend/executor/execExpr.c|  70 ++
 src/backend/executor/execExprInterp.c  | 191 +
 src/backend/optimizer/path/allpaths.c  |   9 +-
 src/backend/optimizer/path/clausesel.c |  13 +
 src/backend/optimizer/plan/planagg.c   |   1 +
 src/backend/optimizer/plan/planner.c   |  28 +
 src/backend/optimizer/util/clauses.c   |  43 ++
 src/backend/utils/adt/ruleutils.c

Re: [HACKERS] CTE inlining

2017-05-04 Thread Andrew Dunstan



On 05/04/2017 01:33 PM, Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>
>> Hadn't though about LATERAL, good point. Still, there will be other cases.
> I'm not sure what your point is.  We know that for some cases the
> optimization barrier semantics are useful, which is why the proposal is
> to add a keyword to install one explicitely:
>
>  with materialized r as
>  (
> select json_populate_record(null::mytype, myjson) as x
> from mytable
>  )
>  select (x).*
>  from r;
>
> this would preserve the current semantics.
>


I understand. Some people appear to think hardly anyone will actually
need this - that's what I was arguing against.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-04 Thread Alvaro Herrera

Andrew Dunstan wrote:

> Hadn't though about LATERAL, good point. Still, there will be other cases.

I'm not sure what your point is.  We know that for some cases the
optimization barrier semantics are useful, which is why the proposal is
to add a keyword to install one explicitely:

 with materialized r as
 (
select json_populate_record(null::mytype, myjson) as x
from mytable
 )
 select (x).*
 from r;

this would preserve the current semantics.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] json_agg produces nonstandard json

2017-05-04 Thread Jordan Deitch

Hello!

I apologize in advanced if this has been previously discussed;

A json(b)_agg() will produce the following result when no results are
passed in:

"[null]"

per:

select jsonb_agg((select 1 where false));

I believe, generally speaking, '[]' would be the more appropriate output.

Would postgres welcome a patch to handle the empty case of json(b)_agg?

Thanks!

---
Jordan Deitch

Re: [HACKERS] CTE inlining

2017-05-04 Thread Andrew Dunstan



On 05/04/2017 12:34 PM, David G. Johnston wrote:
> On Thu, May 4, 2017 at 9:22 AM, Andrew Dunstan
>  >wrote:
>
>
> Yeah, the idea that this won't cause possibly significant pain is
> quite wrong. Quite by accident I came across an example just this
> morning where rewriting as a CTE makes a big improvement.
>
> I wrote this query:
>
> select (json_populate_record(null::mytype, myjson)).*
> from mytable;
>
>
> It turned out that this was an order of magnitude faster:
>
> with r as
> (
>select json_populate_record(null::mytype, myjson) as x
>from mytable
> )
> select (x).*
> from r;
>
>
> Except I suspect we at least have a chance to detect the above and
> not de-optimize it by evaluating "json_populate_record" once for every
> column in mytype.
>
> The now idiomatic solution to the above is to use LATERAL so the
> above CTE is no longer actually a required workaround.


Hadn't though about LATERAL, good point. Still, there will be other cases.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

2017-05-04 Thread Robert Haas

On Thu, May 4, 2017 at 1:04 PM, Amit Kapila  wrote:
> On Thu, May 4, 2017 at 10:18 PM, Robert Haas  wrote:
>> On Thu, May 4, 2017 at 12:18 PM, Amit Kapila  wrote:
>>> As soon as the first command fails due to timeout, we will set
>>> 'abort_cleanup_failure' which will make a toplevel transaction to
>>> abort and also won't allow other statements to execute.  The patch is
>>> trying to enforce a 30-second timeout after statement execution, so it
>>> has to anyway wait till the command execution completes (irrespective
>>> of whether the command succeeds or fail).
>>
>> I don't understand what you mean by this.  If the command doesn't
>> finish within 30 seconds, we *don't* wait for it to finish.
>>
>
> + /*
> + * Submit a query.  Since we don't use non-blocking mode, this also can
> + * block.  But its risk is relatively small, so we ignore that for now.
> + */
> + if (!PQsendQuery(conn, query))
> + {
> + pgfdw_report_error(WARNING, NULL, conn, false, query);
> + return false;
> + }
> +
> + /* Get the result of the query. */
> + if (pgfdw_get_cleanup_result(conn, endtime, ))
> + return false;
>
> The 30 seconds logic is in function pgfdw_get_cleanup_result, can't
> the command hang in PQsendQuery?

Sure.  I thought about trying to fix that too, by using
PQsetnonblocking(), but I thought the patch was doing enough already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

2017-05-04 Thread Amit Kapila

On Thu, May 4, 2017 at 10:18 PM, Robert Haas  wrote:
> On Thu, May 4, 2017 at 12:18 PM, Amit Kapila  wrote:
>> As soon as the first command fails due to timeout, we will set
>> 'abort_cleanup_failure' which will make a toplevel transaction to
>> abort and also won't allow other statements to execute.  The patch is
>> trying to enforce a 30-second timeout after statement execution, so it
>> has to anyway wait till the command execution completes (irrespective
>> of whether the command succeeds or fail).
>
> I don't understand what you mean by this.  If the command doesn't
> finish within 30 seconds, we *don't* wait for it to finish.
>

+ /*
+ * Submit a query.  Since we don't use non-blocking mode, this also can
+ * block.  But its risk is relatively small, so we ignore that for now.
+ */
+ if (!PQsendQuery(conn, query))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
+ /* Get the result of the query. */
+ if (pgfdw_get_cleanup_result(conn, endtime, ))
+ return false;

The 30 seconds logic is in function pgfdw_get_cleanup_result, can't
the command hang in PQsendQuery?



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reducing runtime of stats regression test

2017-05-04 Thread Robert Haas

On Thu, May 4, 2017 at 10:22 AM, Tom Lane  wrote:
> Yes, but that would be getting into the realm of new features, not
> post-feature-freeze test adjustments.  It certainly couldn't be
> a candidate for back-patching.

I'm not sure there's some bright line between adding a new
SQL-callable function to cut down the test time and any other
tinkering we might do to reduce the regression test time.  I think
there's a pretty good argument that all of the recent changes you made
in this area constitute strictly optional tinkering.  I'm haven't been
objecting because they don't seem likely to destabilize anything, but
I don't see that they're really helping us get ready for beta either,
which is presumably what we ought to be focusing on at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 138 matches

Mail list logo