Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

2017-03-19 Thread David Rowley
On 17 March 2017 at 00:04, Amit Kapila  wrote:
> On Thu, Mar 16, 2017 at 7:22 AM, David Rowley
>  wrote:
>> On 16 March 2017 at 13:31, Simon Riggs  wrote:
>>>
>>> On 8 March 2017 at 10:02, David Rowley 
>>> wrote:
>>> > On 8 March 2017 at 01:13, Simon Riggs  wrote:
>>> >> Don't understand this. I'm talking about setting a flag on
>>> >> commit/abort WAL records, like the attached.
>>> >
>>> > There's nothing setting a flag in the attached. I only see the
>>> > checking of the flag.
>>>
>>> Yeh, it wasn't a full patch, just a way marker to the summit for you.
>>>
>>> >> We just need to track info so we can set the flag at EOXact and we're
>>> >> done.
>>> >
>>> > Do you have an idea where to store that information? I'd assume we'd
>>> > want to store something during LogAccessExclusiveLocks() and check
>>> > that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
>>> > anything existing which might help with that today. Do you think I
>>> > should introduce some new global variable for that? Or do you propose
>>> > calling GetRunningTransactionLocks() again while generating the
>>> > Commit/Abort record?
>>>
>>> Seemed easier to write it than explain further. Please see what you think.
>>
>>
>> Thanks. I had been looking for some struct to store the flag in. I'd not
>> considered just adding a new global variable.
>>
>> I was in fact just working on this again earlier, and I had come up with the
>> attached.
>>
>
> I think this will not work if you take AEL in one of the
> subtransaction and then commit the transaction.  The reason is that
> you are using CurrentTransactionState to remember AEL locks and during
> commit (at the time of XactLogCommitRecord) only top level
> CurrentTransactionState will be available which is not same as
> subtransactions CurrentTransactionState.  You can try with a simple
> test like below:
>
> Create table t1(c1 int);
> Begin;
> insert into t1 values(1);
> savepoint s1;
> insert into t1 values(2);
> savepoint s2;
> insert into t1 values(3);
> Lock t1 in Access Exclusive mode;
> commit;
>
> Now, we might want to store this info in top level transaction state
> to make such cases work, but I am not sure if it is better than what
> Simon has proposed in his approach.  Although, I have not evaluated in
> detail, but it seems Simon's patch looks better way to solve this
> problem (In his patch, there is an explicit handling of two-phase
> commits which seems more robust).
>
> Few other comments on patch:
> 1.
> @@ -5325,6 +5341,9 @@ XactLogAbortRecord(TimestampTz abort_time,
>   if (xl_xinfo.xinfo &
> XACT_XINFO_HAS_TWOPHASE)
>   XLogRegisterData((char *) (_twophase), sizeof
> (xl_xact_twophase));
>
> + if (CurrentTransactionState->didLogAELock)
> + xl_xinfo.xinfo |=
> XACT_XINFO_HAS_AE_LOCKS;
>
> You need to set xl_info before the below line in XactLogAbortRecord()
> to get it logged properly.
> if (xl_xinfo.xinfo != 0)
> info |= XLOG_XACT_HAS_INFO;
>
> 2.
> Explicitly Initialize didLogAELock in StartTransaction as we do for didLogXid.

Thanks for looking over this Amit. I'll drop my version of the patch
and make the changes described by Simon about moving
MyXactAccessedTempRel and MyXactAcquiredAccessExclusiveLock into a
bitmap fiags variable.

Thanks for confirming my concerns about the subxacts. I was confused
over how it was meant to work due to StandbyReleaseLockTree() having
some dead code.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

2017-03-19 Thread David Rowley
On 18 March 2017 at 21:52, Simon Riggs  wrote:
> 2. In LogAccessExclusiveLock() we can use GetCurrentTransactionId()
> rather than GetTopTransactionId(), so that we assign the lock to the
> subxid rather than the top xid. That could increase lock traffic, but
> less likely. It also solves the problem of early release when AELs
> held by subxids.
>
> (2) looks safe enough, so patch attached.

I might be missing something here, but what's the point in doing this
if we're not releasing the lock any earlier?

Going by the other patch you posted we're only recording if the top
level xact has an AEL, so we'll never try to release the locks at the
end of the subxact, since the new flag bit won't be set when the
subxact ends.

Seems it'll just be better to remove the dead code from
StandbyReleaseLockTree(), as that'll just mean one pass over the
RecoveryLockList when it comes to releasing it at the end the top
level transaction. Probably we'd want to just move all of
StandbyReleaseLocks()'s code into StandbyReleaseLockTree(), since the
StandbyReleaseLockTree() would otherwise just end up calling the
static function.

Probably I misunderstood this again. Let me know if that's the case.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

2017-03-19 Thread Craig Ringer
On 14 March 2017 at 19:57, Robert Haas  wrote:
> On Mon, Mar 13, 2017 at 10:22 PM, Craig Ringer  wrote:
>> I'll introduce a new LWLock, ClogTruncationLock, which will be held
>> from when we advance the new clogOldestXid field through to when clog
>> truncation completes.
>
> +1.

OK, here's the revised patch. Relevant comments added where needed.

It still changes the xlog record for clog truncation to carry the new
xid, but I've left advancing oldestXid until the checkpoint as is
currently the case, and only advanced oldestClogXid when we replay
clog truncation.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From f2280911c8ba9d4ce75d72948aef2acd4821f613 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 23 Jan 2017 13:25:30 +0800
Subject: [PATCH 1/3] Fix race between clog truncation and lookup

There was previously no way to look up an arbitrary xid without
running the risk of having clog truncated out from under you.

This hasn 't been a problem because anything looking up xids in clog knows
they're protected by datminxid, but that's not the case for arbitrary
user-supplied XIDs. clog is truncated before we advanced oldestXid under
XidGenLock, so holding XidGenLock during a clog lookup is insufficient to
prevent the race. There's no way to look up a SLRU with soft-failure;
attempting a lookup produces an I/O error. There's also no safe way to trap and
swallow the SLRU lookup error due mainly to locking issues.

To address this, introduce a copy of oldestXid, oldestClogXid, that is advanced
before clog truncation under ClogTruncationLock. Lookups of arbitrary XIDs
must take and hold ClogTruncationLock to prevent concurrent advance of the
minimum valid xid in clog.

This race also exists in a worse form on standby servers. On a standby we only
advance oldestXid when we replay the next checkpoint, so there's a much larger
window between clog truncation and subsequent updating of the limit. Fix this
by recording the oldest xid in clog truncation records and applying the update
to oldestClogXid under ClogTruncationLock before replaying the clog truncation.

Note that there's no need to take ClogTruncationLock for normal clog lookups
protected by datfrozenxid, only if accepting arbitrary XIDs that might not be
protected by vacuum thresholds.
---
 src/backend/access/rmgrdesc/clogdesc.c   | 12 +++--
 src/backend/access/transam/clog.c| 46 +---
 src/backend/access/transam/transam.c | 10 +--
 src/backend/access/transam/varsup.c  | 23 +++-
 src/backend/access/transam/xlog.c| 11 
 src/backend/commands/vacuum.c|  2 +-
 src/backend/storage/lmgr/lwlocknames.txt |  1 +
 src/include/access/clog.h|  8 +-
 src/include/access/transam.h |  7 +
 9 files changed, 103 insertions(+), 17 deletions(-)

diff --git a/src/backend/access/rmgrdesc/clogdesc.c b/src/backend/access/rmgrdesc/clogdesc.c
index 352de48..ef268c5 100644
--- a/src/backend/access/rmgrdesc/clogdesc.c
+++ b/src/backend/access/rmgrdesc/clogdesc.c
@@ -23,12 +23,20 @@ clog_desc(StringInfo buf, XLogReaderState *record)
 	char	   *rec = XLogRecGetData(record);
 	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
 
-	if (info == CLOG_ZEROPAGE || info == CLOG_TRUNCATE)
+	if (info == CLOG_ZEROPAGE)
 	{
 		int			pageno;
 
 		memcpy(, rec, sizeof(int));
-		appendStringInfo(buf, "%d", pageno);
+		appendStringInfo(buf, "page %d", pageno);
+	}
+	else if (info == CLOG_TRUNCATE)
+	{
+		xl_clog_truncate xlrec;
+
+		memcpy(, rec, sizeof(xl_clog_truncate));
+		appendStringInfo(buf, "page %d; oldestXact %u",
+			xlrec.pageno, xlrec.oldestXact);
 	}
 }
 
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 5b1d13d..2d33510 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -83,7 +83,8 @@ static SlruCtlData ClogCtlData;
 static int	ZeroCLOGPage(int pageno, bool writeXlog);
 static bool CLOGPagePrecedes(int page1, int page2);
 static void WriteZeroPageXlogRec(int pageno);
-static void WriteTruncateXlogRec(int pageno);
+static void WriteTruncateXlogRec(int pageno, TransactionId oldestXact,
+ Oid oldestXidDb);
 static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
 		   TransactionId *subxids, XidStatus status,
 		   XLogRecPtr lsn, int pageno);
@@ -640,7 +641,7 @@ ExtendCLOG(TransactionId newestXact)
  * the XLOG flush unless we have confirmed that there is a removable segment.
  */
 void
-TruncateCLOG(TransactionId oldestXact)
+TruncateCLOG(TransactionId oldestXact, Oid oldestxid_datoid)
 {
 	int			cutoffPage;
 
@@ -654,8 +655,26 @@ TruncateCLOG(TransactionId oldestXact)
 	if (!SlruScanDirectory(ClogCtl, SlruScanDirCbReportPresence, ))
 		return;	/* nothing to remove */
 
-	/* Write XLOG record 

Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-19 Thread Andres Freund
Hi,


On 2017-03-19 23:55:50 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I've been pondering if we can't entirely get rid of CaseTest etc, the
> > amount of hackery required seems not like a good thing. One way I'd
> > prototyped was to replace them with PARAM_EXEC nodes - then the whole
> > issue of them potentially having different values at different parts of
> > an expression vanishes because the aliasing is removed.
> 
> Yes, replacing all of that with Param slots had occurred to me too.
> We might want to keep the special parse node types for convenience in
> reverse-listing, but having them act just like PARAM_EXEC for execution
> purposes seems promising.

As long as that special parse-time node is part of the same value
numbering, that makes sense (could just name make it a subtype of param
ala PARAM_CASE).  I don't think we actually do anything useful in
ruleutils etc with either CaseTest or CoerceToDomainValue.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect and hash indexes

2017-03-19 Thread Amit Kapila
On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma  wrote:
> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila  wrote:
>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma  
>> wrote:
>>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes  wrote:
 While trying to figure out some bloating in the newly logged hash indexes,
 I'm looking into the type of each page in the index.  But I get an error:

 psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) 
 from
 generate_series(1650,1650) f(x)"

 ERROR:  page is not a hash page
 DETAIL:  Expected ff80, got .

 The contents of the page are:

 \xa400d8f203bf65c91800f01ff01f0420...

 (where the elided characters at the end are all zero)

 What kind of page is that actually?
>>>
>>> it is basically either a newly allocated bucket page or a freed overflow 
>>> page.
>>>
>>
>> What makes you think that it can be a newly allocated page?
>> Basically, we always initialize the special space of newly allocated
>> page, so not sure what makes you deduce that it can be newly allocated
>> page.
>
> I came to know this from the following experiment.
>
> I  created a hash index and kept on inserting data in it till the split 
> happens.
>
> When split happened, I could see following values for firstblock and
> lastblock in _hash_alloc_buckets()
>
> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
> nblocks=32) at hashpage.c:993
> (gdb) n
> (gdb) pfirstblock
> $15 = 34
> (gdb) pnblocks
> $16 = 32
> (gdb) n
> (gdb) plastblock
> $17 = 65
>
> AFAIU, this bucket split resulted in creation of new bucket pages from
> block number 34 to 65.
>
> The contents for metap are as follows,
>
> (gdb) p*metap
> $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples =
> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
> hashm_bmshift = 15,
>   hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31,
> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
> hashm_procid = 450,
>   hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 },
> hashm_mapp = {33,0 }}
>
> Now, if i try to check the page type for block number 65, this is what i see,
>
> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
> ERROR:  page is not a hash page
> DETAIL:  Expected ff80, got .
> test=#
>

The contents of such a page should be zero and Jeff has reported some
valid-looking contents of the page.  If you see this page contents as
zero, then we can conclude what Jeff is seeing was an freed overflow
page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-03-19 Thread Rahila Syed
Hello,

Please find attached a rebased patch with support for pg_dump. I am working
on the patch
to handle adding a new partition after a default partition by throwing an
error if
conflicting rows exist in default partition and adding the partition
successfully otherwise.
Will post an updated patch by tomorrow.

Thank you,
Rahila Syed

On Mon, Mar 13, 2017 at 5:42 AM, Robert Haas  wrote:

> On Fri, Mar 10, 2017 at 2:17 PM, Peter Eisentraut
>  wrote:
> > On 3/2/17 21:40, Robert Haas wrote:
> >> On the point mentioned above, I
> >> don't think adding a partition should move tuples, necessarily; seems
> >> like it would be good enough - maybe better - for it to fail if there
> >> are any that would need to be moved.
> >
> > ISTM that the uses cases of various combinations of adding a default
> > partition, adding another partition after it, removing the default
> > partition, clearing out the default partition in order to add more
> > nondefault partitions, and so on, need to be more clearly spelled out
> > for each partitioning type.  We also need to consider that pg_dump and
> > pg_upgrade need to be able to reproduce all those states.  Seems to be a
> > bit of work still ...
>
> This patch is only targeting list partitioning.   It is not entirely
> clear that the concept makes sense for range partitioning; you can
> already define a partition from the end of the last partitioning up to
> infinity, or from minus-infinity up to the starting point of the first
> partition.  The only thing a default range partition would do is let
> you do is have a single partition cover all of the ranges that would
> otherwise be unassigned, which might not even be something we want.
>
> I don't know how complete the patch is, but the specification seems
> clear enough.  If you have partitions for 1, 3, and 5, you get
> partition constraints of (a = 1), (a = 3), and (a = 5).  If you add a
> default partition, you get a constraint of (a != 1 and a != 3 and a !=
> 5).  If you then add a partition for 7, the default partition's
> constraint becomes (a != 1 and a != 3 and a != 5 and a != 7).  The
> partition must be revalidated at that point for conflicting rows,
> which we can either try to move to the new partition, or just error
> out if there are any, depending on what we decide we want to do.  I
> don't think any of that requires any special handling for either
> pg_dump or pg_upgrade; it all just falls out of getting the
> partitioning constraints correct and consistently enforcing them, just
> as for any other partition.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


default_partition_v2.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-19 Thread Tom Lane
Andres Freund  writes:
> On 2017-03-16 11:15:16 -0400, Tom Lane wrote:
>> The thing that actually made the ExecEvalCase code into a bug was that
>> we were using ExprContext-level fields to store the current caseValue,
>> allowing aliasing to occur across nested CASEs.  I think that in
>> this implementation, it ought to be possible to get rid of
>> ExprContext.caseValue_datum et al altogether, in favor of some storage
>> location that's private to each CASE expression.  I'm a bit disappointed
>> that that doesn't seem to have happened.

> ...  Unfortunately
> CaseTest/CoerceToDomainValue are reused outside of domain / case
> expressions in a bunch of places (plpgsql uses CaseTest for casts
> evaluation, validateDomainConstraint/domain_check_input evaluate domain
> constraints without a CoerceToDomain node).

Yeah, there's various stuff that did that for expediency.

> I'd like to get rid of those usages, but that'd recurse into rewriting
> plpgsql casts and other random pieces of code into a different approach
> - something I'd like to avoid doing at the same as this already large
> patch.

Agreed, maybe we should just plan to clean that up later.

> I've been pondering if we can't entirely get rid of CaseTest etc, the
> amount of hackery required seems not like a good thing. One way I'd
> prototyped was to replace them with PARAM_EXEC nodes - then the whole
> issue of them potentially having different values at different parts of
> an expression vanishes because the aliasing is removed.

Yes, replacing all of that with Param slots had occurred to me too.
We might want to keep the special parse node types for convenience in
reverse-listing, but having them act just like PARAM_EXEC for execution
purposes seems promising.

>> Eventually, I would also like to find a way to remove the restriction
>> imposed by the other part of f0c7b789a, ie that we can't inline a SQL
>> function when that would result in intermixing two levels of CASE
>> expression.

> That seems to suggest my PARAM_EXEC idea isn't necessarily perfect -
> inlining would cause aliasing again, but it'd also not hard to fix that
> up.

Right, inlining would probably require some parameter-renumbering to avoid
aliasing.  But at least we'd have a clear framework for how to handle it,
whereas the way things are now, it's just impossible.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-19 Thread Andres Freund
Hi,


On 2017-03-16 11:15:16 -0400, Tom Lane wrote:
> The thing that actually made the ExecEvalCase code into a bug was that
> we were using ExprContext-level fields to store the current caseValue,
> allowing aliasing to occur across nested CASEs.  I think that in
> this implementation, it ought to be possible to get rid of
> ExprContext.caseValue_datum et al altogether, in favor of some storage
> location that's private to each CASE expression.  I'm a bit disappointed
> that that doesn't seem to have happened.

The patch actually does so - during ExecInitExprRec the "relevant"
case/domain testval is stored in ExprState->innermost_*, those pointers
are then stored directly in the relevant steps for
CaseTest/CoerceToDomainValue evaluation.  Unfortunately
CaseTest/CoerceToDomainValue are reused outside of domain / case
expressions in a bunch of places (plpgsql uses CaseTest for casts
evaluation, validateDomainConstraint/domain_check_input evaluate domain
constraints without a CoerceToDomain node).  I.e
ExprContext.caseValue_datum etc. aren't used for normal expressions
anymore, just for the ad-hoc hackery in a bunch of places.

I'd like to get rid of those usages, but that'd recurse into rewriting
plpgsql casts and other random pieces of code into a different approach
- something I'd like to avoid doing at the same as this already large
patch.


I've been pondering if we can't entirely get rid of CaseTest etc, the
amount of hackery required seems not like a good thing. One way I'd
prototyped was to replace them with PARAM_EXEC nodes - then the whole
issue of them potentially having different values at different parts of
an expression vanishes because the aliasing is removed.



> Eventually, I would also like to find a way to remove the restriction
> imposed by the other part of f0c7b789a, ie that we can't inline a SQL
> function when that would result in intermixing two levels of CASE
> expression.  An implementation along the lines of what I've sketched
> above could handle that easily enough, as long as we could identify
> which nested level of CASE a particular CaseTestExpr belongs to.
> I don't know how to do that offhand, but it seems like it ought to be
> soluble if we put a bit of thought into it.

I haven't thought overly much about this, but I agree, it looks like it
should be doable.

That seems to suggest my PARAM_EXEC idea isn't necessarily perfect -
inlining would cause aliasing again, but it'd also not hard to fix that
up.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Determine if an error is transient by its error code.

2017-03-19 Thread Tom Lane
Craig Ringer  writes:
> On 20 March 2017 at 10:26, Dominick O'Dierno  wrote:
>> Essentially I want to determine by the error code if it is worth retrying
>> the call (transient) or if the error was due to a bad query or programmer
>> error, in which case don't retry.

> In general you'll need classes of retry:
> * just reissue the query (deadlock retry, etc)
> * reconnect and retry

Yeah.  There's a pretty significant fraction of these where just blindly
repeating the failing query isn't likely to help; the error code is meant
to suggest that the DBA has to fix something, eg adjust configuration
limits.  I'm also pretty dubious about the value of a blind retry for,
eg, disk_full.

One you missed that I think *is* supposed to imply "just retry" is
40001 serialization_failure.  You have to retry the whole transaction
though, not just one query.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Determine if an error is transient by its error code.

2017-03-19 Thread Craig Ringer
On 20 March 2017 at 10:26, Dominick O'Dierno  wrote:
> Hello folks,
>
> I'm trying to define a transient fault detection strategy for a client
> application when calling a postgres database.
>
> Essentially I want to determine by the error code if it is worth retrying
> the call (transient) or if the error was due to a bad query or programmer
> error, in which case don't retry.
>
> Going through the codes as posted here
> https://www.postgresql.org/docs/9.6/static/errcodes-appendix.html I had a go
> at making a list of error codes which may be transient:
>
> 53000: insufficient_resources
> 53100: disk_full
> 53200: out_of_memory
> 53300: too_many_connections
> 53400: configuration_limit_exceeded
> 57000: operator_intervention
> 57014: query_canceled
> 57P01: admin_shutdown
> 57P02: crash_shutdown
> 57P03: cannot_connect_now
> 57P04: database_dropped
> 58000: system_error
> 58030: io_error

Depends on how transient you mean, really.

I/O error, disk full, cannot_connect_now, etc may or may not require
admin intervention.

I would argue that database_dropped isn't transient. But I guess you
might be re-creating it?

> These next few I am not sure whether they should be treated as transient or
> not, but I am guessing so
>
> 55P03: lock_not_available

Yeah, I'd say that's transient.

> 55006: object_in_use

Same.

> 55000: object_not_in_prerequisite_state

Varies. This can be a bit of a catchall error, encompassing things
that need configuration changes, things that need system state changes
(won't work in recover or whatever), and things that will change in a
short span of time.

In general you'll need classes of retry:

* just reissue the query (deadlock retry, etc)
* reconnect and retry

etc.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2017-03-19 Thread Robert Haas
On Fri, Mar 17, 2017 at 2:30 AM, Amit Kapila  wrote:
>> I was wondering about doing an explicit test: if the XID being
>> committed matches the one in the PGPROC, and nsubxids matches, and the
>> actual list of XIDs matches, then apply the optimization.  That could
>> replace the logic that you've proposed to exclude non-commit cases,
>> gxact cases, etc. and it seems fundamentally safer.  But it might be a
>> more expensive test, too, so I'm not sure.
>
> I think if the number of subxids is very small let us say under 5 or
> so, then such a check might not matter, but otherwise it could be
> expensive.

We could find out by testing it.  We could also restrict the
optimization to cases with just a few subxids, because if you've got a
large number of subxids this optimization probably isn't buying much
anyway.  We're trying to avoid grabbing CLogControlLock to do a very
small amount of work, but if you've got 10 or 20 subxids we're doing
as much work anyway as the group update optimization is attempting to
put into one batch.

> So we have four ways to proceed:
> 1. Have this optimization for subtransactions and make it safe by
> having some additional conditions like check for recovery, explicit
> check for if the actual transaction ids match with ids stored in proc.
> 2. Have this optimization when there are no subtransactions. In this
> case, we can have a very simple check for this optimization.
> 3. Drop this patch and idea.
> 4. Consider it for next version.
>
> I personally think second way is okay for this release as that looks
> safe and gets us the maximum benefit we can achieve by this
> optimization and then consider adding optimization for subtransactions
> (first way) in the future version if we think it is safe and gives us
> the benefit.
>
> Thoughts?

I don't like #2 very much.  Restricting it to a relatively small
number of transactions - whatever we can show doesn't hurt performance
- seems OK, but restriction it to the exactly-zero-subtransactions
case seems poor.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)

2017-03-19 Thread Jim Nasby

On 3/19/17 2:32 PM, Tom Lane wrote:

Jim Nasby  writes:

Having thought about it, I share Tom's concerns. And now I'm worried
about what happens if there are multiple separate OR clauses. I guess
those would be handled by separate UNIONs?


As proposed, the patch would try to optimize by splitting each OR clause
independently, and would choose whichever way gave the best cost estimate.
I'm not sure it's possible to do better than that, and even if it is,
I think improving it could be left for later.


Agreed.


I'd also considered an approach of de-duping on the basis of all relation
ctids, while allowing a rel's ctid to be returned as NULL from a UNION arm
in which the rel was eliminated entirely.  But that doesn't fix it,
because in this example the first arm would return (a.ctid, NULL) while
the second arm would return (NULL, b.ctid), so that the UNION step would
still fail to detect any duplication.  To make it work, we'd have to not
eliminate the joins, which would pretty much defeat the usefulness of
the optimization for your original example case.


It might still be worth-while in some circumstances. In your example, if 
there were these indexes:


a__id ON a(id), a__x ON a(x)
b__id ON b(id), b__y ON b(y)

then it might be faster to nested loop a__x=42 to b__id=a.id and union 
that with b__y=43 nested to a__id=b.id.


That said, now isn't the time to be adding more complexity.


So full joins definitely break this whole optimization.  I think it's okay
with left joins though, because when starting from "a left join b" it will
never be possible to remove "a" so we'll always include a.ctid in the
UNION de-duping step.  If b is removed in some arm, then it must be true
that we get exactly one left-join output row per a row, regardless of the
contents of b, in that arm.  The argument for the patch being okay is
essentially that we must get exactly one left-join output row per a row,
regardless of the contents of b, in *every* arm, because the various
modified versions of the OR clause can't affect that conclusion.  In some
of the arms we might not remove b, and we might even be able to reduce the
left join to an inner join, but there should still be no more than one
join output row per a row.  That being the case, it should be sufficient
to de-dup using a.ctid while ignoring b.ctid.


The only case I can think of is: would it be possible (perhaps not 
today; maybe in the future) for other parts of the query to affect join 
elimination? I can't conceive of how that might happen, but if it did 
then it's possible that the elimination would work differently with the 
UNION than it would with an OR.


The comment on join_is_removable() does mention that there's other 
potentially interesting cases that we can't handle now; it's maybe worth 
mentioning


Other than that, I can't see any issues with your logic.


Any clearer yet?


Absolutely. I think it would be very valuable to include that with the 
initial comment in planunionor.c. Join reduction and removal is already 
tricky enough to wrap your head around.


Other comments:


+  * is retty mechanical, but we can't do it until we have a RelOptInfo for the


s/retty/pretty/


I suspect that in many systems single-table queries are far more common 
than CTEs, so maybe it's worth reversing those two tests in 
split_join_or_clauses().



For the Unique path, it would be nice if the code did what would be 
necessary to consider a TID hash join, since that means a user could 
create the appropriate operator and it would just be picked up. 
Certainly not worth much effort at this point though.



+   /*
+* Must not have any volatile functions in FROM or WHERE (see notes at
+* head of file).
+*/
+   if (contain_volatile_functions((Node *) parse->jointree))


Is there by chance anywhere else that needs to check that? Maybe worth 
adding the info to the Query struct if so.



+* We insist that all baserels used in the query be plain relations, so


Dumb question... views have already be decomposed at this point, right?

Perhaps less dumb question... what happens if the original query already 
had setOps? AIUI setOps work has already been done by the time 
split_join_or_clauses() happens; I just want to check that that's OK.


I'm not sure a GUC is worth it... I suspect that any query with multiple 
rels and an OR condition is going to be expensive enough that whatever 
additional plan time there is won't be noticeable.


I've verified that the patch still applies and make check-world is clean.
--
Jim Nasby, Chief Data Architect, Austin TX
OpenSCG http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-19 Thread Robert Haas
On Fri, Mar 17, 2017 at 8:10 PM, Robert Haas  wrote:
> While I was studying what you did with reparameterize_path_by_child(),
> I started to wonder whether reparameterize_path() doesn't need to
> start handling join paths.  I think it only handles scan paths right
> now because that's the only thing that can appear under an appendrel
> created by inheritance expansion, but you're changing that.  Maybe
> it's not critical -- I think the worst consequences of missing some
> handling there is that we won't consider a parameterized path in some
> case where it would be advantageous to do so.  Still, you might want
> to investigate a bit.

I spent a fair amount of time this weekend musing over
reparameterize_path_by_child().  I think a key question for this patch
- as you already pointed out - is whether we're happy with that
approach.  When we discover that we want to perform a partitionwise
parameterized nestloop, and therefore that we need the paths for each
inner appendrel to get their input values from the corresponding outer
appendrel members rather than from the outer parent, we've got two
choices.  The first is to do what the patch actually does, which is to
build a new path tree for the nestloop inner path parameterized by the
appropriate childrel.  The second is to use the existing paths, which
are parameterized by the parent rel, and then somehow allow make that
work.  For example, you can imagine that create_plan_recurse() could
pass down a list of parameterized nestloops above the current point in
the path tree, and a parent-child mapping for each, and then we could
try to substitute everything while actually generating the plan
instead of creating paths sooner.  Which is better?

It would be nice to hear opinions from anyone else who cares, but
after some thought I think the approach you've picked is probably
better, because it's more like what we do already.  We have existing
precedent for reparameterizing a path, but none for allowing a Var for
one relation (the parent) to in effect refer to another relation (the
child).

That having been said, having try_nestloop_path() perform the
reparameterization at the very top of the function seems quite
undesirable.  You're creating a new path there before you know whether
it's going to be rejected by the invalid-parameterization test and
also before you know whether initial_cost_nestloop is going to reject
it.  It would be much better if you could find a way to postpone the
reparameterization until after those steps, and only do it if you're
going to try add_path().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-19 Thread Robert Haas
On Sun, Mar 19, 2017 at 12:15 AM, Rafia Sabih
 wrote:
> I was trying to play around with this patch and came across following
> case when without the patch query completes in 9 secs and with it in
> 15 secs. Theoretically, I tried to capture the case when each
> partition is having good amount of rows in output and each has to
> build their own hash, in that case the cost of building so many hashes
> comes to be more costly than having an append and then join. Thought
> it might be helpful to consider this case in better designing of the
> algorithm. Please feel free to point out if I missed something.

In the non-partitionwise plan, the query planner correctly chooses to
hash the same table (prt2) and probe from the large table (prt).  In
the partition-wise plan, it generally does the opposite.  There is a
mix of merge joins and hash joins, but of the 15 children that picked
merge joins, 14 of them hashed the larger partition (in each case,
from prt) and probed from the smaller one (in each case, from prt2),
which seems like an odd strategy.  So I think the problem is not that
building lots of hash tables is slower than building just one, but
rather that for some reason it's choosing the wrong table to hash.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Determine if an error is transient by its error code.

2017-03-19 Thread Dominick O'Dierno
Hello folks,

I'm trying to define a transient fault detection strategy for a client
application when calling a postgres database.

Essentially I want to determine by the error code if it is worth retrying
the call (transient) or if the error was due to a bad query or programmer
error, in which case don't retry.

Going through the codes as posted here
https://www.postgresql.org/docs/9.6/static/errcodes-appendix.html I had a
go at making a list of error codes which may be transient:

53000: insufficient_resources
53100: disk_full
53200: out_of_memory
53300: too_many_connections
53400: configuration_limit_exceeded
57000: operator_intervention
57014: query_canceled
57P01: admin_shutdown
57P02: crash_shutdown
57P03: cannot_connect_now
57P04: database_dropped
58000: system_error
58030: io_error

These next few I am not sure whether they should be treated as transient or
not, but I am guessing so

55P03: lock_not_available
55006: object_in_use
55000: object_not_in_prerequisite_state
08000: connection_exception
08003: connection_does_not_exist
08006: connection_failure
08001: sqlclient_unable_to_establish_sqlconnection
08004: sqlserver_rejected_establishment_of_sqlconnection
08007: transaction_resolution_unknown

Are there any codes listed above where retrying would actually not be
helpful?
Are there any codes that I did not include that I should have?

Thanks,

-Dominick


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-19 Thread Alvaro Herrera
Tom Lane wrote:

> I'm not entirely convinced that function-per-command is an improvement
> though.  Seems like it would only help to the extent that you could do a
> simple "return" to implement early exit, and it looks to me like that
> doesn't work in a lot of places because you still have to clean up things
> like malloc'd argument strings before you can return.  So the question
> we have to answer is whether this way looks cleaner than what we'd get if
> we just changed the logic in-place.  For the purpose of answering that
> question, looking at the final state is the right thing to do.
> 
> I don't have a definite opinion on that core question yet, since I've not
> read this version of the patch.  Anybody else want to give an opinion?

Currently, exec_command is a 1500-line function.  If I had to see how a
single \-command worked, I would have to fold everything but the command
I'm interested in, in case there's something nontrivial at function
start or end (or even in between -- I would have to start by figuring
out whether there's anything other than "else if" somewhere in those
1500 lines).  I think splitting into command-specific functions makes
this much easier to follow, particularly if we want to add extra tricks
such as returning early etc.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-19 Thread Vaishnavi Prabakaran
On Fri, Mar 17, 2017 at 12:37 AM, Daniel Verite 
wrote:

> Vaishnavi Prabakaran wrote:
>
> > So, attached the alternative fix for this issue.
> > Please share me your thoughts.
>
> I assume you prefer the alternative fix because it's simpler.
>

I would like add one more reason for this fix, I think "PQsetSingleRowMode"
should be called only when the result is ready to be processed and before
starting to consume result as it is documented currently as follows -
"To enter single-row mode, call PQsetSingleRowMode immediately after a
successful call of PQsendQuery (or a sibling function). This mode selection
is effective only for the currently executing query. Then call PQgetResult
repeatedly..."


I agree that first fix (you shared) will allow user to set single-row mode
after PQsendQuery, but it also allows user to set this mode at any time of
batch processing(not necessarily "immediately after PQsendQuery"), also
"mode selection is effective only for the currently executing query" will
be false. Please note that I don't see any problem with this deviation. I
like to outline that documentation here anyways needs an update/note.


 Before going further, I would like to mention that I have modified the
documentation of batch processing( in v6 code patch) as below:
"To enter single-row mode, call PQsetSingleRowMode immediately after a
successful call of PQbatchQueueProcess. This mode selection is effective
only for the currently executing query. For more information on the
use of PQsetSingleRowMode
, refer to Section 33.6, “Retrieving Query Results Row-By-Row”. "

Please let me know if you think this is not enough but wanted to update
section 33.6 also?



>
> > I would also like to hear Craig's opinion on it before applying this fix
> > to the original patch, just to make sure am not missing anything here.
>
> +1
>
> The main question is whether the predicates enforced
> by PQsetSingleRowMode() apply in batch mode in all cases
> when it's legit to call that function. Two predicates
> that may be problematic are:
> if (conn->asyncStatus != PGASYNC_BUSY)
> return 0;
> and
> if (conn->result)
> return 0;
>
> The general case with batch mode is that, from the doc:
> "The client interleaves result processing with sending batch queries"
>

While sending batch queries in middle of result processing, only the query
is appended to the list of queries maintained for batch processing and no
current connection attribute impacting result processing will be changed.
So, calling the PQsetSingleRowMode in-between result processing will fail
as it tries to set single-row mode for currently executing query for which
result processing is already started.


Note that I've not even tested that here,

I've tested
> batching a bunch of queries in a first step and getting the results
> in a second step.
> I am not confident that the above predicates will be true
> in all cases.

Also your alternative fix assumes that we add
> a user-visible exception to PQsetSingleRowMode in batch mode,
> whereby it must not be called as currently documented:
>   "call PQsetSingleRowMode immediately after a successful call of
>PQsendQuery (or a sibling function)"

My gut feeling is that it's not the right direction, I prefer making
> the single-row a per-query attribute internally and keep
> PQsetSingleRowMode's contract unchanged from the
> user's perspective.
>
>
Am going to include the test which you shared in the test patch. Please let
me know if you want to cover anymore specific cases to gain confidence.

Thanks & Regards,
Vaishnavi,
Fujitsu Australia.


Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2017-03-19 Thread Peter Geoghegan
On Sun, Mar 12, 2017 at 3:05 PM, Peter Geoghegan  wrote:
> I attach my V9 of the patch. I came up some stuff for the design of
> resource management that I think meets every design goal that we have
> for shared/unified BufFiles:

Commit 2609e91fc broke the parallel CREATE INDEX cost model. I should
now pass -1 as the index block argument to compute_parallel_worker(),
just as all callers that aren't parallel index scan do after that
commit. This issue caused V9 to never choose parallel CREATE INDEX
within nbtsort.c. There was also a small amount of bitrot.

Attached V10 fixes this regression. I also couldn't resist adding a
few new assertions that I thought were worth having to buffile.c, plus
dedicated wait events for parallel tuplesort. And, I fixed a silly bug
added in V9 around where worker_wait() should occur.

-- 
Peter Geoghegan


0001-Add-parallel-B-tree-index-build-sorting.patch.gz
Description: GNU Zip compressed data


0002-Add-temporary-testing-tools.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Peter Eisentraut
On 3/18/17 16:12, Magnus Hagander wrote:
> But given that we are in the process of breaking a lot of other scripts
> for 10,

I don't think the xlog renaming really affects this area.

> perhaps we should rename it to pg_createuser?

I'm not keen on doing that now, but if we were to do it at some point, I
would do it for all programs not already starting with "p" or containing
"pg".

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] createlang/droplang deprecated

2017-03-19 Thread Peter Eisentraut
On 3/18/17 09:00, Peter Eisentraut wrote:
> I just noticed that createlang and droplang have been listed as
> deprecated since PG 9.1.
> 
> Do we dare remove them?

Patch

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 3ae8367b933d02246a305432a411236d309a2060 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 18 Mar 2017 23:34:52 -0400
Subject: [PATCH] Remove createlang and droplang

They have been deprecated since PostgreSQL 9.1.
---
 doc/src/sgml/installation.sgml|  12 +-
 doc/src/sgml/plperl.sgml  |   3 +-
 doc/src/sgml/plpython.sgml|   3 +-
 doc/src/sgml/pltcl.sgml   |   7 +-
 doc/src/sgml/ref/allfiles.sgml|   2 -
 doc/src/sgml/ref/create_function.sgml |   1 -
 doc/src/sgml/ref/create_language.sgml |  15 +-
 doc/src/sgml/ref/createlang.sgml  | 291 --
 doc/src/sgml/ref/drop_language.sgml   |   1 -
 doc/src/sgml/ref/droplang.sgml| 288 -
 doc/src/sgml/reference.sgml   |   2 -
 doc/src/sgml/release-9.1.sgml |   4 +-
 doc/src/sgml/xplang.sgml  |   9 +-
 src/bin/scripts/.gitignore|   2 -
 src/bin/scripts/Makefile  |   6 +-
 src/bin/scripts/createlang.c  | 251 -
 src/bin/scripts/droplang.c| 250 -
 src/bin/scripts/nls.mk|   4 +-
 src/bin/scripts/t/030_createlang.pl   |  25 ---
 src/bin/scripts/t/060_droplang.pl |  23 ---
 src/tools/msvc/Install.pm |   4 +-
 21 files changed, 21 insertions(+), 1182 deletions(-)
 delete mode 100644 doc/src/sgml/ref/createlang.sgml
 delete mode 100644 doc/src/sgml/ref/droplang.sgml
 delete mode 100644 src/bin/scripts/createlang.c
 delete mode 100644 src/bin/scripts/droplang.c
 delete mode 100644 src/bin/scripts/t/030_createlang.pl
 delete mode 100644 src/bin/scripts/t/060_droplang.pl

diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 79201b78e3..f8a222e637 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -2256,17 +2256,17 @@ Memory Management
  memory management.  You can have a server with many multiples of
  gigabytes of RAM free, but still get out of memory or address
  space errors when running applications.  One example
- is createlang failing with unusual errors.
+ is loading of extensions failing with unusual errors.
  For example, running as the owner of the PostgreSQL installation:
 
--bash-3.00$ createlang plperl template1
-createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plperl.so": A memory address is not in the address space for the process.
+=# CREATE EXTENSION plperl;
+ERROR:  could not load library "/opt/dbs/pgsql/lib/plperl.so": A memory address is not in the address space for the process.
 
 Running as a non-owner in the group possessing the PostgreSQL
 installation:
 
--bash-3.00$ createlang plperl template1
-createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plperl.so": Bad address
+=# CREATE EXTENSION plperl;
+ERROR:  could not load library "/opt/dbs/pgsql/lib/plperl.so": Bad address
 
  Another example is out of memory errors in the PostgreSQL server
  logs, with every memory allocation near or greater than 256 MB
@@ -2284,7 +2284,7 @@ Memory Management
 
 
 
- In the case of the createlang example, above,
+ In the case of the plperl example, above,
  check your umask and the permissions of the binaries in your
  PostgreSQL installation.  The binaries involved in that example
  were 32-bit and installed as mode 750 instead of 755.  Due to the
diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml
index 9117769125..dd2ffbc6ce 100644
--- a/doc/src/sgml/plperl.sgml
+++ b/doc/src/sgml/plperl.sgml
@@ -27,8 +27,7 @@ PL/Perl - Perl Procedural Language
 
   
To install PL/Perl in a particular database, use
-   CREATE EXTENSION plperl, or from the shell command line use
-   createlang plperl dbname.
+   CREATE EXTENSION plperl.
   
 
   
diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
index 46397781be..fb5d336efc 100644
--- a/doc/src/sgml/plpython.sgml
+++ b/doc/src/sgml/plpython.sgml
@@ -14,8 +14,7 @@ PL/Python - Python Procedural Language
 
  
   To install PL/Python in a particular database, use
-  CREATE EXTENSION plpythonu, or from the shell command line use
-  createlang plpythonu dbname (but
+  CREATE EXTENSION plpythonu (but
   see also ).
  
 
diff --git a/doc/src/sgml/pltcl.sgml b/doc/src/sgml/pltcl.sgml
index ed745a7481..ba4af2aec5 100644
--- a/doc/src/sgml/pltcl.sgml
+++ b/doc/src/sgml/pltcl.sgml
@@ -66,10 +66,9 @@ Overview
 directory if Tcl support is specified in the configuration step of
 

[HACKERS] Re: [COMMITTERS] pgsql: Add TAP tests for password-based authentication methods.

2017-03-19 Thread Peter Eisentraut
On 3/17/17 05:37, Heikki Linnakangas wrote:
> Add TAP tests for password-based authentication methods.
> 
> Tests all combinations of users with MD5, plaintext and SCRAM verifiers
> stored in pg_authid, with plain 'password', 'md5' and 'scram'
> authentication methods.

This is missing an entry for tmp_check/ in .gitignore.  But maybe we can
do that globally instead of repeating it in every directory?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ICU integration

2017-03-19 Thread Andreas Karlsson

On 03/15/2017 05:33 PM, Peter Eisentraut wrote:

Updated patch attached.


Ok, I applied to patch again and ran the tests. I get a test failure on 
make check-world in the pg_dump tests but it can be fixed with the below.


diff --git a/src/bin/pg_dump/t/002_pg_dump.pl 
b/src/bin/pg_dump/t/002_pg_dump.pl

index 3cac4a9ae0..7d9c90363b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -2422,7 +2422,7 @@ qr/^\QINSERT INTO test_fifth_table (col1, col2, 
col3, col4, col5) VALUES (NULL,

  'CREATE COLLATION test0 FROM "C";',
regexp =>
  qr/^
- \QCREATE COLLATION test0 (lc_collate = 'C', lc_ctype = 'C');\E/xm,
+ \QCREATE COLLATION test0 (provider = 'libc', locale = 'C');\E/xm,
like => {
binary_upgrade   => 1,
clean=> 1,


- I do not like how it is not obvious which is the default version of
every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not
"sv_standard%icu" as one might expect. Is this inherent to ICU or
something we can work around?


We get these keywords from ucol_getKeywordValuesForLocale(), which says
"Given a key and a locale, returns an array of string values in a
preferred order that would make a difference."  So all those are
supposedly different from each other.


I believe you are mistaken. The locale "sv" is just an alias for 
"sv-u-standard" as far as I can tell. See the definition of the Swedish 
locale 
(http://source.icu-project.org/repos/icu/trunk/icu4c/source/data/coll/sv.txt) 
and there are just three collations: reformed (default), standard, 
search (plus eot and emoji which are inherited).


I am also quite annoyed at col-emoji and col-eor (European Ordering 
Rules). They are defined at the root and inherited by all languages, but 
no language modifies col-emoji for their needs which makes it a foot 
gun. See the Danish sorting example below where at least I expected the 
same order. For col-eor it makes a bit more sense since I would expect 
the locales en_GB-u-col-eot and sv_SE-u-col-eot to collate exactly the same.


# SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE 
"da-x-icu";

 x

 a
 k
 aa
(3 rows)

# SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE 
"da-u-co-emoji-x-icu";

 x

 a
 aa
 k
(3 rows)

It seems ICU has made quite the mess here, and I am not sure to what 
degree we need to handle it to avoid our users getting confused. I need 
to think some of it, and would love input from others. Maybe the right 
thing to do is to ignore the issue with col-emoji, but I would want to 
do something about the default collations.



- ICU talks about a new syntax for locale extensions (old:
de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page
http://userguide.icu-project.org/collation/api. Is this something we
need to car about? It looks like we currently use the old format, and
while I personally prefer it I am not sure we should rely on an old syntax.


Interesting.  I hadn't see this before, and the documentation is sparse.
 But it seems that this refers to BCP 47 language tags, which seem like
a good idea.

So what I have done is change all the predefined ICU collations to be
named after the BCP 47 scheme, with a "private use" -x-icu suffix
(instead of %icu).  The preserves the original idea but uses a standard
naming scheme.

I'm not terribly worried that they are going to remove the "old" locale
naming, but just to be forward looking, I have changed it so that the
collcollate entries are made using the "new" naming for ICU >=54.


Sounds good.


- I get an error when creating a new locale.

#CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR:  could not create locale "sv": Success

# CREATE COLLATION sv2 (LOCALE = 'sv');
ERROR:  could not create locale "sv": Resource temporarily unavailable
Time: 1.109 ms


Hmm, that's pretty straightforward code.  What is your operating system?
 What are the build options?  Does it work without this patch?


This issue is unrelated to ICU. I had forgot to specify provider so the 
eorrs are correct (even though that the value of the errno is weird).



- For the collprovider is it really correct to say that 'd' is the
default value as it does in catalogs.sgml?


It doesn't say it's the default value, it says it uses the database
default.  This is all a bit confusing.  We have a collation object named
"default", which uses the locale set for the database.  That's been that
way for a while.  Now when introducing the collation providers, that
"default" collation gets its own collprovider category 'd'.  That is not
really used anywhere, but we have to put something there.


Ah, I see now. Hm, that is a bit awkward but I cannot think of a cleaner 
alternative.



- I do not know what the policy for formatting the documentation is, but
some of the paragraphs are in need of re-wrapping.


Hmm, I don't see anything terribly bad.


Maybe it is just me 

Re: [HACKERS] [COMMITTERS] pgsql: Fix and simplify check for whether we're running as Windows serv

2017-03-19 Thread Tom Lane
Heikki Linnakangas  writes:
> On 03/17/2017 07:57 PM, Tom Lane wrote:
>> This seems to have broken narwhal:

> It's not very nice to change the requirements in a minor version, but I 
> don't think this would be a real problem for anyone. Not many people 
> build PostgreSQL using MinGW, let alone with an ancient version of it. 
> But if people don't agree, we could instead revert this patch and apply 
> the smaller V2 patch [2] instead, in the back-branches.

> Thoughts? Any objections to requiring a newer version of MinGW? Any 
> objections to do so in the next minor release?

Hm.  I'm +1 for doing that in HEAD, but less so for the back branches.

Can we get some fix on when the functions in question were added
to MinGW?  If we knew how new a toolchain we'd be requiring here,
that would help make this decision.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Fix and simplify check for whether we're running as Windows serv

2017-03-19 Thread Heikki Linnakangas

On 03/17/2017 07:57 PM, Tom Lane wrote:

Heikki Linnakangas  writes:

Fix and simplify check for whether we're running as Windows service.


This seems to have broken narwhal:

Creating library file: libpostgres.a
../../src/port/libpgport_srv.a(win32security_srv.o)(.text+0x109): In function 
`pgwin32_is_admin':
C:/msys/1.0/local/pgbuildfarm/buildroot/HEAD/pgsql.build/src/port/win32security.c:77:
 undefined reference to `CheckTokenMembership'
../../src/port/libpgport_srv.a(win32security_srv.o)(.text+0x127):C:/msys/1.0/local/pgbuildfarm/buildroot/HEAD/pgsql.build/src/port/win32security.c:77:
 undefined reference to `CheckTokenMembership'
../../src/port/libpgport_srv.a(win32security_srv.o)(.text+0x264): In function 
`pgwin32_is_service':
C:/msys/1.0/local/pgbuildfarm/buildroot/HEAD/pgsql.build/src/port/win32security.c:138:
 undefined reference to `CheckTokenMembership'
../../src/port/libpgport_srv.a(win32security_srv.o)(.text+0x302):C:/msys/1.0/local/pgbuildfarm/buildroot/HEAD/pgsql.build/src/port/win32security.c:163:
 undefined reference to `CheckTokenMembership'
collect2: ld returned 1 exit status
make[2]: *** [postgres] Error 1


Hmm. The other MinGW animals are not having that problem.

According to the MSDN docs [1], CheckTokenMembership() is defined in the 
Advapi32 library. In pg_ctl.c and in 
src/include/common/restricted_token.c, we currently do this, for another 
function that's in Advapi32:



/*
 * Mingw headers are incomplete, and so are the libraries. So we have to load
 * a whole lot of API functions dynamically. Since we have to do this anyway,
 * also load the couple of functions that *do* exist in minwg headers but not
 * on NT4. That way, we don't break on NT4.
 */
typedef BOOL (WINAPI * __CreateRestrictedToken) (HANDLE, DWORD, DWORD, 
PSID_AND_ATTRIBUTES, DWORD, PLUID_AND_ATTRIBUTES, DWORD, PSID_AND_ATTRIBUTES, 
PHANDLE);
typedef BOOL (WINAPI * __IsProcessInJob) (HANDLE, HANDLE, PBOOL);
typedef HANDLE (WINAPI * __CreateJobObject) (LPSECURITY_ATTRIBUTES, LPCTSTR);
typedef BOOL (WINAPI * __SetInformationJobObject) (HANDLE, JOBOBJECTINFOCLASS, 
LPVOID, DWORD);
typedef BOOL (WINAPI * __AssignProcessToJobObject) (HANDLE, HANDLE);
typedef BOOL (WINAPI * __QueryInformationJobObject) (HANDLE, 
JOBOBJECTINFOCLASS, LPVOID, DWORD, LPDWORD);


CheckTokenMembership() is probably one of those missing functions, like 
CreateRestrictedToken.


I'd like to bump our minimum requirements,and stop supporting old MinGW 
versions that don't have those headers. I'm not sure what version they 
were added in, but frogmouth doesn't have this problem, and it uses gcc 
4.5.0. Once we do that, we can simplify pg_ctl.c and restricted_token.c 
to not open advapi32.dll dynamically, and call those functions the usual 
way.


It's not very nice to change the requirements in a minor version, but I 
don't think this would be a real problem for anyone. Not many people 
build PostgreSQL using MinGW, let alone with an ancient version of it. 
But if people don't agree, we could instead revert this patch and apply 
the smaller V2 patch [2] instead, in the back-branches.


Thoughts? Any objections to requiring a newer version of MinGW? Any 
objections to do so in the next minor release?


[1] 
https://msdn.microsoft.com/en-us/library/windows/desktop/aa376389%28v=vs.85%29.aspx


[2] 
https://www.postgresql.org/message-id/CAB7nPqSvfu%3DKpJ%3DNX%2BYAHmgAmQdzA7N5h31BjzXeMgczhGCC%2BQ%40mail.gmail.com


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-19 Thread Corey Huinker
On Sun, Mar 19, 2017 at 1:18 PM, Fabien COELHO  wrote:

>
> Hello Corey,
>
> v24 highlights:
>>
>
> The v24 patch is twice larger that the previous submission. Sigh.
>
> If I'm reading headers correctly, it seems that it adds an
> "expected/psql-on-error-stop.out" file without a corresponding test
> source in "sql/". Is this file to be simply ignored, or is a source missing?


Ignore it. I created the new .sql/.out pair, realized that the file naming
convention was underscores not dashes, changed them and evidently forgot
that I had already added a dashed one to git.


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-19 Thread Corey Huinker
On Sun, Mar 19, 2017 at 4:23 PM, Fabien COELHO  wrote:

>
> Hello Tom,
>
> I'm not entirely convinced that function-per-command is an improvement
>> though. [...]
>>
>
> I don't have a definite opinion on that core question yet, since I've not
>> read this version of the patch.  Anybody else want to give an opinion?
>>
>
> My 0.02€:
>
> I've already provided my view...
>
> Personnally I like good functions. Maybe a per-command-family set of
> functions could improve the code readability, but (1) I'm not sure this is
> achieved by this patch (eg the if-related state management is now
> dispatched in 4 functions) and (2) I'm not sure that this approach helps
> much with respect to trying to factor out backslash-command-related
> active-or-not argument management.
>
> However I have not looked at the patch in detail. I'm planing to do so
> later this week.


I offered to split the patch into two steps (1. break each "family" into
it's own function and 2. Do what's needed for \if-\endif) but got no
response. I can still do that if people think it's worthwhile.


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Joshua D. Drake

On 03/19/2017 01:01 PM, Tom Lane wrote:

Andreas Karlsson  writes:

As for if we should have backwards compatibility for the old names I am
leaning weakly for providing it in the case of createuser. I can see end
users being pissed off that the createuser command is suddenly gone
without any warning when they upgrade. On the flip side I have no idea
how much work it would be to maintain those legacy names.


It seems reasonably trivial to me as far as the code goes --- just
create a symlink during installation.  (On Windows I suppose we'd have
to make a second physical copy, but these files are not so large that
that seems unacceptable.)


Windows supports symlinks:

https://msdn.microsoft.com/en-us/library/windows/desktop/aa365680(v=vs.85).aspx

JD


--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Tom Lane
Andres Freund  writes:
> I wish 'pg' wasn't an already used binary name.  It'd be much nicer if
> we had a '/usr/bin/pg' wrapper binary in the git style, that we could easily
> expand over time, without hitting new conflicts.  I'd even consider a
> '/usr/bin/pgsql' that has commands for all our binaries a considerable
> improvement in the long term.

Meh.  I'm not seeing where "pg command args" is that much superior to
"pg_command args".  Yeah, if we'd done it like that from the beginning
it'd be fine; but people are used to the latter style and I don't feel
a need to make them change over.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-19 Thread Fabien COELHO


Hello Tom,


I'm not entirely convinced that function-per-command is an improvement
though. [...]



I don't have a definite opinion on that core question yet, since I've not
read this version of the patch.  Anybody else want to give an opinion?


My 0.02€:

I've already provided my view...

Personnally I like good functions. Maybe a per-command-family set of 
functions could improve the code readability, but (1) I'm not sure this is 
achieved by this patch (eg the if-related state management is now 
dispatched in 4 functions) and (2) I'm not sure that this approach helps 
much with respect to trying to factor out backslash-command-related 
active-or-not argument management.


However I have not looked at the patch in detail. I'm planing to do so 
later this week.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Tom Lane
Stephen Frost  writes:
> If we take your approach to its logical conclustion then we should be
> planning to maintain all user-facing deprecated features for as long as
> there is a version where it exists in a non-deprecated fashion, in other
> words for 5 years, if we assume that script authors only care about
> supporting as far back as we do, which I'm not entirely sure is a great
> assumption to begin with, but I'd say it's at least a minimum.

Well, that is definitely a straw man (as you admit later).  I don't want
to go there either.  Where to draw the line has to be a case-by-case
decision, though.  In this case the amount of effort involved in providing
backwards compatibility is pretty minimal, and the project has failed to
provide any warning to users that this is something we might choose to
break in future, so it seems to me that we ought to provide compatibility
for awhile.  If somebody bitches that we broke their script in v15, we'll
be in a much better position if we can point to five years' worth of
deprecation notices than if we just have to say "tough, we felt like
breaking your script so we did it".

> I'll admit that part of that is likely because I don't think I have
> *ever* used createdb or createlang or createuser (excepting whatever
> regression tests run them),

Well, personally I use createdb pretty often.  I basically never use
the other two, but I don't fool myself that I'm a representative user.
createuser in particular seems like it has pretty high value-added
if you create new roles often and need to give them passwords.  Yeah,
you can do it in psql, but it's not as convenient, especially if you
want to do proper quoting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Andres Freund
On 2017-03-18 16:23:17 -0400, Tom Lane wrote:
> Magnus Hagander  writes:
> > createuser, dropuser - definitely pollutes the namespace, people do
> > sometimes try them for the wrong thing. Unlike the db ones they do add
> > value though -- I don't think we have a psql way of in a single command
> > doing what --pwprompt on createuser does, do we? But given that we are in
> > the process of breaking a lot of other scripts for 10, perhaps we should
> > rename it to pg_createuser?
> 
> I'm not particularly on board with arguments like "we already broke a ton
> of stuff so let's break some more".  Eventually you've managed to create
> a daunting barrier to upgrading at all.

+1


> I think a more reasonable way to proceed is to install symlinks
> pg_createuser -> createuser (or the other direction), mark the older names
> as deprecated, and announce that we'll remove the old names a few releases
> from now.  That gives people time to adjust.

+1


> Maybe we should handle createdb likewise, rather than just kicking it to
> the curb.  I know I use it quite often; it's less typing than psql -c
> 'create database ...' postgres, and still would be with a pg_ prefix.

I think we should add pg_createdb, and do a normal deprecation cycle for
removing createdb.


I wish 'pg' wasn't an already used binary name.  It'd be much nicer if
we had a '/usr/bin/pg' wrapper binary in the git style, that we could easily
expand over time, without hitting new conflicts.  I'd even consider a
'/usr/bin/pgsql' that has commands for all our binaries a considerable
improvement in the long term.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Tom Lane
Andreas Karlsson  writes:
> As for if we should have backwards compatibility for the old names I am 
> leaning weakly for providing it in the case of createuser. I can see end 
> users being pissed off that the createuser command is suddenly gone 
> without any warning when they upgrade. On the flip side I have no idea 
> how much work it would be to maintain those legacy names.

It seems reasonably trivial to me as far as the code goes --- just
create a symlink during installation.  (On Windows I suppose we'd have
to make a second physical copy, but these files are not so large that
that seems unacceptable.)  It might be more work on the documentation
side, depending on whether you wanted two copies of the man page or
not.  I see that postmaster's man page pretty much just points to
postgres, so it seems like we could do that for these scripts too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Stephen Frost
Greetings,

* Andreas Karlsson (andr...@proxel.se) wrote:
> On 03/19/2017 07:35 PM, Stephen Frost wrote:
> >* Tom Lane (t...@sss.pgh.pa.us) wrote:
> >>Stephen Frost  writes:
> >>(Or in other words, we've been getting along fine with these script names
> >>for circa twenty years, so what's the rush to change them RIGHT NOW?)
> >
> >To be clear, I'm not in any particular rush to change them 'RIGHT NOW'.
> >I tend to agree with Magnus that we're doing a lot of other things in
> >PG10 and that makes it a bit of a natural point, but I don't hold that
> >position terribly strongly.  On the other hand, I do not relish the idea
> >of providing backwards-compatibility for every user-facing change we do
> >for 5 years and that's where I feel this approach is encouraging us to
> >go.
> 
> I only think that argument is only applicable where the changes are
> closely related, e.g. renaming pg_clog, pg_xlog and pg_log in the
> same release. I do not see any strong connection between createuser
> and pg_xlog.

Given all that we're doing, it strikes me as pretty likely that people
will realize that PG10 does more and will therefore pay more attention,
in general, to what we tell them in the release notes about changes.

> As for if we should have backwards compatibility for the old names I
> am leaning weakly for providing it in the case of createuser. I can
> see end users being pissed off that the createuser command is
> suddenly gone without any warning when they upgrade. On the flip
> side I have no idea how much work it would be to maintain those
> legacy names.

This particular case might not be that much of a maintenance burden, but
to your example above, if they're going to be annoyed about it going
missing in PG10, it seems likely that they're going to be annoyed when
the symlink goes away in PG15 (or whatever) too.

Either way, we'll obviously document what we're doing in the release
notes, so the whole "without any warning" doesn't really fly, in my
view, either.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)

2017-03-19 Thread Tom Lane
I wrote:
> Consider
>   SELECT count(*)
>   FROM a FULL JOIN b ON (a.id = b.id)
>   WHERE (a.x = 42 OR b.y = 43);
> and suppose that a and b have mutual FK constraints guaranteeing that
> every non-null a.id value has exactly one match in b and vice versa.

Oh, that was sloppy of me.  Join removal depends on unique constraints
not FK constraints.  So actually, this example just requires assuming
that both a.id and b.id are known unique, which is much less far-fetched
than assuming circular FK constraints.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Andreas Karlsson

On 03/19/2017 07:35 PM, Stephen Frost wrote:

* Tom Lane (t...@sss.pgh.pa.us) wrote:

Stephen Frost  writes:
(Or in other words, we've been getting along fine with these script names
for circa twenty years, so what's the rush to change them RIGHT NOW?)


To be clear, I'm not in any particular rush to change them 'RIGHT NOW'.
I tend to agree with Magnus that we're doing a lot of other things in
PG10 and that makes it a bit of a natural point, but I don't hold that
position terribly strongly.  On the other hand, I do not relish the idea
of providing backwards-compatibility for every user-facing change we do
for 5 years and that's where I feel this approach is encouraging us to
go.


I only think that argument is only applicable where the changes are 
closely related, e.g. renaming pg_clog, pg_xlog and pg_log in the same 
release. I do not see any strong connection between createuser and pg_xlog.


As for if we should have backwards compatibility for the old names I am 
leaning weakly for providing it in the case of createuser. I can see end 
users being pissed off that the createuser command is suddenly gone 
without any warning when they upgrade. On the flip side I have no idea 
how much work it would be to maintain those legacy names.


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Improve OR conditions on joined columns (common star schema problem)

2017-03-19 Thread Tom Lane
Jim Nasby  writes:
> On 3/16/17 11:54 AM, David Steele wrote:
>> On 2/14/17 4:03 PM, Tom Lane wrote:
>>> Well, the key point is whether it's really OK to de-dup on the basis
>>> of only the CTIDs that are not eliminated in any UNION arm.  I was
>>> feeling fairly good about that until I thought of the full-join-to-
>>> left-join-to-no-join conversion issue mentioned in the comment.

>> Jim, have you had time to think about this?  Any insights?

> Having thought about it, I share Tom's concerns. And now I'm worried 
> about what happens if there are multiple separate OR clauses. I guess 
> those would be handled by separate UNIONs?

As proposed, the patch would try to optimize by splitting each OR clause
independently, and would choose whichever way gave the best cost estimate.
I'm not sure it's possible to do better than that, and even if it is,
I think improving it could be left for later.

> Tom, could you expand the description some, especially with some examples?

Here's a counterexample --- admittedly rather artificial, but still a
counterexample --- to applying the optimization when there are FULL joins.
Consider

SELECT count(*)
FROM a FULL JOIN b ON (a.id = b.id)
WHERE (a.x = 42 OR b.y = 43);

and suppose that a and b have mutual FK constraints guaranteeing that
every non-null a.id value has exactly one match in b and vice versa.  (For
the purposes of this example, I think it doesn't matter whether or not we
allow there to be null id values.)  Also suppose that there are some join
rows satisfying both a.x = 42 and b.y = 43, while other join rows satisfy
only one of the OR arms.  The patch would try to implement this query as,
approximately,

SELECT count(*)
FROM
( SELECT FROM a FULL JOIN b ON (a.id = b.id) WHERE a.x = 42
  UNION-using-ctids
  SELECT FROM a FULL JOIN b ON (a.id = b.id) WHERE b.y = 43 )

Now in the first UNION arm, we'd observe that the strict a.x = 42
condition allows reduction of the join strength to LEFT JOIN, and then
we'd deduce from the FK on b that the join to b can be dropped altogether.
In the second arm, we'd similarly conclude that a can be dropped
altogether, leaving

SELECT count(*)
FROM
( SELECT FROM a WHERE a.x = 42
  UNION-using-ctids
  SELECT FROM b WHERE b.y = 43 )

But at this point there are no rels that are not eliminated in any UNION
arm, so that no de-duping would occur in the UNION, meaning that we'd
double-count any join rows in which both a.x = 42 and b.y = 43.

I'd also considered an approach of de-duping on the basis of all relation
ctids, while allowing a rel's ctid to be returned as NULL from a UNION arm
in which the rel was eliminated entirely.  But that doesn't fix it,
because in this example the first arm would return (a.ctid, NULL) while
the second arm would return (NULL, b.ctid), so that the UNION step would
still fail to detect any duplication.  To make it work, we'd have to not
eliminate the joins, which would pretty much defeat the usefulness of
the optimization for your original example case.

So full joins definitely break this whole optimization.  I think it's okay
with left joins though, because when starting from "a left join b" it will
never be possible to remove "a" so we'll always include a.ctid in the
UNION de-duping step.  If b is removed in some arm, then it must be true
that we get exactly one left-join output row per a row, regardless of the
contents of b, in that arm.  The argument for the patch being okay is
essentially that we must get exactly one left-join output row per a row,
regardless of the contents of b, in *every* arm, because the various
modified versions of the OR clause can't affect that conclusion.  In some
of the arms we might not remove b, and we might even be able to reduce the
left join to an inner join, but there should still be no more than one
join output row per a row.  That being the case, it should be sufficient
to de-dup using a.ctid while ignoring b.ctid.

Any clearer yet?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Create replication slot in pg_basebackup if requested and not yet present

2017-03-19 Thread Michael Banck
Hi,

On Sun, Mar 19, 2017 at 05:01:23PM +0100, Magnus Hagander wrote:
> On Sun, Mar 19, 2017 at 11:21 AM, Michael Banck 
> wrote:
> > So I propose the attached tiny patch to just create the slot (if it does
> > not exist already) in pg_basebackup, somewhat similar to what
> > pg_receivewal does, albeit unconditionally, if the user explicitly
> > requested a slot:
> >
> > $ pg_basebackup --pgdata=data2 --write-recovery-conf --slot=pg2
> > $ echo $?
> > 0
> > $ psql -c "DROP_REPLICATION_SLOT pg2" "host=127.0.0.1 port=5432
> > replication=database user=mba dbname=postgres"
> > SELECT
> > $
> >
> > This would get us somewhat closer to near zero-config replication, in my
> > opinion. Pardon me if that was discussed already and shot down
> >
> > Comments?
> 
> I think this is a good idea, since it makes replication slots easier to
> use. The change to use temporary slots was good for backups, but didn't
> help people setting up replicas.
> 
> I've been annoyed for a while we didn't have a "create slot" mode in
> pg_basebackup, but doing it integrated like this is definitely a step even
> better than that.

Great!
 
> I think maybe we should output a message when the slot is created, at least
> in verbose mode, to make sure people realize that happened. Does that seem
> reasonable?

So the patch I sent earlier creates the slot in ReceiveXlogStream() in
receivewal.c, as that's where the temp slot gets created as well, but
now I wonder whether that is maybe not the best place, as pg_receivewal
also calls that function. The other problem with receivewal.c is that
`verbose' isn't around in it so I don't how I'd print out a message
there.

So probably it is better to create the slot in pg_basebackup.c's
StartLogStreamer(), see the attached first patch, that one also adds
a verbose message.

I also now realized I didn't ran the TAP tests and they need updating,
this has been done in the first attached patch as well. Independently of
this patch series I think those two hunks from the third patch are
applicable to current master as well:

 $node->command_fails(
-   [ 'pg_basebackup', '-D', "$tempdir/fail", '-S', 'slot1' ],
+   [ 'pg_basebackup', '-D', "$tempdir/fail", '-X', 'none', '-S', 'slot0' ],
'pg_basebackup with replication slot fails without -X stream');

as '-X stream' is now the default, and (more cosmetically)

-like($lsn, qr!^0/[0-9A-Z]{7,8}$!, 'restart LSN of slot has advanced');
+like($lsn, qr!^0/[0-9A-F]{7,8}$!, 'restart LSN of slot has advanced');

as we are looking for hex values.

However, the other thing to ponder is that we don't really know whether
the user maybe setup a replication slot on the primary in preparation
already as there seems to be no way to query the list of current slots
via the replication protocal, and setting up another normal connection
just to query pg_replication_slots seems to be overkill. So maybe the
user would be confused then why the slot is created, but IDK. 

If there are other uses for querying the available replication slots,
maybe the protocol could be extended to that end for 11.

Finally, it is a bit inconsitent that we'd report the creation of the
permanent slot, but not the temporary one. 

I took a look and it seems the main reason why ReceiveXlogStream() does
not call streamutil,c's CreateReplicationSlot() seems to be that the
latter has no notion of temporary slots yet.  I'm attaching a second
patch which refactors things a bit more, adding a `is_temporary'
argument to CreateReplicationSlot() and moving the creation of the
temporary slot to pg_basebackup.c's StartLogStreamer() as well (as
pg_recvlogical and pg_receivewal do not deal with temp slots anyway).
This way, the creation of the temporary slot can be mention on --verbose
as well.

Personally, I think it'd be good to be able to have the --slot option
just work in 10, so if the first patch is still acceptable (pending
review) for 10 but not the second, that'd be entirely fine.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
>From 835e6fe3e30d534bc5918d1d6c399074a9a13626 Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Sun, 19 Mar 2017 10:58:13 +0100
Subject: [PATCH 1/2] Create replication slot in pg_basebackup if requested and
 not yet present.

If a replication slot is explicitly requested, try to create it before
starting to replicate from it.
---
 src/bin/pg_basebackup/pg_basebackup.c| 15 +++
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 19 ---
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 0a4944d..69ca4be 100644
--- 

Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> It's not a "half measure", it's providing a sane upgrade path.
> 
> > I really don't see it that way.  We're talking about existing scripts
> > which will break if the binary is renamed.  That means that, today,
> > they're using createlang/createdb/createuser.  The options for a user
> > using such a script with the proposed approach, when PG10 comes out
> > are:
> 
> > 1) don't change the script, because the old names work
> > 2) adjust the script to work with both X and pg_X values
> 
> You're neglecting what I think most people would want to do:
> 
> 3) Wait till they no longer care about supporting versions that have
> only the old names, then change their scripts to use pg_X.

Perhaps it's a bit stark, but my experience has been that people tend to
fall into one of the two camps I outlined above.

If we take your approach to its logical conclustion then we should be
planning to maintain all user-facing deprecated features for as long as
there is a version where it exists in a non-deprecated fashion, in other
words for 5 years, if we assume that script authors only care about
supporting as far back as we do, which I'm not entirely sure is a great
assumption to begin with, but I'd say it's at least a minimum.

To that I say: heck no.  I understand that there has been grief due to
the various user-facing changes we've made recently and I completely
understand those complaints because it means that developers have to
deal with version-specific differences, particularly when it comes to
monitoring and now with the pg_xlog -> pg_wal changes, more with backup
too, but trying to deal with multiple sets of interfaces over a long
period of time is going to be incredibly painful and very confusing for
users, especially new users.

In some cases, we would end up with 5 interfaces, all having to be
maintained across all of the back-branches, because we changed something
in every release, deprecating each prior version as we went.

No, I don't feel that this kind of backwards-compatibility is really
something we want, as a project.  We, for very good reason, support 5
years worth of back-branches to allow users the opportunity and time to
migrate at a reasoned pace, possibly even skipping versions as they go
if they don't wish to make changes every year.

I'll admit that this is a bit of a straw-man and goes beyond what you're
suggesting here, specifically, but that just begs the question of "where
do we draw the line", if we are going to draw one.  Are scripts which
use command-line utilities more of an issue to break than those which
query pg_stat_activity?

> > I anticipate an argument along the lines of "but we're giving them time
> > to make the change" but I don't see that as really holding all that much
> > weight either- we maintain back-branch releases for years to give users
> > time to adjust to changes that have been made in the latest releases.
> 
> Yes, and that also means that other tooling has to be prepared to work
> with multiple releases.  You're proposing to make that harder, and I
> do not think there's sufficient reason.

I do think, in the general sense, that tools based on PG should be
prepared to deal with differences between major versions, because we
have quite a few of those.  I'm hard-pressed to come up with a major PG
tool which *doesn't* have mechanisms for dealing with differences
between major versions already.

There's certainly no shortage of PG-based applications which also have
code for dealing with different PG major versions either.

> This line of argument means that we probably couldn't remove the old
> names until 9.6 is out of support, but so what?  We had the deprecation
> notice for createlang in place since 9.1, and I think that that's about
> the right timeline for this sort of changeover.  We should not
> cavalierly break peoples' scripts for what's fundamentally just a
> cosmetic improvement.

Per my question above- when is 5 years of deprecated-but-working support
for a feature the right timeline and when isn't it?  My feeling is that
most of the changes we make in this regard are accepted under the "well,
it's a major version change, so you'll need to adjust" but this change
isn't viewed in that same light and I'm not really sure that I see a
good reason for that.

I'll admit that part of that is likely because I don't think I have
*ever* used createdb or createlang or createuser (excepting whatever
regression tests run them), but even so, figuring out some idea where we
draw this line between things that need to be deprecated for 5 years and
things that don't would be useful, though likely a source of ongoing
discussion too.

> (Or in other words, we've been getting along fine with these script names
> for circa twenty years, so what's the rush to change them RIGHT NOW?)

To be clear, I'm not in any particular rush to change them 

Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> It's not a "half measure", it's providing a sane upgrade path.

> I really don't see it that way.  We're talking about existing scripts
> which will break if the binary is renamed.  That means that, today,
> they're using createlang/createdb/createuser.  The options for a user
> using such a script with the proposed approach, when PG10 comes out
> are:

> 1) don't change the script, because the old names work
> 2) adjust the script to work with both X and pg_X values

You're neglecting what I think most people would want to do:

3) Wait till they no longer care about supporting versions that have
only the old names, then change their scripts to use pg_X.

> I anticipate an argument along the lines of "but we're giving them time
> to make the change" but I don't see that as really holding all that much
> weight either- we maintain back-branch releases for years to give users
> time to adjust to changes that have been made in the latest releases.

Yes, and that also means that other tooling has to be prepared to work
with multiple releases.  You're proposing to make that harder, and I
do not think there's sufficient reason.

This line of argument means that we probably couldn't remove the old
names until 9.6 is out of support, but so what?  We had the deprecation
notice for createlang in place since 9.1, and I think that that's about
the right timeline for this sort of changeover.  We should not
cavalierly break peoples' scripts for what's fundamentally just a
cosmetic improvement.

(Or in other words, we've been getting along fine with these script names
for circa twenty years, so what's the rush to change them RIGHT NOW?)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-19 Thread Tom Lane
Alvaro Herrera  writes:
> The reason this is so large is that there is an entangled refactoring
> patch, splitting the exec_command() function from one giant switch()
> into one routine for each command.  It's up to the committer whether to
> do it all in one patch, or to request this to be split into a
> refactoring patch plus another adding functionality on top.

Assuming we want to do it that way at all, two steps would probably be
easier to review in detail.

I'm not entirely convinced that function-per-command is an improvement
though.  Seems like it would only help to the extent that you could do a
simple "return" to implement early exit, and it looks to me like that
doesn't work in a lot of places because you still have to clean up things
like malloc'd argument strings before you can return.  So the question
we have to answer is whether this way looks cleaner than what we'd get if
we just changed the logic in-place.  For the purpose of answering that
question, looking at the final state is the right thing to do.

I don't have a definite opinion on that core question yet, since I've not
read this version of the patch.  Anybody else want to give an opinion?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-03-19 Thread Peter Geoghegan
On Sun, Mar 19, 2017 at 8:40 AM, Greg Stark  wrote:
>> Out of idle curiosity, I decided to generate disassembly of both
>> macaddr_cmp_internal(), and the patch's abbreviated comparator. The
>> former consists of 49 x86-64 instructions at -02 on my machine,
>> totaling 135 bytes of object code. The latter consists of only 10
>> instructions, or 24 bytes of object code.
>
> I wonder if there's something that could be optimized out of the
> normal cmp function but we're defeating some compiler optimizations
> with all our casts and aliasing.

There was one shl instruction for every left shift (hibits() or
lowbits() call) that appears in macaddr_cmp_internal(). I suppose that
it's possible that that could have been better optimized on a
big-endian machine, where abbreviated keys do not need to be
byteswaped to make the abbreviated comparator work. Perhaps the
compiler could have recognized that macaddr is a struct that consists
of 6 unsigned bytes as digits.

One thing that I've noticed makes a relatively big difference to
instruction count in comparators is varlena overhead, which does come
up here, since macaddr is a type that doesn't have a varlena header
(it was recently suggested by Tom that this is a mistake on practical
grounds, though). I've informally considered the possibility of
providing alternative versions of comparators that do not detoast or
work with anything other than 1-byte header varlenas, because
tuplesort has detected that that happens to be generally safe. I doubt
that I'll ever get around to posting a patch to do that, since the
cost savings are probably still marginal. I could probably find
something better to work on.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> >> On Sat, Mar 18, 2017 at 9:23 PM, Tom Lane  wrote:
> >>> I think a more reasonable way to proceed is to install symlinks
> >>> pg_createuser -> createuser (or the other direction), mark the older names
> >>> as deprecated, and announce that we'll remove the old names a few releases
> >>> from now.  That gives people time to adjust.
> 
> >> I'd suggest doing it in the other direction, but yeah, that seems like a
> >> softer way to handle it. As long as we clearly document them as such.
> >> Perhaps we should even have them output a little "hey you should be using
> >> pg_xyz" if they are used by the wrong name, but I wonder if that might
> >> break things.
> 
> > I don't really agree with these half-measures.
> 
> It's not a "half measure", it's providing a sane upgrade path.

I really don't see it that way.  We're talking about existing scripts
which will break if the binary is renamed.  That means that, today,
they're using createlang/createdb/createuser.  The options for a user
using such a script with the proposed approach, when PG10 comes out
are:

1) don't change the script, because the old names work
2) adjust the script to work with both X and pg_X values

If option #1 is used, then it's just going to break whenever we do
remove those symlinks, meaning all we're doing is delaying things
without very much point.

If option #2 is used, which I believe is what we would want people to
do, then the symlinks are useless.

I anticipate an argument along the lines of "but we're giving them time
to make the change" but I don't see that as really holding all that much
weight either- we maintain back-branch releases for years to give users
time to adjust to changes that have been made in the latest releases.
The people who are going to rush out to deploy PG10 as soon as it's
released with the expectation that nothing is going to break aren't very
likely going to be the group who will be reading the release notes
carefully and making note of the changes we made that impact their
environment and scripts.  In other words, they're likely to stick with
option #1 above, and complain loudly whenever we do remove the symlinks,
if we ever manage to.

> If we had not had the deprecation notice in place for createlang, I do
> not think Magnus' proposal to remove it would have been accepted so
> easily --- or, probably, at all.  There is no such notice in place
> for createdb or createuser, so arguing by analogy to that proposal
> falls flat.

I wasn't.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-19 Thread Alvaro Herrera
Fabien COELHO wrote:
> 
> Hello Corey,
> 
> > v24 highlights:
> 
> The v24 patch is twice larger that the previous submission. Sigh.

The reason this is so large is that there is an entangled refactoring
patch, splitting the exec_command() function from one giant switch()
into one routine for each command.  It's up to the committer whether to
do it all in one patch, or to request this to be split into a
refactoring patch plus another adding functionality on top.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-19 Thread Fabien COELHO


Hello Corey,


v24 highlights:


The v24 patch is twice larger that the previous submission. Sigh.

If I'm reading headers correctly, it seems that it adds an 
"expected/psql-on-error-stop.out" file without a corresponding test source 
in "sql/". Is this file to be simply ignored, or is a source missing?


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Tom Lane
Stephen Frost  writes:
> * Magnus Hagander (mag...@hagander.net) wrote:
>> On Sat, Mar 18, 2017 at 9:23 PM, Tom Lane  wrote:
>>> I think a more reasonable way to proceed is to install symlinks
>>> pg_createuser -> createuser (or the other direction), mark the older names
>>> as deprecated, and announce that we'll remove the old names a few releases
>>> from now.  That gives people time to adjust.

>> I'd suggest doing it in the other direction, but yeah, that seems like a
>> softer way to handle it. As long as we clearly document them as such.
>> Perhaps we should even have them output a little "hey you should be using
>> pg_xyz" if they are used by the wrong name, but I wonder if that might
>> break things.

> I don't really agree with these half-measures.

It's not a "half measure", it's providing a sane upgrade path.

If we had not had the deprecation notice in place for createlang, I do
not think Magnus' proposal to remove it would have been accepted so
easily --- or, probably, at all.  There is no such notice in place
for createdb or createuser, so arguing by analogy to that proposal
falls flat.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-03-19 Thread David Steele



On 3/17/17 4:56 PM, Tom Lane wrote:

Peter Eisentraut  writes:

On 3/17/17 16:20, Peter Eisentraut wrote:

I think we would have to extend restore_command with an additional
placeholder that communicates the segment size, and add a new pg_standby
option to accept that size somehow.  And specifying the size would have
to be mandatory, for complete robustness.  Urgh.



Another way would be to name the WAL files in a more self-describing
way.  For example, instead of


Actually, if you're content with having tools obtain this info by
examining the WAL files, we shouldn't need to muck with the WAL naming
convention (which seems like it would be a horrid mess, anyway --- too
much outside code knows that).  Tools could get the segment size out of
XLogLongPageHeaderData.xlp_seg_size in the first page of the segment.

regards, tom lane


+1

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] createlang/droplang deprecated

2017-03-19 Thread David Steele

On 3/18/17 9:00 AM, Peter Eisentraut wrote:

I just noticed that createlang and droplang have been listed as
deprecated since PG 9.1.

Do we dare remove them?


+1

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Create replication slot in pg_basebackup if requested and not yet present

2017-03-19 Thread Magnus Hagander
On Sun, Mar 19, 2017 at 11:21 AM, Michael Banck 
wrote:

> Hi,
>
> with the default configuration improvements so far for 10, it seems to
> be very easy to setup streaming replication (at least locally):
>
> $ initdb --pgdata=data1
> $ pg_ctl --pgdata=data1 start
> $ pg_basebackup --pgdata=data2 --write-recovery-conf
> $ sed -i -e 's/^#port.=.5432/port = 5433/' \
> > -e 's/^#hot_standby.=.off/hot_standby = on/' \
> > data2/postgresql.conf
> $ pg_ctl --pgdata=data2 start
>
> (there might be a case for having hot_standby=on by default, but I think
> that got discussed elsewhere and is anyway a different thread).
>
> However, the above does not use replication slots, and if you want to do
> so, you get:
>
> $ pg_basebackup --pgdata=data2 --write-recovery-conf --slot=pg2
> 2017-03-19 11:04:37.978 CET [25362] ERROR:  replication slot "pg2" does
> not exist
> pg_basebackup: could not send replication command "START_REPLICATION":
> ERROR:  replication slot "pg2" does not exist
> pg_basebackup: child process exited with error 1
> pg_basebackup: removing data directory "data2"
>
> The error message is clear enough, but I wonder whether people will
> start writing streaming replication tutorials just glossing over this
> because it's one more step to run CREATE_REPLICATION_SLOT manually.
>
> So I propose the attached tiny patch to just create the slot (if it does
> not exist already) in pg_basebackup, somewhat similar to what
> pg_receivewal does, albeit unconditionally, if the user explicitly
> requested a slot:
>
> $ pg_basebackup --pgdata=data2 --write-recovery-conf --slot=pg2
> $ echo $?
> 0
> $ psql -c "DROP_REPLICATION_SLOT pg2" "host=127.0.0.1 port=5432
> replication=database user=mba dbname=postgres"
> SELECT
> $
>
> This would get us somewhat closer to near zero-config replication, in my
> opinion. Pardon me if that was discussed already and shot down
>
> Comments?
>

I think this is a good idea, since it makes replication slots easier to
use. The change to use temporary slots was good for backups, but didn't
help people setting up replicas.

I've been annoyed for a while we didn't have a "create slot" mode in
pg_basebackup, but doing it integrated like this is definitely a step even
better than that.

I think maybe we should output a message when the slot is created, at least
in verbose mode, to make sure people realize that happened. Does that seem
reasonable?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-03-19 Thread Greg Stark
On 18 March 2017 at 22:22, Peter Geoghegan  wrote:
>
> Out of idle curiosity, I decided to generate disassembly of both
> macaddr_cmp_internal(), and the patch's abbreviated comparator. The
> former consists of 49 x86-64 instructions at -02 on my machine,
> totaling 135 bytes of object code. The latter consists of only 10
> instructions, or 24 bytes of object code.

I wonder if there's something that could be optimized out of the
normal cmp function but we're defeating some compiler optimizations
with all our casts and aliasing.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
> On Sat, Mar 18, 2017 at 9:23 PM, Tom Lane  wrote:
> > Magnus Hagander  writes:
> > > createuser, dropuser - definitely pollutes the namespace, people do
> > > sometimes try them for the wrong thing. Unlike the db ones they do add
> > > value though -- I don't think we have a psql way of in a single command
> > > doing what --pwprompt on createuser does, do we? But given that we are in
> > > the process of breaking a lot of other scripts for 10, perhaps we should
> > > rename it to pg_createuser?
> >
> > I'm not particularly on board with arguments like "we already broke a ton
> > of stuff so let's break some more".  Eventually you've managed to create
> > a daunting barrier to upgrading at all.
> 
> The argument is more that if we are going to break it, now is a good time
> to do it because we are already forcing people to review their scripts.
> 
> If we decide not to break it at all that's one thing. But if we *are* going
> to break it, it's better to do it in 10 than in 11.

I tend to agree with this, but...

> > I think a more reasonable way to proceed is to install symlinks
> > pg_createuser -> createuser (or the other direction), mark the older names
> > as deprecated, and announce that we'll remove the old names a few releases
> > from now.  That gives people time to adjust.
> 
> I'd suggest doing it in the other direction, but yeah, that seems like a
> softer way to handle it. As long as we clearly document them as such.
> Perhaps we should even have them output a little "hey you should be using
> pg_xyz" if they are used by the wrong name, but I wonder if that might
> break things.

I don't really agree with these half-measures.  They don't do anything
for the namespace pollution problem and even if we do spit out a warning
or something (which may break things *anyway*, as you suggest), it's
just delaying when things break to some future point where we'll have to
argue, again, about removing the symlinks, and then it'll be in PG11 or
PG12 or some other release which isn't breaking as much.

> > Maybe we should handle createdb likewise, rather than just kicking it to
> > the curb.  I know I use it quite often; it's less typing than psql -c
> > 'create database ...' postgres, and still would be with a pg_ prefix.
> 
> As long as they have a pg_ prefix, I don't see much harm in them being
> there, they're tiny. It's not like they're a huge maintenance burden.

I agree with keeping them, with a pg_ prefix.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Magnus Hagander
On Sat, Mar 18, 2017 at 9:23 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > createuser, dropuser - definitely pollutes the namespace, people do
> > sometimes try them for the wrong thing. Unlike the db ones they do add
> > value though -- I don't think we have a psql way of in a single command
> > doing what --pwprompt on createuser does, do we? But given that we are in
> > the process of breaking a lot of other scripts for 10, perhaps we should
> > rename it to pg_createuser?
>
> I'm not particularly on board with arguments like "we already broke a ton
> of stuff so let's break some more".  Eventually you've managed to create
> a daunting barrier to upgrading at all.
>

The argument is more that if we are going to break it, now is a good time
to do it because we are already forcing people to review their scripts.

If we decide not to break it at all that's one thing. But if we *are* going
to break it, it's better to do it in 10 than in 11.



> I think a more reasonable way to proceed is to install symlinks
> pg_createuser -> createuser (or the other direction), mark the older names
> as deprecated, and announce that we'll remove the old names a few releases
> from now.  That gives people time to adjust.
>

I'd suggest doing it in the other direction, but yeah, that seems like a
softer way to handle it. As long as we clearly document them as such.
Perhaps we should even have them output a little "hey you should be using
pg_xyz" if they are used by the wrong name, but I wonder if that might
break things.


Maybe we should handle createdb likewise, rather than just kicking it to
> the curb.  I know I use it quite often; it's less typing than psql -c
> 'create database ...' postgres, and still would be with a pg_ prefix.
>

As long as they have a pg_ prefix, I don't see much harm in them being
there, they're tiny. It's not like they're a huge maintenance burden.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Magnus Hagander
On Sun, Mar 19, 2017 at 1:44 PM, Andreas Karlsson  wrote:

> On 03/18/2017 09:12 PM, Magnus Hagander wrote:
>
>> createdb, dropdb - also not clear they're about postgres, more likely to
>> be used by mistake but not that bad. That said, do they add any *value*
>> beyond what you can do with psql -c "CREATE DATABASE"? I don't really
>> see one, so I'd suggest dropping these too.
>>
>
> The value they add is that they quote the database name and options
> correctly which makes them easier to use safely and reliably in shell
> scripts. And unless I am missing something obvious I do not think there is
> any easy way for a beginner to do this with just psql.


Good point, I hadn't thought of that. I guess I just generally make sure I
don't use names that require quoting myself, but I can definitely see how
that's an actual value-add.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Logical decoding on standby

2017-03-19 Thread Petr Jelinek
Hi,

I don't know how well I can review the 0001 (the TAP infra patch) but it
looks okay to me.

I don't really have any complaints about 0002 either. I like that it's
more or less one self-contained function and there are no weird ifdefs
anymore like in 9.6 version (btw your commit message talks about 9.5 but
it was 9.6). I also like the clever test :)

I am slightly worried about impact of the readTimeLineHistory() call but
I think it should be called so little that it should not matter.

That brings us to the big patch 0003.

I still don't like the "New in 10.0" comments in documentation, for one
it's 10, not 10.0 and mainly we don't generally write stuff like this to
documentation, that's what release notes are for.

There is large amounts of whitespace mess (empty lines with only
whitespace, spaces at the end of the lines), nothing horrible, but
should be cleaned up.

One thing I don't understand much is the wal_level change and turning
off catalogXmin tracking. I don't really see anywhere that the
catalogXmin would be reset in control file for example. There is TODO in
UpdateOldestCatalogXmin() that seems related but tbh I don't follow
what's happening there - comment says about not doing anything, but the
code inside the if block are only Asserts.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mat views stats

2017-03-19 Thread Jim Mlodgenski
>
>
> But that seems pretty ugly.  Given the lack of previous reports, I'm
> personally content to leave this unfixed in the back branches.
>
> Comments?
>
> Most instances of this I've seen out in the field have worked around this
by just running analyze in the scheduled jobs after the refresh  so we're
probably good not back patching.


Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-03-19 Thread Petr Jelinek
On 19/03/17 12:32, Pavel Stehule wrote:
> 
> 
> 2017-03-18 19:30 GMT+01:00 Petr Jelinek  >:
> 
> On 16/03/17 17:15, David Steele wrote:
> > On 2/1/17 3:59 PM, Pavel Stehule wrote:
> >> Hi
> >>
> >> 2017-01-24 21:33 GMT+01:00 Pavel Stehule  
> >> >>:
> >>
> >> Perhaps that's as simple as renaming all the existing _ns_*
> >> functions to _block_ and then adding support for pragmas...
> >>
> >> Since you're adding cursor_options to PLpgSQL_expr it 
> should
> >> probably be removed as an option to exec_*.
> >>
> >> I have to recheck it. Some cursor options going from dynamic
> >> cursor variables and are related to dynamic query - not query
> >> that creates query string.
> >>
> >> hmm .. so current state is better due using options like
> >> CURSOR_OPT_PARALLEL_OK
> >>
> >>  if (expr->plan == NULL)
> >> exec_prepare_plan(estate, expr, (parallelOK ?
> >>   CURSOR_OPT_PARALLEL_OK : 0) |
> >> expr->cursor_options);
> >>
> >> This options is not permanent feature of expression - and then I
> >> cannot to remove cursor_option argument from exec_*
> >>
> >> I did minor cleaning - remove cursor_options from plpgsql_var
> >>
> >> + basic doc
> >
> > This patch still applies cleanly and compiles at cccbdde.
> >
> > Any reviewers want to have a look?
> >
> 
> I'll bite.
> 
> I agree with Jim that it's not very nice to add yet another
> block/ns-like layer. I don't see why pragma could not be added to either
> PLpgSQL_stmt_block (yes pragma can be for whole function but function
> body is represented by PLpgSQL_stmt_block as well so no issue there), or
> to namespace code. In namespace since they are used for other thing
> there would be bit of unnecessary propagation but it's 8bytes per
> namespace, does not seem all that much.
> 
> My preference would be to add it to PLpgSQL_stmt_block (unless we plan
> to add posibility to add pragmas for other loops and other things) but I
> am not sure if current block is easily (and in a fast way) accessible
> from all places where it's needed. Maybe the needed info could be pushed
> to estate from PLpgSQL_stmt_block during the execution.
> 
> 
> There is maybe partial misunderstand of pragma - it is set of nested
> configurations used in compile time only. It can be used in execution
> time too - it change nothing.
> 
> The pragma doesn't build a persistent tree. It is stack of
> configurations that allows fast access to current configuration, and
> fast leaving of configuration when the change is out of scope.
> 
> I don't see any any advantage to integrate pragma to ns or to
> stmt_block. But maybe I don't understand to your idea. 
> 
> I see a another possibility in code - nesting init_block_directives() to
> plpgsql_ns_push and free_block_directives() to plpgsql_ns_pop()
> 

That's more or less what I mean by "integrating" to ns :)

The main idea is to not add 3rd layer of block push/pop that's sprinkled
in "random" places.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding of two-phase transactions

2017-03-19 Thread Petr Jelinek
On 17/03/17 03:34, Craig Ringer wrote:
> On 17 March 2017 at 08:10, Stas Kelvich  wrote:
> 
>> While working on this i’ve spotted quite a nasty corner case with aborted 
>> prepared
>> transaction. I have some not that great ideas how to fix it, but maybe i 
>> blurred my
>> view and missed something. So want to ask here at first.
>>
>> Suppose we created a table, then in 2pc tx we are altering it and after that 
>> aborting tx.
>> So pg_class will have something like this:
>>
>> xmin | xmax | relname
>> 100  | 200| mytable
>> 200  | 0| mytable
>>
>> After previous abort, tuple (100,200,mytable) becomes visible and if we will 
>> alter table
>> again then xmax of first tuple will be set current xid, resulting in 
>> following table:
>>
>> xmin | xmax | relname
>> 100  | 300| mytable
>> 200  | 0| mytable
>> 300  | 0| mytable
>>
>> In that moment we’ve lost information that first tuple was deleted by our 
>> prepared tx.
> 
> Right. And while the prepared xact has aborted, we don't control when
> it aborts and when those overwrites can start happening. We can and
> should check if a 2pc xact is aborted before we start decoding it so
> we can skip decoding it if it's already aborted, but it could be
> aborted *while* we're decoding it, then have data needed for its
> snapshot clobbered.
> 
> This hasn't mattered in the past because prepared xacts (and
> especially aborted 2pc xacts) have never needed snapshots, we've never
> needed to do something from the perspective of a prepared xact.
> 
> I think we'll probably need to lock the 2PC xact so it cannot be
> aborted or committed while we're decoding it, until we finish decoding
> it. So we lock it, then check if it's already aborted/already
> committed/in progress. If it's aborted, treat it like any normal
> aborted xact. If it's committed, treat it like any normal committed
> xact. If it's in progress, keep the lock and decode it.
> 
> People using logical decoding for 2PC will presumably want to control
> 2PC via logical decoding, so they're not so likely to mind such a
> lock.
> 
>> * Try at first to scan catalog filtering out tuples with xmax bigger than 
>> snapshot->xmax
>> as it was possibly deleted by our tx. Than if nothing found scan in a usual 
>> way.
> 
> I don't think that'll be at all viable with the syscache/relcache
> machinery. Way too intrusive.
> 

I think only genam would need changes to do two-phase scan for this as
the catalog scans should ultimately go there. It's going to slow down
things but we could limit the impact by doing the two-phase scan only
when historical snapshot is in use and the tx being decoded changed
catalogs (we already have global knowledge of the first one, and it
would be trivial to add the second one as we have local knowledge of
that as well).

What I think is better strategy than filtering out by xmax would be
filtering "in" by xmin though. Meaning that first scan would return only
tuples modified by current tx which are visible in snapshot and second
scan would return the other visible tuples. That way whatever the
decoded tx seen should always win.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Andreas Karlsson

On 03/18/2017 09:12 PM, Magnus Hagander wrote:

createdb, dropdb - also not clear they're about postgres, more likely to
be used by mistake but not that bad. That said, do they add any *value*
beyond what you can do with psql -c "CREATE DATABASE"? I don't really
see one, so I'd suggest dropping these too.


The value they add is that they quote the database name and options 
correctly which makes them easier to use safely and reliably in shell 
scripts. And unless I am missing something obvious I do not think there 
is any easy way for a beginner to do this with just psql.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)

2017-03-19 Thread Joshua D. Drake

On 03/18/2017 01:12 PM, Magnus Hagander wrote:

Magnus Hagander > writes:

2017-03-18 14:00 GMT+01:00 Peter Eisentraut >:



createuser, dropuser - definitely pollutes the namespace, people do
sometimes try them for the wrong thing. Unlike the db ones they do add
value though -- I don't think we have a psql way of in a single command
doing what --pwprompt on createuser does, do we? But given that we are
in the process of breaking a lot of other scripts for 10, perhaps we
should rename it to pg_createuser?


We have one chance in the near future to shake things up, break things 
for the better and lose a lot of long time issues. Making things 
consistent and declarative (pg_ for example) is a great opportunity.


+1

JD


--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-03-19 Thread Pavel Stehule
2017-03-18 19:30 GMT+01:00 Petr Jelinek :

> On 16/03/17 17:15, David Steele wrote:
> > On 2/1/17 3:59 PM, Pavel Stehule wrote:
> >> Hi
> >>
> >> 2017-01-24 21:33 GMT+01:00 Pavel Stehule  >> >:
> >>
> >> Perhaps that's as simple as renaming all the existing _ns_*
> >> functions to _block_ and then adding support for pragmas...
> >>
> >> Since you're adding cursor_options to PLpgSQL_expr it should
> >> probably be removed as an option to exec_*.
> >>
> >> I have to recheck it. Some cursor options going from dynamic
> >> cursor variables and are related to dynamic query - not query
> >> that creates query string.
> >>
> >> hmm .. so current state is better due using options like
> >> CURSOR_OPT_PARALLEL_OK
> >>
> >>  if (expr->plan == NULL)
> >> exec_prepare_plan(estate, expr, (parallelOK ?
> >>   CURSOR_OPT_PARALLEL_OK : 0) |
> >> expr->cursor_options);
> >>
> >> This options is not permanent feature of expression - and then I
> >> cannot to remove cursor_option argument from exec_*
> >>
> >> I did minor cleaning - remove cursor_options from plpgsql_var
> >>
> >> + basic doc
> >
> > This patch still applies cleanly and compiles at cccbdde.
> >
> > Any reviewers want to have a look?
> >
>
> I'll bite.
>
> I agree with Jim that it's not very nice to add yet another
> block/ns-like layer. I don't see why pragma could not be added to either
> PLpgSQL_stmt_block (yes pragma can be for whole function but function
> body is represented by PLpgSQL_stmt_block as well so no issue there), or
> to namespace code. In namespace since they are used for other thing
> there would be bit of unnecessary propagation but it's 8bytes per
> namespace, does not seem all that much.
>
> My preference would be to add it to PLpgSQL_stmt_block (unless we plan
> to add posibility to add pragmas for other loops and other things) but I
> am not sure if current block is easily (and in a fast way) accessible
> from all places where it's needed. Maybe the needed info could be pushed
> to estate from PLpgSQL_stmt_block during the execution.
>
>
There is maybe partial misunderstand of pragma - it is set of nested
configurations used in compile time only. It can be used in execution time
too - it change nothing.

The pragma doesn't build a persistent tree. It is stack of configurations
that allows fast access to current configuration, and fast leaving of
configuration when the change is out of scope.

I don't see any any advantage to integrate pragma to ns or to stmt_block.
But maybe I don't understand to your idea.

I see a another possibility in code - nesting init_block_directives() to
plpgsql_ns_push and free_block_directives() to plpgsql_ns_pop()

Pavel



> --
>   Petr Jelinek  http://www.2ndQuadrant.com/
>   PostgreSQL Development, 24x7 Support, Training & Services
>


[HACKERS] Create replication slot in pg_basebackup if requested and not yet present

2017-03-19 Thread Michael Banck
Hi,

with the default configuration improvements so far for 10, it seems to
be very easy to setup streaming replication (at least locally):

$ initdb --pgdata=data1
$ pg_ctl --pgdata=data1 start
$ pg_basebackup --pgdata=data2 --write-recovery-conf
$ sed -i -e 's/^#port.=.5432/port = 5433/' \
> -e 's/^#hot_standby.=.off/hot_standby = on/' \
> data2/postgresql.conf
$ pg_ctl --pgdata=data2 start

(there might be a case for having hot_standby=on by default, but I think
that got discussed elsewhere and is anyway a different thread).

However, the above does not use replication slots, and if you want to do
so, you get:

$ pg_basebackup --pgdata=data2 --write-recovery-conf --slot=pg2 
2017-03-19 11:04:37.978 CET [25362] ERROR:  replication slot "pg2" does
not exist
pg_basebackup: could not send replication command "START_REPLICATION":
ERROR:  replication slot "pg2" does not exist
pg_basebackup: child process exited with error 1
pg_basebackup: removing data directory "data2"

The error message is clear enough, but I wonder whether people will
start writing streaming replication tutorials just glossing over this
because it's one more step to run CREATE_REPLICATION_SLOT manually.

So I propose the attached tiny patch to just create the slot (if it does
not exist already) in pg_basebackup, somewhat similar to what
pg_receivewal does, albeit unconditionally, if the user explicitly
requested a slot:

$ pg_basebackup --pgdata=data2 --write-recovery-conf --slot=pg2 
$ echo $?
0
$ psql -c "DROP_REPLICATION_SLOT pg2" "host=127.0.0.1 port=5432 
replication=database user=mba dbname=postgres" 
SELECT
$

This would get us somewhat closer to near zero-config replication, in my
opinion. Pardon me if that was discussed already and shot down

Comments?


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
From 8a52b7d8cc6f956fba47465d280581e200ec5239 Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Sun, 19 Mar 2017 10:58:13 +0100
Subject: [PATCH] Create replication slot in pg_basebackup if requested and not
 yet present.

If a replication slot is explicitly requested, try to create it before
starting to replicate from it.
---
 src/bin/pg_basebackup/receivelog.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index f415135..0ddb7a6 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -527,6 +527,17 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
 	}
 
 	/*
+	 * Try to create a permanent replication slot if one is specified. Do
+	 * not error out if the slot already exists, other errors are already
+	 * reported by CreateReplicationSlot().
+	 */
+	if (!stream->temp_slot && stream->replication_slot)
+	{
+		if (!CreateReplicationSlot(conn, stream->replication_slot, NULL, true, true))
+			return false;
+	}
+
+	/*
 	 * initialize flush position to starting point, it's the caller's
 	 * responsibility that that's sane.
 	 */
-- 
2.1.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2017-03-19 Thread Simon Riggs
On 13 March 2017 at 10:56, Craig Ringer  wrote:
> On 7 March 2017 at 21:08, Simon Riggs  wrote:
>
>> Patch 4 committed. Few others need rebase.
>
> Since this patch series

Patch 1 fails since feature has already been applied. If other reason,
let me know.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PinBuffer() no longer makes use of strategy

2017-03-19 Thread Simon Riggs
On 4 February 2017 at 09:33, Andres Freund  wrote:

> For pretty much else that logic seems worse,
> because it essentially prevents any buffers ever staying in s_b when
> only ringbuffer accesses are performed.

That was exactly its intent. The ringbuffer was designed to avoid
cache spoiling by large scans.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-19 Thread Ashutosh Sharma
Hi,

I didn't find any major issues with the patch. It works as expected.
However, I have few minor comments that I would like to share,

+ 
+   Total number of sample rows. The sample it reads is taken randomly.
+   Its size depends on the default_statistics_target
parameter value.
+ 

1) I think it would be better if you could specify reference link to
the guc variable 'default_statistics_target'. Something like this,

.

If you go through monitoring.sgml, you would find that there is
reference link to all guc variables being used.

2) I feel that the 'computing_statistics' phase could have been
splitted into 'computing_column_stats', 'computing_index_stats'
Please let me know your thoughts on this.


+   certain commands during command execution.  Currently, the command
+   which supports progress reporting is VACUUM and
ANALYZE.  This may be
expanded in the future.

3) I think above needs to be rephrased. Something like...Currently,
the supported progress reporting commands are 'VACUUM' and
'ANALYZE'.

Moreover, I also ran your patch on Windows platform and didn't find
any issues. Thanks.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

On Sat, Mar 18, 2017 at 5:30 PM, Ashutosh Sharma  wrote:
> Hi Vinayak,
>
> Here are couple of review comments that may need your attention.
>
> 1. Firstly, I am seeing some trailing whitespace errors when trying to
> apply your v3 patch using git apply command.
>
> [ashu@localhost postgresql]$ git apply pg_stat_progress_analyze_v3.patch
> pg_stat_progress_analyze_v3.patch:155: trailing whitespace.
> CREATE VIEW pg_stat_progress_analyze AS
> pg_stat_progress_analyze_v3.patch:156: trailing whitespace.
> SELECT
> pg_stat_progress_analyze_v3.patch:157: trailing whitespace.
> S.pid AS pid, S.datid AS datid, D.datname AS datname,
> pg_stat_progress_analyze_v3.patch:158: trailing whitespace.
> S.relid AS relid,
> pg_stat_progress_analyze_v3.patch:159: trailing whitespace.
> CASE S.param1 WHEN 0 THEN 'initializing'
> error: patch failed: doc/src/sgml/monitoring.sgml:521
>
> 2) The test-case 'rules' is failing.  I think it's failing because in
> rules.sql 'ORDERBY viewname' is used which will put
> 'pg_stat_progress_analyze' ahead of 'pg_stat_progress_vacuum' in the
> list of catalog views. You may need to correct rules.out file
> accordingly. This is what i could see in rules.sql,
>
> SELECT viewname, definition FROM pg_views WHERE schemaname <>
> 'information_schema' ORDER BY viewname;
>
> I am still reviewing your patch and doing some testing. Will update if
> i find any issues. Thanks.
>
> --
> With Regards,
> Ashutosh Sharma
> EnterpriseDB:http://www.enterprisedb.com
>
> On Fri, Mar 17, 2017 at 3:16 PM, vinayak
>  wrote:
>>
>> On 2017/03/17 10:38, Robert Haas wrote:
>>>
>>> On Fri, Mar 10, 2017 at 2:46 AM, vinayak
>>>  wrote:

 Thank you for reviewing the patch.

 The attached patch incorporated Michael and Amit comments also.
>>>
>>> I reviewed this tonight.
>>
>> Thank you for reviewing the patch.
>>>
>>> +/* Report compute index stats phase */
>>> +pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
>>> +
>>> PROGRESS_ANALYZE_PHASE_COMPUTE_INDEX_STATS);
>>>
>>> Hmm, is there really any point to this?  And is it even accurate?  It
>>> doesn't look to me like we are computing any index statistics here;
>>> we're only allocating a few in-memory data structures here, I think,
>>> which is not a statistics computation and probably doesn't take any
>>> significant time.
>>>
>>> +/* Report compute heap stats phase */
>>> +pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
>>> +
>>> PROGRESS_ANALYZE_PHASE_COMPUTE_HEAP_STATS);
>>>
>>> The phase you label as computing heap statistics also includes the
>>> computation of index statistics.  I wouldn't bother separating the
>>> computation of heap and index statistics; I'd just have a "compute
>>> statistics" phase that begins right after the comment that starts
>>> "Compute the statistics."
>>
>> Understood. Fixed in the attached patch.
>>>
>>>
>>> +/* Report that we are now doing index cleanup */
>>> +pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
>>> +PROGRESS_ANALYZE_PHASE_INDEX_CLEANUP);
>>>
>>> OK, this doesn't make any sense either.  We are not cleaning up the
>>> indexes here.  We are just closing them and releasing resources.  I
>>> don't see why you need this phase at all; it can't last more than some
>>> tiny fraction of a second.  It seems like you're trying to copy the
>>> exact same phases that exist for vacuum instead of thinking about what
>>> makes sense for ANALYZE.
>>
>> Understood. I have removed this phase.
>>>
>>>
>>> +/* Report number of heap blocks scaned so far*/
>>> +pgstat_progress_update_param(PROGRESS_ANALYZE_HEAP_BLKS_SCANNED,

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-19 Thread Pavan Deolasee
On Thu, Mar 16, 2017 at 12:53 PM, Robert Haas  wrote:

> On Wed, Mar 15, 2017 at 3:44 PM, Pavan Deolasee
>  wrote:
> > I couldn't find a better way without a lot of complex infrastructure.
> Even
> > though we now have ability to mark index pointers and we know that a
> given
> > pointer either points to the pre-WARM chain or post-WARM chain, this does
> > not solve the case when an index does not receive a new entry. In that
> case,
> > both pre-WARM and post-WARM tuples are reachable via the same old index
> > pointer. The only way we could deal with this is to mark index pointers
> as
> > "common", "pre-warm" and "post-warm". But that would require us to update
> > the old pointer's state from "common" to "pre-warm" for the index whose
> keys
> > are being updated. May be it's doable, but might be more complex than the
> > current approach.
>
> /me scratches head.
>
> Aren't pre-warm and post-warm just (better) names for blue and red?
>
>
Yeah, sounds better. Just to make it clear, the current design sets the
following information:

HEAP_WARM_TUPLE - When a row gets WARM updated, both old and new versions
of the row are marked with HEAP_WARM_TUPLE flag. This allows us to remember
that a certain row was WARM-updated, even if the update later aborts and we
cleanup the new version and truncate the chain. All subsequent tuple
versions will carry this flag until a non-HOT updates happens, which breaks
the HOT chain.

HEAP_WARM_RED - After first WARM update, the new version of the tuple is
marked with this flag. This flag is also carried forward to all future HOT
updated tuples. So the only tuple that has HEAP_WARM_TUPLE but not
HEAP_WARM_RED is the old version before the WARM update. Also, all tuples
marked with HEAP_WARM_RED flag satisfies HOT property (i.e. all index key
columns share the same value). Similarly, all tuples NOT marked with
HEAP_WARM_RED also satisfy HOT property. I've so far called them Red and
Blue chains respectively.

In addition, in the current patch, the new index pointers resulted from
WARM updates are marked BTREE_INDEX_RED_POINTER/HASH_INDEX_RED_POINTER

I think per your suggestion we can change HEAP_WARM_RED to HEAP_WARM_TUPLE
and similarly rename the index pointers to BTREE/HASH_INDEX_WARM_POINTER
and replace HEAP_WARM_TUPLE with something like HEAP_WARM_UPDATED_TUPLE to
signify that this or some previous version of this chain was once
WARM-updated.

Does that sound ok? I can change the patch accordingly.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services