Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Andres Freund
On 2017-11-09 16:45:07 -0800, Peter Geoghegan wrote:
> On Thu, Nov 9, 2017 at 4:17 PM, Andres Freund <and...@anarazel.de> wrote:
> >> Actually, on second thought, I take that back -- I don't think that
> >> REINDEXing will even finish once a HOT chain is broken by the bug.
> >> IndexBuildHeapScan() actually does quite a good job of making sure
> >> that HOT chains are sane, which is how the enhanced amcheck notices
> >> the bug here in practice.
> >
> > I think that's too optimistic.
> 
> Why? Because the "find the TID of the root" logic in
> IndexBuildHeapScan()/heap_get_root_tuples() won't reliably find the
> actual root (it might be some other HOT chain root following TID
> recycling by VACUUM)?

Primarily because it's not an anti-corruption tool. I'd be surprised if
there weren't ways to corrupt the page using these corruptions that
aren't detected by it.  But even if it were, I don't think there's
enough information to do so in the general case.  You very well can end
up with pages where subsequent hot pruning has removed a good bit of the
direct evidence of this bug.

But I'm not really sure why the error detection capabilities of matter
much for the principal point I raised, which is how much work we need to
do to not further worsen the corruption.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Andres Freund
On 2017-11-09 16:02:17 -0800, Peter Geoghegan wrote:
> > What I'm currently wondering about is how much we need to harden
> > postgres against such existing corruption. If e.g. the hot chains are
> > broken somebody might have reindexed thinking the problem is fixed - but
> > if they then later vacuum everything goes to shit again, with dead rows
> > reappearing.
> 
> I don't follow you here. Why would REINDEXing make the rows that
> should be dead disappear again, even for a short period of time?

It's not the REINDEX that makes them reappear. It's the second
vacuum. The reindex part was about $user trying to fix the problem...
As you need two vacuums with appropriate cutoffs to hit the "rows
revive" problem, that'll often in practice not happen immediately.


> Actually, on second thought, I take that back -- I don't think that
> REINDEXing will even finish once a HOT chain is broken by the bug.
> IndexBuildHeapScan() actually does quite a good job of making sure
> that HOT chains are sane, which is how the enhanced amcheck notices
> the bug here in practice.

I think that's too optimistic.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-09 Thread Andres Freund
On 2017-11-04 06:15:00 -0700, Andres Freund wrote:
> The reason for that is that I hadn't yet quite figured out how the bug I
> described in the commit message (and the previously committed testcase)
> would cause that. I was planning to diagnose / experiment more with this
> and write an email if I couldn't come up with an explanation.   The
> committed test does *not* actually trigger that.
> 
> The reason I couldn't quite figure out how the problem triggers is that
> [ long explanation ]

Attached is a version of the already existing regression test that both
reproduces the broken hot chain (and thus failing index lookups) and
then also the tuple reviving.  I don't see any need for letting this run
with arbitrary permutations.

Thanks to whoever allowed isolationtester permutations to go over
multiple lines and allow comments. I was wondering about adding that as
a feature just to discover it's already there ;)


What I'm currently wondering about is how much we need to harden
postgres against such existing corruption. If e.g. the hot chains are
broken somebody might have reindexed thinking the problem is fixed - but
if they then later vacuum everything goes to shit again, with dead rows
reappearing.  There's no way we can fix hot chains after the fact, but
preventing dead rows from reapparing seems important.  A minimal version
of that is fairly easy - we slap a bunch of if if
!TransactionIdDidCommit() elog(ERROR) at various code paths. But that'll
often trigger clog access errors when the problem occurred - if we want
to do better we need to pass down relfrozenxid/relminmxid to a few
functions.  I'm inclined to do so, but it'll make the patch larger...

Comments?

Greetings,

Andres Freund
# Test for interactions of tuple freezing with dead, as well as recently-dead
# tuples using multixacts via FOR KEY SHARE.
setup
{
  DROP TABLE IF EXISTS tab_freeze;
  CREATE TABLE tab_freeze (
id int PRIMARY KEY,
name char(3),
x int);
  INSERT INTO tab_freeze VALUES (1, '111', 0);
  INSERT INTO tab_freeze VALUES (3, '333', 0);
}

teardown
{
  DROP TABLE tab_freeze;
}

session "s1"
step "s1_begin" { BEGIN; }
step "s1_update"{ UPDATE tab_freeze SET x = x + 1 WHERE id = 3; }
step "s1_commit"{ COMMIT; }
step "s1_vacuum"{ VACUUM FREEZE tab_freeze; }
step "s1_selectone" {
BEGIN;
SET LOCAL enable_seqscan = false;
SET LOCAL enable_bitmapscan = false;
SELECT * FROM tab_freeze WHERE id = 3;
COMMIT;
}
step "s1_selectall" { SELECT * FROM tab_freeze ORDER BY name, id; }

session "s2"
step "s2_begin" { BEGIN; }
step "s2_key_share" { SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE; 
}
step "s2_commit"{ COMMIT; }
step "s2_vacuum"{ VACUUM FREEZE tab_freeze; }

session "s3"
step "s3_begin" { BEGIN; }
step "s3_key_share" { SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE; 
}
step "s3_commit"{ COMMIT; }
step "s3_vacuum"{ VACUUM FREEZE tab_freeze; }

# This permutation verfies that a previous bug
# https://postgr.es/m/e5711e62-8fdf-4dca-a888-c200bf6b5...@amazon.com
# https://postgr.es/m/20171102112019.33wb7g5wp4zpj...@alap3.anarazel.de
# is not reintroduced. We used to make wrong pruning / freezing
# decision for multixacts, which could lead to a) broken hot chains b)
# dead rows being revived.
permutation "s1_begin" "s2_begin" "s3_begin" # start transactions
   "s1_update" "s2_key_share" "s3_key_share" # have xmax be a multi with an 
updater, updater being oldest xid
   "s1_update" # create additional row version that has multis
   "s1_commit" "s2_commit" # commit both updater and share locker
   "s2_vacuum" # due to bug in freezing logic, we used to *not* prune updated 
row, and then froze it
   "s1_selectone" # if hot chain is broken, the row can't be found via index 
scan
   "s3_commit" # commit remaining open xact
   "s2_vacuum" # pruning / freezing in broken hot chains would unset xmax, 
reviving rows
   "s1_selectall" # show borkedness

-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Remove secondary checkpoint

2017-11-07 Thread Andres Freund
On 2017-11-07 14:12:19 -0500, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > On 2017-11-07 17:57:31 +, Simon Riggs wrote:
> >> Remove secondary checkpoint
> 
> > FWIW, I don't think this should be applied without a pg_resetxlog
> > feature allowing to manually use older checkpoints.
> 
> Uh, what?  We have never before insisted on pg_resetxlog being able
> to use WAL across a WAL format change, and there have been many of
> those.

I think you misunderstand my point - I'm saying that pg_resetxlog should
be able to force the use of older checkpoints, basically as a fallback
to cases where the previous approach might actually have worked, not
that it needs to work across format changes.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Remove secondary checkpoint

2017-11-07 Thread Andres Freund
On 2017-11-07 17:57:31 +, Simon Riggs wrote:
> Remove secondary checkpoint
> 
> Previously server reserved WAL for last two checkpoints,
> which used too much disk space for small servers.
> 
> Bumps PG_CONTROL_VERSION

FWIW, I don't think this should be applied without a pg_resetxlog
feature allowing to manually use older checkpoints.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-04 Thread Andres Freund
On 2017-11-04 06:15:00 -0700, Andres Freund wrote:
> The current testcase, and I think the descriptions in the relevant
> threads, all actually fail to point out the precise way the bug is
> triggered.  As e.g. evidenced that the freeze-the-dead case appears to
> not cause any failures in !assertion builds even if the bug is present.

Trying to write up tests reproducing more of the issues in the area, I
think I might have found a third issue - although I'm not sure how
practically relevant it is:

When FreezeMultiXactId() decides it needs to create a new multi because
the old one is below the cutoff, that attempt can be defeated by the
multixact id caching. If the new multi has exactly the same members the
multixact id cache will just return the existing multi with the same
members. The cache will routinely be primed from the lookup of its
members.

I'm not yet sure how easily this can be hit in practice, because
commonly the multixact horizon should prevent a multi with all its
members living from being below the horizon. I found a situation where
that's not the case with the current bug, but I'm not sif that can
happen otherwise.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-04 Thread Andres Freund
On 2017-11-03 12:36:59 -0700, Peter Geoghegan wrote:
> Andres Freund <and...@anarazel.de> wrote:
> > Here's that patch.  I've stared at this some, and Robert did too. Robert
> > mentioned that the commit message might need some polish and I'm not
> > 100% sure about the error message texts yet.
> 
> The commit message should probably say that the bug involves the
> resurrection of previously dead tuples, which is different to there
> being duplicates because a constraint is not enforced because HOT chains
> are broken (that's a separate, arguably less serious problem).

The reason for that is that I hadn't yet quite figured out how the bug I
described in the commit message (and the previously committed testcase)
would cause that. I was planning to diagnose / experiment more with this
and write an email if I couldn't come up with an explanation.   The
committed test does *not* actually trigger that.

The reason I couldn't quite figure out how the problem triggers is that
the xmax removing branch in FreezeMultiXactId() can only be reached if
the multi is from before the cutoff - which it can't have been for a
single vacuum execution to trigger the bug, because that'd require the
multi to be running to falsely return RECENTLY_DEAD rather than DEAD (by
definition a multi can't be below the cutoff if running).

For the problem to occur I think vacuum has to be executed *twice*: The
first time through HTSV mistakenly returns RECENTLY_DEAD preventing the
tuple from being pruned. That triggers FreezeMultiXactId() to create a
*new* multi with dead members. At this point the database already is in
a bad state. Then in a second vacuum HTSV returns DEAD, but
 * Ordinarily, DEAD tuples would have 
been removed by
 * heap_page_prune(), but it's possible 
that the tuple
 * state changed since 
heap_page_prune() looked.  In
 * particular an INSERT_IN_PROGRESS 
tuple could have
 * changed to DEAD if the inserter 
aborted.  So this
 * cannot be considered an error 
condition.
 *
..
if (HeapTupleIsHotUpdated() ||
HeapTupleIsHeapOnly())
{
nkeep += 1;

prevents the tuple from being removed. If now the multi xmax is below
the xmin horizon it triggers
/*
 * If the xid is older than the cutoff, it has to have 
aborted,
 * otherwise the tuple would have gotten pruned away.
 */
if (TransactionIdPrecedes(xid, cutoff_xid))
{
if (TransactionIdDidCommit(xid))
elog(ERROR, "can't freeze committed 
xmax");
*flags |= FRM_INVALIDATE_XMAX;
in FreezeMultiXact. Without the new elog, this then causes xmax to be
removed, reviving the tuple.


The current testcase, and I think the descriptions in the relevant
threads, all actually fail to point out the precise way the bug is
triggered.  As e.g. evidenced that the freeze-the-dead case appears to
not cause any failures in !assertion builds even if the bug is present.


The good news is that the error checks I added in the patch upthread
prevent all of this from happening, even though I'd not yet understood
the mechanics fully - it's imnsho pretty clear that we need to be more
paranoid in production builds around this.   A bunch of users that
triggered largely "invisible" corruption (the first vacuum described
above) will possibly run into one of these elog()s, but that seems far
preferrable to making the corruption a lot worse.


I think unfortunately the check + elog() in the
HeapTupleIsHeapOnly())
{
nkeep += 1;

/*
 * If this were to happen for a 
tuple that actually
 * needed to be frozen, we'd be 
in trouble, because
 * it'd leave a tuple below the 
relation's xmin
 * horizon alive.
 */
if 
(heap_tuple_needs_freeze(tuple.t_data, FreezeLimit,
   

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-03 Thread Andres Freund


On November 4, 2017 1:22:04 AM GMT+05:30, Alvaro Herrera 
<alvhe...@alvh.no-ip.org> wrote:
>Peter Geoghegan wrote:
>> Andres Freund <and...@anarazel.de> wrote:
>
>> > Staring at the vacuumlazy hunk I think I might have found a related
>bug:
>> > heap_update_tuple() just copies the old xmax to the new tuple's
>xmax if
>> > a multixact and still running.  It does so without verifying
>liveliness
>> > of members.  Isn't that buggy? Consider what happens if we have
>three
>> > blocks: 1 has free space, two is being vacuumed and is locked,
>three is
>> > full and has a tuple that's key share locked by a live tuple and is
>> > updated by a dead xmax from before the xmin horizon. In that case
>afaict
>> > the multi will be copied from the third page to the first one. 
>Which is
>> > quite bad, because vacuum already processed it, and we'll set
>> > relfrozenxid accordingly.  I hope I'm missing something here?
>> 
>> Can you be more specific about what you mean here? I think that I
>> understand where you're going with this, but I'm not sure.
>
>He means that the tuple that heap_update moves to page 1 (which will no
>longer be processed by vacuum) will contain a multixact that's older
>than relminmxid -- because it is copied unchanged by heap_update
>instead
>of properly checking against age limit.

Right. That, or a member xid below relminxid. I think both scenarios are 
possible.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-03 Thread Andres Freund
On 2017-11-02 06:05:51 -0700, Andres Freund wrote:
> Hi,
> 
> On 2017-11-02 13:49:47 +0100, Alvaro Herrera wrote:
> > Andres Freund wrote:
> > > I think the problem is on the pruning, rather than the freezing side. We
> > > can't freeze a tuple if it has an alive predecessor - rather than
> > > weakining this, we should be fixing the pruning to not have the alive
> > > predecessor.
> > 
> > I gave a look at HTSV back then, but I didn't find what the right tweak
> > was, but then I only tried changing the return value to DEAD and
> > DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based
> > on OldestXmin didn't occur to me ... I was thinking that the fact that
> > there were live lockers meant that the tuple could not be removed,
> > obviously failing to notice that the subsequent versions of the tuple
> > would be good enough.
> 
> I'll try to write up a commit based on that idea. I think there's some
> comment work needed too, Robert and I were both confused by a few
> things.
> I'm unfortunately travelling atm - it's evening here, and I'll flying
> back to the US all Saturday. I'm fairly sure I'll be able to come up
> with a decent patch tomorrow, but I'll need review and testing by
> others.

Here's that patch.  I've stared at this some, and Robert did too. Robert
mentioned that the commit message might need some polish and I'm not
100% sure about the error message texts yet.

I'm not yet convinced that the new elog in vacuumlazy can never trigger
- but I also don't think we want to actually freeze the tuple in that
case.

Staring at the vacuumlazy hunk I think I might have found a related bug:
heap_update_tuple() just copies the old xmax to the new tuple's xmax if
a multixact and still running.  It does so without verifying liveliness
of members.  Isn't that buggy? Consider what happens if we have three
blocks: 1 has free space, two is being vacuumed and is locked, three is
full and has a tuple that's key share locked by a live tuple and is
updated by a dead xmax from before the xmin horizon. In that case afaict
the multi will be copied from the third page to the first one.  Which is
quite bad, because vacuum already processed it, and we'll set
relfrozenxid accordingly.  I hope I'm missing something here?

Greetings,

Andres Freund
>From 774376dc8f7ed76657660e6beb0f5191d1bdaae4 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Fri, 3 Nov 2017 07:52:29 -0700
Subject: [PATCH] Fix pruning of locked and updated tuples.

Previously it was possible that a tuple was not pruned during vacuum,
even though it's update xmax (i.e. the updating xid in a multixact
with both key share lockers and an updater) was below the cutoff
horizon.

As the freezing code assumed, rightly so, that that's not supposed to
happen, xmax would be preserved (as a member of a new multixact or
xmax directly). That causes two problems: For one the tuple is below
the xmin horizon, which can cause problems if the clog is truncated or
once there's an xid wraparound. The bigger problem is that that will
break HOT chains, leading to index lookups failing, which in turn can
lead to constraints being violated.

Fix the problem by recognizing that tuples can be DEAD instead of
RECENTLY_DEAD, even if the multixactid has alive members, if the
update_xid is below the xmin horizon. That's safe because newer
versions of the tuple will contain the locking xids.

As the wrong freezing of xmax can cause data corruption, extend sanity
checks and promote them to elogs rather than assertions.

Per-Discussion: Robert Haas and Freund
Reported-By: Daniel Wood
Discussion:
https://postgr.es/m/e5711e62-8fdf-4dca-a888-c200bf6b5...@amazon.com
https://postgr.es/m/20171102112019.33wb7g5wp4zpj...@alap3.anarazel.de
---
 src/backend/access/heap/heapam.c  | 34 --
 src/backend/commands/vacuumlazy.c | 13 +
 src/backend/utils/time/tqual.c| 53 +++
 src/test/isolation/isolation_schedule |  1 +
 4 files changed, 62 insertions(+), 39 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 765750b8743..9b963e0cd0d 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6412,7 +6412,8 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
 			 */
 			if (TransactionIdPrecedes(xid, cutoff_xid))
 			{
-Assert(!TransactionIdDidCommit(xid));
+if (TransactionIdDidCommit(xid))
+	elog(ERROR, "can't freeze committed xmax");
 *flags |= FRM_INVALIDATE_XMAX;
 xid = InvalidTransactionId; /* not strictly necessary */
 			}
@@ -6484,6 +6485,8 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
 		{
 			TransactionId xid = members[i].xid;
 
+			Assert(TransactionIdIsValid(xid));
+
 			/*
 			 * It's an upda

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Andres Freund
Hi,

On 2017-11-02 13:49:47 +0100, Alvaro Herrera wrote:
> Andres Freund wrote:
> > I think the problem is on the pruning, rather than the freezing side. We
> > can't freeze a tuple if it has an alive predecessor - rather than
> > weakining this, we should be fixing the pruning to not have the alive
> > predecessor.
> 
> I gave a look at HTSV back then, but I didn't find what the right tweak
> was, but then I only tried changing the return value to DEAD and
> DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based
> on OldestXmin didn't occur to me ... I was thinking that the fact that
> there were live lockers meant that the tuple could not be removed,
> obviously failing to notice that the subsequent versions of the tuple
> would be good enough.

I'll try to write up a commit based on that idea. I think there's some
comment work needed too, Robert and I were both confused by a few
things.
I'm unfortunately travelling atm - it's evening here, and I'll flying
back to the US all Saturday. I'm fairly sure I'll be able to come up
with a decent patch tomorrow, but I'll need review and testing by
others.


> > If the update xmin is actually below the cutoff we can remove the tuple
> > even if there's live lockers - the lockers will also be present in the
> > newer version of the tuple.  I verified that for me that fixes the
> > problem. Obviously that'd require some comment work and more careful
> > diagnosis.
> 
> Sounds good.
> 
> I'll revert those commits then, keeping the test, and then you can
> commit your change.  OK?

Generally that sounds good - but you can't keep the testcase in without
the new fix - the buildfarm would immediately turn red. I guess the best
thing would be to temporarily remove it from the schedule?


Do we care about people upgrading to unreleased versions? We could do
nothing, document it in the release notes, or ???


Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Andres Freund
Hi,

On 2017-09-28 14:47:53 +, Alvaro Herrera wrote:
> Fix freezing of a dead HOT-updated tuple
>
> Vacuum calls page-level HOT prune to remove dead HOT tuples before doing
> liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples.  But
> concurrent transaction commit/abort may turn DEAD some of the HOT tuples
> that survived the prune, before HeapTupleSatisfiesVacuum tests them.
> This happens to activate the code that decides to freeze the tuple ...
> which resuscitates it, duplicating data.
>
> (This is especially bad if there's any unique constraints, because those
> are now internally violated due to the duplicate entries, though you
> won't know until you try to REINDEX or dump/restore the table.)
>
> One possible fix would be to simply skip doing anything to the tuple,
> and hope that the next HOT prune would remove it.  But there is a
> problem: if the tuple is older than freeze horizon, this would leave an
> unfrozen XID behind, and if no HOT prune happens to clean it up before
> the containing pg_clog segment is truncated away, it'd later cause an
> error when the XID is looked up.
>
> Fix the problem by having the tuple freezing routines cope with the
> situation: don't freeze the tuple (and keep it dead).  In the cases that
> the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag
> so that there is no need to look up the XID in pg_clog later on.

I think this is the wrong fix - the assumption that ctid chains can be
validated based on the prev-xmax = cur-xmin is fairly ingrained into the
system, and we shouldn't just be breaking it. The need to later
lobotomize the checks, in a5736bf754, is some evidence of that.

I spent some time discussing this with Robert today (with both of us
alternating between feeling the other and ourselves as stupid), and the
conclusion I think is that the problem is on the pruning, rather than
the freezing side.

FWIW, I don't think the explanation in the commit message of how the
problem triggers is actually correct - the testcase you added doesn't
have any transactions concurrently committing / aborting when
crashing. Rather the problem is that the liveliness checks for freezing
is different from the ones for pruning. HTSV considers xmin
RECENTLY_DEAD when there's a multi xmax with at least one alive locker,
whereas pruning thinks it has to do something because there's the member
xid below the cutoff. No concurrent activity is needed to trigger that.

I think the problem is on the pruning, rather than the freezing side. We
can't freeze a tuple if it has an alive predecessor - rather than
weakining this, we should be fixing the pruning to not have the alive
predecessor.

The relevant logic in HTSV is:

if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
{
TransactionId xmax;

if (MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), 
false))
{
/* already checked above */
Assert(!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask));

xmax = HeapTupleGetUpdateXid(tuple);

/* not LOCKED_ONLY, so it has to have an xmax */
Assert(TransactionIdIsValid(xmax));

if (TransactionIdIsInProgress(xmax))
return HEAPTUPLE_DELETE_IN_PROGRESS;
else if (TransactionIdDidCommit(xmax))
/* there are still lockers around -- can't 
return DEAD here */
return HEAPTUPLE_RECENTLY_DEAD;
/* updating transaction aborted */
return HEAPTUPLE_LIVE;

with the problematic branch being the TransactionIdDidCommit()
case. Rather than unconditionally returning HEAPTUPLE_RECENTLY_DEAD, we
should test the update xid against OldestXmin and return DEAD /
RECENTLY_DEAD according to that.

If the update xmin is actually below the cutoff we can remove the tuple
even if there's live lockers - the lockers will also be present in the
newer version of the tuple.  I verified that for me that fixes the
problem. Obviously that'd require some comment work and more careful
diagnosis.


I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in
the face of previously corrupted tuple chains and pg_upgraded clusters -
it can lead to tuples being considered related, even though they they're
from entirely independent hot chains. Especially when upgrading 9.3 post
your fix, to current releases.


In short, I think the two commits should be reverted, and replaced with
a fix along what I'm outlining above.

There'll be some trouble for people that upgraded to an unreleased
version, but I don't really see what we could do about that.

I could be entirely wrong - I've been travelling for the last two weeks
and my brain is somewhat more fried than usual.

Regards,

Andres


-- 
Sent via pgsql-committers mailing list 

Re: [COMMITTERS] pgsql: Add pg_noinline macro to c.h.

2017-10-13 Thread Andres Freund
On 2017-10-13 18:32:08 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Add pg_noinline macro to c.h.
> 
> I think you want this to be parenthesized so that pgindent doesn't
> go nuts when you use it.  At least, that was the reason why
> pg_attribute_noreturn() has parens.  Even if the current version
> of pgindent doesn't have that problem

The current pgindent actually generates *better* layout with pg_noinline
rather than pg_noinline().  The latter forces a linebreak both in
declarations and definitions like:

static pg_noinline()
HeapTuple SearchCatCacheMiss(CatCache *cache,

static pg_noinline()
HeapTuple
SearchCatCacheMiss(CatCache *cache,


One difference to pg_attribute_noreturn() is that for function
definitions - and noinline needs to be present there IME - you can't put
it after the parameter list.


> I would argue that consistency demands that you spell this 
> pg_attribute_noinline().

Hm, I personally find inline vs pg_noinline more aesthetically pleasing.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Improve sys/catcache performance.

2017-10-13 Thread Andres Freund
Improve sys/catcache performance.

The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
   more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
   entries - instead store them as a Datum array. This also allows to
   get rid of having to build dummy tuples for negative & list
   entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
   likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
   specific number of attributes allows to unroll loops and avoid
   other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
   greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
   allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
   piece.

This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.

I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
  shows up in profiles. Unfortunately it's not easy to do so safely as
  an entry's memory location can change at various times, which
  doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
  but the win isn't big and the code for it is ugly, because the
  tuples have to be freed as well.
- add more proper functions, rather than macros for
  SearchSysCacheCopyN etc., but right now they don't show up in
  profiles.

The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside.  That might be a good idea
anyway, but it's for another day.

Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: 
https://postgr.es/m/20170914061207.zxotvyopetm7l...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/141fd1b66ce6e3d10518d66d4008bd368f1505fd

Modified Files
--
src/backend/utils/cache/catcache.c | 692 +
src/backend/utils/cache/syscache.c |  49 ++-
src/include/utils/catcache.h   | 122 ---
src/include/utils/syscache.h   |  23 +-
src/tools/pgindent/typedefs.list   |   2 +
5 files changed, 608 insertions(+), 280 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add pg_noinline macro to c.h.

2017-10-13 Thread Andres Freund
Add pg_noinline macro to c.h.

Forcing a function not to be inlined can be useful if it's the
slow-path of a performance critical function, or should be visible in
profiles to allow for proper cost attribution.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170914061207.zxotvyopetm7l...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/a0247e7a11bb9f5fd55694b594a3906b7bd05881

Modified Files
--
src/include/c.h | 16 
1 file changed, 16 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Force "restrict" not to be used when compiling with xlc.

2017-10-13 Thread Andres Freund
Force "restrict" not to be used when compiling with xlc.

Per buildfarm animal Hornet and followup manual testing by Noah Misch,
it appears xlc miscompiles code using "restrict" in at least some
cases. Allow disabling restrict usage with FORCE_DISABLE_RESTRICT=yes
in template files, and do so for aix/xlc.

Author: Andres Freund and Tom Lane
Discussion: https://postgr.es/m/1820.1507918...@sss.pgh.pa.us

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/d133982d598c7e6208d16cb4fc0b552151796603

Modified Files
--
configure| 6 +-
configure.in | 6 +-
src/template/aix | 4 
3 files changed, 14 insertions(+), 2 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.

2017-10-13 Thread Andres Freund
On 2017-10-13 14:19:22 -0400, Tom Lane wrote:
> > So it'd probably better to introduce a FORCE_DISABLE_RESTRICT=yes, set
> > at the same place, that's then tested before running the relevant
> > configure check?
> 
> +1.  I think you don't actually have to skip the configure check,
> and there might be some value in letting it carry on normally
> (so that "restrict" is set properly).  We'd just want it to affect
> what pg_restrict gets defined as.  Something like

> if test "$ac_cv_c_restrict" = "no" -o "x$FORCE_DISABLE_RESTRICT" = "xyes"; 
> then
>   pg_restrict=""
> else
>   pg_restrict="$ac_cv_c_restrict"

Yea, that works. Will make it so.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.

2017-10-13 Thread Andres Freund
On 2017-10-13 13:48:07 -0400, Tom Lane wrote:
> I wrote:
> > Anyway, I will go make the sizeof() usages consistent, just to satisfy
> > my own uncontrollable neatnik-ism.  Assuming that hornet stays red,
> > which is probable, we should disable "restrict" for that compiler.
> 
> As expected, that didn't fix it.  Andres, are you going to put in
> a disable?  Do we know exactly what we want to test for that?

A easiest way to do this would be to put something like
CPPFLAGS="$CPPFLAGS -Dpg_restrict" into the existing
if test "$GCC" != yes ; then
block in a/src/template/linux. But that'd probably result in "macro
redefined" warnings or somesuch.

So it'd probably better to introduce a FORCE_DISABLE_RESTRICT=yes, set
at the same place, that's then tested before running the relevant
configure check?

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.

2017-10-13 Thread Andres Freund
Hi,

On 2017-10-13 00:02:47 -0700, Noah Misch wrote:
> I hacked psql to call PQtrace() and ran "psql -Xc 'select true'" in the
> defective configuration and in a working x64 GNU/Linux configuration.  I've
> attached both PQtrace() products.

Thanks!


> To backend> Msg Q
> To backend> "select true"
> To backend> Msg complete, length 17
> From backend> T
> From backend (#4)> 17
> From backend (#2)> 1
> From backend> "bool"
> From backend (#4)> 1
> From backend (#2)> 0
> From backend (#4)> 1140850688
> From backend (#2)> 2816
> From backend (#4)> 16777216
> From backend (#2)> 372
> From backend> D
> From backend (#4)> 11
> From backend> C
> From backend (#4)> 13
> From backend> "SELECT 1"
> From backend> Z
> From backend (#4)> 5
> From backend> Z
> From backend (#4)> 5
> From backend> I
> To backend> Msg X
> To backend> Msg complete, length 5

> To backend> Msg Q
> To backend> "select true"
> To backend> Msg complete, length 17
> From backend> T
> From backend (#4)> 29
> From backend (#2)> 1
> From backend> "bool"
> From backend (#4)> 0
> From backend (#2)> 0
> From backend (#4)> 16
> From backend (#2)> 1
> From backend (#4)> -1
> From backend (#2)> 0
> From backend> D
> From backend (#4)> 11
> From backend (#2)> 1
> From backend (#4)> 1
> From backend (1)> t
> From backend> C
> From backend (#4)> 13
> From backend> "SELECT 1"
> From backend> Z
> From backend (#4)> 5
> From backend> Z
> From backend (#4)> 5
> From backend> I
> To backend> Msg X
> To backend> Msg complete, length 5

That's certainly quite weird. I can't immediately pinpoint where the bug
is. I initially thought that the StringInfo's lengths might be wrong,
but that doesn'tqutie seem to make sense. Looks a bit like there's just
garbless mess in there...  Will have another look when I don't have to
force my eyes to stay open.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.

2017-10-13 Thread Andres Freund
On 2017-10-12 19:35:36 -0700, Noah Misch wrote:
> On Thu, Oct 12, 2017 at 04:08:44PM -0700, Andres Freund wrote:
> > So we've two animals (hornet, sungazer) that are:
> > #define SIZEOF_SIZE_T 8
> > #define WORDS_BIGENDIAN 1
> > #define restrict __restrict
> > 
> > one compiled with xlc that fails and one with gcc that succeeds. I'm
> > hesitant to reach for that, but I wonder if there's a compiler
> > bug. Alternatively there could be some undefined behaviour here that
> > only triggers on xlc 64bit, but I'm not quite seeing it.
> > 
> > Noah, any chance you could force restrict to off on that animal?
> 
> I can confirm it allows "make check" to pass.  Specifically, I did this
> against commit 91d5f1a:
> 
> --- src/include/pg_config.h~2017-10-12 18:11:33.0 -0700
> +++ src/include/pg_config.h 2017-10-12 18:22:34.0 -0700
> @@ -929 +929 @@
> - #define pg_restrict __restrict
> + #define pg_restrict
> @@ -934 +934 @@
> - #define restrict __restrict
> + #define restrict
> 
> I have no reason to believe this is specific to hornet's installation, so I
> recommend against altering hornet's configuration.  It's too likely that the
> next xlc user will need to do the same thing.

Yea, I wasn't trying to propose that - I just thought it'd be easier to
narrow down with access to the machine than 6h cycle buildfarm
debugging.


> > Otherwise I can push a platform fix that disables it.
> 
> This sounds reasonable.

I'm not getting a great vibe about the aix/xlc quality in this thread :(


Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.

2017-10-12 Thread Andres Freund
On 2017-10-12 20:06:32 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> >> fe-connect.c:2382:6: warning: implicit declaration of function 
> >> 'getpeereid' [-Wimplicit-function-declaration]
> >> Looks like we're missing
> >> #include 
> 
> > Hm, it got removed as part of
> > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9e3755ecb2d058f7d123dd35a2e1784006190962
> > but that's not an explanation, because
> 
> Nope, because that's quite old.

Right. I'd mentioned that it's *not* that commit, even though it
initially looked suspicious.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.

2017-10-12 Thread Andres Freund
On 2017-10-12 16:08:44 -0700, Andres Freund wrote:
> wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes 
> -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels 
> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv 
> -fexcess-precision=standard -g -O2 -I../../../src/include-c -o auth.o 
> auth.c
> auth.c: In function 'auth_peer':
> auth.c:2002:2: warning: implicit declaration of function 'getpeereid' 
> [-Wimplicit-function-declaration]
>   if (getpeereid(port->sock, , ) != 0)
>   ^
> wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes 
> -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels 
> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv 
> -fexcess-precision=standard -g -O2 -D_REENTRANT -D_THREAD_SAFE 
> -D_POSIX_PTHREAD_SEMANTICS  -DFRONTEND -DUNSAFE_STAT_OK -I. 
> -I../../../src/include   -I../../../src/port -I../../../src/port 
> -DSO_MAJOR_VERSION=5  -c -o fe-connect.o fe-connect.c
> fe-connect.c: In function 'PQconnectPoll':
> fe-connect.c:2382:6: warning: implicit declaration of function 'getpeereid' 
> [-Wimplicit-function-declaration]
>   if (getpeereid(conn->sock, , ) != 0)
>   ^
> 
> Looks like we're missing
> #include 

Hm, it got removed as part of
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9e3755ecb2d058f7d123dd35a2e1784006190962
but that's not an explanation, because
c.h includes sys/types.h. Which according to IBM's docs
https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.basetrf1/getpeereid.htm
is the right thing to include.  Given that xlc doesn't complain, I'll
just assume this is some issue with the headers gcc uses on aix, but I'm
far from confident.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Use C99 restrict via pg_restrict, rather than restrict directly.

2017-10-12 Thread Andres Freund
Use C99 restrict via pg_restrict, rather than restrict directly.

Unfortunately using 'restrict' plainly causes problems with MSVC,
which supports restrict only as '__restrict'. Defining 'restrict' to
'__restrict' unfortunately causes a conflict with MSVC's usage of
__declspec(restrict) in headers.

Therefore define pg_restrict to the appropriate keyword instead, and
replace existing usages.

This replaces the temporary workaround introduced in 36b4b91ba078.

Author: Andres Freund
Discussion: https://postgr.es/m/2656.1507830...@sss.pgh.pa.us

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/91d5f1a4a3e8aea2a6488243bac55806160408fb

Modified Files
--
configure | 107 --
configure.in  |  15 +-
src/include/libpq/pqformat.h  |  24 +-
src/include/pg_config.h.in|   4 ++
src/include/pg_config.h.win32 |  22 -
5 files changed, 101 insertions(+), 71 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.

2017-10-12 Thread Andres Freund
On 2017-10-12 10:44:58 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Improve performance of SendRowDescriptionMessage.
> 
> One or another of these patches has broken buildfarm member hornet.
> Apparently, it's transmitting incorrectly-formatted RowDescription
> messages.

This is curious.

Buildfarm animal hornet has failed. It's ppc64 compiled with xlc 12.1,
64 bit:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2017-10-12%2022%3A14%3A41

Buildfarm animal mandrill succeeded. It's ppc64 compiled with xlc 12.1,
32 bit:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill=2017-10-12%2018%3A35%3A47

Buildfarm animal sungazer succeeded. It's ppc64 compiled with gcc 4.8,
64 bit:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2017-10-12%2008%3A21%3A29

All of these are up to at least 31079a4.

So we've two animals (hornet, sungazer) that are:
#define SIZEOF_SIZE_T 8
#define WORDS_BIGENDIAN 1
#define restrict __restrict

one compiled with xlc that fails and one with gcc that succeeds. I'm
hesitant to reach for that, but I wonder if there's a compiler
bug. Alternatively there could be some undefined behaviour here that
only triggers on xlc 64bit, but I'm not quite seeing it.

Noah, any chance you could force restrict to off on that animal?
Otherwise I can push a platform fix that disables it.


Entirely independent of this, both machines report some interesting
warnings:
Hornet:
xlc_r -D_LARGE_FILES=1  -qnoansialias -g -O2 -qmaxmem=16384 -qsrcmsg  
-L../../src/port -L../../src/common  
-Wl,-blibpath:'/home/nm/farm/xlc64/HEAD/inst/lib:/usr/lib:/lib' 
-L../../src/port -lpgport  -Wl,-bnoentry -Wl,-H512 -Wl,-bM:SRE -o timetravel.so 
timetravel.o -Wl,-bE:timetravel.exp -Wl,-bI:../../src/backend/postgres.imp
ld: 0711-224 WARNING: Duplicate symbol: .pg_strcasecmp
ld: 0711-224 WARNING: Duplicate symbol: .pg_ascii_tolower
ld: 0711-224 WARNING: Duplicate symbol: .pg_ascii_toupper
ld: 0711-224 WARNING: Duplicate symbol: .pg_tolower
ld: 0711-224 WARNING: Duplicate symbol: .pg_toupper
ld: 0711-224 WARNING: Duplicate symbol: .pg_strncasecmp

Hm. Seems we've some workaround in some platforms:
# src/template/win32

# --allow-multiple-definition is required to link pg_dump because it finds
# pg_toupper() etc. in both libpq and pgport

But that, uh, seems not good?

Sungazer:
wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes 
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels 
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv 
-fexcess-precision=standard -g -O2 zic.o  -L../../src/port -L../../src/common  
-Wl,-blibpath:'/home/nm/farm/gcc64/HEAD/inst/lib:/usr/lib:/lib'  -lpgcommon 
-lpgport -lssl -lcrypto -lz -lreadline -lld -lm  -o zic
ld: 0711-224 WARNING: Duplicate symbol: .bcopy
ld: 0711-224 WARNING: Duplicate symbol: .memmove

Hm.

wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes 
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels 
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv 
-fexcess-precision=standard -g -O2 -I../../../src/include-c -o auth.o auth.c
auth.c: In function 'auth_peer':
auth.c:2002:2: warning: implicit declaration of function 'getpeereid' 
[-Wimplicit-function-declaration]
  if (getpeereid(port->sock, , ) != 0)
  ^
wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes 
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels 
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv 
-fexcess-precision=standard -g -O2 -D_REENTRANT -D_THREAD_SAFE 
-D_POSIX_PTHREAD_SEMANTICS  -DFRONTEND -DUNSAFE_STAT_OK -I. 
-I../../../src/include   -I../../../src/port -I../../../src/port 
-DSO_MAJOR_VERSION=5  -c -o fe-connect.o fe-connect.c
fe-connect.c: In function 'PQconnectPoll':
fe-connect.c:2382:6: warning: implicit declaration of function 'getpeereid' 
[-Wimplicit-function-declaration]
  if (getpeereid(conn->sock, , ) != 0)
  ^

Looks like we're missing
#include 


wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes 
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels 
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv 
-fexcess-precision=standard -g -O2 -I../../../../src/include-c -o 
pg_locale.o pg_locale.c
pg_locale.c: In function 'wchar2char':
pg_locale.c:1648:3: warning: implicit declaration of function 'wcstombs_l' 
[-Wimplicit-function-declaration]
   result = wcstombs_l(to, from, tolen, locale->info.lt);
   ^
This is curious. If I'm interpreting this correctly PGAC_FUNC_WCSTOMBS_L
fails to find a declaration, but AC_CHECK_FUNCS finds wcstombs_l, so we
happily use it.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric

2017-10-12 Thread Andres Freund
Hi,

On 2017-10-12 11:03:34 -0700, Andres Freund wrote:
> On 2017-10-12 13:55:07 -0400, Tom Lane wrote:
> > Or, if you insist on having it, we're going to have to go the pg_restrict
> > route.  I don't see why that means duplicating any configure logic: on
> > non-Windows we can use the autoconf probe and then write
> > "#define pg_restrict restrict".
> 
> Yea, that should work. I'll try to come up with a patch.

We can't do so unconditionally in c.h or such, because that'd again
cause conflicts with __declspec(restrict) on MSVC versions that don't
support restrict, because it'd require restrict to be defined empty.

But it's easy to do so in configure, and then have a separate definition
in pg_config.h.win32. Done so in the attached commit. It's slightly ugly
to have two definitions of restrict in pg_config.h.in, but whatever.

Greetings,

Andres Freund
>From 0ddc96424119682cf3b68c6d8d95ea894a60817a Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Thu, 12 Oct 2017 15:25:38 -0700
Subject: [PATCH] Use C99 restrict via pg_restrict, rather than restrict
 directly.

Unfortunately using 'restrict' plainly causes problems with MSVC,
which supports restrict only as '__restrict'. Defining 'restrict' to
'__restrict' unfortunately causes a conflict with MSVC's usage of
__declspec(restrict) in headers.

Therefore define pg_restrict to the appropriate keyword instead, and
replace existing usages.

This replaces the temporary workaround introduced in 36b4b91ba078.

Author: Andres Freund
Discussion: https://postgr.es/m/2656.1507830...@sss.pgh.pa.us
---
 configure | 107 --
 configure.in  |  15 +-
 src/include/libpq/pqformat.h  |  24 +-
 src/include/pg_config.h.in|   4 ++
 src/include/pg_config.h.win32 |  22 -
 5 files changed, 101 insertions(+), 71 deletions(-)

diff --git a/configure b/configure
index 910f0fc3736..cdcb3ceb0c8 100755
--- a/configure
+++ b/configure
@@ -11545,52 +11545,6 @@ _ACEOF
 ;;
 esac
 
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for C/C++ restrict keyword" >&5
-$as_echo_n "checking for C/C++ restrict keyword... " >&6; }
-if ${ac_cv_c_restrict+:} false; then :
-  $as_echo_n "(cached) " >&6
-else
-  ac_cv_c_restrict=no
-   # The order here caters to the fact that C++ does not require restrict.
-   for ac_kw in __restrict __restrict__ _Restrict restrict; do
- cat confdefs.h - <<_ACEOF >conftest.$ac_ext
-/* end confdefs.h.  */
-typedef int * int_ptr;
-	int foo (int_ptr $ac_kw ip) {
-	return ip[0];
-   }
-int
-main ()
-{
-int s[1];
-	int * $ac_kw t = s;
-	t[0] = 0;
-	return foo(t)
-  ;
-  return 0;
-}
-_ACEOF
-if ac_fn_c_try_compile "$LINENO"; then :
-  ac_cv_c_restrict=$ac_kw
-fi
-rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
- test "$ac_cv_c_restrict" != no && break
-   done
-
-fi
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_restrict" >&5
-$as_echo "$ac_cv_c_restrict" >&6; }
-
- case $ac_cv_c_restrict in
-   restrict) ;;
-   no) $as_echo "#define restrict /**/" >>confdefs.h
- ;;
-   *)  cat >>confdefs.h <<_ACEOF
-#define restrict $ac_cv_c_restrict
-_ACEOF
- ;;
- esac
-
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for printf format archetype" >&5
 $as_echo_n "checking for printf format archetype... " >&6; }
 if ${pgac_cv_printf_archetype+:} false; then :
@@ -12508,6 +12462,67 @@ $as_echo "#define LOCALE_T_IN_XLOCALE 1" >>confdefs.h
 
 fi
 
+# MSVC doesn't cope well with defining restrict to __restrict, the
+# spelling it understands, because it conflicts with
+# __declspec(restrict). Therefore we define pg_restrict to the
+# appropriate definition, which presumably won't conflict.
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for C/C++ restrict keyword" >&5
+$as_echo_n "checking for C/C++ restrict keyword... " >&6; }
+if ${ac_cv_c_restrict+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_cv_c_restrict=no
+   # The order here caters to the fact that C++ does not require restrict.
+   for ac_kw in __restrict __restrict__ _Restrict restrict; do
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+typedef int * int_ptr;
+	int foo (int_ptr $ac_kw ip) {
+	return ip[0];
+   }
+int
+main ()
+{
+int s[1];
+	int * $ac_kw t = s;
+	t[0] = 0;
+	return foo(t)
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  ac_cv_c_restrict=$ac_kw
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ test "$ac_cv_c_restrict" != no && break
+   done
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_restrict" &

Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric

2017-10-12 Thread Andres Freund
Hi,


On 2017-10-12 13:55:07 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > On 2017-10-12 11:30:00 -0400, Tom Lane wrote:
> >> I don't actually see why you need a #define at all --- "restrict" is the
> >> standard spelling of the keyword no?
> 
> > It is, but a lot of compilers name it differently, e.g. __restrict in
> > the case of msvc.
> 
> It's 2017 and they're still not C99 compliant?  Oh well.

...


> TBH, I really doubt that restrict buys us enough performance to justify
> dealing with this.  I'd just revert that change altogether.

It's quite noticeable, and there's plenty of other case where it seems
likely to be beneficial.


> Or, if you insist on having it, we're going to have to go the pg_restrict
> route.  I don't see why that means duplicating any configure logic: on
> non-Windows we can use the autoconf probe and then write
> "#define pg_restrict restrict".

Yea, that should work. I'll try to come up with a patch.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric

2017-10-12 Thread Andres Freund
On 2017-10-12 11:30:00 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > I've temporarily silenced that error by moving the stdlib.h include
> > before the definition of restrict, but that seems fairly fragile. I
> > primarily wanted to see whether there's other problems.  At least thrips
> > is is now happy.
> 
> > I see a number of options to continue:
> > - only define restrict on msvc 2013+ - for some reason woodlouse didn't
> >   complain about this problem, but I'm very doubtful that that's
> >   actually reliable.
> > - rename restrict to pg_restrict. That's annoying because we'd have to
> >   copy the autoconf test.
> > - live with the hack of including stdlib.h early, in pg_config.h.win32.
> > - $better_idea
> 
> I don't actually see why you need a #define at all --- "restrict" is the
> standard spelling of the keyword no?

It is, but a lot of compilers name it differently, e.g. __restrict in
the case of msvc.


> I really do not like the stdlib.h hack: that amounts to assuming that
> only stdlib.h does or ever will contain declspec(restrict).  Maybe
> you could get away with that if you were applying it only to long-dead
> MSVC versions, but doing it "#if (_MSC_VER >= 1500)" is clearly going
> to break someday.

Yea, I dislike it quite a bit too. Unfortunately not even defining
restrict to empty as if it were unsupported looks viable to me :(

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric

2017-10-11 Thread Andres Freund
On 2017-10-11 21:59:53 -0700, Andres Freund wrote:
> That fixed that problem I think. But unfortunately since then another
> problem has been reported by some other animals, all with older msvc
> versions afaict (thrips - vs 2012, bowerbird - vs 2012).

Correction, thrips is vs 2010, not 2012.


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric

2017-10-11 Thread Andres Freund
On 2017-10-11 17:13:20 -0700, Andres Freund wrote:
> Hi,
> 
> On 2017-10-11 23:11:15 +0000, Andres Freund wrote:
> > Add configure infrastructure to detect support for C99's restrict.
> > 
> > Will be used in later commits improving performance for a few key
> > routines where information about aliasing allows for significantly
> > better code generation.
> > 
> > This allows to use the C99 'restrict' keyword without breaking C89, or
> > for that matter C++, compilers. If not supported it's defined to be
> > empty.
> 
> Woodlouse doesn't like this, erroring out with:
> C:\buildfarm\buildenv\HEAD\pgsql.build\src\include\libpq/pqformat.h(47): 
> error C2219: syntax error : type qualifier must be after '*' 
> (src/backend/access/common/printsimple.c) 
> [C:\buildfarm\buildenv\HEAD\pgsql.build\postgres.vcxproj]
> 
> It's MSVC being super peculiar about error checks. I think msvc is just
> confused by the pointer hiding typedef. Using some online MSVC (and
> other compilers) frontend:
> https://godbolt.org/g/TD3nmA
> 
> I confirmed that removing the pointer hiding typedef indeed resolves the
> isssue.
> 
> I can't quite decide whether msvc just has taste and dislikes pointer
> hiding typedefs as much as I do, or whether it's incredibly stupid ;)
> 
> I'll add a comment and use StringInfoData *.

That fixed that problem I think. But unfortunately since then another
problem has been reported by some other animals, all with older msvc
versions afaict (thrips - vs 2012, bowerbird - vs 2012). Those report
that the defining of restrict to __restrict interfers with some system
headers:
C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\include\stdlib.h(619): 
error C2485: '__restrict' : unrecognized extended attribute 
(src/backend/access/brin/brin.c) 
[c:\prog\bf\root\HEAD\pgsql.build\postgres.vcxproj]

Presumably that is because the headers contain __declspec(restrict) on
some function declarations.

I've temporarily silenced that error by moving the stdlib.h include
before the definition of restrict, but that seems fairly fragile. I
primarily wanted to see whether there's other problems.  At least thrips
is is now happy.

I see a number of options to continue:
- only define restrict on msvc 2013+ - for some reason woodlouse didn't
  complain about this problem, but I'm very doubtful that that's
  actually reliable.
- rename restrict to pg_restrict. That's annoying because we'd have to
  copy the autoconf test.
- live with the hack of including stdlib.h early, in pg_config.h.win32.
- $better_idea

Comments?

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Replace remaining uses of pq_sendint with pq_sendint{8, 16, 32}.

2017-10-11 Thread Andres Freund
Replace remaining uses of pq_sendint with pq_sendint{8,16,32}.

pq_sendint() remains, so extension code doesn't unnecessarily break.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/31079a4a8e66e56e48bad94d380fa6224e9ffa0d

Modified Files
--
contrib/hstore/hstore_io.c  |  8 +--
src/backend/access/common/printsimple.c | 18 +++
src/backend/access/common/printtup.c| 16 +++---
src/backend/access/transam/parallel.c   |  4 +-
src/backend/commands/async.c|  2 +-
src/backend/commands/copy.c |  8 +--
src/backend/libpq/auth.c|  2 +-
src/backend/replication/basebackup.c| 86 -
src/backend/replication/logical/proto.c | 20 
src/backend/replication/walreceiver.c   |  8 +--
src/backend/replication/walsender.c | 36 +++---
src/backend/tcop/fastpath.c |  4 +-
src/backend/tcop/postgres.c |  8 +--
src/backend/utils/adt/arrayfuncs.c  | 14 +++---
src/backend/utils/adt/date.c|  4 +-
src/backend/utils/adt/geo_ops.c |  4 +-
src/backend/utils/adt/int.c |  4 +-
src/backend/utils/adt/jsonb.c   |  2 +-
src/backend/utils/adt/nabstime.c| 10 ++--
src/backend/utils/adt/numeric.c | 14 +++---
src/backend/utils/adt/oid.c |  2 +-
src/backend/utils/adt/rangetypes.c  |  4 +-
src/backend/utils/adt/rowtypes.c|  8 +--
src/backend/utils/adt/tid.c |  6 +--
src/backend/utils/adt/timestamp.c   |  4 +-
src/backend/utils/adt/tsquery.c | 13 +++--
src/backend/utils/adt/tsvector.c|  6 +--
src/backend/utils/adt/txid.c|  2 +-
src/backend/utils/adt/varbit.c  |  2 +-
src/backend/utils/adt/xid.c |  4 +-
30 files changed, 160 insertions(+), 163 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Temporary attempt at a workaround for further MSVC restrict buil

2017-10-11 Thread Andres Freund
Temporary attempt at a workaround for further MSVC restrict build failures.

It appears some versions of msvc use __declspec(restrict) in stdlib.h
and subsidiary headers. Including those after defining 'restrict' to
'__restrict' doesn't work.  Try to get the buildfarm green to see
whether there's further problems, by including stdlib.h just before
said define.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/36b4b91ba07843406d5a30106facb59d8275c6de

Modified Files
--
src/include/pg_config.h.win32 | 5 +
1 file changed, 5 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Work around overly strict restrict checks by MSVC.

2017-10-11 Thread Andres Freund
Work around overly strict restrict checks by MSVC.

Apparently MSVC requires a * before a restrict in a variable
declaration, even if the adorned type already is a pointer, just via
typedef.

As reported by buildfarm animal woodlouse.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20171012001320.4putagiruueht...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/060b069984a69ff0255ce318f10681c553613bef

Modified Files
--
src/include/libpq/pqformat.h | 13 -
1 file changed, 8 insertions(+), 5 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.

2017-10-11 Thread Andres Freund
Improve performance of SendRowDescriptionMessage.

There's three categories of changes leading to better performance:
- Splitting the per-attribute part of SendRowDescriptionMessage into a
  v2 and a v3 version allows avoiding branches for every attribute.
- Preallocating the size of the buffer to be big enough for all
  attributes and then using pq_write* avoids unnecessary buffer
  size checks & resizing.
- Reusing a persistently allocated StringInfo for all
  SendRowDescriptionMessage() invocations avoids repeated allocations
  & reallocations.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/4c119fbcd49ba882791c7b99a1e934b985468e9f

Modified Files
--
src/backend/access/common/printtup.c | 146 ++-
src/backend/tcop/postgres.c  |  35 +++--
src/include/access/printtup.h|   4 +-
3 files changed, 138 insertions(+), 47 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric

2017-10-11 Thread Andres Freund
Hi,

On 2017-10-11 23:11:15 +, Andres Freund wrote:
> Add configure infrastructure to detect support for C99's restrict.
> 
> Will be used in later commits improving performance for a few key
> routines where information about aliasing allows for significantly
> better code generation.
> 
> This allows to use the C99 'restrict' keyword without breaking C89, or
> for that matter C++, compilers. If not supported it's defined to be
> empty.

Woodlouse doesn't like this, erroring out with:
C:\buildfarm\buildenv\HEAD\pgsql.build\src\include\libpq/pqformat.h(47): error 
C2219: syntax error : type qualifier must be after '*' 
(src/backend/access/common/printsimple.c) 
[C:\buildfarm\buildenv\HEAD\pgsql.build\postgres.vcxproj]

It's MSVC being super peculiar about error checks. I think msvc is just
confused by the pointer hiding typedef. Using some online MSVC (and
other compilers) frontend:
https://godbolt.org/g/TD3nmA

I confirmed that removing the pointer hiding typedef indeed resolves the
isssue.

I can't quite decide whether msvc just has taste and dislikes pointer
hiding typedefs as much as I do, or whether it's incredibly stupid ;)

I'll add a comment and use StringInfoData *.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Use one stringbuffer for all rows printed in printtup.c.

2017-10-11 Thread Andres Freund
Use one stringbuffer for all rows printed in printtup.c.

This avoids newly allocating, and then possibly growing, the
stringbuffer for every row. For wide rows this can substantially
reduce memory allocator overhead, at the price of not immediately
reducing memory usage after outputting an especially wide row.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/f2dec34e19d3969ddd616e671fe9a7b968bec812

Modified Files
--
src/backend/access/common/printtup.c | 46 
1 file changed, 25 insertions(+), 21 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Allow to avoid NUL-byte management for stringinfos and use in fo

2017-10-11 Thread Andres Freund
Allow to avoid NUL-byte management for stringinfos and use in format.c.

In a lot of the places having appendBinaryStringInfo() maintain a
trailing NUL byte wasn't actually meaningful, e.g. when appending an
integer which can contain 0 in one of its bytes.

Removing this yields some small speedup, but more importantly will be
more consistent when providing faster variants of pq_sendint etc.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/70c2d1be2b1e1efa8ef38a92b443fa290a9558dd

Modified Files
--
src/backend/lib/stringinfo.c | 21 -
src/backend/libpq/pqformat.c | 18 +-
src/include/lib/stringinfo.h |  8 
3 files changed, 37 insertions(+), 10 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add more efficient functions to pqformat API.

2017-10-11 Thread Andres Freund
Add more efficient functions to pqformat API.

There's three prongs to achieve greater efficiency here:

1) Allow reusing a stringbuffer across pq_beginmessage/endmessage,
   with the new pq_beginmessage_reuse/endmessage_reuse. This can be
   beneficial both because it avoids allocating the initial buffer,
   and because it's more likely to already have an correctly sized
   buffer.

2) Replacing pq_sendint() with pq_sendint$width() inline
   functions. Previously unnecessary and unpredictable branches in
   pq_sendint() were needed. Additionally the replacement functions
   are implemented more efficiently.  pq_sendint is now deprecated, a
   separate commit will convert all in-tree callers.

3) Add pq_writeint$width(), pq_writestring(). These rely on sufficient
   space in the StringInfo's buffer, avoiding individual space checks
   & potential individual resizing.  To allow this to be used for
   strings, expose mbutil.c's MAX_CONVERSION_GROWTH.

Followup commits will make use of these facilities.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/1de09ad8eb1fa673ee7899d6dfbb2b49ba204818

Modified Files
--
src/backend/libpq/pqformat.c   |  88 -
src/backend/utils/mb/mbutils.c |  11 ---
src/include/libpq/pqformat.h   | 168 -
src/include/mb/pg_wchar.h  |  11 +++
4 files changed, 208 insertions(+), 70 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric

2017-10-11 Thread Andres Freund
Add configure infrastructure to detect support for C99's restrict.

Will be used in later commits improving performance for a few key
routines where information about aliasing allows for significantly
better code generation.

This allows to use the C99 'restrict' keyword without breaking C89, or
for that matter C++, compilers. If not supported it's defined to be
empty.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/0b974dba2d6b5581ce422ed883209de46f313fb6

Modified Files
--
configure | 46 +++
configure.in  |  1 +
src/include/pg_config.h.in| 14 +
src/include/pg_config.h.win32 | 11 +++
4 files changed, 72 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Prevent idle in transaction session timeout from sometimes being

2017-10-11 Thread Andres Freund
Prevent idle in transaction session timeout from sometimes being ignored.

The previous coding in ProcessInterrupts() could lead to
idle_in_transaction_session_timeout being ignored, when
statement_timeout occurred earlier.

The problem was that ProcessInterrupts() would return before
processing the transaction timeout if QueryCancelPending was set while
QueryCancelHoldoffCount != 0 - which is the case when reading new
commands from the client. Ergo when the idle transaction timeout would
hit.

Fix that by removing the early return. Alternatively the transaction
timeout code could have been moved up, but that early return seems
like an issue that could hit other cases too.

Author: Lukas Fittl
Bug: #14821
Discussion:

https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org

https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2cf_lpc6gvofnc0-g...@mail.gmail.com
Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced.

Branch
--
REL_10_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/61ace8fe7fe82dc04c1de493a414597989f05e56

Modified Files
--
src/backend/tcop/postgres.c | 32 +++-
1 file changed, 15 insertions(+), 17 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Prevent idle in transaction session timeout from sometimes being

2017-10-11 Thread Andres Freund
Prevent idle in transaction session timeout from sometimes being ignored.

The previous coding in ProcessInterrupts() could lead to
idle_in_transaction_session_timeout being ignored, when
statement_timeout occurred earlier.

The problem was that ProcessInterrupts() would return before
processing the transaction timeout if QueryCancelPending was set while
QueryCancelHoldoffCount != 0 - which is the case when reading new
commands from the client. Ergo when the idle transaction timeout would
hit.

Fix that by removing the early return. Alternatively the transaction
timeout code could have been moved up, but that early return seems
like an issue that could hit other cases too.

Author: Lukas Fittl
Bug: #14821
Discussion:

https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org

https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2cf_lpc6gvofnc0-g...@mail.gmail.com
Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/f676616651c83b14e1d879fbfabdd3ab2dc70bbe

Modified Files
--
src/backend/tcop/postgres.c | 32 +++-
1 file changed, 15 insertions(+), 17 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Prevent idle in transaction session timeout from sometimes being

2017-10-11 Thread Andres Freund
Prevent idle in transaction session timeout from sometimes being ignored.

The previous coding in ProcessInterrupts() could lead to
idle_in_transaction_session_timeout being ignored, when
statement_timeout occurred earlier.

The problem was that ProcessInterrupts() would return before
processing the transaction timeout if QueryCancelPending was set while
QueryCancelHoldoffCount != 0 - which is the case when reading new
commands from the client. Ergo when the idle transaction timeout would
hit.

Fix that by removing the early return. Alternatively the transaction
timeout code could have been moved up, but that early return seems
like an issue that could hit other cases too.

Author: Lukas Fittl
Bug: #14821
Discussion:

https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org

https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2cf_lpc6gvofnc0-g...@mail.gmail.com
Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced.

Branch
--
REL9_6_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/0da46d75e31ddfa9180345a14d720814e36922fa

Modified Files
--
src/backend/tcop/postgres.c | 32 +++-
1 file changed, 15 insertions(+), 17 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add port/strnlen support to libpq and ecpg Makefiles.

2017-10-11 Thread Andres Freund
Hi,

On 2017-10-11 11:58:58 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Phew. This is a bit a sad state of affairs. The separate libpq logic for
> > getting pgport is presumably because of possibly different threading
> > flags and then because of the appropriate compiler/linker flags for a
> > shared library?
> 
> I don't see why threading would matter, but building with -fPIC or
> not is definitely an issue.

-pthread changes some "memory model" type assumptions by the compiler
too IIRC, not just linker stage things. In a non-threaded environment
the compiler is kinda free to invent phantom stores and such. It's
unlikely to matter for just pgbench, but ...


> I agree the PITA factor of the current approach keeps increasing.
> It sounds a bit silly to build libpgport three ways, but maybe
> we should just do that.

We already kinda are, just by copying things around ;)


> Or conceivably we should just build the FE version of libpgport.a
> with -fPIC and not worry about whether that loses some efficiency
> for client programs.  A lot of distros are effectively forcing
> that, or even -fPIE, anyway.

Hm.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add port/strnlen support to libpq and ecpg Makefiles.

2017-10-11 Thread Andres Freund
On 2017-10-11 15:28:16 +, Tom Lane wrote:
> Add port/strnlen support to libpq and ecpg Makefiles.
> 
> In the wake of fffd651e8, any makefile that pulls in snprintf.c
> from src/port/ needs to be prepared to pull in strnlen.c as well.
> Per buildfarm.

Thanks.


> Modified Files
> --
> src/interfaces/ecpg/compatlib/.gitignore  | 1 +
> src/interfaces/ecpg/compatlib/Makefile| 6 +++---
> src/interfaces/ecpg/ecpglib/.gitignore| 1 +
> src/interfaces/ecpg/ecpglib/Makefile  | 7 ---
> src/interfaces/ecpg/pgtypeslib/.gitignore | 1 +
> src/interfaces/ecpg/pgtypeslib/Makefile   | 7 ---
> src/interfaces/libpq/.gitignore   | 1 +
> src/interfaces/libpq/Makefile | 6 +++---
> 8 files changed, 18 insertions(+), 12 deletions(-)

Phew. This is a bit a sad state of affairs. The separate libpq logic for
getting pgport is presumably because of possibly different threading
flags and then because of the appropriate compiler/linker flags for a
shared library?

Wonder if we shouldn't have variants of libpqport that support threading
and shared libraries, but built centrally. libpgport{-shared}{-mt} or
such.  Formally speaking it's not quite right that we don't use
threading aware flags in pgbench for example. Some gcc platforms at
least pretty much assume everything is built with -pthread e.g.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Rewrite strnlen replacement implementation from 8a241792f96.

2017-10-10 Thread Andres Freund
On 2017-10-10 19:15:13 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Rewrite strnlen replacement implementation from 8a241792f96.
> 
> Hm, did you hand-edit the configure script and then forget to undo it?
> Cause what was committed was not right.

Yea, that was me testing the fallback :(.  I think I need holidays.


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-10 Thread Andres Freund
On 2017-10-10 18:10:15 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Here's a fix. Not quite sure whether we really need the
> > HAVE_DECL_STRNLEN test, added it for symmetry.
> 
> LGTM.

Thanks for checking.


> I think the DECL test is a good idea; the system definition
> might be a macro or otherwise weird, in which case we'd cause problems
> if we write an extern definition anyway.

I was thinking about protecting it with HAVE_STRNLEN rather than not
protecting it at all. But this works, so whatever ;)

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Rewrite strnlen replacement implementation from 8a241792f96.

2017-10-10 Thread Andres Freund
Rewrite strnlen replacement implementation from 8a241792f96.

The previous placement of the fallback implementation in libpgcommon
was problematic, because libpqport functions need strnlen
functionality.

Move replacement into libpgport. Provide strnlen() under its posix
name, instead of pg_strnlen(). Fix stupid configure bug, executing the
test only when compiled with threading support.

Author: Andres Freund
Discussion: https://postgr.es/m/e1e1gr2-0005fb...@gemulon.postgresql.org

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/fffd651e83ccbd6191a76be6ec7c6b1b27888fde

Modified Files
--
configure | 25 -
configure.in  |  6 +++---
src/backend/utils/mmgr/mcxt.c |  3 +--
src/common/string.c   | 20 
src/include/common/string.h   | 15 ---
src/include/pg_config.h.in|  4 
src/include/pg_config.h.win32 | 10 +++---
src/include/port.h|  4 
src/port/snprintf.c   |  4 +---
src/port/strnlen.c| 33 +
10 files changed, 77 insertions(+), 47 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-10 Thread Andres Freund
On 2017-10-10 17:34:17 -0400, Andrew Dunstan wrote:
> This test is governed by the test at line 946 of configure.in. You need
> to move it somewhere else.

Yea, sorry for the noise :(.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-10 Thread Andres Freund
On 2017-10-10 13:53:15 -0700, Andres Freund wrote:
> On 2017-10-10 16:51:39 -0400, Tom Lane wrote:
> > Andres Freund <and...@anarazel.de> writes:
> > > As far as I can tell it's still somehow using a configure from before
> > > the last commits:
> > 
> > No, it's pilot error.  The AC_CHECK_FUNCS call you added strnlen to
> > is only executed if
> > AS_IF([test "$enable_thread_safety" = yes -a "$PORTNAME" != "win32"],
> 
> Dear god, I'm daft. Dear god, m4 is a horrible lanuagage.

Here's a fix. Not quite sure whether we really need the
HAVE_DECL_STRNLEN test, added it for symmetry.

Afaict windows "always" had strnlen, so no need to meddle with the
windows build.

Will eat lunch and push, even if there's some futher adjustments, I'd
like to get Andrew's animals green again.

Greetings,

Andres Freund
>From fffd651e83ccbd6191a76be6ec7c6b1b27888fde Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Tue, 10 Oct 2017 14:42:16 -0700
Subject: [PATCH] Rewrite strnlen replacement implementation from 8a241792f96.

The previous placement of the fallback implementation in libpgcommon
was problematic, because libpqport functions need strnlen
functionality.

Move replacement into libpgport. Provide strnlen() under its posix
name, instead of pg_strnlen(). Fix stupid configure bug, executing the
test only when compiled with threading support.

Author: Andres Freund
Discussion: https://postgr.es/m/e1e1gr2-0005fb...@gemulon.postgresql.org
---
 configure | 25 -
 configure.in  |  6 +++---
 src/backend/utils/mmgr/mcxt.c |  3 +--
 src/common/string.c   | 20 
 src/include/common/string.h   | 15 ---
 src/include/pg_config.h.in|  4 
 src/include/pg_config.h.win32 | 10 +++---
 src/include/port.h|  4 
 src/port/snprintf.c   |  4 +---
 src/port/strnlen.c| 33 +
 10 files changed, 77 insertions(+), 47 deletions(-)
 create mode 100644 src/port/strnlen.c

diff --git a/configure b/configure
index a1283c05005..b0582657bf4 100755
--- a/configure
+++ b/configure
@@ -8777,7 +8777,7 @@ fi
 
 
 
-for ac_func in strerror_r getpwuid_r gethostbyname_r strnlen
+for ac_func in strerror_r getpwuid_r gethostbyname_r
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
@@ -13161,6 +13161,16 @@ fi
 cat >>confdefs.h <<_ACEOF
 #define HAVE_DECL_STRLCPY $ac_have_decl
 _ACEOF
+ac_fn_c_check_decl "$LINENO" "strnlenfrak" "ac_cv_have_decl_strnlen" "$ac_includes_default"
+if test "x$ac_cv_have_decl_strnlen" = xyes; then :
+  ac_have_decl=1
+else
+  ac_have_decl=0
+fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_DECL_STRNLEN $ac_have_decl
+_ACEOF
 
 # This is probably only present on macOS, but may as well check always
 ac_fn_c_check_decl "$LINENO" "F_FULLFSYNC" "ac_cv_have_decl_F_FULLFSYNC" "#include 
@@ -13528,6 +13538,19 @@ esac
 
 fi
 
+ac_fn_c_check_func "$LINENO" "strnlenfrak" "ac_cv_func_strnlen"
+if test "x$ac_cv_func_strnlen" = xyes; then :
+  $as_echo "#define HAVE_STRNLEN 1" >>confdefs.h
+
+else
+  case " $LIBOBJS " in
+  *" strnlen.$ac_objext "* ) ;;
+  *) LIBOBJS="$LIBOBJS strnlen.$ac_objext"
+ ;;
+esac
+
+fi
+
 
 
 case $host_os in
diff --git a/configure.in b/configure.in
index e1381b4ead6..4548db0dd3c 100644
--- a/configure.in
+++ b/configure.in
@@ -961,7 +961,7 @@ LIBS="$LIBS $PTHREAD_LIBS"
 AC_CHECK_HEADER(pthread.h, [], [AC_MSG_ERROR([
 pthread.h not found;  use --disable-thread-safety to disable thread safety])])
 
-AC_CHECK_FUNCS([strerror_r getpwuid_r gethostbyname_r strnlen])
+AC_CHECK_FUNCS([strerror_r getpwuid_r gethostbyname_r])
 
 # Do test here with the proper thread flags
 PGAC_FUNC_STRERROR_R_INT
@@ -1422,7 +1422,7 @@ AC_CHECK_DECLS(posix_fadvise, [], [], [#include ])
 fi
 
 AC_CHECK_DECLS(fdatasync, [], [], [#include ])
-AC_CHECK_DECLS([strlcat, strlcpy])
+AC_CHECK_DECLS([strlcat, strlcpy, strnlen])
 # This is probably only present on macOS, but may as well check always
 AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include ])
 
@@ -1514,7 +1514,7 @@ else
   AC_CHECK_FUNCS([fpclass fp_class fp_class_d class], [break])
 fi
 
-AC_REPLACE_FUNCS([crypt fls getopt getrusage inet_aton mkdtemp random rint srandom strerror strlcat strlcpy])
+AC_REPLACE_FUNCS([crypt fls getopt getrusage inet_aton mkdtemp random rint srandom strerror strlcat strlcpy strnlen])
 
 case $host_os in
 
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 64e0408d5af..c5c311fad39 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/

Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-10 Thread Andres Freund
On 2017-10-10 16:51:39 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > As far as I can tell it's still somehow using a configure from before
> > the last commits:
> 
> No, it's pilot error.  The AC_CHECK_FUNCS call you added strnlen to
> is only executed if
> AS_IF([test "$enable_thread_safety" = yes -a "$PORTNAME" != "win32"],

Dear god, I'm daft. Dear god, m4 is a horrible lanuagage.

Thanks.


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-10 Thread Andres Freund
On 2017-10-10 16:37:04 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > (there's definitely fixes to be made to where strnlen's replacement is
> > located, but regardless, this needs to be fixed too)
> 
> Given that strnlen is standardized by POSIX, and has been for nigh a
> decade, I think it'd be all right for us to treat it as a straight
> port replacement function, a la strlcpy() for instance.  That is,
> forget the pg_ prefix and just create a strnlen() function if the
> platform has not got it.

Yea, I'm ok with that. But can't do that before the configure issue on
these boxes is fixed, otherwise we'll likely get symbol conflicts...

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-10 Thread Andres Freund
On 2017-10-09 23:33:36 -0400, Andrew Dunstan wrote:
> 
> 
> On 10/09/2017 07:15 PM, Andres Freund wrote:
> > Hi Andrew,
> >
> > On 2017-10-09 22:22:04 +, Andres Freund wrote:
> >> Add pg_strnlen() a portable implementation of strlen.
> >>
> >> As the OS version is likely going to be more optimized, fall back to
> >> it if available, as detected by configure.
> > I'm a bit confused, frogmouth
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogmouth=2017-10-09%2022%3A30%3A41
> > shows that it compiled the new code, but the configure output doesn't
> > show it ran through the new configure test.  Additionally, without the
> > the config define, this should result in the replacement being
> > used. Which doesn't seem to be the case either.
> >
> > Kinda sounds like this used some halfway outdated build or such?
> >
> 
> 
> 
> frogmouth is using some code not yet released that makes the config
> cache persistent. I just identified and fixed a stupid bug in the code
> that obsoletes the cache, and I have removed frogmouth's cache file and
> set it running again, so we'll see if that fixes things.

As far as I can tell it's still somehow using a configure from before
the last commits:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=frogmouth=2017-10-10%2018%3A35%3A06=configure

Note that it's not running the new test.

(there's definitely fixes to be made to where strnlen's replacement is
located, but regardless, this needs to be fixed too)

- Andres


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-09 Thread Andres Freund
On 2017-10-10 00:25:52 -0400, Tom Lane wrote:
> Andrew Dunstan <and...@dunslane.net> writes:
> > frogmouth is using some code not yet released that makes the config
> > cache persistent. I just identified and fixed a stupid bug in the code
> > that obsoletes the cache, and I have removed frogmouth's cache file and
> > set it running again, so we'll see if that fixes things.

Interesting. Although an outdated cache file doesn't explain why
configure didn't even do the relevant check, no?  This kinda sounds more
like configure as a whole is outdated, rather than it's cache file.


> I doubt it.

It'll quite possibly fix it, but probably not for good reasons.
Presumably after a proper configure run the fallback won't be used
anymore.


> I think the problem with this patch is that Andres has
> made libpgport dependent on libpgcommon, which is backwards, or at
> least circular.  The module layering is supposed to go the other way.

Yes, quite possibly. At least one platform without strnlen [1], as well
as my local machine when intentionally marking strnlen as not available,
ran successfully. But that's likely just a difference in how/when
symbols are resolved.

I think the current split between common and port isn't particularly
meaningful. But as long as we have it, this probably belongs more in
port than in common.

Greetings,

Andres Freund

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=gharial=2017-10-10%2000%3A31%3A38=config


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-09 Thread Andres Freund
Hi Andrew,

On 2017-10-09 22:22:04 +, Andres Freund wrote:
> Add pg_strnlen() a portable implementation of strlen.
> 
> As the OS version is likely going to be more optimized, fall back to
> it if available, as detected by configure.

I'm a bit confused, frogmouth
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogmouth=2017-10-09%2022%3A30%3A41
shows that it compiled the new code, but the configure output doesn't
show it ran through the new configure test.  Additionally, without the
the config define, this should result in the replacement being
used. Which doesn't seem to be the case either.

Kinda sounds like this used some halfway outdated build or such?

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix pnstrdup() to not memcpy() the maximum allowed length.

2017-10-09 Thread Andres Freund
Fix pnstrdup() to not memcpy() the maximum allowed length.

The previous behaviour was dangerous if the length passed wasn't the
size of the underlying buffer, but the maximum size of the underlying
buffer.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20161003215524.mwz5p45pcverr...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/82c117cb90e6b6b79f06d61eb1ddf06e94e75b60

Modified Files
--
src/backend/utils/mmgr/mcxt.c | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.

2017-10-09 Thread Andres Freund
Add pg_strnlen() a portable implementation of strlen.

As the OS version is likely going to be more optimized, fall back to
it if available, as detected by configure.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/8a241792f968ed5be6cf4d41e32c0d264f6c0c65

Modified Files
--
configure |  2 +-
configure.in  |  2 +-
src/common/string.c   | 20 
src/include/common/string.h   | 15 +++
src/include/pg_config.h.in|  3 +++
src/include/pg_config.h.win32 |  3 +++
src/port/snprintf.c   | 12 ++--
7 files changed, 45 insertions(+), 12 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Reduce memory usage of targetlist SRFs.

2017-10-08 Thread Andres Freund
Reduce memory usage of targetlist SRFs.

Previously nodeProjectSet only released memory once per input tuple,
rather than once per returned tuple. If the computation of an
individual returned tuple requires a lot of memory, that can lead to
problems.

Instead change things so that the expression context can be reset once
per output tuple, which requires a new memory context to store SRF
arguments in.

This is a longstanding issue, but was hard to fix before 9.6, due to
the way tSRFs where evaluated. But it's fairly easy to fix now. We
could backpatch this into 10, but given there've been fewc omplaints
that doesn't seem worth the risk so far.

Reported-By: Lucas Fairchild
Author: Andres Freund, per discussion with Tom Lane
Discussion: https://postgr.es/m/4514.1507318...@sss.pgh.pa.us

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/84ad4b036d975ad1be0f52251bac3a06463c9811

Modified Files
--
src/backend/executor/execSRF.c| 31 ---
src/backend/executor/nodeProjectSet.c | 32 +++-
src/include/executor/executor.h   |  1 +
src/include/nodes/execnodes.h |  1 +
4 files changed, 57 insertions(+), 8 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Msvc doesn't know UINT16_MAX, replace with PG_UINT16_MAX.

2017-10-04 Thread Andres Freund
Msvc doesn't know UINT16_MAX, replace with PG_UINT16_MAX.

UINT16_MAX usage is originating from commit 212e6f34d55c.

Per buildfarm animal currawong.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/9eafa2b5b043b84fb9846bd7a57d15ed1ee220c1

Modified Files
--
src/include/utils/fmgrtab.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Attempt to adapt windows build for 212e6f34d55c.

2017-10-04 Thread Andres Freund
Attempt to adapt windows build for 212e6f34d55c.

Per buildfarm animal baiji.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/15334ad19a776f76cbb725e4e9162a7bce1bd4d0

Modified Files
--
src/tools/msvc/Solution.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Move genbki.pl's find_defined_symbol to Catalog.pm.

2017-10-04 Thread Andres Freund
Move genbki.pl's find_defined_symbol to Catalog.pm.

Will be used in Gen_fmgrtab.pl in a followup commit.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/18f791ab2b6a01a632653d394e046f3daf193ff6

Modified Files
--
src/backend/catalog/Catalog.pm | 35 ++-
src/backend/catalog/genbki.pl  | 34 --
2 files changed, 38 insertions(+), 31 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Replace binary search in fmgr_isbuiltin with a lookup array.

2017-10-04 Thread Andres Freund
Replace binary search in fmgr_isbuiltin with a lookup array.

Turns out we have enough functions that the binary search is quite
noticeable in profiles.

Thus have Gen_fmgrtab.pl build a new mapping from a builtin function's
oid to an index in the existing fmgr_builtins array. That keeps the
additional memory usage at a reasonable amount.

Author: Andres Freund, with input from Tom Lane
Discussion: 
https://postgr.es/m/20170914065128.a5sk7z4xde5uy...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/212e6f34d55c910505f87438d878698223d9a617

Modified Files
--
src/backend/utils/Gen_fmgrtab.pl | 79 +---
src/backend/utils/Makefile   |  2 +-
src/backend/utils/fmgr/fmgr.c| 29 ++-
src/include/utils/fmgrtab.h  | 11 +-
4 files changed, 88 insertions(+), 33 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Yet another pg_bswap typo in a windows only file.

2017-10-01 Thread Andres Freund
Yet another pg_bswap typo in a windows only file.

Per buildfarm animal frogmouth.

Brown-Paper-Bagged-By: Andres Freund

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/0c8b3ee94478ca07c86c09d2399a2ce73c2b922b

Modified Files
--
src/port/getaddrinfo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Correct include file name in inet_aton fallback.

2017-10-01 Thread Andres Freund
Correct include file name in inet_aton fallback.

Per buildfarm animal frogmouth.

Author: Andres Freund

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/859759b62f2d2f2f2805e2aa9ebdb167a1b9655c

Modified Files
--
src/port/inet_aton.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Replace most usages of ntoh[ls] and hton[sl] with pg_bswap.h.

2017-10-01 Thread Andres Freund
Replace most usages of ntoh[ls] and hton[sl] with pg_bswap.h.

All postgres internal usages are replaced, it's just libpq example
usages that haven't been converted. External users of libpq can't
generally rely on including postgres internal headers.

Note that this includes replacing open-coded byte swapping of 64bit
integers (using two 32 bit swaps) with a single 64bit swap.

Where it looked applicable, I have removed netinet/in.h and
arpa/inet.h usage, which previously provided the relevant
functionality. It's perfectly possible that I missed other reasons for
including those, the buildfarm will tell.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170927172019.gheidqy6xvlxb...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/0ba99c84e8c7138143059b281063d4cca5a2bfea

Modified Files
--
contrib/pgcrypto/crypt-des.c| 17 +++-
contrib/uuid-ossp/uuid-ossp.c   | 17 +++-
src/backend/commands/copy.c | 11 +-
src/backend/libpq/auth.c| 18 -
src/backend/libpq/ifaddr.c  |  6 +++---
src/backend/libpq/pqcomm.c  |  6 +++---
src/backend/libpq/pqformat.c| 40 ++---
src/backend/postmaster/postmaster.c | 13 ++--
src/backend/tcop/fastpath.c |  8 +++-
src/bin/pg_basebackup/streamutil.c  | 34 +++
src/bin/pg_dump/parallel.c  |  6 --
src/bin/pg_rewind/libpq_fetch.c | 29 ++-
src/common/scram-common.c   |  7 ++-
src/interfaces/libpq/fe-connect.c   | 12 +--
src/interfaces/libpq/fe-lobj.c  | 11 +-
src/interfaces/libpq/fe-misc.c  | 14 ++---
src/interfaces/libpq/fe-protocol2.c |  5 ++---
src/interfaces/libpq/fe-protocol3.c |  5 ++---
src/port/getaddrinfo.c  | 11 +-
src/port/inet_aton.c|  4 +++-
20 files changed, 99 insertions(+), 175 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Allow pg_ctl kill to send SIGKILL.

2017-10-01 Thread Andres Freund
Allow pg_ctl kill to send SIGKILL.

Previously that was disallowed out of an abundance of
caution. Providing KILL support however is helpful to make the
013_crash_restart.pl test portable, and there's no actual issue with
allowing it.  SIGABRT, which has similar consequences except it also
dumps core, was already allowed.

Author: Andres Freund
Discussion: 
https://postgr.es/m/45d42d41-6145-9be1-7261-84acf6d9e...@2ndquadrant.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/2e83db3ad2da9b073af9ae12916f0b71cf698e1e

Modified Files
--
src/bin/pg_ctl/pg_ctl.c | 5 +
1 file changed, 1 insertion(+), 4 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Remove redundant stdint.h include.

2017-10-01 Thread Andres Freund
Remove redundant stdint.h include.

Discussion: https://postgr.es/m/31674.1506788...@sss.pgh.pa.us

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/1f2830f9df9f0196ba541c1e253463afe657cb67

Modified Files
--
src/include/port/pg_bswap.h | 8 
1 file changed, 4 insertions(+), 4 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Try to make crash restart test work on windows.

2017-10-01 Thread Andres Freund
Try to make crash restart test work on windows.

Author: Andres Freund
Tested-By: Andrew Dunstan
Discussion: 
https://postgr.es/m/20170930224424.ud5ilchmclbl5...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/784905795f8aadc09efe2fdae195279d17250f00

Modified Files
--
src/test/recovery/t/013_crash_restart.pl | 28 ++--
1 file changed, 10 insertions(+), 18 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.

2017-09-30 Thread Andres Freund
Hi,

On 2017-09-30 22:28:39 -0400, Andrew Dunstan wrote:
> >> But even after fixing that, there unfortunately is:
> >>
> >> static void
> >> set_sig(char *signame)
> >> {
> >> …
> >> #if 0
> >>/* probably should NOT provide SIGKILL */
> >>else if (strcmp(signame, "KILL") == 0)
> >>sig = SIGKILL;
> >> #endif
> >>
> >> I'm unclear on what that provision is achieving? If you can kill with
> >> pg_ctl you can do other nasty stuff too (like just use kill instead of
> >> pg_ctl)?
> 
> 
> I put it in when we rewrote pg_ctl in C many years ago, possibly out of
> a superabundance of caution. I agree it's worth revisiting. I think the
> idea was that there's a difference between an ordinary footgun and an
> officially sanctioned footgun :-)

Heh.  I'm inclined to take it out. We could add a --use-the-force-luke
type parameter, but it doesn't seem worth it.


> Haven't tested on MSVC but with this patch it passes on jacana (mingw).

Yay!  Thanks for testing.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.

2017-09-30 Thread Andres Freund
On 2017-09-30 15:27:12 -0700, Andres Freund wrote:
> On 2017-09-30 18:21:33 -0400, Andrew Dunstan wrote:
> > 
> > [re-adding commiters which I inadvertently left off]
> > 
> > 
> > On 09/30/2017 06:10 PM, Andres Freund wrote:
> > >
> > >
> > >> I was just looking at this. Why aren't we using "pg_ctl kill" to
> > >> terminate the backend? That's supposed to be portable.
> > > Because pg_ctl can't do that for any process but postmaster, no? The
> > > test is supposed to find issues with backend death (and has
> > > defficiencies in error reporting already, and would have caught a bug
> > > I'd introduced previously).
> 
> > No, I don't think so. That's not what the docs say. That's why you give
> > it a pid argument" "pg_ctl kill signal_name process_id"
> 
> Oh, cool. Didn't know that one. So the answer is:
> "Because Andres doesn't know squat.".
> 
> But even after fixing that, there unfortunately is:
> 
> static void
> set_sig(char *signame)
> {
> …
> #if 0
>   /* probably should NOT provide SIGKILL */
>   else if (strcmp(signame, "KILL") == 0)
>   sig = SIGKILL;
> #endif
> 
> I'm unclear on what that provision is achieving? If you can kill with
> pg_ctl you can do other nasty stuff too (like just use kill instead of
> pg_ctl)?

Could you perhaps test whether windows likes things after the following
patch?  I don't think the kill_kill guarantees are really needed here,
so we might even be able to allow this on msvc.

Greetings,

Andres Freund
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 4e02c4cea1..0990647297 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -1961,11 +1961,8 @@ set_sig(char *signame)
 		sig = SIGQUIT;
 	else if (strcmp(signame, "ABRT") == 0)
 		sig = SIGABRT;
-#if 0
-	/* probably should NOT provide SIGKILL */
 	else if (strcmp(signame, "KILL") == 0)
 		sig = SIGKILL;
-#endif
 	else if (strcmp(signame, "TERM") == 0)
 		sig = SIGTERM;
 	else if (strcmp(signame, "USR1") == 0)
diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl
index ca02054ff0..91a8ef90c1 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -16,16 +16,8 @@ use Test::More;
 use Config;
 use Time::HiRes qw(usleep);
 
-if ($Config{osname} eq 'MSWin32')
-{
-	# some Windows Perls at least don't like IPC::Run's
-	# start/kill_kill regime.
-	plan skip_all => "Test fails on Windows perl";
-}
-else
-{
-	plan tests => 18;
-}
+plan tests => 18;
+
 
 # To avoid hanging while expecting some specific input from a psql
 # instance being driven by us, add a timeout high enough that it
@@ -106,10 +98,10 @@ $monitor_stdout = '';
 $monitor_stderr = '';
 
 # kill once with QUIT - we expect psql to exit, while emitting error message first
-my $cnt = kill 'QUIT', $pid;
+my $ret = TestLib::system_log('pg_ctl', 'kill', 'QUIT', $pid);
 
 # Exactly process should have been alive to be killed
-is($cnt, 1, "exactly one process killed with SIGQUIT");
+is($ret, 0, "killed process with SIGQUIT");
 
 # Check that psql sees the killed backend as having been terminated
 $killme_stdin .= q[
@@ -119,14 +111,14 @@ ok(pump_until($killme, \$killme_stderr, qr/WARNING:  terminating connection beca
"psql query died successfully after SIGQUIT");
 $killme_stderr = '';
 $killme_stdout = '';
-$killme->kill_kill;
+$killme->finish;
 
 # Wait till server restarts - we should get the WARNING here, but
 # sometimes the server is unable to send that, if interrupted while
 # sending.
 ok(pump_until($monitor, \$monitor_stderr, qr/WARNING:  terminating connection because of crash of another server process|server closed the connection unexpectedly/m),
"psql monitor died successfully after SIGQUIT");
-$monitor->kill_kill;
+$monitor->finish;
 
 # Wait till server restarts
 is($node->poll_query_until('postgres', 'SELECT $$restarted after sigquit$$;', 'restarted after sigquit'),
@@ -179,8 +171,8 @@ $monitor_stderr = '';
 
 # kill with SIGKILL this time - we expect the backend to exit, without
 # being able to emit an error error message
-$cnt = kill 'KILL', $pid;
-is($cnt, 1, "exactly one process killed with KILL");
+$ret = TestLib::system_log('pg_ctl', 'kill', 'KILL', $pid);
+is($ret, 0, "killed process with KILL");
 
 # Check that psql sees the server as being terminated. No WARNING,
 # because signal handlers aren't being run on SIGKILL.
@@ -189,14 +181,14 @@ SELECT 1;
 ];
 ok(pump_until($killme, \$killme_stderr, qr/server closed the connection unexpectedly/m),
"psql query died successfully after SIGKILL");
-$killme->kill_kill;
+$killme-&

Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.

2017-09-30 Thread Andres Freund
On 2017-09-30 18:21:33 -0400, Andrew Dunstan wrote:
> 
> [re-adding commiters which I inadvertently left off]
> 
> 
> On 09/30/2017 06:10 PM, Andres Freund wrote:
> >
> >
> >> I was just looking at this. Why aren't we using "pg_ctl kill" to
> >> terminate the backend? That's supposed to be portable.
> > Because pg_ctl can't do that for any process but postmaster, no? The
> > test is supposed to find issues with backend death (and has
> > defficiencies in error reporting already, and would have caught a bug
> > I'd introduced previously).

> No, I don't think so. That's not what the docs say. That's why you give
> it a pid argument" "pg_ctl kill signal_name process_id"

Oh, cool. Didn't know that one. So the answer is:
"Because Andres doesn't know squat.".

But even after fixing that, there unfortunately is:

static void
set_sig(char *signame)
{
…
#if 0
/* probably should NOT provide SIGKILL */
else if (strcmp(signame, "KILL") == 0)
sig = SIGKILL;
#endif

I'm unclear on what that provision is achieving? If you can kill with
pg_ctl you can do other nasty stuff too (like just use kill instead of
pg_ctl)?

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Extend & revamp pg_bswap.h infrastructure.

2017-09-30 Thread Andres Freund
On 2017-09-30 12:17:06 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Extend & revamp pg_bswap.h infrastructure.
> 
> Hm, what is the point of this in pg_bswap.h:
> 
> +#ifdef _MSC_VER
> +#include 
> +#endif
> 
> c.h will already have included .  There might be some
> value in this if we anticipated allowing freestanding use of this
> header, but that won't happen because it depends on configure symbols.

Well, it's not that obvious where the _byteswap_* functions are coming
from on msvc. I guess we can just leave the comment
/* In all supported versions msvc provides _byteswap_* functions in stdlib.h */
there, but I see no harm in the current form either.


> but that won't happen because it depends on configure symbols.

FWIW, I've wondered about replacing the pg_config.h tests with explicit
gcc version checks. But doesn't seem worth it for now.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix copy & pasto in 510b8cbff15f.

2017-09-29 Thread Andres Freund
Fix copy & pasto in 510b8cbff15f.

Reported-By: Peter Geoghegan

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/248e33756b425335d94a32ffc8e9aace04f82c31

Modified Files
--
src/include/port/pg_bswap.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Extend & revamp pg_bswap.h infrastructure.

2017-09-29 Thread Andres Freund
On 2017-09-29 17:33:51 -0700, Peter Geoghegan wrote:
> On Fri, Sep 29, 2017 at 5:29 PM, Andres Freund <and...@anarazel.de> wrote:
> > Extend & revamp pg_bswap.h infrastructure.
> 
> This looks wrong:
> 
> +static inline uint16
> +pg_bswap64(uint16 x)
> +{

Yea, just noticed that :(. Running test, and pushing.  Thanks for
checking!

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix typo.

2017-09-29 Thread Andres Freund
Fix typo.

Reported-By: Thomas Munro and Jesper Pedersen

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/f14241236ea2e306dc665635665c7f88669b6ca4

Modified Files
--
src/include/utils/hashutils.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Extend & revamp pg_bswap.h infrastructure.

2017-09-29 Thread Andres Freund
Extend & revamp pg_bswap.h infrastructure.

Upcoming patches are going to address performance issues that involve
slow system provided ntohs/htons etc. To address that expand
pg_bswap.h to provide pg_ntoh{16,32,64}, pg_hton{16,32,64} and
optimize their respective implementations by using compiler intrinsics
for gcc compatible compilers and msvc. Fall back to manual
implementations using shifts etc otherwise.

Additionally remove multiple evaluation hazards from the existing
BSWAP32/64 macros, by replacing them with inline functions when
necessary. In the course of that the naming scheme is changed to
pg_bswap16/32/64.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170927172019.gheidqy6xvlxb...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/510b8cbff15fcece246f66f2273ccf830a6c7e98

Modified Files
--
config/c-compiler.m4|  17 ++
configure   |  24 
configure.in|   1 +
contrib/btree_gist/btree_uuid.c |   4 +-
src/include/pg_config.h.in  |   3 +
src/include/pg_config.h.win32   |   3 +
src/include/port/pg_bswap.h | 132 
src/include/port/pg_crc32c.h|   2 +-
8 files changed, 159 insertions(+), 27 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Remove pgbench "progress" test pending solution of its timing is

2017-09-23 Thread Andres Freund
Hi,

On 2017-09-23 19:57:40 +0200, Fabien COELHO wrote:
> According to the trace, the host was so loaded that it could not do anything
> for over two seconds, so that the first output is "progress: 2.6", i.e. the
> 0 thread did not get any significant time for the first 2.6 seconds... of a
> 1 second test. Hmmm, not very kind.
> 
> Somehow this cannot be helped: if the system does not give any execution
> time to some pgbench thread, the expected output cannot be there.

skink runs postgres under valgrind, so it's going to be slower.


> If the test must run even when postgres doesn't, it is a little bit hard as
> a starting assumption for a benchmarking tool:-(

I'm unsure what the point of this is.  It's not like we discussing
removing pgbench, we're talking about an unreliable test.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add inline murmurhash32(uint32) function.

2017-09-22 Thread Andres Freund
Add inline murmurhash32(uint32) function.

The function already existed in tidbitmap.c but more users requiring
fast hashing of 32bit ints are coming up.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20170914061207.zxotvyopetm7l...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/791961f59b792fbd4f0a992d3ccab47298e79103

Modified Files
--
src/backend/nodes/tidbitmap.c | 20 ++--
src/include/utils/hashutils.h | 18 ++
2 files changed, 20 insertions(+), 18 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix s/intidb/initdb/ typo.

2017-09-22 Thread Andres Freund
Fix s/intidb/initdb/ typo.

Reported-By: Michael Paquier
Discussion: 
https://postgr.es/m/CAB7nPqTfaKAYZ4wuUM-W8kc4VnXrxX1=5-a9i==vouptmfp...@mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/f9583e86b4bfa8c4e4d83ab33e5dcdaeab5c45a1

Modified Files
--
src/include/pg_config_manual.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Expand expected output for recovery test even further.

2017-09-22 Thread Andres Freund
Expand expected output for recovery test even further.

I'd assumed that the backend being killed should be able to get out an
error message - but it turns out it's not guaranteed that it's not
still sending a ready-for-query.  Really need to do something about
getting these error message to the client.

Reported-By: Thomas Munro, Tom Lane
Discussion:

https://postgr.es/m/CAEepm=0TE90nded+bNthP45_PEvGAAr=3gxhhjobl4xmolt...@mail.gmail.com
https://postgr.es/m/14968.1506101...@sss.pgh.pa.us

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/8d926029e817d280b2376433e3aaa3895e1a7128

Modified Files
--
src/test/recovery/t/013_crash_restart.pl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Accept that server might not be able to send error in crash reco

2017-09-22 Thread Andres Freund
On 2017-09-22 13:30:14 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Accept that server might not be able to send error in crash recovery test.
> 
> piculet says you didn't cover all the bases:
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet=2017-09-22%2017%3A10%3A01
> 
> Looks like you missed adding the "server closed the connection
> unexpectedly" alternative to one of the three places.

I'd hoped^Wassumed it couldn't happen in that case.  I'll expand the
expected output. FWIW, I kinda like the new debugging output for this -
makes it much easier to diagnose.

Besides this individual test, I think we really need to do something
that increases the likelihood of getting these error messages to
clients.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Make WAL segment size configurable at initdb time.

2017-09-19 Thread Andres Freund
Make WAL segment size configurable at initdb time.

For performance reasons a larger segment size than the default 16MB
can be useful. A larger segment size has two main benefits: Firstly,
in setups using archiving, it makes it easier to write scripts that
can keep up with higher amounts of WAL, secondly, the WAL has to be
written and synced to disk less frequently.

But at the same time large segment size are disadvantageous for
smaller databases. So far the segment size had to be configured at
compile time, often making it unrealistic to choose one fitting to a
particularly load. Therefore change it to a initdb time setting.

This includes a breaking changes to the xlogreader.h API, which now
requires the current segment size to be configured.  For that and
similar reasons a number of binaries had to be taught how to recognize
the current segment size.

Author: Beena Emerson, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael
Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja
Discussion: 
https://postgr.es/m/caog9apeacq--1iekbhfzxsqpw_ylmepaa4hndny5+zulpt8...@mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/fc49e24fa69a15efacd5b8958115ed9c43c48f9a

Modified Files
--
configure   |  54 -
configure.in|  31 ---
contrib/pg_standby/pg_standby.c | 115 +--
doc/src/sgml/backup.sgml|   2 +-
doc/src/sgml/installation.sgml  |  14 --
doc/src/sgml/ref/initdb.sgml|  15 ++
doc/src/sgml/wal.sgml   |  13 +-
src/backend/access/transam/twophase.c   |   3 +-
src/backend/access/transam/xlog.c   | 255 ++--
src/backend/access/transam/xlogarchive.c|  14 +-
src/backend/access/transam/xlogfuncs.c  |  10 +-
src/backend/access/transam/xlogreader.c |  32 +--
src/backend/access/transam/xlogutils.c  |  36 ++--
src/backend/bootstrap/bootstrap.c   |  15 +-
src/backend/postmaster/checkpointer.c   |   5 +-
src/backend/replication/basebackup.c|  34 ++--
src/backend/replication/logical/logical.c   |   2 +-
src/backend/replication/logical/reorderbuffer.c |  19 +-
src/backend/replication/slot.c  |   2 +-
src/backend/replication/walreceiver.c   |  14 +-
src/backend/replication/walreceiverfuncs.c  |   4 +-
src/backend/replication/walsender.c |  16 +-
src/backend/utils/misc/guc.c|  20 +-
src/backend/utils/misc/pg_controldata.c |   5 +-
src/backend/utils/misc/postgresql.conf.sample   |   2 +-
src/bin/initdb/initdb.c |  58 +-
src/bin/pg_basebackup/pg_basebackup.c   |   7 +-
src/bin/pg_basebackup/pg_receivewal.c   |  16 +-
src/bin/pg_basebackup/receivelog.c  |  36 ++--
src/bin/pg_basebackup/streamutil.c  |  76 +++
src/bin/pg_basebackup/streamutil.h  |   2 +
src/bin/pg_controldata/pg_controldata.c |  15 +-
src/bin/pg_resetwal/pg_resetwal.c   |  55 +++--
src/bin/pg_rewind/parsexlog.c   |  30 ++-
src/bin/pg_rewind/pg_rewind.c   |  12 +-
src/bin/pg_rewind/pg_rewind.h   |   1 +
src/bin/pg_test_fsync/pg_test_fsync.c   |   7 +-
src/bin/pg_upgrade/test.sh  |   4 +-
src/bin/pg_waldump/pg_waldump.c | 246 ---
src/include/access/xlog.h   |   1 +
src/include/access/xlog_internal.h  |  76 ---
src/include/access/xlogreader.h |   8 +-
src/include/catalog/pg_control.h|   2 +-
src/include/pg_config.h.in  |   5 -
src/include/pg_config_manual.h  |   6 +
src/tools/msvc/Solution.pm  |   2 -
46 files changed, 897 insertions(+), 500 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Accept that server might not be able to send error in crash reco

2017-09-19 Thread Andres Freund
Accept that server might not be able to send error in crash recovery test.

As it turns out we can't rely that the script's monitoring session is
terminated with a proper error by the server, because the session
might be terminated while already trying to send data.

Also improve robustness and error reporting facilities of the test,
developed while debugging this issue.

Discussion: 
https://postgr.es/m/20170920020038.kllxgilo7xzwm...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/5ada1fcd0c30be1b0b793a802cf6da386a6c1925

Modified Files
--
src/test/recovery/t/013_crash_restart.pl | 98 
1 file changed, 74 insertions(+), 24 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.

2017-09-19 Thread Andres Freund
On 2017-09-19 19:00:38 -0700, Andres Freund wrote:
> Given this fact pattern, I'll allow the case without a received error
> message in the recovery test. Objections?

Hearing none. Pushed.

While debugging this, I've also introduced a pump wrapper so that we now
get:
ok 4 - exactly one process killed with SIGQUIT
# aborting wait: program died
# stream contents: >>psql::9: WARNING:  terminating connection because 
of crash of another server process
# DETAIL:  The postmaster has commanded this server process to roll back the 
current transaction and exit, because another server process exited abnormally 
and possibly corrupted shared memory.
# HINT:  In a moment you should be able to reconnect to the database and repeat 
your command.
# psql::9: server closed the connection unexpectedly
#   This probably means the server terminated abnormally
#   before or while processing the request.
# psql::9: connection to server was lost
# <<
# pattern searched for: (?^m:MAYDAY:  terminating connection because of crash 
of another server process)
not ok 5 - psql query died successfully after SIGQUIT


Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.

2017-09-19 Thread Andres Freund
On 2017-09-19 18:06:29 -0700, Andres Freund wrote:
> On 2017-09-19 16:46:58 -0400, Tom Lane wrote:
> > Have we forgotten an fflush() or something?
> 
> After hacking a fix for my previous theory, I started adding strace into
> the mix, to verify this. Takes longer to reproduce, but after filtering
> to -e trace=network, I got this:
> 
> socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
> connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 
> ENOENT (No such file or directory)
> socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
> connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 
> ENOENT (No such file or directory)
> socket(AF_UNIX, SOCK_STREAM, 0) = 3
> connect(3, {sa_family=AF_UNIX, sun_path="/tmp/EDkYotgk3u/.s.PGSQL.57230"}, 
> 110) = 0
> getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
> getsockname(3, {sa_family=AF_UNIX}, [128->2]) = 0
> sendto(3, "\0\0\0O\0\3\0\0user\0andres\0database\0pos"..., 79, MSG_NOSIGNAL, 
> NULL, 0) = 79
> recvfrom(3, "R\0\0\0\10\0\0\0\0S\0\0\0,application_name\0t"..., 16384, 0, 
> NULL, NULL) = 340
> sendto(3, "Q\0\0\0\37SELECT $$psql-connected$$;\0", 32, MSG_NOSIGNAL, NULL, 
> 0) = 32
> recvfrom(3, 
> "T\0\0\0!\0\1?column?\0\0\0\0\0\0\0\0\0\0\31\377\377\377\377\377\377"..., 
> 16384, 0, NULL, NULL) = 79
> sendto(3, "Q\0\0\0\33SELECT pg_sleep(3600);\0", 28, MSG_NOSIGNAL, NULL, 0) = 
> 28
> recvfrom(3, 0x555817dae2a0, 16384, 0, NULL, NULL) = -1 ECONNRESET (Connection 
> reset by peer)
> +++ exited with 2 +++
> 
> So indeed, we got a connreset before receiving the proper error
> message.
> 
> The corresponding server log (debug3):
> 2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 730
> 2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 716
> 2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 715
> 2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 717
> 2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 718
> 2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 719
> 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl WARNING:  
> terminating connection because of crash of another server process
> 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DETAIL:  The 
> postmaster has commanded this server process to roll back the current t
> ransaction and exit, because another server process exited abnormally and 
> possibly corrupted shared memory.
> 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl HINT:  In a moment 
> you should be able to reconnect to the database and repeat your c
> ommand.
> 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG:  
> shmem_exit(-1): 0 before_shmem_exit callbacks to make
> 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG:  
> shmem_exit(-1): 0 on_shmem_exit callbacks to make
> 2017-09-20 00:57:00.573 UTC [720] DEBUG:  shmem_exit(-1): 0 before_shmem_exit 
> callbacks to make
> 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG:  
> proc_exit(-1): 0 callbacks to make
> ...
> 2017-09-20 00:57:00.577 UTC [713] DEBUG:  server process (PID 730) exited 
> with exit code 2
> 2017-09-20 00:57:00.577 UTC [713] DETAIL:  Failed process was running: SELECT 
> pg_sleep(3600);
> 2017-09-20 00:57:00.577 UTC [713] LOG:  all server processes terminated; 
> reinitializing
> 
> So the server indeed was killed by SIGQUIT, not an escalation to
> SIGKILL. And it output stuff to the server log, and didn't complain
> about communication to the client... Odd.

Hah! The reason for that is that socket_flush tries to avoid doing stuff
recursively:

static int
socket_flush(void)
{
int res;

/* No-op if reentrant call */
if (PqCommBusy)
return 0;
...

(detected by putting an elog(COMMERROR) there)

adding an abort shows the following backtrace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x7f089de4e42a in __GI_abort () at abort.c:89
#2  0x56473218a3f6 in socket_flush () at 
/home/andres/src/postgresql/src/backend/libpq/pqcomm.c:1408
#3  0x56473246e4ec in send_message_to_frontend (edata=0x5647329e34e0 
)
at /home/andres/src/postgresql/src/backend/utils/error/elog.c:3319
#4  0x56473246ad02 in EmitErrorReport () at 
/home/andres/src/postgresql/src/backend/utils/error/elog.c:1483
#5  0x5647324680af in errfinish (dummy=0) at 
/home/andres/src/postgresql/src/backend/utils/error/elog.c:483
#6  0x5647322fb340 in quickdie (postgres_signal_arg=3) at 
/home/andres/src/postgresql/src/backend/tcop/postgres.c:2608

Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.

2017-09-19 Thread Andres Freund
On 2017-09-19 16:46:58 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > So this is geniuinely interesting. When the machine is really loaded (as
> > in 6 animals running on a vm at the same time, incuding valgrind), psql
> > sometimes doesn't get the WARNING message from a shutdown. Instead it
> > gets
> > # psql::3: server closed the connection unexpectedly
> > #   This probably means the server terminated abnormally
> > #   before or while processing the request.
> > # psql::3: connection to server was lost
> 
> That seems pretty weird.  Maybe it's not the same case, but in
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae=2017-09-19%2020%3A10%3A02
> 
> you can see from the postmaster log that the backend *is* issuing
> the message, or at least it's getting to the server log:
> 
> 2017-09-19 20:20:34.476 UTC [6363] [unknown] LOG:  connection received: 
> host=[local]
> 2017-09-19 20:20:34.477 UTC [6363] [unknown] LOG:  connection authorized: 
> user=andres database=postgres
> 2017-09-19 20:20:34.478 UTC [6363] t/013_crash_restart.pl LOG:  statement: 
> SELECT $$psql-connected$$;
> ...
> 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl WARNING:  
> terminating connection because of crash of another server process
> 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl DETAIL:  The 
> postmaster has commanded this server process to roll back the current 
> transaction and exit, because another server process exited abnormally and 
> possibly corrupted shared memory.
> 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl HINT:  In a moment 
> you should be able to reconnect to the database and repeat your command.
> 
> Have we forgotten an fflush() or something?

After hacking a fix for my previous theory, I started adding strace into
the mix, to verify this. Takes longer to reproduce, but after filtering
to -e trace=network, I got this:

socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 
ENOENT (No such file or directory)
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 
ENOENT (No such file or directory)
socket(AF_UNIX, SOCK_STREAM, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/tmp/EDkYotgk3u/.s.PGSQL.57230"}, 110) 
= 0
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
getsockname(3, {sa_family=AF_UNIX}, [128->2]) = 0
sendto(3, "\0\0\0O\0\3\0\0user\0andres\0database\0pos"..., 79, MSG_NOSIGNAL, 
NULL, 0) = 79
recvfrom(3, "R\0\0\0\10\0\0\0\0S\0\0\0,application_name\0t"..., 16384, 0, NULL, 
NULL) = 340
sendto(3, "Q\0\0\0\37SELECT $$psql-connected$$;\0", 32, MSG_NOSIGNAL, NULL, 0) 
= 32
recvfrom(3, 
"T\0\0\0!\0\1?column?\0\0\0\0\0\0\0\0\0\0\31\377\377\377\377\377\377"..., 
16384, 0, NULL, NULL) = 79
sendto(3, "Q\0\0\0\33SELECT pg_sleep(3600);\0", 28, MSG_NOSIGNAL, NULL, 0) = 28
recvfrom(3, 0x555817dae2a0, 16384, 0, NULL, NULL) = -1 ECONNRESET (Connection 
reset by peer)
+++ exited with 2 +++

So indeed, we got a connreset before receiving the proper error
message.

The corresponding server log (debug3):
2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 730
2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 716
2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 715
2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 717
2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 718
2017-09-20 00:57:00.573 UTC [713] DEBUG:  sending SIGQUIT to process 719
2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl WARNING:  terminating 
connection because of crash of another server process
2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DETAIL:  The 
postmaster has commanded this server process to roll back the current t
ransaction and exit, because another server process exited abnormally and 
possibly corrupted shared memory.
2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl HINT:  In a moment you 
should be able to reconnect to the database and repeat your c
ommand.
2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG:  
shmem_exit(-1): 0 before_shmem_exit callbacks to make
2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG:  
shmem_exit(-1): 0 on_shmem_exit callbacks to make
2017-09-20 00:57:00.573 UTC [720] DEBUG:  shmem_exit(-1): 0 before_shmem_exit 
callbacks to make
2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG:  proc_exit(-1): 
0 callbacks to make
...
2017-09-20 00:57:00.577 UTC [713] DEBUG:  server process (PID 730) exited with 
exit code 2
2017-09-20 00:57:00.577 UTC [713] DETAIL:  Failed process

[COMMITTERS] pgsql: s/NULL byte/NUL byte/ in comment refering to C string terminator

2017-09-19 Thread Andres Freund
s/NULL byte/NUL byte/ in comment refering to C string terminator.

Reported-By: Robert Haas
Discussion: 
https://postgr.es/m/CA+Tgmoa+YBvWgFST2NVoeXjVSohEpK=vqnvcsockhtvvxfl...@mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/896537f078ba4d709ce754ecaff8350fd55bdfd8

Modified Files
--
src/backend/postmaster/pgstat.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.

2017-09-19 Thread Andres Freund
On 2017-09-19 13:53:18 -0700, Andres Freund wrote:
> On 2017-09-19 16:46:58 -0400, Tom Lane wrote:
> > Have we forgotten an fflush() or something?
> > 
> > Also, maybe problem is on client side.  I vaguely recall a libpq bug
> > wherein it would complain about socket EOF even though data remained
> > to be processed.  Maybe we reintroduced something like that?
> 
> That seems quite possible.

Preface: This is largely a as of yet unverified theory. But seems
worthwhile to investigate whether it's the cause of this issue or not.

By changing the error message I know that the "server closed the
connection unexpectedly" ERROR is coming from pqsecure_raw_read(). The
caller here has to be pqReadData(). Where have the following block:

if (nread > 0)
{
conn->inEnd += nread;

/*
 * Hack to deal with the fact that some kernels will only give 
us back
 * 1 packet per recv() call, even if we asked for more and 
there is
 * more available.  If it looks like we are reading a long 
message,
 * loop back to recv() again immediately, until we run out of 
data or
 * buffer space.  Without this, the block-and-restart behavior 
of
 * libpq's higher levels leads to O(N^2) performance on long 
messages.
 *
 * Since we left-justified the data above, conn->inEnd gives the
 * amount of data already read in the current message.  We 
consider
 * the message "long" once we have acquired 32k ...B
 */
if (conn->inEnd > 32768 &&
(conn->inBufSize - conn->inEnd) >= 8192)
{
someread = 1;
goto retry3;
}
return 1;
}

So imagine we've just read one block containing the error message from
the server. That's going to be less than 32kb. So we go into the retry3
path.

/* OK, try to read some data */
retry3:

nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd,
  conn->inBufSize - 
conn->inEnd);
if (nread < 0)
{
if (SOCK_ERRNO == EINTR)
goto retry3;
/* Some systems return EAGAIN/EWOULDBLOCK for no data */
#ifdef EAGAIN
if (SOCK_ERRNO == EAGAIN)
return someread;
#endif
#if defined(EWOULDBLOCK) && (!defined(EAGAIN) || (EWOULDBLOCK != EAGAIN))
if (SOCK_ERRNO == EWOULDBLOCK)
return someread;
#endif
/* We might get ECONNRESET here if using TCP and backend died */
#ifdef ECONNRESET
if (SOCK_ERRNO == ECONNRESET)
goto definitelyFailed;
#endif
/* pqsecure_read set the error message for us */
return -1;
}

So if the connection actually was closed we:
definitelyFailed:
/* Do *not* drop any already-read data; caller still wants it */
pqDropConnection(conn, false);
conn->status = CONNECTION_BAD;  /* No more connection to backend */
return -1;

and PQgetResult() will happily signal that upwards with an error:
/* Wait for some more data, and load it. */
if (flushResult ||
pqWait(TRUE, FALSE, conn) ||
pqReadData(conn) < 0)
{
/*
 * conn->errorMessage has been set by pqWait or 
pqReadData. We
 * want to append it to any already-received error 
message.
 */
pqSaveErrorResult(conn);
conn->asyncStatus = PGASYNC_IDLE;
return pqPrepareAsyncResult(conn);

ISTM, we need to react differently if we've already partially read data
successfully? Am I missing something?

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Avoid use of non-portable strnlen() in pgstat_clip_activity().

2017-09-19 Thread Andres Freund
Avoid use of non-portable strnlen() in pgstat_clip_activity().

The use of strnlen rather than strlen was just paranoia. Instead of
giving up on the paranoia, just implement the safeguard
differently. And add a comment explaining why we're careful.

Author: Andres Freund
Discussion: https://postgr.es/m/e1duokj-0001mc...@gemulon.postgresql.org

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/71edbb6f66f7139d6209334ef8734a122ba06b56

Modified Files
--
src/backend/postmaster/pgstat.c | 25 +
src/include/pgstat.h|  2 +-
2 files changed, 22 insertions(+), 5 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Speedup pgstat_report_activity by moving mb-aware truncation to

2017-09-19 Thread Andres Freund
On 2017-09-19 16:53:09 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > On 2017-09-19 16:25:31 -0400, Tom Lane wrote:
> >> Looks like you're going to need to not depend on strnlen().
> 
> > Yup noticed - we have pg_strnlen as a static function in
> > src/port/snprintf. I'm tempted to just make that public, rather than
> > open code the equivalent?
> 
> No, we should not tie this to whether we're using port/snprintf.c.

Oh, I would have moved it to src/common/string.c, sorry for not
mentioning that.


> Personally, I don't see why this isn't coded more like
> 
>   int rawlen = strlen(activity);
> 
>   rawlen = Min(rawlen, pgstat_track_activity_query_size - 1);

Pure paranoia, but probably not justified paranoia. We take care that
there's always a trailing NULL byte, even under concurrent
modifications (which exist due to pgstat_get_backend_current_activity()).

I'll just change things so that the pnstrdup() happens first, and then
do a plain strlen() on that. Same paranoia, less cost.


> It seems likely to me that strlen() is going to be optimized a lot
> harder than strnlen() on most platforms, so that this way would win
> not only for strings up to pgstat_track_activity_query_size but for
> strings some way beyond that.  Yes, for really really long queries
> the strnlen way would win, but that's optimizing for the wrong case IMO.

I wasn't thinking of optimizing this - and I've a hard time seing a case
where even the naive snprintf pg_strnlen(), or a plain strlen() would be
relevant performancewise.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.

2017-09-19 Thread Andres Freund
On 2017-09-19 16:46:58 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > So this is geniuinely interesting. When the machine is really loaded (as
> > in 6 animals running on a vm at the same time, incuding valgrind), psql
> > sometimes doesn't get the WARNING message from a shutdown. Instead it
> > gets
> > # psql::3: server closed the connection unexpectedly
> > #   This probably means the server terminated abnormally
> > #   before or while processing the request.
> > # psql::3: connection to server was lost
> 
> That seems pretty weird.  Maybe it's not the same case, but in
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae=2017-09-19%2020%3A10%3A02
> 
> you can see from the postmaster log that the backend *is* issuing
> the message, or at least it's getting to the server log:
> 
> 2017-09-19 20:20:34.476 UTC [6363] [unknown] LOG:  connection received: 
> host=[local]
> 2017-09-19 20:20:34.477 UTC [6363] [unknown] LOG:  connection authorized: 
> user=andres database=postgres
> 2017-09-19 20:20:34.478 UTC [6363] t/013_crash_restart.pl LOG:  statement: 
> SELECT $$psql-connected$$;
> ...
> 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl WARNING:  
> terminating connection because of crash of another server process
> 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl DETAIL:  The 
> postmaster has commanded this server process to roll back the current 
> transaction and exit, because another server process exited abnormally and 
> possibly corrupted shared memory.
> 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl HINT:  In a moment 
> you should be able to reconnect to the database and repeat your command.

I think it's likely the same - I've observed the same with the added
instrumentation.


> Have we forgotten an fflush() or something?
> 
> Also, maybe problem is on client side.  I vaguely recall a libpq bug
> wherein it would complain about socket EOF even though data remained
> to be processed.  Maybe we reintroduced something like that?

That seems quite possible.


> > We can obviously easily make the test accept both - but are we ok with
> > the client sometimes not getting the message?
> 
> I'm not ...

Same here.

I'll see if I can spot the bug in an hour or two. If not I'll make the
test temporarily accept both outputs while investigating?


Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Speedup pgstat_report_activity by moving mb-aware truncation to

2017-09-19 Thread Andres Freund
On 2017-09-19 16:25:31 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Speedup pgstat_report_activity by moving mb-aware truncation to read side.
> 
> Looks like you're going to need to not depend on strnlen().

Yup noticed - we have pg_strnlen as a static function in
src/port/snprintf. I'm tempted to just make that public, rather than
open code the equivalent?

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.

2017-09-19 Thread Andres Freund
On 2017-09-19 15:24:49 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > Checkining on calliphoridae why that's not sufficient - the machine's
> > busy, so the build & test will take a bit.
> 
> FWIW, prairiedog got through the recovery tests this time --- run's
> still going though.

So this is geniuinely interesting. When the machine is really loaded (as
in 6 animals running on a vm at the same time, incuding valgrind), psql
sometimes doesn't get the WARNING message from a shutdown. Instead it
gets
# psql::3: server closed the connection unexpectedly
#   This probably means the server terminated abnormally
#   before or while processing the request.
# psql::3: connection to server was lost

We can obviously easily make the test accept both - but are we ok with
the client sometimes not getting the message?

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Speedup pgstat_report_activity by moving mb-aware truncation to

2017-09-19 Thread Andres Freund
Speedup pgstat_report_activity by moving mb-aware truncation to read side.

Previously multi-byte aware truncation was done on every
pgstat_report_activity() call - proving to be a bottleneck for
workloads with long query strings that execute quickly.

Instead move the truncation to the read side, which commonly is
executed far less frequently. That's possible because all server
encodings allow to determine the length of a multi-byte string from
the first byte.

Rename PgBackendStatus.st_activity to st_activity_raw so existing
extension users of the field break - their code has to be adjusted to
use pgstat_clip_activity().

Author: Andres Freund
Tested-By: Khuntal Ghosh
Reviewed-By: Robert Haas, Tom Lane
Discussion: 
https://postgr.es/m/20170912071948.pa7igbpkkkvie...@alap3.anarazel.de

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/54b6cd589ac2f5635a42511236a5eb7299e2dcaf

Modified Files
--
src/backend/postmaster/pgstat.c | 63 -
src/backend/utils/adt/pgstatfuncs.c | 17 +++---
src/include/pgstat.h| 12 +--
3 files changed, 72 insertions(+), 20 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.

2017-09-19 Thread Andres Freund
On 2017-09-19 17:40:20 +, Andres Freund wrote:
> Make new crash restart test a bit more robust.
> 
> Add timeouts in case psql doesn't deliver the expected output, and try
> to cause the monitoring psql to be fully connected to a backend.  This
> isn't necessarily everything needed, but at least the timeouts should
> reduce the pain for buildfarm owners.
> 
> Author: Andres Freund
> Reported-By: Tom Lane, BF animals prairiedog and calliphoridae
> Discussion: https://postgr.es/m/e1du6zt-00043i...@gemulon.postgresql.org

Checkining on calliphoridae why that's not sufficient - the machine's
busy, so the build & test will take a bit.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.

2017-09-19 Thread Andres Freund
On 2017-09-19 09:58:26 -0700, Andres Freund wrote:
> 
> 
> On September 19, 2017 9:53:28 AM PDT, Tom Lane <t...@sss.pgh.pa.us> wrote:
> >Well, please fix it ASAP, if you don't want to take it out pending
> >the fixes.
> 
> Will as soon as I finished my morning coffee. Uncaffeinated, which my phone 
> fittingly autocorrects to unvaccinated, commits aren't a good idea.

Done. Survived ~100 cycles here locally, while running make -j16 -s
check-world in two parallel branches. But I have a fast disk, so it's
not comparable.

If the buildfarm doesn't complain about the use of IPC::Run's timeout
functionality, we should probably patch that into the other use of
IPC::Run as well, but especially into the other user of the pump() until
... scheme.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Make new crash restart test a bit more robust.

2017-09-19 Thread Andres Freund
Make new crash restart test a bit more robust.

Add timeouts in case psql doesn't deliver the expected output, and try
to cause the monitoring psql to be fully connected to a backend.  This
isn't necessarily everything needed, but at least the timeouts should
reduce the pain for buildfarm owners.

Author: Andres Freund
Reported-By: Tom Lane, BF animals prairiedog and calliphoridae
Discussion: https://postgr.es/m/e1du6zt-00043i...@gemulon.postgresql.org

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/1910353675bd149e1020b29c0fae02538fc358cd

Modified Files
--
src/test/recovery/t/013_crash_restart.pl | 34 
1 file changed, 21 insertions(+), 13 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.

2017-09-19 Thread Andres Freund


On September 19, 2017 9:53:28 AM PDT, Tom Lane  wrote:
>Well, please fix it ASAP, if you don't want to take it out pending
>the fixes.

Will as soon as I finished my morning coffee. Uncaffeinated, which my phone 
fittingly autocorrects to unvaccinated, commits aren't a good idea.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.

2017-09-19 Thread Andres Freund
Hi,

On 2017-09-19 12:13:54 -0400, Tom Lane wrote:
> IOW, the "$monitor" instance of psql did not complete making its
> connection until after the crash/restart cycle had occurred.

That'd be easy enough to fix...

Just something like

$monitor_stdin .= q[
SELECT $$am-i-up$$;
];
$monitor->pump until $monitor_stdout =~ /am-i-up/;
$monitor_stdout = '';


> So we're just sitting there waiting for a crash report that won't
> come.  Which is another very serious deficiency in this test:
> lacking any sort of timeout, it will just freeze indefinitely
> if anything doesn't happen exactly the way it expects.  From a
> buildfarm owner's standpoint, that's pretty damn unfriendly.
> It means having to manually unwedge your animals from time to time.

Note that I just copied the code for that from another test - this is
isn't unique to this test. I agree that it'd be good to add a timeout to
those pump calls.


> I'd like to ask you to revert this test, at least pending making
> it a whole lot more bulletproof.

Hm. Ok. That seems like an overreaction to me - the failure rate isn't
actualy that high so far.  I'm happy to add both timeouts and "earlier
startup" of the $monitor, but I'd prefer to do so in-tree - I'd run the
test through 100+ iterations locally, without any of this showing up.


> We don't really need crash recovery testing in the buildfarm IMO ---
> we hackers crash the system plenty often enough to notice problems
> there.

I for one don't exercise that kind of crash restarts, my development
scripts all work with restart_after_crash = false. What I find more
concerning however is coverage of EXEC_BACKEND, which has far fewer
developers actively running it constantly.

Greetings,

Andres Freund


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.

2017-09-18 Thread Andres Freund


On September 18, 2017 8:55:35 PM PDT, Tom Lane <t...@sss.pgh.pa.us> wrote:
>Andres Freund <and...@anarazel.de> writes:
>> Add test for postmaster crash restarts.
>
>Hm, calliphoridae doesn't like this.

Yea. Not clear to me why yet. The machine ran a number of instances with nearly 
the same config successfully. Can't imagine that copyparse makes a difference 
here.  I suspect it's somehow load related... Ran a good number of iterations 
locally, didn't reproduce, even under high load.  Think I'll add bit more error 
reporting.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Rearm statement_timeout after each executed query.

2017-09-18 Thread Andres Freund
Rearm statement_timeout after each executed query.

Previously statement_timeout, in the extended protocol, affected all
messages till a Sync message.  For clients that pipeline/batch query
execution that's problematic.

Instead disable timeout after each Execute message, and enable, if
necessary, the timer in start_xact_command(). As that's done only for
Execute and not Parse / Bind, pipelining the latter two could still
cause undesirable timeouts. But a survey of protocol implementations
shows that all drivers issue Sync messages when preparing, and adding
timeout rearming to both is fairly expensive for the common parse /
bind / execute sequence.

Author: Tatsuo Ishii, editorialized by Andres Freund
Reviewed-By: Takayuki Tsunakawa, Andres Freund
Discussion: 
https://postgr.es/m/20170222.115044.1665674502985097185.t-is...@sraoss.co.jp

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/f8e5f156b30efee5d0038b03e38735773abcb7ed

Modified Files
--
src/backend/tcop/postgres.c | 77 ++---
1 file changed, 65 insertions(+), 12 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix uninitialized variable in dshash.c.

2017-09-18 Thread Andres Freund
Fix uninitialized variable in dshash.c.

A bugfix for commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20.  The code
would have crashed if hashtable->size_log2 ever had the same value as
hashtable->control->size_log2 by coincidence.

Per Valgrind.

Author: Thomas Munro
Reported-By: Tomas Vondra
Discussion: 
https://postgr.es/m/e72fb33c-4f31-f276-e972-263d9b59554d%402ndquadrant.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/0fb9e4ace5ce4d479d839a720f32b99fdc87f455

Modified Files
--
src/backend/lib/dshash.c | 9 +
1 file changed, 9 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix crash restart bug introduced in 8356753c212.

2017-09-18 Thread Andres Freund
Fix crash restart bug introduced in 8356753c212.

The bug was caused by not re-reading the control file during crash
recovery restarts, which lead to an attempt to pfree() shared memory
contents. The fix is to re-read the control file, which seems good
anyway.

It's unclear as of this moment, whether we want to keep the
refactoring introduced in the commit referenced above, or come up with
an alternative approach. But fixing the bug in the mean time seems
like a good idea regardless.

A followup commit will introduce regression test coverage for crash
restarts.

Reported-By: Tom Lane
Discussion: https://postgr.es/m/14134.1505572...@sss.pgh.pa.us

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/ec9e05b3c392ba9587f283507459737684539574

Modified Files
--
src/backend/access/transam/xlog.c   | 44 +++--
src/backend/postmaster/postmaster.c | 13 ---
src/backend/tcop/postgres.c |  2 +-
src/include/access/xlog.h   |  2 +-
4 files changed, 39 insertions(+), 22 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


  1   2   3   4   5   6   7   8   9   10   >