Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-09 16:45:07 -0800, Peter Geoghegan wrote: > On Thu, Nov 9, 2017 at 4:17 PM, Andres Freund <and...@anarazel.de> wrote: > >> Actually, on second thought, I take that back -- I don't think that > >> REINDEXing will even finish once a HOT chain is broken by the bug. > >> IndexBuildHeapScan() actually does quite a good job of making sure > >> that HOT chains are sane, which is how the enhanced amcheck notices > >> the bug here in practice. > > > > I think that's too optimistic. > > Why? Because the "find the TID of the root" logic in > IndexBuildHeapScan()/heap_get_root_tuples() won't reliably find the > actual root (it might be some other HOT chain root following TID > recycling by VACUUM)? Primarily because it's not an anti-corruption tool. I'd be surprised if there weren't ways to corrupt the page using these corruptions that aren't detected by it. But even if it were, I don't think there's enough information to do so in the general case. You very well can end up with pages where subsequent hot pruning has removed a good bit of the direct evidence of this bug. But I'm not really sure why the error detection capabilities of matter much for the principal point I raised, which is how much work we need to do to not further worsen the corruption. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-09 16:02:17 -0800, Peter Geoghegan wrote: > > What I'm currently wondering about is how much we need to harden > > postgres against such existing corruption. If e.g. the hot chains are > > broken somebody might have reindexed thinking the problem is fixed - but > > if they then later vacuum everything goes to shit again, with dead rows > > reappearing. > > I don't follow you here. Why would REINDEXing make the rows that > should be dead disappear again, even for a short period of time? It's not the REINDEX that makes them reappear. It's the second vacuum. The reindex part was about $user trying to fix the problem... As you need two vacuums with appropriate cutoffs to hit the "rows revive" problem, that'll often in practice not happen immediately. > Actually, on second thought, I take that back -- I don't think that > REINDEXing will even finish once a HOT chain is broken by the bug. > IndexBuildHeapScan() actually does quite a good job of making sure > that HOT chains are sane, which is how the enhanced amcheck notices > the bug here in practice. I think that's too optimistic. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-04 06:15:00 -0700, Andres Freund wrote: > The reason for that is that I hadn't yet quite figured out how the bug I > described in the commit message (and the previously committed testcase) > would cause that. I was planning to diagnose / experiment more with this > and write an email if I couldn't come up with an explanation. The > committed test does *not* actually trigger that. > > The reason I couldn't quite figure out how the problem triggers is that > [ long explanation ] Attached is a version of the already existing regression test that both reproduces the broken hot chain (and thus failing index lookups) and then also the tuple reviving. I don't see any need for letting this run with arbitrary permutations. Thanks to whoever allowed isolationtester permutations to go over multiple lines and allow comments. I was wondering about adding that as a feature just to discover it's already there ;) What I'm currently wondering about is how much we need to harden postgres against such existing corruption. If e.g. the hot chains are broken somebody might have reindexed thinking the problem is fixed - but if they then later vacuum everything goes to shit again, with dead rows reappearing. There's no way we can fix hot chains after the fact, but preventing dead rows from reapparing seems important. A minimal version of that is fairly easy - we slap a bunch of if if !TransactionIdDidCommit() elog(ERROR) at various code paths. But that'll often trigger clog access errors when the problem occurred - if we want to do better we need to pass down relfrozenxid/relminmxid to a few functions. I'm inclined to do so, but it'll make the patch larger... Comments? Greetings, Andres Freund # Test for interactions of tuple freezing with dead, as well as recently-dead # tuples using multixacts via FOR KEY SHARE. setup { DROP TABLE IF EXISTS tab_freeze; CREATE TABLE tab_freeze ( id int PRIMARY KEY, name char(3), x int); INSERT INTO tab_freeze VALUES (1, '111', 0); INSERT INTO tab_freeze VALUES (3, '333', 0); } teardown { DROP TABLE tab_freeze; } session "s1" step "s1_begin" { BEGIN; } step "s1_update"{ UPDATE tab_freeze SET x = x + 1 WHERE id = 3; } step "s1_commit"{ COMMIT; } step "s1_vacuum"{ VACUUM FREEZE tab_freeze; } step "s1_selectone" { BEGIN; SET LOCAL enable_seqscan = false; SET LOCAL enable_bitmapscan = false; SELECT * FROM tab_freeze WHERE id = 3; COMMIT; } step "s1_selectall" { SELECT * FROM tab_freeze ORDER BY name, id; } session "s2" step "s2_begin" { BEGIN; } step "s2_key_share" { SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE; } step "s2_commit"{ COMMIT; } step "s2_vacuum"{ VACUUM FREEZE tab_freeze; } session "s3" step "s3_begin" { BEGIN; } step "s3_key_share" { SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE; } step "s3_commit"{ COMMIT; } step "s3_vacuum"{ VACUUM FREEZE tab_freeze; } # This permutation verfies that a previous bug # https://postgr.es/m/e5711e62-8fdf-4dca-a888-c200bf6b5...@amazon.com # https://postgr.es/m/20171102112019.33wb7g5wp4zpj...@alap3.anarazel.de # is not reintroduced. We used to make wrong pruning / freezing # decision for multixacts, which could lead to a) broken hot chains b) # dead rows being revived. permutation "s1_begin" "s2_begin" "s3_begin" # start transactions "s1_update" "s2_key_share" "s3_key_share" # have xmax be a multi with an updater, updater being oldest xid "s1_update" # create additional row version that has multis "s1_commit" "s2_commit" # commit both updater and share locker "s2_vacuum" # due to bug in freezing logic, we used to *not* prune updated row, and then froze it "s1_selectone" # if hot chain is broken, the row can't be found via index scan "s3_commit" # commit remaining open xact "s2_vacuum" # pruning / freezing in broken hot chains would unset xmax, reviving rows "s1_selectall" # show borkedness -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Remove secondary checkpoint
On 2017-11-07 14:12:19 -0500, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2017-11-07 17:57:31 +, Simon Riggs wrote: > >> Remove secondary checkpoint > > > FWIW, I don't think this should be applied without a pg_resetxlog > > feature allowing to manually use older checkpoints. > > Uh, what? We have never before insisted on pg_resetxlog being able > to use WAL across a WAL format change, and there have been many of > those. I think you misunderstand my point - I'm saying that pg_resetxlog should be able to force the use of older checkpoints, basically as a fallback to cases where the previous approach might actually have worked, not that it needs to work across format changes. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Remove secondary checkpoint
On 2017-11-07 17:57:31 +, Simon Riggs wrote: > Remove secondary checkpoint > > Previously server reserved WAL for last two checkpoints, > which used too much disk space for small servers. > > Bumps PG_CONTROL_VERSION FWIW, I don't think this should be applied without a pg_resetxlog feature allowing to manually use older checkpoints. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-04 06:15:00 -0700, Andres Freund wrote: > The current testcase, and I think the descriptions in the relevant > threads, all actually fail to point out the precise way the bug is > triggered. As e.g. evidenced that the freeze-the-dead case appears to > not cause any failures in !assertion builds even if the bug is present. Trying to write up tests reproducing more of the issues in the area, I think I might have found a third issue - although I'm not sure how practically relevant it is: When FreezeMultiXactId() decides it needs to create a new multi because the old one is below the cutoff, that attempt can be defeated by the multixact id caching. If the new multi has exactly the same members the multixact id cache will just return the existing multi with the same members. The cache will routinely be primed from the lookup of its members. I'm not yet sure how easily this can be hit in practice, because commonly the multixact horizon should prevent a multi with all its members living from being below the horizon. I found a situation where that's not the case with the current bug, but I'm not sif that can happen otherwise. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-03 12:36:59 -0700, Peter Geoghegan wrote: > Andres Freund <and...@anarazel.de> wrote: > > Here's that patch. I've stared at this some, and Robert did too. Robert > > mentioned that the commit message might need some polish and I'm not > > 100% sure about the error message texts yet. > > The commit message should probably say that the bug involves the > resurrection of previously dead tuples, which is different to there > being duplicates because a constraint is not enforced because HOT chains > are broken (that's a separate, arguably less serious problem). The reason for that is that I hadn't yet quite figured out how the bug I described in the commit message (and the previously committed testcase) would cause that. I was planning to diagnose / experiment more with this and write an email if I couldn't come up with an explanation. The committed test does *not* actually trigger that. The reason I couldn't quite figure out how the problem triggers is that the xmax removing branch in FreezeMultiXactId() can only be reached if the multi is from before the cutoff - which it can't have been for a single vacuum execution to trigger the bug, because that'd require the multi to be running to falsely return RECENTLY_DEAD rather than DEAD (by definition a multi can't be below the cutoff if running). For the problem to occur I think vacuum has to be executed *twice*: The first time through HTSV mistakenly returns RECENTLY_DEAD preventing the tuple from being pruned. That triggers FreezeMultiXactId() to create a *new* multi with dead members. At this point the database already is in a bad state. Then in a second vacuum HTSV returns DEAD, but * Ordinarily, DEAD tuples would have been removed by * heap_page_prune(), but it's possible that the tuple * state changed since heap_page_prune() looked. In * particular an INSERT_IN_PROGRESS tuple could have * changed to DEAD if the inserter aborted. So this * cannot be considered an error condition. * .. if (HeapTupleIsHotUpdated() || HeapTupleIsHeapOnly()) { nkeep += 1; prevents the tuple from being removed. If now the multi xmax is below the xmin horizon it triggers /* * If the xid is older than the cutoff, it has to have aborted, * otherwise the tuple would have gotten pruned away. */ if (TransactionIdPrecedes(xid, cutoff_xid)) { if (TransactionIdDidCommit(xid)) elog(ERROR, "can't freeze committed xmax"); *flags |= FRM_INVALIDATE_XMAX; in FreezeMultiXact. Without the new elog, this then causes xmax to be removed, reviving the tuple. The current testcase, and I think the descriptions in the relevant threads, all actually fail to point out the precise way the bug is triggered. As e.g. evidenced that the freeze-the-dead case appears to not cause any failures in !assertion builds even if the bug is present. The good news is that the error checks I added in the patch upthread prevent all of this from happening, even though I'd not yet understood the mechanics fully - it's imnsho pretty clear that we need to be more paranoid in production builds around this. A bunch of users that triggered largely "invisible" corruption (the first vacuum described above) will possibly run into one of these elog()s, but that seems far preferrable to making the corruption a lot worse. I think unfortunately the check + elog() in the HeapTupleIsHeapOnly()) { nkeep += 1; /* * If this were to happen for a tuple that actually * needed to be frozen, we'd be in trouble, because * it'd leave a tuple below the relation's xmin * horizon alive. */ if (heap_tuple_needs_freeze(tuple.t_data, FreezeLimit,
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On November 4, 2017 1:22:04 AM GMT+05:30, Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote: >Peter Geoghegan wrote: >> Andres Freund <and...@anarazel.de> wrote: > >> > Staring at the vacuumlazy hunk I think I might have found a related >bug: >> > heap_update_tuple() just copies the old xmax to the new tuple's >xmax if >> > a multixact and still running. It does so without verifying >liveliness >> > of members. Isn't that buggy? Consider what happens if we have >three >> > blocks: 1 has free space, two is being vacuumed and is locked, >three is >> > full and has a tuple that's key share locked by a live tuple and is >> > updated by a dead xmax from before the xmin horizon. In that case >afaict >> > the multi will be copied from the third page to the first one. >Which is >> > quite bad, because vacuum already processed it, and we'll set >> > relfrozenxid accordingly. I hope I'm missing something here? >> >> Can you be more specific about what you mean here? I think that I >> understand where you're going with this, but I'm not sure. > >He means that the tuple that heap_update moves to page 1 (which will no >longer be processed by vacuum) will contain a multixact that's older >than relminmxid -- because it is copied unchanged by heap_update >instead >of properly checking against age limit. Right. That, or a member xid below relminxid. I think both scenarios are possible. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-02 06:05:51 -0700, Andres Freund wrote: > Hi, > > On 2017-11-02 13:49:47 +0100, Alvaro Herrera wrote: > > Andres Freund wrote: > > > I think the problem is on the pruning, rather than the freezing side. We > > > can't freeze a tuple if it has an alive predecessor - rather than > > > weakining this, we should be fixing the pruning to not have the alive > > > predecessor. > > > > I gave a look at HTSV back then, but I didn't find what the right tweak > > was, but then I only tried changing the return value to DEAD and > > DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based > > on OldestXmin didn't occur to me ... I was thinking that the fact that > > there were live lockers meant that the tuple could not be removed, > > obviously failing to notice that the subsequent versions of the tuple > > would be good enough. > > I'll try to write up a commit based on that idea. I think there's some > comment work needed too, Robert and I were both confused by a few > things. > I'm unfortunately travelling atm - it's evening here, and I'll flying > back to the US all Saturday. I'm fairly sure I'll be able to come up > with a decent patch tomorrow, but I'll need review and testing by > others. Here's that patch. I've stared at this some, and Robert did too. Robert mentioned that the commit message might need some polish and I'm not 100% sure about the error message texts yet. I'm not yet convinced that the new elog in vacuumlazy can never trigger - but I also don't think we want to actually freeze the tuple in that case. Staring at the vacuumlazy hunk I think I might have found a related bug: heap_update_tuple() just copies the old xmax to the new tuple's xmax if a multixact and still running. It does so without verifying liveliness of members. Isn't that buggy? Consider what happens if we have three blocks: 1 has free space, two is being vacuumed and is locked, three is full and has a tuple that's key share locked by a live tuple and is updated by a dead xmax from before the xmin horizon. In that case afaict the multi will be copied from the third page to the first one. Which is quite bad, because vacuum already processed it, and we'll set relfrozenxid accordingly. I hope I'm missing something here? Greetings, Andres Freund >From 774376dc8f7ed76657660e6beb0f5191d1bdaae4 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Fri, 3 Nov 2017 07:52:29 -0700 Subject: [PATCH] Fix pruning of locked and updated tuples. Previously it was possible that a tuple was not pruned during vacuum, even though it's update xmax (i.e. the updating xid in a multixact with both key share lockers and an updater) was below the cutoff horizon. As the freezing code assumed, rightly so, that that's not supposed to happen, xmax would be preserved (as a member of a new multixact or xmax directly). That causes two problems: For one the tuple is below the xmin horizon, which can cause problems if the clog is truncated or once there's an xid wraparound. The bigger problem is that that will break HOT chains, leading to index lookups failing, which in turn can lead to constraints being violated. Fix the problem by recognizing that tuples can be DEAD instead of RECENTLY_DEAD, even if the multixactid has alive members, if the update_xid is below the xmin horizon. That's safe because newer versions of the tuple will contain the locking xids. As the wrong freezing of xmax can cause data corruption, extend sanity checks and promote them to elogs rather than assertions. Per-Discussion: Robert Haas and Freund Reported-By: Daniel Wood Discussion: https://postgr.es/m/e5711e62-8fdf-4dca-a888-c200bf6b5...@amazon.com https://postgr.es/m/20171102112019.33wb7g5wp4zpj...@alap3.anarazel.de --- src/backend/access/heap/heapam.c | 34 -- src/backend/commands/vacuumlazy.c | 13 + src/backend/utils/time/tqual.c| 53 +++ src/test/isolation/isolation_schedule | 1 + 4 files changed, 62 insertions(+), 39 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 765750b8743..9b963e0cd0d 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -6412,7 +6412,8 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask, */ if (TransactionIdPrecedes(xid, cutoff_xid)) { -Assert(!TransactionIdDidCommit(xid)); +if (TransactionIdDidCommit(xid)) + elog(ERROR, "can't freeze committed xmax"); *flags |= FRM_INVALIDATE_XMAX; xid = InvalidTransactionId; /* not strictly necessary */ } @@ -6484,6 +6485,8 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask, { TransactionId xid = members[i].xid; + Assert(TransactionIdIsValid(xid)); + /* * It's an upda
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Hi, On 2017-11-02 13:49:47 +0100, Alvaro Herrera wrote: > Andres Freund wrote: > > I think the problem is on the pruning, rather than the freezing side. We > > can't freeze a tuple if it has an alive predecessor - rather than > > weakining this, we should be fixing the pruning to not have the alive > > predecessor. > > I gave a look at HTSV back then, but I didn't find what the right tweak > was, but then I only tried changing the return value to DEAD and > DELETE_IN_PROGRESS; the thought of selecting DEAD or RECENTLY_DEAD based > on OldestXmin didn't occur to me ... I was thinking that the fact that > there were live lockers meant that the tuple could not be removed, > obviously failing to notice that the subsequent versions of the tuple > would be good enough. I'll try to write up a commit based on that idea. I think there's some comment work needed too, Robert and I were both confused by a few things. I'm unfortunately travelling atm - it's evening here, and I'll flying back to the US all Saturday. I'm fairly sure I'll be able to come up with a decent patch tomorrow, but I'll need review and testing by others. > > If the update xmin is actually below the cutoff we can remove the tuple > > even if there's live lockers - the lockers will also be present in the > > newer version of the tuple. I verified that for me that fixes the > > problem. Obviously that'd require some comment work and more careful > > diagnosis. > > Sounds good. > > I'll revert those commits then, keeping the test, and then you can > commit your change. OK? Generally that sounds good - but you can't keep the testcase in without the new fix - the buildfarm would immediately turn red. I guess the best thing would be to temporarily remove it from the schedule? Do we care about people upgrading to unreleased versions? We could do nothing, document it in the release notes, or ??? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Hi, On 2017-09-28 14:47:53 +, Alvaro Herrera wrote: > Fix freezing of a dead HOT-updated tuple > > Vacuum calls page-level HOT prune to remove dead HOT tuples before doing > liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples. But > concurrent transaction commit/abort may turn DEAD some of the HOT tuples > that survived the prune, before HeapTupleSatisfiesVacuum tests them. > This happens to activate the code that decides to freeze the tuple ... > which resuscitates it, duplicating data. > > (This is especially bad if there's any unique constraints, because those > are now internally violated due to the duplicate entries, though you > won't know until you try to REINDEX or dump/restore the table.) > > One possible fix would be to simply skip doing anything to the tuple, > and hope that the next HOT prune would remove it. But there is a > problem: if the tuple is older than freeze horizon, this would leave an > unfrozen XID behind, and if no HOT prune happens to clean it up before > the containing pg_clog segment is truncated away, it'd later cause an > error when the XID is looked up. > > Fix the problem by having the tuple freezing routines cope with the > situation: don't freeze the tuple (and keep it dead). In the cases that > the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag > so that there is no need to look up the XID in pg_clog later on. I think this is the wrong fix - the assumption that ctid chains can be validated based on the prev-xmax = cur-xmin is fairly ingrained into the system, and we shouldn't just be breaking it. The need to later lobotomize the checks, in a5736bf754, is some evidence of that. I spent some time discussing this with Robert today (with both of us alternating between feeling the other and ourselves as stupid), and the conclusion I think is that the problem is on the pruning, rather than the freezing side. FWIW, I don't think the explanation in the commit message of how the problem triggers is actually correct - the testcase you added doesn't have any transactions concurrently committing / aborting when crashing. Rather the problem is that the liveliness checks for freezing is different from the ones for pruning. HTSV considers xmin RECENTLY_DEAD when there's a multi xmax with at least one alive locker, whereas pruning thinks it has to do something because there's the member xid below the cutoff. No concurrent activity is needed to trigger that. I think the problem is on the pruning, rather than the freezing side. We can't freeze a tuple if it has an alive predecessor - rather than weakining this, we should be fixing the pruning to not have the alive predecessor. The relevant logic in HTSV is: if (tuple->t_infomask & HEAP_XMAX_IS_MULTI) { TransactionId xmax; if (MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false)) { /* already checked above */ Assert(!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)); xmax = HeapTupleGetUpdateXid(tuple); /* not LOCKED_ONLY, so it has to have an xmax */ Assert(TransactionIdIsValid(xmax)); if (TransactionIdIsInProgress(xmax)) return HEAPTUPLE_DELETE_IN_PROGRESS; else if (TransactionIdDidCommit(xmax)) /* there are still lockers around -- can't return DEAD here */ return HEAPTUPLE_RECENTLY_DEAD; /* updating transaction aborted */ return HEAPTUPLE_LIVE; with the problematic branch being the TransactionIdDidCommit() case. Rather than unconditionally returning HEAPTUPLE_RECENTLY_DEAD, we should test the update xid against OldestXmin and return DEAD / RECENTLY_DEAD according to that. If the update xmin is actually below the cutoff we can remove the tuple even if there's live lockers - the lockers will also be present in the newer version of the tuple. I verified that for me that fixes the problem. Obviously that'd require some comment work and more careful diagnosis. I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in the face of previously corrupted tuple chains and pg_upgraded clusters - it can lead to tuples being considered related, even though they they're from entirely independent hot chains. Especially when upgrading 9.3 post your fix, to current releases. In short, I think the two commits should be reverted, and replaced with a fix along what I'm outlining above. There'll be some trouble for people that upgraded to an unreleased version, but I don't really see what we could do about that. I could be entirely wrong - I've been travelling for the last two weeks and my brain is somewhat more fried than usual. Regards, Andres -- Sent via pgsql-committers mailing list
Re: [COMMITTERS] pgsql: Add pg_noinline macro to c.h.
On 2017-10-13 18:32:08 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Add pg_noinline macro to c.h. > > I think you want this to be parenthesized so that pgindent doesn't > go nuts when you use it. At least, that was the reason why > pg_attribute_noreturn() has parens. Even if the current version > of pgindent doesn't have that problem The current pgindent actually generates *better* layout with pg_noinline rather than pg_noinline(). The latter forces a linebreak both in declarations and definitions like: static pg_noinline() HeapTuple SearchCatCacheMiss(CatCache *cache, static pg_noinline() HeapTuple SearchCatCacheMiss(CatCache *cache, One difference to pg_attribute_noreturn() is that for function definitions - and noinline needs to be present there IME - you can't put it after the parameter list. > I would argue that consistency demands that you spell this > pg_attribute_noinline(). Hm, I personally find inline vs pg_noinline more aesthetically pleasing. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Improve sys/catcache performance.
Improve sys/catcache performance. The following are the individual improvements: 1) Avoidance of FunctionCallInfo based function calls, replaced by more efficient functions with a native C argument interface. 2) Don't extract columns from a cache entry's tuple whenever matching entries - instead store them as a Datum array. This also allows to get rid of having to build dummy tuples for negative & list entries, and of a hack for dealing with cstring vs. text weirdness. 3) Reorder members of catcache.h struct, so imortant entries are more likely to be on one cacheline. 4) Allowing the compiler to specialize critical SearchCatCache for a specific number of attributes allows to unroll loops and avoid other nkeys dependant initialization. 5) Only initializing the ScanKey when necessary, i.e. catcache misses, greatly reduces cache unnecessary cpu cache misses. 6) Split of the cache-miss case from the hash lookup, reducing stack allocations etc in the common case. 7) CatCTup and their corresponding heaptuple are allocated in one piece. This results in making cache lookups themselves roughly three times as fast - full-system benchmarks obviously improve less than that. I've also evaluated further techniques: - replace open coded hash with simplehash - the list walk right now shows up in profiles. Unfortunately it's not easy to do so safely as an entry's memory location can change at various times, which doesn't work well with the refcounting and cache invalidation. - Cacheline-aligning CatCTup entries - helps some with performance, but the win isn't big and the code for it is ugly, because the tuples have to be freed as well. - add more proper functions, rather than macros for SearchSysCacheCopyN etc., but right now they don't show up in profiles. The reason the macro wrapper for syscache.c/h have to be changed, rather than just catcache, is that doing otherwise would require exposing the SysCache array to the outside. That might be a good idea anyway, but it's for another day. Author: Andres Freund Reviewed-By: Robert Haas Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7l...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/141fd1b66ce6e3d10518d66d4008bd368f1505fd Modified Files -- src/backend/utils/cache/catcache.c | 692 + src/backend/utils/cache/syscache.c | 49 ++- src/include/utils/catcache.h | 122 --- src/include/utils/syscache.h | 23 +- src/tools/pgindent/typedefs.list | 2 + 5 files changed, 608 insertions(+), 280 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Add pg_noinline macro to c.h.
Add pg_noinline macro to c.h. Forcing a function not to be inlined can be useful if it's the slow-path of a performance critical function, or should be visible in profiles to allow for proper cost attribution. Author: Andres Freund Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7l...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/a0247e7a11bb9f5fd55694b594a3906b7bd05881 Modified Files -- src/include/c.h | 16 1 file changed, 16 insertions(+) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Force "restrict" not to be used when compiling with xlc.
Force "restrict" not to be used when compiling with xlc. Per buildfarm animal Hornet and followup manual testing by Noah Misch, it appears xlc miscompiles code using "restrict" in at least some cases. Allow disabling restrict usage with FORCE_DISABLE_RESTRICT=yes in template files, and do so for aix/xlc. Author: Andres Freund and Tom Lane Discussion: https://postgr.es/m/1820.1507918...@sss.pgh.pa.us Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/d133982d598c7e6208d16cb4fc0b552151796603 Modified Files -- configure| 6 +- configure.in | 6 +- src/template/aix | 4 3 files changed, 14 insertions(+), 2 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.
On 2017-10-13 14:19:22 -0400, Tom Lane wrote: > > So it'd probably better to introduce a FORCE_DISABLE_RESTRICT=yes, set > > at the same place, that's then tested before running the relevant > > configure check? > > +1. I think you don't actually have to skip the configure check, > and there might be some value in letting it carry on normally > (so that "restrict" is set properly). We'd just want it to affect > what pg_restrict gets defined as. Something like > if test "$ac_cv_c_restrict" = "no" -o "x$FORCE_DISABLE_RESTRICT" = "xyes"; > then > pg_restrict="" > else > pg_restrict="$ac_cv_c_restrict" Yea, that works. Will make it so. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.
On 2017-10-13 13:48:07 -0400, Tom Lane wrote: > I wrote: > > Anyway, I will go make the sizeof() usages consistent, just to satisfy > > my own uncontrollable neatnik-ism. Assuming that hornet stays red, > > which is probable, we should disable "restrict" for that compiler. > > As expected, that didn't fix it. Andres, are you going to put in > a disable? Do we know exactly what we want to test for that? A easiest way to do this would be to put something like CPPFLAGS="$CPPFLAGS -Dpg_restrict" into the existing if test "$GCC" != yes ; then block in a/src/template/linux. But that'd probably result in "macro redefined" warnings or somesuch. So it'd probably better to introduce a FORCE_DISABLE_RESTRICT=yes, set at the same place, that's then tested before running the relevant configure check? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.
Hi, On 2017-10-13 00:02:47 -0700, Noah Misch wrote: > I hacked psql to call PQtrace() and ran "psql -Xc 'select true'" in the > defective configuration and in a working x64 GNU/Linux configuration. I've > attached both PQtrace() products. Thanks! > To backend> Msg Q > To backend> "select true" > To backend> Msg complete, length 17 > From backend> T > From backend (#4)> 17 > From backend (#2)> 1 > From backend> "bool" > From backend (#4)> 1 > From backend (#2)> 0 > From backend (#4)> 1140850688 > From backend (#2)> 2816 > From backend (#4)> 16777216 > From backend (#2)> 372 > From backend> D > From backend (#4)> 11 > From backend> C > From backend (#4)> 13 > From backend> "SELECT 1" > From backend> Z > From backend (#4)> 5 > From backend> Z > From backend (#4)> 5 > From backend> I > To backend> Msg X > To backend> Msg complete, length 5 > To backend> Msg Q > To backend> "select true" > To backend> Msg complete, length 17 > From backend> T > From backend (#4)> 29 > From backend (#2)> 1 > From backend> "bool" > From backend (#4)> 0 > From backend (#2)> 0 > From backend (#4)> 16 > From backend (#2)> 1 > From backend (#4)> -1 > From backend (#2)> 0 > From backend> D > From backend (#4)> 11 > From backend (#2)> 1 > From backend (#4)> 1 > From backend (1)> t > From backend> C > From backend (#4)> 13 > From backend> "SELECT 1" > From backend> Z > From backend (#4)> 5 > From backend> Z > From backend (#4)> 5 > From backend> I > To backend> Msg X > To backend> Msg complete, length 5 That's certainly quite weird. I can't immediately pinpoint where the bug is. I initially thought that the StringInfo's lengths might be wrong, but that doesn'tqutie seem to make sense. Looks a bit like there's just garbless mess in there... Will have another look when I don't have to force my eyes to stay open. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.
On 2017-10-12 19:35:36 -0700, Noah Misch wrote: > On Thu, Oct 12, 2017 at 04:08:44PM -0700, Andres Freund wrote: > > So we've two animals (hornet, sungazer) that are: > > #define SIZEOF_SIZE_T 8 > > #define WORDS_BIGENDIAN 1 > > #define restrict __restrict > > > > one compiled with xlc that fails and one with gcc that succeeds. I'm > > hesitant to reach for that, but I wonder if there's a compiler > > bug. Alternatively there could be some undefined behaviour here that > > only triggers on xlc 64bit, but I'm not quite seeing it. > > > > Noah, any chance you could force restrict to off on that animal? > > I can confirm it allows "make check" to pass. Specifically, I did this > against commit 91d5f1a: > > --- src/include/pg_config.h~2017-10-12 18:11:33.0 -0700 > +++ src/include/pg_config.h 2017-10-12 18:22:34.0 -0700 > @@ -929 +929 @@ > - #define pg_restrict __restrict > + #define pg_restrict > @@ -934 +934 @@ > - #define restrict __restrict > + #define restrict > > I have no reason to believe this is specific to hornet's installation, so I > recommend against altering hornet's configuration. It's too likely that the > next xlc user will need to do the same thing. Yea, I wasn't trying to propose that - I just thought it'd be easier to narrow down with access to the machine than 6h cycle buildfarm debugging. > > Otherwise I can push a platform fix that disables it. > > This sounds reasonable. I'm not getting a great vibe about the aix/xlc quality in this thread :( Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.
On 2017-10-12 20:06:32 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > >> fe-connect.c:2382:6: warning: implicit declaration of function > >> 'getpeereid' [-Wimplicit-function-declaration] > >> Looks like we're missing > >> #include > > > Hm, it got removed as part of > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9e3755ecb2d058f7d123dd35a2e1784006190962 > > but that's not an explanation, because > > Nope, because that's quite old. Right. I'd mentioned that it's *not* that commit, even though it initially looked suspicious. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.
On 2017-10-12 16:08:44 -0700, Andres Freund wrote: > wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes > -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels > -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv > -fexcess-precision=standard -g -O2 -I../../../src/include-c -o auth.o > auth.c > auth.c: In function 'auth_peer': > auth.c:2002:2: warning: implicit declaration of function 'getpeereid' > [-Wimplicit-function-declaration] > if (getpeereid(port->sock, , ) != 0) > ^ > wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes > -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels > -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv > -fexcess-precision=standard -g -O2 -D_REENTRANT -D_THREAD_SAFE > -D_POSIX_PTHREAD_SEMANTICS -DFRONTEND -DUNSAFE_STAT_OK -I. > -I../../../src/include -I../../../src/port -I../../../src/port > -DSO_MAJOR_VERSION=5 -c -o fe-connect.o fe-connect.c > fe-connect.c: In function 'PQconnectPoll': > fe-connect.c:2382:6: warning: implicit declaration of function 'getpeereid' > [-Wimplicit-function-declaration] > if (getpeereid(conn->sock, , ) != 0) > ^ > > Looks like we're missing > #include Hm, it got removed as part of http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9e3755ecb2d058f7d123dd35a2e1784006190962 but that's not an explanation, because c.h includes sys/types.h. Which according to IBM's docs https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.basetrf1/getpeereid.htm is the right thing to include. Given that xlc doesn't complain, I'll just assume this is some issue with the headers gcc uses on aix, but I'm far from confident. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Use C99 restrict via pg_restrict, rather than restrict directly.
Use C99 restrict via pg_restrict, rather than restrict directly. Unfortunately using 'restrict' plainly causes problems with MSVC, which supports restrict only as '__restrict'. Defining 'restrict' to '__restrict' unfortunately causes a conflict with MSVC's usage of __declspec(restrict) in headers. Therefore define pg_restrict to the appropriate keyword instead, and replace existing usages. This replaces the temporary workaround introduced in 36b4b91ba078. Author: Andres Freund Discussion: https://postgr.es/m/2656.1507830...@sss.pgh.pa.us Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/91d5f1a4a3e8aea2a6488243bac55806160408fb Modified Files -- configure | 107 -- configure.in | 15 +- src/include/libpq/pqformat.h | 24 +- src/include/pg_config.h.in| 4 ++ src/include/pg_config.h.win32 | 22 - 5 files changed, 101 insertions(+), 71 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.
On 2017-10-12 10:44:58 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Improve performance of SendRowDescriptionMessage. > > One or another of these patches has broken buildfarm member hornet. > Apparently, it's transmitting incorrectly-formatted RowDescription > messages. This is curious. Buildfarm animal hornet has failed. It's ppc64 compiled with xlc 12.1, 64 bit: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2017-10-12%2022%3A14%3A41 Buildfarm animal mandrill succeeded. It's ppc64 compiled with xlc 12.1, 32 bit: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill=2017-10-12%2018%3A35%3A47 Buildfarm animal sungazer succeeded. It's ppc64 compiled with gcc 4.8, 64 bit: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2017-10-12%2008%3A21%3A29 All of these are up to at least 31079a4. So we've two animals (hornet, sungazer) that are: #define SIZEOF_SIZE_T 8 #define WORDS_BIGENDIAN 1 #define restrict __restrict one compiled with xlc that fails and one with gcc that succeeds. I'm hesitant to reach for that, but I wonder if there's a compiler bug. Alternatively there could be some undefined behaviour here that only triggers on xlc 64bit, but I'm not quite seeing it. Noah, any chance you could force restrict to off on that animal? Otherwise I can push a platform fix that disables it. Entirely independent of this, both machines report some interesting warnings: Hornet: xlc_r -D_LARGE_FILES=1 -qnoansialias -g -O2 -qmaxmem=16384 -qsrcmsg -L../../src/port -L../../src/common -Wl,-blibpath:'/home/nm/farm/xlc64/HEAD/inst/lib:/usr/lib:/lib' -L../../src/port -lpgport -Wl,-bnoentry -Wl,-H512 -Wl,-bM:SRE -o timetravel.so timetravel.o -Wl,-bE:timetravel.exp -Wl,-bI:../../src/backend/postgres.imp ld: 0711-224 WARNING: Duplicate symbol: .pg_strcasecmp ld: 0711-224 WARNING: Duplicate symbol: .pg_ascii_tolower ld: 0711-224 WARNING: Duplicate symbol: .pg_ascii_toupper ld: 0711-224 WARNING: Duplicate symbol: .pg_tolower ld: 0711-224 WARNING: Duplicate symbol: .pg_toupper ld: 0711-224 WARNING: Duplicate symbol: .pg_strncasecmp Hm. Seems we've some workaround in some platforms: # src/template/win32 # --allow-multiple-definition is required to link pg_dump because it finds # pg_toupper() etc. in both libpq and pgport But that, uh, seems not good? Sungazer: wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 zic.o -L../../src/port -L../../src/common -Wl,-blibpath:'/home/nm/farm/gcc64/HEAD/inst/lib:/usr/lib:/lib' -lpgcommon -lpgport -lssl -lcrypto -lz -lreadline -lld -lm -o zic ld: 0711-224 WARNING: Duplicate symbol: .bcopy ld: 0711-224 WARNING: Duplicate symbol: .memmove Hm. wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I../../../src/include-c -o auth.o auth.c auth.c: In function 'auth_peer': auth.c:2002:2: warning: implicit declaration of function 'getpeereid' [-Wimplicit-function-declaration] if (getpeereid(port->sock, , ) != 0) ^ wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -DFRONTEND -DUNSAFE_STAT_OK -I. -I../../../src/include -I../../../src/port -I../../../src/port -DSO_MAJOR_VERSION=5 -c -o fe-connect.o fe-connect.c fe-connect.c: In function 'PQconnectPoll': fe-connect.c:2382:6: warning: implicit declaration of function 'getpeereid' [-Wimplicit-function-declaration] if (getpeereid(conn->sock, , ) != 0) ^ Looks like we're missing #include wrap-gcc -D_THREAD_SAFE=1 -D_LARGE_FILES=1 -maix64 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I../../../../src/include-c -o pg_locale.o pg_locale.c pg_locale.c: In function 'wchar2char': pg_locale.c:1648:3: warning: implicit declaration of function 'wcstombs_l' [-Wimplicit-function-declaration] result = wcstombs_l(to, from, tolen, locale->info.lt); ^ This is curious. If I'm interpreting this correctly PGAC_FUNC_WCSTOMBS_L fails to find a declaration, but AC_CHECK_FUNCS finds wcstombs_l, so we happily use it. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric
Hi, On 2017-10-12 11:03:34 -0700, Andres Freund wrote: > On 2017-10-12 13:55:07 -0400, Tom Lane wrote: > > Or, if you insist on having it, we're going to have to go the pg_restrict > > route. I don't see why that means duplicating any configure logic: on > > non-Windows we can use the autoconf probe and then write > > "#define pg_restrict restrict". > > Yea, that should work. I'll try to come up with a patch. We can't do so unconditionally in c.h or such, because that'd again cause conflicts with __declspec(restrict) on MSVC versions that don't support restrict, because it'd require restrict to be defined empty. But it's easy to do so in configure, and then have a separate definition in pg_config.h.win32. Done so in the attached commit. It's slightly ugly to have two definitions of restrict in pg_config.h.in, but whatever. Greetings, Andres Freund >From 0ddc96424119682cf3b68c6d8d95ea894a60817a Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Thu, 12 Oct 2017 15:25:38 -0700 Subject: [PATCH] Use C99 restrict via pg_restrict, rather than restrict directly. Unfortunately using 'restrict' plainly causes problems with MSVC, which supports restrict only as '__restrict'. Defining 'restrict' to '__restrict' unfortunately causes a conflict with MSVC's usage of __declspec(restrict) in headers. Therefore define pg_restrict to the appropriate keyword instead, and replace existing usages. This replaces the temporary workaround introduced in 36b4b91ba078. Author: Andres Freund Discussion: https://postgr.es/m/2656.1507830...@sss.pgh.pa.us --- configure | 107 -- configure.in | 15 +- src/include/libpq/pqformat.h | 24 +- src/include/pg_config.h.in| 4 ++ src/include/pg_config.h.win32 | 22 - 5 files changed, 101 insertions(+), 71 deletions(-) diff --git a/configure b/configure index 910f0fc3736..cdcb3ceb0c8 100755 --- a/configure +++ b/configure @@ -11545,52 +11545,6 @@ _ACEOF ;; esac -{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for C/C++ restrict keyword" >&5 -$as_echo_n "checking for C/C++ restrict keyword... " >&6; } -if ${ac_cv_c_restrict+:} false; then : - $as_echo_n "(cached) " >&6 -else - ac_cv_c_restrict=no - # The order here caters to the fact that C++ does not require restrict. - for ac_kw in __restrict __restrict__ _Restrict restrict; do - cat confdefs.h - <<_ACEOF >conftest.$ac_ext -/* end confdefs.h. */ -typedef int * int_ptr; - int foo (int_ptr $ac_kw ip) { - return ip[0]; - } -int -main () -{ -int s[1]; - int * $ac_kw t = s; - t[0] = 0; - return foo(t) - ; - return 0; -} -_ACEOF -if ac_fn_c_try_compile "$LINENO"; then : - ac_cv_c_restrict=$ac_kw -fi -rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext - test "$ac_cv_c_restrict" != no && break - done - -fi -{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_restrict" >&5 -$as_echo "$ac_cv_c_restrict" >&6; } - - case $ac_cv_c_restrict in - restrict) ;; - no) $as_echo "#define restrict /**/" >>confdefs.h - ;; - *) cat >>confdefs.h <<_ACEOF -#define restrict $ac_cv_c_restrict -_ACEOF - ;; - esac - { $as_echo "$as_me:${as_lineno-$LINENO}: checking for printf format archetype" >&5 $as_echo_n "checking for printf format archetype... " >&6; } if ${pgac_cv_printf_archetype+:} false; then : @@ -12508,6 +12462,67 @@ $as_echo "#define LOCALE_T_IN_XLOCALE 1" >>confdefs.h fi +# MSVC doesn't cope well with defining restrict to __restrict, the +# spelling it understands, because it conflicts with +# __declspec(restrict). Therefore we define pg_restrict to the +# appropriate definition, which presumably won't conflict. +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for C/C++ restrict keyword" >&5 +$as_echo_n "checking for C/C++ restrict keyword... " >&6; } +if ${ac_cv_c_restrict+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_cv_c_restrict=no + # The order here caters to the fact that C++ does not require restrict. + for ac_kw in __restrict __restrict__ _Restrict restrict; do + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +typedef int * int_ptr; + int foo (int_ptr $ac_kw ip) { + return ip[0]; + } +int +main () +{ +int s[1]; + int * $ac_kw t = s; + t[0] = 0; + return foo(t) + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + ac_cv_c_restrict=$ac_kw +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext + test "$ac_cv_c_restrict" != no && break + done + +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_restrict" &
Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric
Hi, On 2017-10-12 13:55:07 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2017-10-12 11:30:00 -0400, Tom Lane wrote: > >> I don't actually see why you need a #define at all --- "restrict" is the > >> standard spelling of the keyword no? > > > It is, but a lot of compilers name it differently, e.g. __restrict in > > the case of msvc. > > It's 2017 and they're still not C99 compliant? Oh well. ... > TBH, I really doubt that restrict buys us enough performance to justify > dealing with this. I'd just revert that change altogether. It's quite noticeable, and there's plenty of other case where it seems likely to be beneficial. > Or, if you insist on having it, we're going to have to go the pg_restrict > route. I don't see why that means duplicating any configure logic: on > non-Windows we can use the autoconf probe and then write > "#define pg_restrict restrict". Yea, that should work. I'll try to come up with a patch. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric
On 2017-10-12 11:30:00 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > I've temporarily silenced that error by moving the stdlib.h include > > before the definition of restrict, but that seems fairly fragile. I > > primarily wanted to see whether there's other problems. At least thrips > > is is now happy. > > > I see a number of options to continue: > > - only define restrict on msvc 2013+ - for some reason woodlouse didn't > > complain about this problem, but I'm very doubtful that that's > > actually reliable. > > - rename restrict to pg_restrict. That's annoying because we'd have to > > copy the autoconf test. > > - live with the hack of including stdlib.h early, in pg_config.h.win32. > > - $better_idea > > I don't actually see why you need a #define at all --- "restrict" is the > standard spelling of the keyword no? It is, but a lot of compilers name it differently, e.g. __restrict in the case of msvc. > I really do not like the stdlib.h hack: that amounts to assuming that > only stdlib.h does or ever will contain declspec(restrict). Maybe > you could get away with that if you were applying it only to long-dead > MSVC versions, but doing it "#if (_MSC_VER >= 1500)" is clearly going > to break someday. Yea, I dislike it quite a bit too. Unfortunately not even defining restrict to empty as if it were unsupported looks viable to me :( Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric
On 2017-10-11 21:59:53 -0700, Andres Freund wrote: > That fixed that problem I think. But unfortunately since then another > problem has been reported by some other animals, all with older msvc > versions afaict (thrips - vs 2012, bowerbird - vs 2012). Correction, thrips is vs 2010, not 2012. -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric
On 2017-10-11 17:13:20 -0700, Andres Freund wrote: > Hi, > > On 2017-10-11 23:11:15 +0000, Andres Freund wrote: > > Add configure infrastructure to detect support for C99's restrict. > > > > Will be used in later commits improving performance for a few key > > routines where information about aliasing allows for significantly > > better code generation. > > > > This allows to use the C99 'restrict' keyword without breaking C89, or > > for that matter C++, compilers. If not supported it's defined to be > > empty. > > Woodlouse doesn't like this, erroring out with: > C:\buildfarm\buildenv\HEAD\pgsql.build\src\include\libpq/pqformat.h(47): > error C2219: syntax error : type qualifier must be after '*' > (src/backend/access/common/printsimple.c) > [C:\buildfarm\buildenv\HEAD\pgsql.build\postgres.vcxproj] > > It's MSVC being super peculiar about error checks. I think msvc is just > confused by the pointer hiding typedef. Using some online MSVC (and > other compilers) frontend: > https://godbolt.org/g/TD3nmA > > I confirmed that removing the pointer hiding typedef indeed resolves the > isssue. > > I can't quite decide whether msvc just has taste and dislikes pointer > hiding typedefs as much as I do, or whether it's incredibly stupid ;) > > I'll add a comment and use StringInfoData *. That fixed that problem I think. But unfortunately since then another problem has been reported by some other animals, all with older msvc versions afaict (thrips - vs 2012, bowerbird - vs 2012). Those report that the defining of restrict to __restrict interfers with some system headers: C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\include\stdlib.h(619): error C2485: '__restrict' : unrecognized extended attribute (src/backend/access/brin/brin.c) [c:\prog\bf\root\HEAD\pgsql.build\postgres.vcxproj] Presumably that is because the headers contain __declspec(restrict) on some function declarations. I've temporarily silenced that error by moving the stdlib.h include before the definition of restrict, but that seems fairly fragile. I primarily wanted to see whether there's other problems. At least thrips is is now happy. I see a number of options to continue: - only define restrict on msvc 2013+ - for some reason woodlouse didn't complain about this problem, but I'm very doubtful that that's actually reliable. - rename restrict to pg_restrict. That's annoying because we'd have to copy the autoconf test. - live with the hack of including stdlib.h early, in pg_config.h.win32. - $better_idea Comments? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Replace remaining uses of pq_sendint with pq_sendint{8, 16, 32}.
Replace remaining uses of pq_sendint with pq_sendint{8,16,32}. pq_sendint() remains, so extension code doesn't unnecessarily break. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/31079a4a8e66e56e48bad94d380fa6224e9ffa0d Modified Files -- contrib/hstore/hstore_io.c | 8 +-- src/backend/access/common/printsimple.c | 18 +++ src/backend/access/common/printtup.c| 16 +++--- src/backend/access/transam/parallel.c | 4 +- src/backend/commands/async.c| 2 +- src/backend/commands/copy.c | 8 +-- src/backend/libpq/auth.c| 2 +- src/backend/replication/basebackup.c| 86 - src/backend/replication/logical/proto.c | 20 src/backend/replication/walreceiver.c | 8 +-- src/backend/replication/walsender.c | 36 +++--- src/backend/tcop/fastpath.c | 4 +- src/backend/tcop/postgres.c | 8 +-- src/backend/utils/adt/arrayfuncs.c | 14 +++--- src/backend/utils/adt/date.c| 4 +- src/backend/utils/adt/geo_ops.c | 4 +- src/backend/utils/adt/int.c | 4 +- src/backend/utils/adt/jsonb.c | 2 +- src/backend/utils/adt/nabstime.c| 10 ++-- src/backend/utils/adt/numeric.c | 14 +++--- src/backend/utils/adt/oid.c | 2 +- src/backend/utils/adt/rangetypes.c | 4 +- src/backend/utils/adt/rowtypes.c| 8 +-- src/backend/utils/adt/tid.c | 6 +-- src/backend/utils/adt/timestamp.c | 4 +- src/backend/utils/adt/tsquery.c | 13 +++-- src/backend/utils/adt/tsvector.c| 6 +-- src/backend/utils/adt/txid.c| 2 +- src/backend/utils/adt/varbit.c | 2 +- src/backend/utils/adt/xid.c | 4 +- 30 files changed, 160 insertions(+), 163 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Temporary attempt at a workaround for further MSVC restrict buil
Temporary attempt at a workaround for further MSVC restrict build failures. It appears some versions of msvc use __declspec(restrict) in stdlib.h and subsidiary headers. Including those after defining 'restrict' to '__restrict' doesn't work. Try to get the buildfarm green to see whether there's further problems, by including stdlib.h just before said define. Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/36b4b91ba07843406d5a30106facb59d8275c6de Modified Files -- src/include/pg_config.h.win32 | 5 + 1 file changed, 5 insertions(+) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Work around overly strict restrict checks by MSVC.
Work around overly strict restrict checks by MSVC. Apparently MSVC requires a * before a restrict in a variable declaration, even if the adorned type already is a pointer, just via typedef. As reported by buildfarm animal woodlouse. Author: Andres Freund Discussion: https://postgr.es/m/20171012001320.4putagiruueht...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/060b069984a69ff0255ce318f10681c553613bef Modified Files -- src/include/libpq/pqformat.h | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Improve performance of SendRowDescriptionMessage.
Improve performance of SendRowDescriptionMessage. There's three categories of changes leading to better performance: - Splitting the per-attribute part of SendRowDescriptionMessage into a v2 and a v3 version allows avoiding branches for every attribute. - Preallocating the size of the buffer to be big enough for all attributes and then using pq_write* avoids unnecessary buffer size checks & resizing. - Reusing a persistently allocated StringInfo for all SendRowDescriptionMessage() invocations avoids repeated allocations & reallocations. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/4c119fbcd49ba882791c7b99a1e934b985468e9f Modified Files -- src/backend/access/common/printtup.c | 146 ++- src/backend/tcop/postgres.c | 35 +++-- src/include/access/printtup.h| 4 +- 3 files changed, 138 insertions(+), 47 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric
Hi, On 2017-10-11 23:11:15 +, Andres Freund wrote: > Add configure infrastructure to detect support for C99's restrict. > > Will be used in later commits improving performance for a few key > routines where information about aliasing allows for significantly > better code generation. > > This allows to use the C99 'restrict' keyword without breaking C89, or > for that matter C++, compilers. If not supported it's defined to be > empty. Woodlouse doesn't like this, erroring out with: C:\buildfarm\buildenv\HEAD\pgsql.build\src\include\libpq/pqformat.h(47): error C2219: syntax error : type qualifier must be after '*' (src/backend/access/common/printsimple.c) [C:\buildfarm\buildenv\HEAD\pgsql.build\postgres.vcxproj] It's MSVC being super peculiar about error checks. I think msvc is just confused by the pointer hiding typedef. Using some online MSVC (and other compilers) frontend: https://godbolt.org/g/TD3nmA I confirmed that removing the pointer hiding typedef indeed resolves the isssue. I can't quite decide whether msvc just has taste and dislikes pointer hiding typedefs as much as I do, or whether it's incredibly stupid ;) I'll add a comment and use StringInfoData *. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Use one stringbuffer for all rows printed in printtup.c.
Use one stringbuffer for all rows printed in printtup.c. This avoids newly allocating, and then possibly growing, the stringbuffer for every row. For wide rows this can substantially reduce memory allocator overhead, at the price of not immediately reducing memory usage after outputting an especially wide row. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/f2dec34e19d3969ddd616e671fe9a7b968bec812 Modified Files -- src/backend/access/common/printtup.c | 46 1 file changed, 25 insertions(+), 21 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Allow to avoid NUL-byte management for stringinfos and use in fo
Allow to avoid NUL-byte management for stringinfos and use in format.c. In a lot of the places having appendBinaryStringInfo() maintain a trailing NUL byte wasn't actually meaningful, e.g. when appending an integer which can contain 0 in one of its bytes. Removing this yields some small speedup, but more importantly will be more consistent when providing faster variants of pq_sendint etc. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/70c2d1be2b1e1efa8ef38a92b443fa290a9558dd Modified Files -- src/backend/lib/stringinfo.c | 21 - src/backend/libpq/pqformat.c | 18 +- src/include/lib/stringinfo.h | 8 3 files changed, 37 insertions(+), 10 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Add more efficient functions to pqformat API.
Add more efficient functions to pqformat API. There's three prongs to achieve greater efficiency here: 1) Allow reusing a stringbuffer across pq_beginmessage/endmessage, with the new pq_beginmessage_reuse/endmessage_reuse. This can be beneficial both because it avoids allocating the initial buffer, and because it's more likely to already have an correctly sized buffer. 2) Replacing pq_sendint() with pq_sendint$width() inline functions. Previously unnecessary and unpredictable branches in pq_sendint() were needed. Additionally the replacement functions are implemented more efficiently. pq_sendint is now deprecated, a separate commit will convert all in-tree callers. 3) Add pq_writeint$width(), pq_writestring(). These rely on sufficient space in the StringInfo's buffer, avoiding individual space checks & potential individual resizing. To allow this to be used for strings, expose mbutil.c's MAX_CONVERSION_GROWTH. Followup commits will make use of these facilities. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/1de09ad8eb1fa673ee7899d6dfbb2b49ba204818 Modified Files -- src/backend/libpq/pqformat.c | 88 - src/backend/utils/mb/mbutils.c | 11 --- src/include/libpq/pqformat.h | 168 - src/include/mb/pg_wchar.h | 11 +++ 4 files changed, 208 insertions(+), 70 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Add configure infrastructure to detect support for C99's restric
Add configure infrastructure to detect support for C99's restrict. Will be used in later commits improving performance for a few key routines where information about aliasing allows for significantly better code generation. This allows to use the C99 'restrict' keyword without breaking C89, or for that matter C++, compilers. If not supported it's defined to be empty. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbek...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/0b974dba2d6b5581ce422ed883209de46f313fb6 Modified Files -- configure | 46 +++ configure.in | 1 + src/include/pg_config.h.in| 14 + src/include/pg_config.h.win32 | 11 +++ 4 files changed, 72 insertions(+) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Prevent idle in transaction session timeout from sometimes being
Prevent idle in transaction session timeout from sometimes being ignored. The previous coding in ProcessInterrupts() could lead to idle_in_transaction_session_timeout being ignored, when statement_timeout occurred earlier. The problem was that ProcessInterrupts() would return before processing the transaction timeout if QueryCancelPending was set while QueryCancelHoldoffCount != 0 - which is the case when reading new commands from the client. Ergo when the idle transaction timeout would hit. Fix that by removing the early return. Alternatively the transaction timeout code could have been moved up, but that early return seems like an issue that could hit other cases too. Author: Lukas Fittl Bug: #14821 Discussion: https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2cf_lpc6gvofnc0-g...@mail.gmail.com Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced. Branch -- REL_10_STABLE Details --- https://git.postgresql.org/pg/commitdiff/61ace8fe7fe82dc04c1de493a414597989f05e56 Modified Files -- src/backend/tcop/postgres.c | 32 +++- 1 file changed, 15 insertions(+), 17 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Prevent idle in transaction session timeout from sometimes being
Prevent idle in transaction session timeout from sometimes being ignored. The previous coding in ProcessInterrupts() could lead to idle_in_transaction_session_timeout being ignored, when statement_timeout occurred earlier. The problem was that ProcessInterrupts() would return before processing the transaction timeout if QueryCancelPending was set while QueryCancelHoldoffCount != 0 - which is the case when reading new commands from the client. Ergo when the idle transaction timeout would hit. Fix that by removing the early return. Alternatively the transaction timeout code could have been moved up, but that early return seems like an issue that could hit other cases too. Author: Lukas Fittl Bug: #14821 Discussion: https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2cf_lpc6gvofnc0-g...@mail.gmail.com Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced. Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/f676616651c83b14e1d879fbfabdd3ab2dc70bbe Modified Files -- src/backend/tcop/postgres.c | 32 +++- 1 file changed, 15 insertions(+), 17 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Prevent idle in transaction session timeout from sometimes being
Prevent idle in transaction session timeout from sometimes being ignored. The previous coding in ProcessInterrupts() could lead to idle_in_transaction_session_timeout being ignored, when statement_timeout occurred earlier. The problem was that ProcessInterrupts() would return before processing the transaction timeout if QueryCancelPending was set while QueryCancelHoldoffCount != 0 - which is the case when reading new commands from the client. Ergo when the idle transaction timeout would hit. Fix that by removing the early return. Alternatively the transaction timeout code could have been moved up, but that early return seems like an issue that could hit other cases too. Author: Lukas Fittl Bug: #14821 Discussion: https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2cf_lpc6gvofnc0-g...@mail.gmail.com Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced. Branch -- REL9_6_STABLE Details --- https://git.postgresql.org/pg/commitdiff/0da46d75e31ddfa9180345a14d720814e36922fa Modified Files -- src/backend/tcop/postgres.c | 32 +++- 1 file changed, 15 insertions(+), 17 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add port/strnlen support to libpq and ecpg Makefiles.
Hi, On 2017-10-11 11:58:58 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Phew. This is a bit a sad state of affairs. The separate libpq logic for > > getting pgport is presumably because of possibly different threading > > flags and then because of the appropriate compiler/linker flags for a > > shared library? > > I don't see why threading would matter, but building with -fPIC or > not is definitely an issue. -pthread changes some "memory model" type assumptions by the compiler too IIRC, not just linker stage things. In a non-threaded environment the compiler is kinda free to invent phantom stores and such. It's unlikely to matter for just pgbench, but ... > I agree the PITA factor of the current approach keeps increasing. > It sounds a bit silly to build libpgport three ways, but maybe > we should just do that. We already kinda are, just by copying things around ;) > Or conceivably we should just build the FE version of libpgport.a > with -fPIC and not worry about whether that loses some efficiency > for client programs. A lot of distros are effectively forcing > that, or even -fPIE, anyway. Hm. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add port/strnlen support to libpq and ecpg Makefiles.
On 2017-10-11 15:28:16 +, Tom Lane wrote: > Add port/strnlen support to libpq and ecpg Makefiles. > > In the wake of fffd651e8, any makefile that pulls in snprintf.c > from src/port/ needs to be prepared to pull in strnlen.c as well. > Per buildfarm. Thanks. > Modified Files > -- > src/interfaces/ecpg/compatlib/.gitignore | 1 + > src/interfaces/ecpg/compatlib/Makefile| 6 +++--- > src/interfaces/ecpg/ecpglib/.gitignore| 1 + > src/interfaces/ecpg/ecpglib/Makefile | 7 --- > src/interfaces/ecpg/pgtypeslib/.gitignore | 1 + > src/interfaces/ecpg/pgtypeslib/Makefile | 7 --- > src/interfaces/libpq/.gitignore | 1 + > src/interfaces/libpq/Makefile | 6 +++--- > 8 files changed, 18 insertions(+), 12 deletions(-) Phew. This is a bit a sad state of affairs. The separate libpq logic for getting pgport is presumably because of possibly different threading flags and then because of the appropriate compiler/linker flags for a shared library? Wonder if we shouldn't have variants of libpqport that support threading and shared libraries, but built centrally. libpgport{-shared}{-mt} or such. Formally speaking it's not quite right that we don't use threading aware flags in pgbench for example. Some gcc platforms at least pretty much assume everything is built with -pthread e.g. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Rewrite strnlen replacement implementation from 8a241792f96.
On 2017-10-10 19:15:13 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Rewrite strnlen replacement implementation from 8a241792f96. > > Hm, did you hand-edit the configure script and then forget to undo it? > Cause what was committed was not right. Yea, that was me testing the fallback :(. I think I need holidays. -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
On 2017-10-10 18:10:15 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Here's a fix. Not quite sure whether we really need the > > HAVE_DECL_STRNLEN test, added it for symmetry. > > LGTM. Thanks for checking. > I think the DECL test is a good idea; the system definition > might be a macro or otherwise weird, in which case we'd cause problems > if we write an extern definition anyway. I was thinking about protecting it with HAVE_STRNLEN rather than not protecting it at all. But this works, so whatever ;) Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Rewrite strnlen replacement implementation from 8a241792f96.
Rewrite strnlen replacement implementation from 8a241792f96. The previous placement of the fallback implementation in libpgcommon was problematic, because libpqport functions need strnlen functionality. Move replacement into libpgport. Provide strnlen() under its posix name, instead of pg_strnlen(). Fix stupid configure bug, executing the test only when compiled with threading support. Author: Andres Freund Discussion: https://postgr.es/m/e1e1gr2-0005fb...@gemulon.postgresql.org Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/fffd651e83ccbd6191a76be6ec7c6b1b27888fde Modified Files -- configure | 25 - configure.in | 6 +++--- src/backend/utils/mmgr/mcxt.c | 3 +-- src/common/string.c | 20 src/include/common/string.h | 15 --- src/include/pg_config.h.in| 4 src/include/pg_config.h.win32 | 10 +++--- src/include/port.h| 4 src/port/snprintf.c | 4 +--- src/port/strnlen.c| 33 + 10 files changed, 77 insertions(+), 47 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
On 2017-10-10 17:34:17 -0400, Andrew Dunstan wrote: > This test is governed by the test at line 946 of configure.in. You need > to move it somewhere else. Yea, sorry for the noise :(. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
On 2017-10-10 13:53:15 -0700, Andres Freund wrote: > On 2017-10-10 16:51:39 -0400, Tom Lane wrote: > > Andres Freund <and...@anarazel.de> writes: > > > As far as I can tell it's still somehow using a configure from before > > > the last commits: > > > > No, it's pilot error. The AC_CHECK_FUNCS call you added strnlen to > > is only executed if > > AS_IF([test "$enable_thread_safety" = yes -a "$PORTNAME" != "win32"], > > Dear god, I'm daft. Dear god, m4 is a horrible lanuagage. Here's a fix. Not quite sure whether we really need the HAVE_DECL_STRNLEN test, added it for symmetry. Afaict windows "always" had strnlen, so no need to meddle with the windows build. Will eat lunch and push, even if there's some futher adjustments, I'd like to get Andrew's animals green again. Greetings, Andres Freund >From fffd651e83ccbd6191a76be6ec7c6b1b27888fde Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 10 Oct 2017 14:42:16 -0700 Subject: [PATCH] Rewrite strnlen replacement implementation from 8a241792f96. The previous placement of the fallback implementation in libpgcommon was problematic, because libpqport functions need strnlen functionality. Move replacement into libpgport. Provide strnlen() under its posix name, instead of pg_strnlen(). Fix stupid configure bug, executing the test only when compiled with threading support. Author: Andres Freund Discussion: https://postgr.es/m/e1e1gr2-0005fb...@gemulon.postgresql.org --- configure | 25 - configure.in | 6 +++--- src/backend/utils/mmgr/mcxt.c | 3 +-- src/common/string.c | 20 src/include/common/string.h | 15 --- src/include/pg_config.h.in| 4 src/include/pg_config.h.win32 | 10 +++--- src/include/port.h| 4 src/port/snprintf.c | 4 +--- src/port/strnlen.c| 33 + 10 files changed, 77 insertions(+), 47 deletions(-) create mode 100644 src/port/strnlen.c diff --git a/configure b/configure index a1283c05005..b0582657bf4 100755 --- a/configure +++ b/configure @@ -8777,7 +8777,7 @@ fi -for ac_func in strerror_r getpwuid_r gethostbyname_r strnlen +for ac_func in strerror_r getpwuid_r gethostbyname_r do : as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh` ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var" @@ -13161,6 +13161,16 @@ fi cat >>confdefs.h <<_ACEOF #define HAVE_DECL_STRLCPY $ac_have_decl _ACEOF +ac_fn_c_check_decl "$LINENO" "strnlenfrak" "ac_cv_have_decl_strnlen" "$ac_includes_default" +if test "x$ac_cv_have_decl_strnlen" = xyes; then : + ac_have_decl=1 +else + ac_have_decl=0 +fi + +cat >>confdefs.h <<_ACEOF +#define HAVE_DECL_STRNLEN $ac_have_decl +_ACEOF # This is probably only present on macOS, but may as well check always ac_fn_c_check_decl "$LINENO" "F_FULLFSYNC" "ac_cv_have_decl_F_FULLFSYNC" "#include @@ -13528,6 +13538,19 @@ esac fi +ac_fn_c_check_func "$LINENO" "strnlenfrak" "ac_cv_func_strnlen" +if test "x$ac_cv_func_strnlen" = xyes; then : + $as_echo "#define HAVE_STRNLEN 1" >>confdefs.h + +else + case " $LIBOBJS " in + *" strnlen.$ac_objext "* ) ;; + *) LIBOBJS="$LIBOBJS strnlen.$ac_objext" + ;; +esac + +fi + case $host_os in diff --git a/configure.in b/configure.in index e1381b4ead6..4548db0dd3c 100644 --- a/configure.in +++ b/configure.in @@ -961,7 +961,7 @@ LIBS="$LIBS $PTHREAD_LIBS" AC_CHECK_HEADER(pthread.h, [], [AC_MSG_ERROR([ pthread.h not found; use --disable-thread-safety to disable thread safety])]) -AC_CHECK_FUNCS([strerror_r getpwuid_r gethostbyname_r strnlen]) +AC_CHECK_FUNCS([strerror_r getpwuid_r gethostbyname_r]) # Do test here with the proper thread flags PGAC_FUNC_STRERROR_R_INT @@ -1422,7 +1422,7 @@ AC_CHECK_DECLS(posix_fadvise, [], [], [#include ]) fi AC_CHECK_DECLS(fdatasync, [], [], [#include ]) -AC_CHECK_DECLS([strlcat, strlcpy]) +AC_CHECK_DECLS([strlcat, strlcpy, strnlen]) # This is probably only present on macOS, but may as well check always AC_CHECK_DECLS(F_FULLFSYNC, [], [], [#include ]) @@ -1514,7 +1514,7 @@ else AC_CHECK_FUNCS([fpclass fp_class fp_class_d class], [break]) fi -AC_REPLACE_FUNCS([crypt fls getopt getrusage inet_aton mkdtemp random rint srandom strerror strlcat strlcpy]) +AC_REPLACE_FUNCS([crypt fls getopt getrusage inet_aton mkdtemp random rint srandom strerror strlcat strlcpy strnlen]) case $host_os in diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 64e0408d5af..c5c311fad39 100644 --- a/src/backend/utils/mmgr/mcxt.c +++ b/
Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
On 2017-10-10 16:51:39 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > As far as I can tell it's still somehow using a configure from before > > the last commits: > > No, it's pilot error. The AC_CHECK_FUNCS call you added strnlen to > is only executed if > AS_IF([test "$enable_thread_safety" = yes -a "$PORTNAME" != "win32"], Dear god, I'm daft. Dear god, m4 is a horrible lanuagage. Thanks. -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
On 2017-10-10 16:37:04 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > (there's definitely fixes to be made to where strnlen's replacement is > > located, but regardless, this needs to be fixed too) > > Given that strnlen is standardized by POSIX, and has been for nigh a > decade, I think it'd be all right for us to treat it as a straight > port replacement function, a la strlcpy() for instance. That is, > forget the pg_ prefix and just create a strnlen() function if the > platform has not got it. Yea, I'm ok with that. But can't do that before the configure issue on these boxes is fixed, otherwise we'll likely get symbol conflicts... Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
On 2017-10-09 23:33:36 -0400, Andrew Dunstan wrote: > > > On 10/09/2017 07:15 PM, Andres Freund wrote: > > Hi Andrew, > > > > On 2017-10-09 22:22:04 +, Andres Freund wrote: > >> Add pg_strnlen() a portable implementation of strlen. > >> > >> As the OS version is likely going to be more optimized, fall back to > >> it if available, as detected by configure. > > I'm a bit confused, frogmouth > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogmouth=2017-10-09%2022%3A30%3A41 > > shows that it compiled the new code, but the configure output doesn't > > show it ran through the new configure test. Additionally, without the > > the config define, this should result in the replacement being > > used. Which doesn't seem to be the case either. > > > > Kinda sounds like this used some halfway outdated build or such? > > > > > > frogmouth is using some code not yet released that makes the config > cache persistent. I just identified and fixed a stupid bug in the code > that obsoletes the cache, and I have removed frogmouth's cache file and > set it running again, so we'll see if that fixes things. As far as I can tell it's still somehow using a configure from before the last commits: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=frogmouth=2017-10-10%2018%3A35%3A06=configure Note that it's not running the new test. (there's definitely fixes to be made to where strnlen's replacement is located, but regardless, this needs to be fixed too) - Andres -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
On 2017-10-10 00:25:52 -0400, Tom Lane wrote: > Andrew Dunstan <and...@dunslane.net> writes: > > frogmouth is using some code not yet released that makes the config > > cache persistent. I just identified and fixed a stupid bug in the code > > that obsoletes the cache, and I have removed frogmouth's cache file and > > set it running again, so we'll see if that fixes things. Interesting. Although an outdated cache file doesn't explain why configure didn't even do the relevant check, no? This kinda sounds more like configure as a whole is outdated, rather than it's cache file. > I doubt it. It'll quite possibly fix it, but probably not for good reasons. Presumably after a proper configure run the fallback won't be used anymore. > I think the problem with this patch is that Andres has > made libpgport dependent on libpgcommon, which is backwards, or at > least circular. The module layering is supposed to go the other way. Yes, quite possibly. At least one platform without strnlen [1], as well as my local machine when intentionally marking strnlen as not available, ran successfully. But that's likely just a difference in how/when symbols are resolved. I think the current split between common and port isn't particularly meaningful. But as long as we have it, this probably belongs more in port than in common. Greetings, Andres Freund [1] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=gharial=2017-10-10%2000%3A31%3A38=config -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
Hi Andrew, On 2017-10-09 22:22:04 +, Andres Freund wrote: > Add pg_strnlen() a portable implementation of strlen. > > As the OS version is likely going to be more optimized, fall back to > it if available, as detected by configure. I'm a bit confused, frogmouth https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogmouth=2017-10-09%2022%3A30%3A41 shows that it compiled the new code, but the configure output doesn't show it ran through the new configure test. Additionally, without the the config define, this should result in the replacement being used. Which doesn't seem to be the case either. Kinda sounds like this used some halfway outdated build or such? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Fix pnstrdup() to not memcpy() the maximum allowed length.
Fix pnstrdup() to not memcpy() the maximum allowed length. The previous behaviour was dangerous if the length passed wasn't the size of the underlying buffer, but the maximum size of the underlying buffer. Author: Andres Freund Discussion: https://postgr.es/m/20161003215524.mwz5p45pcverr...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/82c117cb90e6b6b79f06d61eb1ddf06e94e75b60 Modified Files -- src/backend/utils/mmgr/mcxt.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Add pg_strnlen() a portable implementation of strlen.
Add pg_strnlen() a portable implementation of strlen. As the OS version is likely going to be more optimized, fall back to it if available, as detected by configure. Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/8a241792f968ed5be6cf4d41e32c0d264f6c0c65 Modified Files -- configure | 2 +- configure.in | 2 +- src/common/string.c | 20 src/include/common/string.h | 15 +++ src/include/pg_config.h.in| 3 +++ src/include/pg_config.h.win32 | 3 +++ src/port/snprintf.c | 12 ++-- 7 files changed, 45 insertions(+), 12 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Reduce memory usage of targetlist SRFs.
Reduce memory usage of targetlist SRFs. Previously nodeProjectSet only released memory once per input tuple, rather than once per returned tuple. If the computation of an individual returned tuple requires a lot of memory, that can lead to problems. Instead change things so that the expression context can be reset once per output tuple, which requires a new memory context to store SRF arguments in. This is a longstanding issue, but was hard to fix before 9.6, due to the way tSRFs where evaluated. But it's fairly easy to fix now. We could backpatch this into 10, but given there've been fewc omplaints that doesn't seem worth the risk so far. Reported-By: Lucas Fairchild Author: Andres Freund, per discussion with Tom Lane Discussion: https://postgr.es/m/4514.1507318...@sss.pgh.pa.us Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/84ad4b036d975ad1be0f52251bac3a06463c9811 Modified Files -- src/backend/executor/execSRF.c| 31 --- src/backend/executor/nodeProjectSet.c | 32 +++- src/include/executor/executor.h | 1 + src/include/nodes/execnodes.h | 1 + 4 files changed, 57 insertions(+), 8 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Msvc doesn't know UINT16_MAX, replace with PG_UINT16_MAX.
Msvc doesn't know UINT16_MAX, replace with PG_UINT16_MAX. UINT16_MAX usage is originating from commit 212e6f34d55c. Per buildfarm animal currawong. Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/9eafa2b5b043b84fb9846bd7a57d15ed1ee220c1 Modified Files -- src/include/utils/fmgrtab.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Attempt to adapt windows build for 212e6f34d55c.
Attempt to adapt windows build for 212e6f34d55c. Per buildfarm animal baiji. Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/15334ad19a776f76cbb725e4e9162a7bce1bd4d0 Modified Files -- src/tools/msvc/Solution.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Move genbki.pl's find_defined_symbol to Catalog.pm.
Move genbki.pl's find_defined_symbol to Catalog.pm. Will be used in Gen_fmgrtab.pl in a followup commit. Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/18f791ab2b6a01a632653d394e046f3daf193ff6 Modified Files -- src/backend/catalog/Catalog.pm | 35 ++- src/backend/catalog/genbki.pl | 34 -- 2 files changed, 38 insertions(+), 31 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Replace binary search in fmgr_isbuiltin with a lookup array.
Replace binary search in fmgr_isbuiltin with a lookup array. Turns out we have enough functions that the binary search is quite noticeable in profiles. Thus have Gen_fmgrtab.pl build a new mapping from a builtin function's oid to an index in the existing fmgr_builtins array. That keeps the additional memory usage at a reasonable amount. Author: Andres Freund, with input from Tom Lane Discussion: https://postgr.es/m/20170914065128.a5sk7z4xde5uy...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/212e6f34d55c910505f87438d878698223d9a617 Modified Files -- src/backend/utils/Gen_fmgrtab.pl | 79 +--- src/backend/utils/Makefile | 2 +- src/backend/utils/fmgr/fmgr.c| 29 ++- src/include/utils/fmgrtab.h | 11 +- 4 files changed, 88 insertions(+), 33 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Yet another pg_bswap typo in a windows only file.
Yet another pg_bswap typo in a windows only file. Per buildfarm animal frogmouth. Brown-Paper-Bagged-By: Andres Freund Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/0c8b3ee94478ca07c86c09d2399a2ce73c2b922b Modified Files -- src/port/getaddrinfo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Correct include file name in inet_aton fallback.
Correct include file name in inet_aton fallback. Per buildfarm animal frogmouth. Author: Andres Freund Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/859759b62f2d2f2f2805e2aa9ebdb167a1b9655c Modified Files -- src/port/inet_aton.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Replace most usages of ntoh[ls] and hton[sl] with pg_bswap.h.
Replace most usages of ntoh[ls] and hton[sl] with pg_bswap.h. All postgres internal usages are replaced, it's just libpq example usages that haven't been converted. External users of libpq can't generally rely on including postgres internal headers. Note that this includes replacing open-coded byte swapping of 64bit integers (using two 32 bit swaps) with a single 64bit swap. Where it looked applicable, I have removed netinet/in.h and arpa/inet.h usage, which previously provided the relevant functionality. It's perfectly possible that I missed other reasons for including those, the buildfarm will tell. Author: Andres Freund Discussion: https://postgr.es/m/20170927172019.gheidqy6xvlxb...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/0ba99c84e8c7138143059b281063d4cca5a2bfea Modified Files -- contrib/pgcrypto/crypt-des.c| 17 +++- contrib/uuid-ossp/uuid-ossp.c | 17 +++- src/backend/commands/copy.c | 11 +- src/backend/libpq/auth.c| 18 - src/backend/libpq/ifaddr.c | 6 +++--- src/backend/libpq/pqcomm.c | 6 +++--- src/backend/libpq/pqformat.c| 40 ++--- src/backend/postmaster/postmaster.c | 13 ++-- src/backend/tcop/fastpath.c | 8 +++- src/bin/pg_basebackup/streamutil.c | 34 +++ src/bin/pg_dump/parallel.c | 6 -- src/bin/pg_rewind/libpq_fetch.c | 29 ++- src/common/scram-common.c | 7 ++- src/interfaces/libpq/fe-connect.c | 12 +-- src/interfaces/libpq/fe-lobj.c | 11 +- src/interfaces/libpq/fe-misc.c | 14 ++--- src/interfaces/libpq/fe-protocol2.c | 5 ++--- src/interfaces/libpq/fe-protocol3.c | 5 ++--- src/port/getaddrinfo.c | 11 +- src/port/inet_aton.c| 4 +++- 20 files changed, 99 insertions(+), 175 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Allow pg_ctl kill to send SIGKILL.
Allow pg_ctl kill to send SIGKILL. Previously that was disallowed out of an abundance of caution. Providing KILL support however is helpful to make the 013_crash_restart.pl test portable, and there's no actual issue with allowing it. SIGABRT, which has similar consequences except it also dumps core, was already allowed. Author: Andres Freund Discussion: https://postgr.es/m/45d42d41-6145-9be1-7261-84acf6d9e...@2ndquadrant.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/2e83db3ad2da9b073af9ae12916f0b71cf698e1e Modified Files -- src/bin/pg_ctl/pg_ctl.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Remove redundant stdint.h include.
Remove redundant stdint.h include. Discussion: https://postgr.es/m/31674.1506788...@sss.pgh.pa.us Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/1f2830f9df9f0196ba541c1e253463afe657cb67 Modified Files -- src/include/port/pg_bswap.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Try to make crash restart test work on windows.
Try to make crash restart test work on windows. Author: Andres Freund Tested-By: Andrew Dunstan Discussion: https://postgr.es/m/20170930224424.ud5ilchmclbl5...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/784905795f8aadc09efe2fdae195279d17250f00 Modified Files -- src/test/recovery/t/013_crash_restart.pl | 28 ++-- 1 file changed, 10 insertions(+), 18 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.
Hi, On 2017-09-30 22:28:39 -0400, Andrew Dunstan wrote: > >> But even after fixing that, there unfortunately is: > >> > >> static void > >> set_sig(char *signame) > >> { > >> … > >> #if 0 > >>/* probably should NOT provide SIGKILL */ > >>else if (strcmp(signame, "KILL") == 0) > >>sig = SIGKILL; > >> #endif > >> > >> I'm unclear on what that provision is achieving? If you can kill with > >> pg_ctl you can do other nasty stuff too (like just use kill instead of > >> pg_ctl)? > > > I put it in when we rewrote pg_ctl in C many years ago, possibly out of > a superabundance of caution. I agree it's worth revisiting. I think the > idea was that there's a difference between an ordinary footgun and an > officially sanctioned footgun :-) Heh. I'm inclined to take it out. We could add a --use-the-force-luke type parameter, but it doesn't seem worth it. > Haven't tested on MSVC but with this patch it passes on jacana (mingw). Yay! Thanks for testing. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.
On 2017-09-30 15:27:12 -0700, Andres Freund wrote: > On 2017-09-30 18:21:33 -0400, Andrew Dunstan wrote: > > > > [re-adding commiters which I inadvertently left off] > > > > > > On 09/30/2017 06:10 PM, Andres Freund wrote: > > > > > > > > >> I was just looking at this. Why aren't we using "pg_ctl kill" to > > >> terminate the backend? That's supposed to be portable. > > > Because pg_ctl can't do that for any process but postmaster, no? The > > > test is supposed to find issues with backend death (and has > > > defficiencies in error reporting already, and would have caught a bug > > > I'd introduced previously). > > > No, I don't think so. That's not what the docs say. That's why you give > > it a pid argument" "pg_ctl kill signal_name process_id" > > Oh, cool. Didn't know that one. So the answer is: > "Because Andres doesn't know squat.". > > But even after fixing that, there unfortunately is: > > static void > set_sig(char *signame) > { > … > #if 0 > /* probably should NOT provide SIGKILL */ > else if (strcmp(signame, "KILL") == 0) > sig = SIGKILL; > #endif > > I'm unclear on what that provision is achieving? If you can kill with > pg_ctl you can do other nasty stuff too (like just use kill instead of > pg_ctl)? Could you perhaps test whether windows likes things after the following patch? I don't think the kill_kill guarantees are really needed here, so we might even be able to allow this on msvc. Greetings, Andres Freund diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c index 4e02c4cea1..0990647297 100644 --- a/src/bin/pg_ctl/pg_ctl.c +++ b/src/bin/pg_ctl/pg_ctl.c @@ -1961,11 +1961,8 @@ set_sig(char *signame) sig = SIGQUIT; else if (strcmp(signame, "ABRT") == 0) sig = SIGABRT; -#if 0 - /* probably should NOT provide SIGKILL */ else if (strcmp(signame, "KILL") == 0) sig = SIGKILL; -#endif else if (strcmp(signame, "TERM") == 0) sig = SIGTERM; else if (strcmp(signame, "USR1") == 0) diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl index ca02054ff0..91a8ef90c1 100644 --- a/src/test/recovery/t/013_crash_restart.pl +++ b/src/test/recovery/t/013_crash_restart.pl @@ -16,16 +16,8 @@ use Test::More; use Config; use Time::HiRes qw(usleep); -if ($Config{osname} eq 'MSWin32') -{ - # some Windows Perls at least don't like IPC::Run's - # start/kill_kill regime. - plan skip_all => "Test fails on Windows perl"; -} -else -{ - plan tests => 18; -} +plan tests => 18; + # To avoid hanging while expecting some specific input from a psql # instance being driven by us, add a timeout high enough that it @@ -106,10 +98,10 @@ $monitor_stdout = ''; $monitor_stderr = ''; # kill once with QUIT - we expect psql to exit, while emitting error message first -my $cnt = kill 'QUIT', $pid; +my $ret = TestLib::system_log('pg_ctl', 'kill', 'QUIT', $pid); # Exactly process should have been alive to be killed -is($cnt, 1, "exactly one process killed with SIGQUIT"); +is($ret, 0, "killed process with SIGQUIT"); # Check that psql sees the killed backend as having been terminated $killme_stdin .= q[ @@ -119,14 +111,14 @@ ok(pump_until($killme, \$killme_stderr, qr/WARNING: terminating connection beca "psql query died successfully after SIGQUIT"); $killme_stderr = ''; $killme_stdout = ''; -$killme->kill_kill; +$killme->finish; # Wait till server restarts - we should get the WARNING here, but # sometimes the server is unable to send that, if interrupted while # sending. ok(pump_until($monitor, \$monitor_stderr, qr/WARNING: terminating connection because of crash of another server process|server closed the connection unexpectedly/m), "psql monitor died successfully after SIGQUIT"); -$monitor->kill_kill; +$monitor->finish; # Wait till server restarts is($node->poll_query_until('postgres', 'SELECT $$restarted after sigquit$$;', 'restarted after sigquit'), @@ -179,8 +171,8 @@ $monitor_stderr = ''; # kill with SIGKILL this time - we expect the backend to exit, without # being able to emit an error error message -$cnt = kill 'KILL', $pid; -is($cnt, 1, "exactly one process killed with KILL"); +$ret = TestLib::system_log('pg_ctl', 'kill', 'KILL', $pid); +is($ret, 0, "killed process with KILL"); # Check that psql sees the server as being terminated. No WARNING, # because signal handlers aren't being run on SIGKILL. @@ -189,14 +181,14 @@ SELECT 1; ]; ok(pump_until($killme, \$killme_stderr, qr/server closed the connection unexpectedly/m), "psql query died successfully after SIGKILL"); -$killme->kill_kill; +$killme-&
Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.
On 2017-09-30 18:21:33 -0400, Andrew Dunstan wrote: > > [re-adding commiters which I inadvertently left off] > > > On 09/30/2017 06:10 PM, Andres Freund wrote: > > > > > >> I was just looking at this. Why aren't we using "pg_ctl kill" to > >> terminate the backend? That's supposed to be portable. > > Because pg_ctl can't do that for any process but postmaster, no? The > > test is supposed to find issues with backend death (and has > > defficiencies in error reporting already, and would have caught a bug > > I'd introduced previously). > No, I don't think so. That's not what the docs say. That's why you give > it a pid argument" "pg_ctl kill signal_name process_id" Oh, cool. Didn't know that one. So the answer is: "Because Andres doesn't know squat.". But even after fixing that, there unfortunately is: static void set_sig(char *signame) { … #if 0 /* probably should NOT provide SIGKILL */ else if (strcmp(signame, "KILL") == 0) sig = SIGKILL; #endif I'm unclear on what that provision is achieving? If you can kill with pg_ctl you can do other nasty stuff too (like just use kill instead of pg_ctl)? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Extend & revamp pg_bswap.h infrastructure.
On 2017-09-30 12:17:06 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Extend & revamp pg_bswap.h infrastructure. > > Hm, what is the point of this in pg_bswap.h: > > +#ifdef _MSC_VER > +#include > +#endif > > c.h will already have included . There might be some > value in this if we anticipated allowing freestanding use of this > header, but that won't happen because it depends on configure symbols. Well, it's not that obvious where the _byteswap_* functions are coming from on msvc. I guess we can just leave the comment /* In all supported versions msvc provides _byteswap_* functions in stdlib.h */ there, but I see no harm in the current form either. > but that won't happen because it depends on configure symbols. FWIW, I've wondered about replacing the pg_config.h tests with explicit gcc version checks. But doesn't seem worth it for now. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Fix copy & pasto in 510b8cbff15f.
Fix copy & pasto in 510b8cbff15f. Reported-By: Peter Geoghegan Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/248e33756b425335d94a32ffc8e9aace04f82c31 Modified Files -- src/include/port/pg_bswap.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Extend & revamp pg_bswap.h infrastructure.
On 2017-09-29 17:33:51 -0700, Peter Geoghegan wrote: > On Fri, Sep 29, 2017 at 5:29 PM, Andres Freund <and...@anarazel.de> wrote: > > Extend & revamp pg_bswap.h infrastructure. > > This looks wrong: > > +static inline uint16 > +pg_bswap64(uint16 x) > +{ Yea, just noticed that :(. Running test, and pushing. Thanks for checking! Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Fix typo.
Fix typo. Reported-By: Thomas Munro and Jesper Pedersen Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/f14241236ea2e306dc665635665c7f88669b6ca4 Modified Files -- src/include/utils/hashutils.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Extend & revamp pg_bswap.h infrastructure.
Extend & revamp pg_bswap.h infrastructure. Upcoming patches are going to address performance issues that involve slow system provided ntohs/htons etc. To address that expand pg_bswap.h to provide pg_ntoh{16,32,64}, pg_hton{16,32,64} and optimize their respective implementations by using compiler intrinsics for gcc compatible compilers and msvc. Fall back to manual implementations using shifts etc otherwise. Additionally remove multiple evaluation hazards from the existing BSWAP32/64 macros, by replacing them with inline functions when necessary. In the course of that the naming scheme is changed to pg_bswap16/32/64. Author: Andres Freund Discussion: https://postgr.es/m/20170927172019.gheidqy6xvlxb...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/510b8cbff15fcece246f66f2273ccf830a6c7e98 Modified Files -- config/c-compiler.m4| 17 ++ configure | 24 configure.in| 1 + contrib/btree_gist/btree_uuid.c | 4 +- src/include/pg_config.h.in | 3 + src/include/pg_config.h.win32 | 3 + src/include/port/pg_bswap.h | 132 src/include/port/pg_crc32c.h| 2 +- 8 files changed, 159 insertions(+), 27 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Remove pgbench "progress" test pending solution of its timing is
Hi, On 2017-09-23 19:57:40 +0200, Fabien COELHO wrote: > According to the trace, the host was so loaded that it could not do anything > for over two seconds, so that the first output is "progress: 2.6", i.e. the > 0 thread did not get any significant time for the first 2.6 seconds... of a > 1 second test. Hmmm, not very kind. > > Somehow this cannot be helped: if the system does not give any execution > time to some pgbench thread, the expected output cannot be there. skink runs postgres under valgrind, so it's going to be slower. > If the test must run even when postgres doesn't, it is a little bit hard as > a starting assumption for a benchmarking tool:-( I'm unsure what the point of this is. It's not like we discussing removing pgbench, we're talking about an unreliable test. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Add inline murmurhash32(uint32) function.
Add inline murmurhash32(uint32) function. The function already existed in tidbitmap.c but more users requiring fast hashing of 32bit ints are coming up. Author: Andres Freund Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7l...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/791961f59b792fbd4f0a992d3ccab47298e79103 Modified Files -- src/backend/nodes/tidbitmap.c | 20 ++-- src/include/utils/hashutils.h | 18 ++ 2 files changed, 20 insertions(+), 18 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Fix s/intidb/initdb/ typo.
Fix s/intidb/initdb/ typo. Reported-By: Michael Paquier Discussion: https://postgr.es/m/CAB7nPqTfaKAYZ4wuUM-W8kc4VnXrxX1=5-a9i==vouptmfp...@mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/f9583e86b4bfa8c4e4d83ab33e5dcdaeab5c45a1 Modified Files -- src/include/pg_config_manual.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Expand expected output for recovery test even further.
Expand expected output for recovery test even further. I'd assumed that the backend being killed should be able to get out an error message - but it turns out it's not guaranteed that it's not still sending a ready-for-query. Really need to do something about getting these error message to the client. Reported-By: Thomas Munro, Tom Lane Discussion: https://postgr.es/m/CAEepm=0TE90nded+bNthP45_PEvGAAr=3gxhhjobl4xmolt...@mail.gmail.com https://postgr.es/m/14968.1506101...@sss.pgh.pa.us Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/8d926029e817d280b2376433e3aaa3895e1a7128 Modified Files -- src/test/recovery/t/013_crash_restart.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Accept that server might not be able to send error in crash reco
On 2017-09-22 13:30:14 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Accept that server might not be able to send error in crash recovery test. > > piculet says you didn't cover all the bases: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet=2017-09-22%2017%3A10%3A01 > > Looks like you missed adding the "server closed the connection > unexpectedly" alternative to one of the three places. I'd hoped^Wassumed it couldn't happen in that case. I'll expand the expected output. FWIW, I kinda like the new debugging output for this - makes it much easier to diagnose. Besides this individual test, I think we really need to do something that increases the likelihood of getting these error messages to clients. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Make WAL segment size configurable at initdb time.
Make WAL segment size configurable at initdb time. For performance reasons a larger segment size than the default 16MB can be useful. A larger segment size has two main benefits: Firstly, in setups using archiving, it makes it easier to write scripts that can keep up with higher amounts of WAL, secondly, the WAL has to be written and synced to disk less frequently. But at the same time large segment size are disadvantageous for smaller databases. So far the segment size had to be configured at compile time, often making it unrealistic to choose one fitting to a particularly load. Therefore change it to a initdb time setting. This includes a breaking changes to the xlogreader.h API, which now requires the current segment size to be configured. For that and similar reasons a number of binaries had to be taught how to recognize the current segment size. Author: Beena Emerson, editorialized by Andres Freund Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja Discussion: https://postgr.es/m/caog9apeacq--1iekbhfzxsqpw_ylmepaa4hndny5+zulpt8...@mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/fc49e24fa69a15efacd5b8958115ed9c43c48f9a Modified Files -- configure | 54 - configure.in| 31 --- contrib/pg_standby/pg_standby.c | 115 +-- doc/src/sgml/backup.sgml| 2 +- doc/src/sgml/installation.sgml | 14 -- doc/src/sgml/ref/initdb.sgml| 15 ++ doc/src/sgml/wal.sgml | 13 +- src/backend/access/transam/twophase.c | 3 +- src/backend/access/transam/xlog.c | 255 ++-- src/backend/access/transam/xlogarchive.c| 14 +- src/backend/access/transam/xlogfuncs.c | 10 +- src/backend/access/transam/xlogreader.c | 32 +-- src/backend/access/transam/xlogutils.c | 36 ++-- src/backend/bootstrap/bootstrap.c | 15 +- src/backend/postmaster/checkpointer.c | 5 +- src/backend/replication/basebackup.c| 34 ++-- src/backend/replication/logical/logical.c | 2 +- src/backend/replication/logical/reorderbuffer.c | 19 +- src/backend/replication/slot.c | 2 +- src/backend/replication/walreceiver.c | 14 +- src/backend/replication/walreceiverfuncs.c | 4 +- src/backend/replication/walsender.c | 16 +- src/backend/utils/misc/guc.c| 20 +- src/backend/utils/misc/pg_controldata.c | 5 +- src/backend/utils/misc/postgresql.conf.sample | 2 +- src/bin/initdb/initdb.c | 58 +- src/bin/pg_basebackup/pg_basebackup.c | 7 +- src/bin/pg_basebackup/pg_receivewal.c | 16 +- src/bin/pg_basebackup/receivelog.c | 36 ++-- src/bin/pg_basebackup/streamutil.c | 76 +++ src/bin/pg_basebackup/streamutil.h | 2 + src/bin/pg_controldata/pg_controldata.c | 15 +- src/bin/pg_resetwal/pg_resetwal.c | 55 +++-- src/bin/pg_rewind/parsexlog.c | 30 ++- src/bin/pg_rewind/pg_rewind.c | 12 +- src/bin/pg_rewind/pg_rewind.h | 1 + src/bin/pg_test_fsync/pg_test_fsync.c | 7 +- src/bin/pg_upgrade/test.sh | 4 +- src/bin/pg_waldump/pg_waldump.c | 246 --- src/include/access/xlog.h | 1 + src/include/access/xlog_internal.h | 76 --- src/include/access/xlogreader.h | 8 +- src/include/catalog/pg_control.h| 2 +- src/include/pg_config.h.in | 5 - src/include/pg_config_manual.h | 6 + src/tools/msvc/Solution.pm | 2 - 46 files changed, 897 insertions(+), 500 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Accept that server might not be able to send error in crash reco
Accept that server might not be able to send error in crash recovery test. As it turns out we can't rely that the script's monitoring session is terminated with a proper error by the server, because the session might be terminated while already trying to send data. Also improve robustness and error reporting facilities of the test, developed while debugging this issue. Discussion: https://postgr.es/m/20170920020038.kllxgilo7xzwm...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/5ada1fcd0c30be1b0b793a802cf6da386a6c1925 Modified Files -- src/test/recovery/t/013_crash_restart.pl | 98 1 file changed, 74 insertions(+), 24 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.
On 2017-09-19 19:00:38 -0700, Andres Freund wrote: > Given this fact pattern, I'll allow the case without a received error > message in the recovery test. Objections? Hearing none. Pushed. While debugging this, I've also introduced a pump wrapper so that we now get: ok 4 - exactly one process killed with SIGQUIT # aborting wait: program died # stream contents: >>psql::9: WARNING: terminating connection because of crash of another server process # DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. # HINT: In a moment you should be able to reconnect to the database and repeat your command. # psql::9: server closed the connection unexpectedly # This probably means the server terminated abnormally # before or while processing the request. # psql::9: connection to server was lost # << # pattern searched for: (?^m:MAYDAY: terminating connection because of crash of another server process) not ok 5 - psql query died successfully after SIGQUIT Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.
On 2017-09-19 18:06:29 -0700, Andres Freund wrote: > On 2017-09-19 16:46:58 -0400, Tom Lane wrote: > > Have we forgotten an fflush() or something? > > After hacking a fix for my previous theory, I started adding strace into > the mix, to verify this. Takes longer to reproduce, but after filtering > to -e trace=network, I got this: > > socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3 > connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3 > connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > socket(AF_UNIX, SOCK_STREAM, 0) = 3 > connect(3, {sa_family=AF_UNIX, sun_path="/tmp/EDkYotgk3u/.s.PGSQL.57230"}, > 110) = 0 > getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 > getsockname(3, {sa_family=AF_UNIX}, [128->2]) = 0 > sendto(3, "\0\0\0O\0\3\0\0user\0andres\0database\0pos"..., 79, MSG_NOSIGNAL, > NULL, 0) = 79 > recvfrom(3, "R\0\0\0\10\0\0\0\0S\0\0\0,application_name\0t"..., 16384, 0, > NULL, NULL) = 340 > sendto(3, "Q\0\0\0\37SELECT $$psql-connected$$;\0", 32, MSG_NOSIGNAL, NULL, > 0) = 32 > recvfrom(3, > "T\0\0\0!\0\1?column?\0\0\0\0\0\0\0\0\0\0\31\377\377\377\377\377\377"..., > 16384, 0, NULL, NULL) = 79 > sendto(3, "Q\0\0\0\33SELECT pg_sleep(3600);\0", 28, MSG_NOSIGNAL, NULL, 0) = > 28 > recvfrom(3, 0x555817dae2a0, 16384, 0, NULL, NULL) = -1 ECONNRESET (Connection > reset by peer) > +++ exited with 2 +++ > > So indeed, we got a connreset before receiving the proper error > message. > > The corresponding server log (debug3): > 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 730 > 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 716 > 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 715 > 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 717 > 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 718 > 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 719 > 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl WARNING: > terminating connection because of crash of another server process > 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DETAIL: The > postmaster has commanded this server process to roll back the current t > ransaction and exit, because another server process exited abnormally and > possibly corrupted shared memory. > 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl HINT: In a moment > you should be able to reconnect to the database and repeat your c > ommand. > 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG: > shmem_exit(-1): 0 before_shmem_exit callbacks to make > 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG: > shmem_exit(-1): 0 on_shmem_exit callbacks to make > 2017-09-20 00:57:00.573 UTC [720] DEBUG: shmem_exit(-1): 0 before_shmem_exit > callbacks to make > 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG: > proc_exit(-1): 0 callbacks to make > ... > 2017-09-20 00:57:00.577 UTC [713] DEBUG: server process (PID 730) exited > with exit code 2 > 2017-09-20 00:57:00.577 UTC [713] DETAIL: Failed process was running: SELECT > pg_sleep(3600); > 2017-09-20 00:57:00.577 UTC [713] LOG: all server processes terminated; > reinitializing > > So the server indeed was killed by SIGQUIT, not an escalation to > SIGKILL. And it output stuff to the server log, and didn't complain > about communication to the client... Odd. Hah! The reason for that is that socket_flush tries to avoid doing stuff recursively: static int socket_flush(void) { int res; /* No-op if reentrant call */ if (PqCommBusy) return 0; ... (detected by putting an elog(COMMERROR) there) adding an abort shows the following backtrace: #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x7f089de4e42a in __GI_abort () at abort.c:89 #2 0x56473218a3f6 in socket_flush () at /home/andres/src/postgresql/src/backend/libpq/pqcomm.c:1408 #3 0x56473246e4ec in send_message_to_frontend (edata=0x5647329e34e0 ) at /home/andres/src/postgresql/src/backend/utils/error/elog.c:3319 #4 0x56473246ad02 in EmitErrorReport () at /home/andres/src/postgresql/src/backend/utils/error/elog.c:1483 #5 0x5647324680af in errfinish (dummy=0) at /home/andres/src/postgresql/src/backend/utils/error/elog.c:483 #6 0x5647322fb340 in quickdie (postgres_signal_arg=3) at /home/andres/src/postgresql/src/backend/tcop/postgres.c:2608
Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.
On 2017-09-19 16:46:58 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > So this is geniuinely interesting. When the machine is really loaded (as > > in 6 animals running on a vm at the same time, incuding valgrind), psql > > sometimes doesn't get the WARNING message from a shutdown. Instead it > > gets > > # psql::3: server closed the connection unexpectedly > > # This probably means the server terminated abnormally > > # before or while processing the request. > > # psql::3: connection to server was lost > > That seems pretty weird. Maybe it's not the same case, but in > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae=2017-09-19%2020%3A10%3A02 > > you can see from the postmaster log that the backend *is* issuing > the message, or at least it's getting to the server log: > > 2017-09-19 20:20:34.476 UTC [6363] [unknown] LOG: connection received: > host=[local] > 2017-09-19 20:20:34.477 UTC [6363] [unknown] LOG: connection authorized: > user=andres database=postgres > 2017-09-19 20:20:34.478 UTC [6363] t/013_crash_restart.pl LOG: statement: > SELECT $$psql-connected$$; > ... > 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl WARNING: > terminating connection because of crash of another server process > 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server process exited abnormally and > possibly corrupted shared memory. > 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl HINT: In a moment > you should be able to reconnect to the database and repeat your command. > > Have we forgotten an fflush() or something? After hacking a fix for my previous theory, I started adding strace into the mix, to verify this. Takes longer to reproduce, but after filtering to -e trace=network, I got this: socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) socket(AF_UNIX, SOCK_STREAM, 0) = 3 connect(3, {sa_family=AF_UNIX, sun_path="/tmp/EDkYotgk3u/.s.PGSQL.57230"}, 110) = 0 getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 getsockname(3, {sa_family=AF_UNIX}, [128->2]) = 0 sendto(3, "\0\0\0O\0\3\0\0user\0andres\0database\0pos"..., 79, MSG_NOSIGNAL, NULL, 0) = 79 recvfrom(3, "R\0\0\0\10\0\0\0\0S\0\0\0,application_name\0t"..., 16384, 0, NULL, NULL) = 340 sendto(3, "Q\0\0\0\37SELECT $$psql-connected$$;\0", 32, MSG_NOSIGNAL, NULL, 0) = 32 recvfrom(3, "T\0\0\0!\0\1?column?\0\0\0\0\0\0\0\0\0\0\31\377\377\377\377\377\377"..., 16384, 0, NULL, NULL) = 79 sendto(3, "Q\0\0\0\33SELECT pg_sleep(3600);\0", 28, MSG_NOSIGNAL, NULL, 0) = 28 recvfrom(3, 0x555817dae2a0, 16384, 0, NULL, NULL) = -1 ECONNRESET (Connection reset by peer) +++ exited with 2 +++ So indeed, we got a connreset before receiving the proper error message. The corresponding server log (debug3): 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 730 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 716 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 715 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 717 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 718 2017-09-20 00:57:00.573 UTC [713] DEBUG: sending SIGQUIT to process 719 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl WARNING: terminating connection because of crash of another server process 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DETAIL: The postmaster has commanded this server process to roll back the current t ransaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl HINT: In a moment you should be able to reconnect to the database and repeat your c ommand. 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2017-09-20 00:57:00.573 UTC [720] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2017-09-20 00:57:00.573 UTC [730] t/013_crash_restart.pl DEBUG: proc_exit(-1): 0 callbacks to make ... 2017-09-20 00:57:00.577 UTC [713] DEBUG: server process (PID 730) exited with exit code 2 2017-09-20 00:57:00.577 UTC [713] DETAIL: Failed process
[COMMITTERS] pgsql: s/NULL byte/NUL byte/ in comment refering to C string terminator
s/NULL byte/NUL byte/ in comment refering to C string terminator. Reported-By: Robert Haas Discussion: https://postgr.es/m/CA+Tgmoa+YBvWgFST2NVoeXjVSohEpK=vqnvcsockhtvvxfl...@mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/896537f078ba4d709ce754ecaff8350fd55bdfd8 Modified Files -- src/backend/postmaster/pgstat.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.
On 2017-09-19 13:53:18 -0700, Andres Freund wrote: > On 2017-09-19 16:46:58 -0400, Tom Lane wrote: > > Have we forgotten an fflush() or something? > > > > Also, maybe problem is on client side. I vaguely recall a libpq bug > > wherein it would complain about socket EOF even though data remained > > to be processed. Maybe we reintroduced something like that? > > That seems quite possible. Preface: This is largely a as of yet unverified theory. But seems worthwhile to investigate whether it's the cause of this issue or not. By changing the error message I know that the "server closed the connection unexpectedly" ERROR is coming from pqsecure_raw_read(). The caller here has to be pqReadData(). Where have the following block: if (nread > 0) { conn->inEnd += nread; /* * Hack to deal with the fact that some kernels will only give us back * 1 packet per recv() call, even if we asked for more and there is * more available. If it looks like we are reading a long message, * loop back to recv() again immediately, until we run out of data or * buffer space. Without this, the block-and-restart behavior of * libpq's higher levels leads to O(N^2) performance on long messages. * * Since we left-justified the data above, conn->inEnd gives the * amount of data already read in the current message. We consider * the message "long" once we have acquired 32k ...B */ if (conn->inEnd > 32768 && (conn->inBufSize - conn->inEnd) >= 8192) { someread = 1; goto retry3; } return 1; } So imagine we've just read one block containing the error message from the server. That's going to be less than 32kb. So we go into the retry3 path. /* OK, try to read some data */ retry3: nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd, conn->inBufSize - conn->inEnd); if (nread < 0) { if (SOCK_ERRNO == EINTR) goto retry3; /* Some systems return EAGAIN/EWOULDBLOCK for no data */ #ifdef EAGAIN if (SOCK_ERRNO == EAGAIN) return someread; #endif #if defined(EWOULDBLOCK) && (!defined(EAGAIN) || (EWOULDBLOCK != EAGAIN)) if (SOCK_ERRNO == EWOULDBLOCK) return someread; #endif /* We might get ECONNRESET here if using TCP and backend died */ #ifdef ECONNRESET if (SOCK_ERRNO == ECONNRESET) goto definitelyFailed; #endif /* pqsecure_read set the error message for us */ return -1; } So if the connection actually was closed we: definitelyFailed: /* Do *not* drop any already-read data; caller still wants it */ pqDropConnection(conn, false); conn->status = CONNECTION_BAD; /* No more connection to backend */ return -1; and PQgetResult() will happily signal that upwards with an error: /* Wait for some more data, and load it. */ if (flushResult || pqWait(TRUE, FALSE, conn) || pqReadData(conn) < 0) { /* * conn->errorMessage has been set by pqWait or pqReadData. We * want to append it to any already-received error message. */ pqSaveErrorResult(conn); conn->asyncStatus = PGASYNC_IDLE; return pqPrepareAsyncResult(conn); ISTM, we need to react differently if we've already partially read data successfully? Am I missing something? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Avoid use of non-portable strnlen() in pgstat_clip_activity().
Avoid use of non-portable strnlen() in pgstat_clip_activity(). The use of strnlen rather than strlen was just paranoia. Instead of giving up on the paranoia, just implement the safeguard differently. And add a comment explaining why we're careful. Author: Andres Freund Discussion: https://postgr.es/m/e1duokj-0001mc...@gemulon.postgresql.org Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/71edbb6f66f7139d6209334ef8734a122ba06b56 Modified Files -- src/backend/postmaster/pgstat.c | 25 + src/include/pgstat.h| 2 +- 2 files changed, 22 insertions(+), 5 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Speedup pgstat_report_activity by moving mb-aware truncation to
On 2017-09-19 16:53:09 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2017-09-19 16:25:31 -0400, Tom Lane wrote: > >> Looks like you're going to need to not depend on strnlen(). > > > Yup noticed - we have pg_strnlen as a static function in > > src/port/snprintf. I'm tempted to just make that public, rather than > > open code the equivalent? > > No, we should not tie this to whether we're using port/snprintf.c. Oh, I would have moved it to src/common/string.c, sorry for not mentioning that. > Personally, I don't see why this isn't coded more like > > int rawlen = strlen(activity); > > rawlen = Min(rawlen, pgstat_track_activity_query_size - 1); Pure paranoia, but probably not justified paranoia. We take care that there's always a trailing NULL byte, even under concurrent modifications (which exist due to pgstat_get_backend_current_activity()). I'll just change things so that the pnstrdup() happens first, and then do a plain strlen() on that. Same paranoia, less cost. > It seems likely to me that strlen() is going to be optimized a lot > harder than strnlen() on most platforms, so that this way would win > not only for strings up to pgstat_track_activity_query_size but for > strings some way beyond that. Yes, for really really long queries > the strnlen way would win, but that's optimizing for the wrong case IMO. I wasn't thinking of optimizing this - and I've a hard time seing a case where even the naive snprintf pg_strnlen(), or a plain strlen() would be relevant performancewise. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.
On 2017-09-19 16:46:58 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > So this is geniuinely interesting. When the machine is really loaded (as > > in 6 animals running on a vm at the same time, incuding valgrind), psql > > sometimes doesn't get the WARNING message from a shutdown. Instead it > > gets > > # psql::3: server closed the connection unexpectedly > > # This probably means the server terminated abnormally > > # before or while processing the request. > > # psql::3: connection to server was lost > > That seems pretty weird. Maybe it's not the same case, but in > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae=2017-09-19%2020%3A10%3A02 > > you can see from the postmaster log that the backend *is* issuing > the message, or at least it's getting to the server log: > > 2017-09-19 20:20:34.476 UTC [6363] [unknown] LOG: connection received: > host=[local] > 2017-09-19 20:20:34.477 UTC [6363] [unknown] LOG: connection authorized: > user=andres database=postgres > 2017-09-19 20:20:34.478 UTC [6363] t/013_crash_restart.pl LOG: statement: > SELECT $$psql-connected$$; > ... > 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl WARNING: > terminating connection because of crash of another server process > 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl DETAIL: The > postmaster has commanded this server process to roll back the current > transaction and exit, because another server process exited abnormally and > possibly corrupted shared memory. > 2017-09-19 20:20:34.485 UTC [6363] t/013_crash_restart.pl HINT: In a moment > you should be able to reconnect to the database and repeat your command. I think it's likely the same - I've observed the same with the added instrumentation. > Have we forgotten an fflush() or something? > > Also, maybe problem is on client side. I vaguely recall a libpq bug > wherein it would complain about socket EOF even though data remained > to be processed. Maybe we reintroduced something like that? That seems quite possible. > > We can obviously easily make the test accept both - but are we ok with > > the client sometimes not getting the message? > > I'm not ... Same here. I'll see if I can spot the bug in an hour or two. If not I'll make the test temporarily accept both outputs while investigating? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Speedup pgstat_report_activity by moving mb-aware truncation to
On 2017-09-19 16:25:31 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Speedup pgstat_report_activity by moving mb-aware truncation to read side. > > Looks like you're going to need to not depend on strnlen(). Yup noticed - we have pg_strnlen as a static function in src/port/snprintf. I'm tempted to just make that public, rather than open code the equivalent? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.
On 2017-09-19 15:24:49 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > Checkining on calliphoridae why that's not sufficient - the machine's > > busy, so the build & test will take a bit. > > FWIW, prairiedog got through the recovery tests this time --- run's > still going though. So this is geniuinely interesting. When the machine is really loaded (as in 6 animals running on a vm at the same time, incuding valgrind), psql sometimes doesn't get the WARNING message from a shutdown. Instead it gets # psql::3: server closed the connection unexpectedly # This probably means the server terminated abnormally # before or while processing the request. # psql::3: connection to server was lost We can obviously easily make the test accept both - but are we ok with the client sometimes not getting the message? Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Speedup pgstat_report_activity by moving mb-aware truncation to
Speedup pgstat_report_activity by moving mb-aware truncation to read side. Previously multi-byte aware truncation was done on every pgstat_report_activity() call - proving to be a bottleneck for workloads with long query strings that execute quickly. Instead move the truncation to the read side, which commonly is executed far less frequently. That's possible because all server encodings allow to determine the length of a multi-byte string from the first byte. Rename PgBackendStatus.st_activity to st_activity_raw so existing extension users of the field break - their code has to be adjusted to use pgstat_clip_activity(). Author: Andres Freund Tested-By: Khuntal Ghosh Reviewed-By: Robert Haas, Tom Lane Discussion: https://postgr.es/m/20170912071948.pa7igbpkkkvie...@alap3.anarazel.de Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/54b6cd589ac2f5635a42511236a5eb7299e2dcaf Modified Files -- src/backend/postmaster/pgstat.c | 63 - src/backend/utils/adt/pgstatfuncs.c | 17 +++--- src/include/pgstat.h| 12 +-- 3 files changed, 72 insertions(+), 20 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Make new crash restart test a bit more robust.
On 2017-09-19 17:40:20 +, Andres Freund wrote: > Make new crash restart test a bit more robust. > > Add timeouts in case psql doesn't deliver the expected output, and try > to cause the monitoring psql to be fully connected to a backend. This > isn't necessarily everything needed, but at least the timeouts should > reduce the pain for buildfarm owners. > > Author: Andres Freund > Reported-By: Tom Lane, BF animals prairiedog and calliphoridae > Discussion: https://postgr.es/m/e1du6zt-00043i...@gemulon.postgresql.org Checkining on calliphoridae why that's not sufficient - the machine's busy, so the build & test will take a bit. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.
On 2017-09-19 09:58:26 -0700, Andres Freund wrote: > > > On September 19, 2017 9:53:28 AM PDT, Tom Lane <t...@sss.pgh.pa.us> wrote: > >Well, please fix it ASAP, if you don't want to take it out pending > >the fixes. > > Will as soon as I finished my morning coffee. Uncaffeinated, which my phone > fittingly autocorrects to unvaccinated, commits aren't a good idea. Done. Survived ~100 cycles here locally, while running make -j16 -s check-world in two parallel branches. But I have a fast disk, so it's not comparable. If the buildfarm doesn't complain about the use of IPC::Run's timeout functionality, we should probably patch that into the other use of IPC::Run as well, but especially into the other user of the pump() until ... scheme. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Make new crash restart test a bit more robust.
Make new crash restart test a bit more robust. Add timeouts in case psql doesn't deliver the expected output, and try to cause the monitoring psql to be fully connected to a backend. This isn't necessarily everything needed, but at least the timeouts should reduce the pain for buildfarm owners. Author: Andres Freund Reported-By: Tom Lane, BF animals prairiedog and calliphoridae Discussion: https://postgr.es/m/e1du6zt-00043i...@gemulon.postgresql.org Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/1910353675bd149e1020b29c0fae02538fc358cd Modified Files -- src/test/recovery/t/013_crash_restart.pl | 34 1 file changed, 21 insertions(+), 13 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.
On September 19, 2017 9:53:28 AM PDT, Tom Lanewrote: >Well, please fix it ASAP, if you don't want to take it out pending >the fixes. Will as soon as I finished my morning coffee. Uncaffeinated, which my phone fittingly autocorrects to unvaccinated, commits aren't a good idea. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.
Hi, On 2017-09-19 12:13:54 -0400, Tom Lane wrote: > IOW, the "$monitor" instance of psql did not complete making its > connection until after the crash/restart cycle had occurred. That'd be easy enough to fix... Just something like $monitor_stdin .= q[ SELECT $$am-i-up$$; ]; $monitor->pump until $monitor_stdout =~ /am-i-up/; $monitor_stdout = ''; > So we're just sitting there waiting for a crash report that won't > come. Which is another very serious deficiency in this test: > lacking any sort of timeout, it will just freeze indefinitely > if anything doesn't happen exactly the way it expects. From a > buildfarm owner's standpoint, that's pretty damn unfriendly. > It means having to manually unwedge your animals from time to time. Note that I just copied the code for that from another test - this is isn't unique to this test. I agree that it'd be good to add a timeout to those pump calls. > I'd like to ask you to revert this test, at least pending making > it a whole lot more bulletproof. Hm. Ok. That seems like an overreaction to me - the failure rate isn't actualy that high so far. I'm happy to add both timeouts and "earlier startup" of the $monitor, but I'd prefer to do so in-tree - I'd run the test through 100+ iterations locally, without any of this showing up. > We don't really need crash recovery testing in the buildfarm IMO --- > we hackers crash the system plenty often enough to notice problems > there. I for one don't exercise that kind of crash restarts, my development scripts all work with restart_after_crash = false. What I find more concerning however is coverage of EXEC_BACKEND, which has far fewer developers actively running it constantly. Greetings, Andres Freund -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts.
On September 18, 2017 8:55:35 PM PDT, Tom Lane <t...@sss.pgh.pa.us> wrote: >Andres Freund <and...@anarazel.de> writes: >> Add test for postmaster crash restarts. > >Hm, calliphoridae doesn't like this. Yea. Not clear to me why yet. The machine ran a number of instances with nearly the same config successfully. Can't imagine that copyparse makes a difference here. I suspect it's somehow load related... Ran a good number of iterations locally, didn't reproduce, even under high load. Think I'll add bit more error reporting. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Rearm statement_timeout after each executed query.
Rearm statement_timeout after each executed query. Previously statement_timeout, in the extended protocol, affected all messages till a Sync message. For clients that pipeline/batch query execution that's problematic. Instead disable timeout after each Execute message, and enable, if necessary, the timer in start_xact_command(). As that's done only for Execute and not Parse / Bind, pipelining the latter two could still cause undesirable timeouts. But a survey of protocol implementations shows that all drivers issue Sync messages when preparing, and adding timeout rearming to both is fairly expensive for the common parse / bind / execute sequence. Author: Tatsuo Ishii, editorialized by Andres Freund Reviewed-By: Takayuki Tsunakawa, Andres Freund Discussion: https://postgr.es/m/20170222.115044.1665674502985097185.t-is...@sraoss.co.jp Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/f8e5f156b30efee5d0038b03e38735773abcb7ed Modified Files -- src/backend/tcop/postgres.c | 77 ++--- 1 file changed, 65 insertions(+), 12 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Fix uninitialized variable in dshash.c.
Fix uninitialized variable in dshash.c. A bugfix for commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20. The code would have crashed if hashtable->size_log2 ever had the same value as hashtable->control->size_log2 by coincidence. Per Valgrind. Author: Thomas Munro Reported-By: Tomas Vondra Discussion: https://postgr.es/m/e72fb33c-4f31-f276-e972-263d9b59554d%402ndquadrant.com Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/0fb9e4ace5ce4d479d839a720f32b99fdc87f455 Modified Files -- src/backend/lib/dshash.c | 9 + 1 file changed, 9 insertions(+) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers
[COMMITTERS] pgsql: Fix crash restart bug introduced in 8356753c212.
Fix crash restart bug introduced in 8356753c212. The bug was caused by not re-reading the control file during crash recovery restarts, which lead to an attempt to pfree() shared memory contents. The fix is to re-read the control file, which seems good anyway. It's unclear as of this moment, whether we want to keep the refactoring introduced in the commit referenced above, or come up with an alternative approach. But fixing the bug in the mean time seems like a good idea regardless. A followup commit will introduce regression test coverage for crash restarts. Reported-By: Tom Lane Discussion: https://postgr.es/m/14134.1505572...@sss.pgh.pa.us Branch -- master Details --- https://git.postgresql.org/pg/commitdiff/ec9e05b3c392ba9587f283507459737684539574 Modified Files -- src/backend/access/transam/xlog.c | 44 +++-- src/backend/postmaster/postmaster.c | 13 --- src/backend/tcop/postgres.c | 2 +- src/include/access/xlog.h | 2 +- 4 files changed, 39 insertions(+), 22 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers