date:20190407

Re: tableam scan-API patch broke foreign key validation

2019-04-07 Thread Andres Freund

Hi,

On 2019-04-06 14:43:26 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2019-04-06 14:34:34 -0400, Tom Lane wrote:
> >> Why should this code need to free anything?  That'd be the responsibility
> >> of the slot code, no?
> 
> > Well, not really. If a slot doesn't hold heap tuples internally,
> > ExecFetchSlotHeapTuple() will return a fresh heap tuple (but signal so
> > by setting *should_free = true if not NULL).
> 
> Ah, got it: ignoring should_free is indeed a potential issue here.

I've pushed a revised version of my earlier patch adding a memory
context that's reset after each tuple.

Greetings,

Andres Freund

Re: Shouldn't validateForeignKeyConstraint() reset memory context?

2019-04-07 Thread Andres Freund

Hi,

On 2018-07-11 11:56:28 -0700, Andres Freund wrote:
> while looking at the pluggable storage patch I noticed that
> validateForeignKeyConstraint() calls RI_FKey_check() for each row
> without resetting a memory / expression context.  There's not too much
> leakage in the called code, but there's some I think.

This has been fixed since. See 
https://www.postgresql.org/message-id/20190408055302.qirqcbnhjxwsxq3m%40alap3.anarazel.de

- Andres

Re: Assert failure when validating foreign keys

2019-04-07 Thread Andres Freund

Hi,

On 2019-03-25 11:04:05 -0700, Andres Freund wrote:
> Hi,
> 
> On 2019-03-24 23:54:53 +1300, David Rowley wrote:
> > This results in an Assert failure on master and an elog ERROR prior to
> > c2fe139c201:
> > 
> > create role test_role with login;
> > create table ref(a int primary key);
> > grant references on ref to test_role;
> > set role test_role;
> > create table t1(a int, b int);
> > insert into t1 values(1,1);
> > alter table t1 add constraint t1_b_key foreign key (b) references ref(a);
> > server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> > The connection to the server was lost. Attempting reset: Failed.
> > 
> > Fails in heapam_tuple_satisfies_snapshot() at
> > Assert(BufferIsValid(bslot->buffer));
> > 
> > c2fe139c201~1:
> > ERROR:  expected buffer tuple
> > 
> > The test case is just a variation of the case in [1], but a different
> > bug, so reporting it on a different thread.
> > 
> > I've not looked into the cause or when it started happening.
> 
> I think the cause is stupidity of mine. In
> validateForeignKeyConstraint() I passed true to the materialize argument
> of ExecFetchSlotHeapTuple(). Which therefore is made independent of
> buffers. Which this assert then notices.  Just changing that to false,
> which is correct, fixes the issue for me.
> 
> I'm a bit confused as to how we have no tests for this code?  Is it just
> that the left join codepath is "too good"?
> 
> I've also noticed that we should free the tuple - that doesn't matter
> for heap, but it sure does for other callers.  But uh, is it actually ok
> to validate an entire table's worth of foreign keys without a memory
> context reset? I.e. shouldn't we have a memory context that we reset
> after each iteration?
> 
> Also, why's there no CHECK_FOR_INTERRUPTS()? heap has some internally on
> a page level, but that doesn't seem all that granular?

Tom pushed a part of this earlier in
commit 46e3442c9ec858071d60a1c0fae2e9868aeaa0c8
Author: Tom Lane 
Date:   2019-04-06 15:09:09 -0400

Fix failures in validateForeignKeyConstraint's slow path.

I've now added a fixed version of the memory context portion of this
patch.

Greetings,

Andres Freund

Re: change password_encryption default to scram-sha-256?

2019-04-07 Thread Andres Freund

Hi,

On 2019-04-08 01:34:42 -0400, Tom Lane wrote:
> Michael Paquier  writes:
> > From what I can see, the major drivers not using directly libpq
> > support our SASL protocol: JDBC and npgsql.  However I can count three
> > of them which still don't support it: Crystal, pq (Go) and asyncpg.
> > pq and asyncpg are very popular on github, with at least 3000 stars
> > each, which is a lot I think.  I have also double-checked their source
> > code and I am seeing no trace of SASL or SCRAM, so it seems to me that
> > we may want to wait more before switching the default.
> 
> Perhaps we could reach out to the authors of those libraries,
> and encourage them to provide support in the next year or so?


Seems go/pq might get it soon-ish: https://github.com/lib/pq/pull/833

There doesn't appear to be much movement on the crystal front (
https://github.com/will/crystal-pg/issues/154 ), but I don't think it's
popular enough to really worry.  There's an issue for asyncpg
https://github.com/MagicStack/asyncpg/issues/314 - but not too much
movement either.

Greetings,

Andres Freund

Re: change password_encryption default to scram-sha-256?

2019-04-07 Thread Tom Lane

Michael Paquier  writes:
> From what I can see, the major drivers not using directly libpq
> support our SASL protocol: JDBC and npgsql.  However I can count three
> of them which still don't support it: Crystal, pq (Go) and asyncpg.
> pq and asyncpg are very popular on github, with at least 3000 stars
> each, which is a lot I think.  I have also double-checked their source
> code and I am seeing no trace of SASL or SCRAM, so it seems to me that
> we may want to wait more before switching the default.

Perhaps we could reach out to the authors of those libraries,
and encourage them to provide support in the next year or so?

I don't doubt that switching to scram-sha-256 is a good idea in
the long run.  The idea here was to give driver authors a reasonable
amount of time to update.  I don't really think that one year
counts as a "reasonable amount of time" given how slowly this
project moves overall ... but we don't want to wait forever ...

regards, tom lane

Re: change password_encryption default to scram-sha-256?

2019-04-07 Thread Michael Paquier

On Sun, Apr 07, 2019 at 08:23:06PM +0200, David Fetter wrote:
> Great idea!  Does it make sense to test all, or at least some
> significant fraction of the connectors listed in
> https://wiki.postgresql.org/wiki/Client_Libraries by default?

This is a more interesting list:
https://wiki.postgresql.org/wiki/List_of_drivers

From what I can see, the major drivers not using directly libpq
support our SASL protocol: JDBC and npgsql.  However I can count three
of them which still don't support it: Crystal, pq (Go) and asyncpg.
pq and asyncpg are very popular on github, with at least 3000 stars
each, which is a lot I think.  I have also double-checked their source
code and I am seeing no trace of SASL or SCRAM, so it seems to me that
we may want to wait more before switching the default.
--
Michael


signature.asc
Description: PGP signature

RE: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Tsunakawa, Takayuki

From: Alvaro Herrera [mailto:alvhe...@2ndquadrant.com]
> "vacuum_truncate" gets my vote too.

+1


From: 'Andres Freund' [mailto:and...@anarazel.de]
> Personally I think the name just needs some committer to make a
> call. This largely is going to be used after encountering too many
> cancellations in production, and researching the cause. Minor spelling
> differences don't seem to particularly matter here.

Absolutely.  Thank you.


From: 'Andres Freund' [mailto:and...@anarazel.de]
> I think it needs to be an autovac option. The production issue is that
> autovacuums constantly cancel queries on the standbys despite
> hot_standby_feedback if you have a workload that does frequent
> truncations. If there's no way to configure it in a way that autovacuum
> takes it into account, people will just disable autovacuum and rely
> entirely on manual scripts. That's what already happens - leading to a
> much bigger foot-gun than disabling truncation.  FWIW, people already in
> production use the workaround to configuring snapshot_too_old as that,
> for undocumented reasons, disables trunctations. That's not better
> either.

Right.  We don't want autovacuum to be considered as a criminal.


Regards
Takayuki Tsunakawa

Re: Ordered Partitioned Table Scans

2019-04-07 Thread Amit Langote

On 2019/04/06 18:06, Julien Rouhaud wrote:
> On Sat, Apr 6, 2019 at 2:45 AM David Rowley
>  wrote:
>>
>> On Sat, 6 Apr 2019 at 12:26, Tom Lane  wrote:
>>> Pushed with some hacking, mostly trying to improve the comments.
>>
>> Great! Many thanks for improving those and pushing it.
>>
>> Many thanks to Julien, Antonin for their detailed reviews on this.
>> Thanks Amit for your input on this as well. Much appreciated.
> 
> Thanks!  I'm glad that we'll have this optimization for pg12.

+1, thank you for working on this.  It's nice to see partitioning being
useful to optimize query execution in even more cases.

Regards,
Amit

Re: lazy_scan_heap() forgets to mark buffer dirty when setting all frozen?

2019-04-07 Thread Andres Freund

Hi,

On 2019-04-08 00:48:20 -0400, Alvaro Herrera wrote:
> On 2019-Apr-07, Andres Freund wrote:
> 
> > lazy_scan_heap() contains the following block:
> > 
> > /*
> >  * If the all-visible page is turned out to be all-frozen but 
> > not
> >  * marked, we should so mark it.  Note that all_frozen is only 
> > valid
> >  * if all_visible is true, so we must check both.
> >  */
> > else if (all_visible_according_to_vm && all_visible && 
> > all_frozen &&
> >  !VM_ALL_FROZEN(onerel, blkno, ))
> > {
> > /*
> >  * We can pass InvalidTransactionId as the cutoff XID 
> > here,
> >  * because setting the all-frozen bit doesn't cause 
> > recovery
> >  * conflicts.
> >  */
> > visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
> >   vmbuffer, 
> > InvalidTransactionId,
> >   
> > VISIBILITYMAP_ALL_FROZEN);
> > }
> > 
> > but I'm afraid that's not quite enough.
> 
> Apparently the initial commit a892234f830e had MarkBufferDirty, but it
> was removed one week later by 77a1d1e79892.

Good catch. Kinda looks like it could have been an accidental removal?
Robert?

Greetings,

Andres Freund

Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Alvaro Herrera

On 2019-Apr-08, Tom Lane wrote:

> The closest match to that name, probably, is just "vacuum_truncate" ---
> which was proposed at the beginning of March, but apparently also
> rejected, because there's no subsequent reference.

"vacuum_truncate" gets my vote too.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: lazy_scan_heap() forgets to mark buffer dirty when setting all frozen?

2019-04-07 Thread Alvaro Herrera

On 2019-Apr-07, Andres Freund wrote:

> lazy_scan_heap() contains the following block:
> 
>   /*
>* If the all-visible page is turned out to be all-frozen but 
> not
>* marked, we should so mark it.  Note that all_frozen is only 
> valid
>* if all_visible is true, so we must check both.
>*/
>   else if (all_visible_according_to_vm && all_visible && 
> all_frozen &&
>!VM_ALL_FROZEN(onerel, blkno, ))
>   {
>   /*
>* We can pass InvalidTransactionId as the cutoff XID 
> here,
>* because setting the all-frozen bit doesn't cause 
> recovery
>* conflicts.
>*/
>   visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
> vmbuffer, 
> InvalidTransactionId,
> 
> VISIBILITYMAP_ALL_FROZEN);
>   }
> 
> but I'm afraid that's not quite enough.

Apparently the initial commit a892234f830e had MarkBufferDirty, but it
was removed one week later by 77a1d1e79892.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: selecting from partitions and constraint exclusion

2019-04-07 Thread Amit Langote

Hi Thibaut,

On 2019/04/06 1:12, Thibaut wrote:
> Le 25/03/2019 à 01:31, Amit Langote a écrit :
>> On 2019/03/22 17:17, Amit Langote wrote:
>>> I'll add this to July fest to avoid forgetting about this.
>> I'd forgotten to do this, but done today. :)
>>
>> Thanks,
>> Amit
> 
> Hello Amit,
> 
> Just a quick information that your last patch does not apply on head:
> 
> $ git apply
> ~/Téléchargements/v2-0001-Fix-planner-to-load-partition-constraint-in-some-.patch
> error: patch failed: src/test/regress/expected/partition_prune.out:3637
> error: src/test/regress/expected/partition_prune.out: patch does not apply
> 
> Manually applying it on top of Hosoya's last 2 patches, It corrects the
> different cases we found so far.
> I will keep on testing next week.

Thanks for the heads up.

We are discussing this and another related matter on a different thread
(titled "speeding up planning with partitions" [1]).  Maybe, the problem
originally reported here will get resolved there once we reach consensus
first on what to do in the HEAD branch and what's back-patchable as a
bug-fix to the PG 11 branch.

[1]
https://www.postgresql.org/message-id/50415da6-0258-d135-2ba4-197041b57c5b%40lab.ntt.co.jp

Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Michael Paquier

On Mon, Apr 08, 2019 at 03:56:52AM +, Tsunakawa, Takayuki wrote:
> Consensus on the name seems to use truncate rather than shrink (a
> few poople kindly said they like shrink, and I'm OK with either
> name.)  And there's no dispute on the behavior.  Do you see any
> other point?

I personally much prefer "truncate" than "shrink", which was my
initial opinion in upthread as well.

Please note that the feature freeze will be effective in just a bit
more than 7 hours, as of the 8th of April 0AM on the AoE timezone
(Anywhere on Earth).
--
Michael


signature.asc
Description: PGP signature

Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread 'Andres Freund'

Hi,

On 2019-04-08 00:38:44 -0400, Tom Lane wrote:
> "Tsunakawa, Takayuki"  writes:
> > From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> >> And, as far as I can see from a quick review of the thread,
> >> we don't really have consensus on the names and behaviors.

Personally I think the name just needs some committer to make a
call. This largely is going to be used after encountering too many
cancellations in production, and researching the cause. Minor spelling
differences don't seem to particularly matter here.

> My own dog in this fight is that we shouldn't have the option at all,
> never mind what the name is.  A persistent reloption to disable truncation
> seems like a real foot-gun.  I'd be okay with a VACUUM command option,
> but for some reason that isn't there at all.

I think it needs to be an autovac option. The production issue is that
autovacuums constantly cancel queries on the standbys despite
hot_standby_feedback if you have a workload that does frequent
truncations. If there's no way to configure it in a way that autovacuum
takes it into account, people will just disable autovacuum and rely
entirely on manual scripts. That's what already happens - leading to a
much bigger foot-gun than disabling truncation.  FWIW, people already in
production use the workaround to configuring snapshot_too_old as that,
for undocumented reasons, disables trunctations. That's not better
either.

Greetings,

Andres Freund

Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Tom Lane

"Tsunakawa, Takayuki"  writes:
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
>> And, as far as I can see from a quick review of the thread,
>> we don't really have consensus on the names and behaviors.

> Consensus on the name seems to use truncate rather than shrink (a few poople 
> kindly said they like shrink, and I'm OK with either name.)  And there's no 
> dispute on the behavior.  Do you see any other point?

The last patch uses the name vacuum_truncate_enabled, which so far
as I can see never appeared in the thread before today.  How can
you claim there's consensus for that?

I see references back in February to truncate_enabled and vacuum_enabled,
but there was certainly no consensus for either, seeing how long the
thread has dragged on since then (those references are barely halfway
down the thread).  Pasting them together to make a carpal-tunnel-inducing
name isn't automatically going to satisfy people.

Also, it looks like one of the main bones of contention is whether
the option is named consistently with the index-scan-disable option,
which seems to have ended up named "vacuum_index_cleanup".  I submit
that "vacuum_truncate_enabled" is not consistent with that; it's not
even the same part of speech.

The closest match to that name, probably, is just "vacuum_truncate" ---
which was proposed at the beginning of March, but apparently also
rejected, because there's no subsequent reference.

My own dog in this fight is that we shouldn't have the option at all,
never mind what the name is.  A persistent reloption to disable truncation
seems like a real foot-gun.  I'd be okay with a VACUUM command option,
but for some reason that isn't there at all.

regards, tom lane

Re: Back-branch bugs with fully-prunable UPDATEs

2019-04-07 Thread Amit Langote

On 2019/04/08 1:57, Tom Lane wrote:
> Amit Langote  writes:
>> On Sun, Apr 7, 2019 at 5:28 AM Tom Lane  wrote:
>>> This test script works fine in HEAD:
>>> In v11, it suffers an assertion failure in ExecSetupPartitionTupleRouting.
>>> In v10, it doesn't crash, but we do get
>>> WARNING:  relcache reference leak: relation "parttbl" not closed
> 
>> What we did in the following commit is behind this:
>> commit 58947fbd56d1481a86a03087c81f728fdf0be866
>> Before this commit, partitioning related code in the executor could
>> always rely on the fact that ModifyTableState.resultRelInfo[] only
>> contains *leaf* partitions.  As of this commit, it may contain the
>> root partitioned table in some cases, which breaks that assumption.
> 
> Ah.  Thanks for the diagnosis and patches; pushed.

Thank you.

> I chose to patch HEAD similarly to v11, even though no bug manifests
> right now; it seems safer that way.  We should certainly have the
> test case in HEAD, now that we realize there wasn't coverage for this.

Agreed, thanks for taking care of that.

Regards,
Amit

RE: Timeout parameters

2019-04-07 Thread Nagaura, Ryohei

Hi, Michael-san.

> From: Michael Paquier [mailto:mich...@paquier.xyz]
> I have just committed the GUC and libpq portion for TCP_USER_TIMEOUT after a
> last lookup, and I have cleaned up a couple of places.  It is a bit 
> disappointing
> to see the option supported on Linux, but not on Windows, Solaris and BSD as
> far as I can see.  Still that's useful in itself for more control of the TCP
> connection.  Hopefully I forgot nobody in the review credits.
I confirmed your commitment. Thank you.

> I am marking the CF entry as committed.  In the future, it would be better to
> not propose multiple concepts on the same thread, and if the socket_timeout
> business is resubmitted, I would suggest a completely new CF entry, and a new
> thread.
Thank you for your suggestion.
I think it would be better too.

Best regards,
-
Ryohei Nagaura

Re: ToDo: show size of partitioned table

2019-04-07 Thread Pavel Stehule

ne 7. 4. 2019 v 21:13 odesílatel Alvaro Herrera 
napsal:

> On 2019-Apr-07, Pavel Stehule wrote:
>
> > ne 7. 4. 2019 v 20:27 odesílatel Alvaro Herrera <
> alvhe...@2ndquadrant.com>
> > napsal:
>
> > > In order for this to display sanely, I added the "Parent name" column
> if
> > > either the "n" flag is shown or a pattern is given (previously it only
> > > appeared in the former case).
> >
> > I am thinking about it and original behave and this new behave should be
> > expected (and unexpected too). We can go this way - I have not
> > counter-arguments, but yes, it is more consistent with some other
> commands,
> > pattern disables some other constraints.
> >
> > It should be documented - Using any pattern in this case forces 'n' flag.
>
> Added to the docs, and pushed.
>

Thank you very much


> I couldn't resist tweaking the ORDER BY clause, too.  I think listing
> all tables first, followed by all indexes, and sorting by parent in each
> category, is much easier to read.  (Maybe this can use additional
> tweaking, but it's a minor thing anyway -- for example putting together
> all indexes that correspond to some particular table?)
>
> I noticed that \d never seems to use pg_total_relation_size, so toast
> size is never shown.  I did likewise here too and used pg_table_size
> everywhere.  I'm not 100% sure this is the most convenient thing.  Maybe
> we need yet another column, and/or yet another flag ...?
>

I prefer some flag - both raw size and total size has sense, but another
column will be less readable

>
> Also, I think the new \dP should gain a new flag (maybe "l") to make it
> list leaf tables/indexes too with their local sizes, and remove those
> from the standard \d listing.
>

+1

Pavel

>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

RE: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Tsunakawa, Takayuki

From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> And, as far as I can see from a quick review of the thread,
> we don't really have consensus on the names and behaviors.

Consensus on the name seems to use truncate rather than shrink (a few poople 
kindly said they like shrink, and I'm OK with either name.)  And there's no 
dispute on the behavior.  Do you see any other point?


Regards
Takayuki Tsunakawa

Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Tom Lane

"Tsunakawa, Takayuki"  writes:
> From: Andres Freund [mailto:and...@anarazel.de]
>> I hope you realize feature freeze is in a few hours...

> Ouch!  Could you take care of committing the patch, please?  I wouldn't be 
> able to express the sadness and tiredness just in case this is pushed to 13 
> because of the parameter name...

As far as I can see, the entire intellectual content of this patch
is in the names and behaviors of the user-visible options; there's
certainly no significant amount of new logic outside of that.

And, as far as I can see from a quick review of the thread,
we don't really have consensus on the names and behaviors.

So I think forcing this in a few hours before feature freeze
is a bad idea.  That isn't going to create consensus where
there was none before; it will just annoy people.

regards, tom lane

lazy_scan_heap() forgets to mark buffer dirty when setting all frozen?

2019-04-07 Thread Andres Freund

Hi,

lazy_scan_heap() contains the following block:

/*
 * If the all-visible page is turned out to be all-frozen but 
not
 * marked, we should so mark it.  Note that all_frozen is only 
valid
 * if all_visible is true, so we must check both.
 */
else if (all_visible_according_to_vm && all_visible && 
all_frozen &&
 !VM_ALL_FROZEN(onerel, blkno, ))
{
/*
 * We can pass InvalidTransactionId as the cutoff XID 
here,
 * because setting the all-frozen bit doesn't cause 
recovery
 * conflicts.
 */
visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
  vmbuffer, 
InvalidTransactionId,
  
VISIBILITYMAP_ALL_FROZEN);
}

but I'm afraid that's not quite enough. As an earlier comment explains:


 * NB: If the heap page is all-visible but the VM bit 
is not set,
 * we don't need to dirty the heap page.  However, if 
checksums
 * are enabled, we do need to make sure that the heap 
page is
 * dirtied before passing it to visibilitymap_set(), 
because it
 * may be logged.  Given that this situation should 
only happen in
 * rare cases after a crash, it is not worth optimizing.
 */
PageSetAllVisible(page);
MarkBufferDirty(buf);
visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
  vmbuffer, 
visibility_cutoff_xid, flags);

don't we need to do that here too? visibilitymap_set() does:

/*
 * If data checksums are enabled (or 
wal_log_hints=on), we
 * need to protect the heap page from being 
torn.
 */
if (XLogHintBitIsNeeded())
{
PageheapPage = 
BufferGetPage(heapBuf);

/* caller is expected to set 
PD_ALL_VISIBLE first */
Assert(PageIsAllVisible(heapPage));
PageSetLSN(heapPage, recptr);
}

i.e. it actually modifies the page when checksums/wal hint bits are
enabled, setting a different LSN. Without it being dirtied that won't
persist. Which doesn't seem good?


visibilitymap_set()'s comment header doesn't explain this well. Nor is
 * Call visibilitymap_pin first to pin the right one. This function doesn't do
 * any I/O.
actually true, given it does XLogInsert(). I think that should've been
adjusted in 503c7305a1e3 ("Make the visibility map crash-safe.").

Greetings,

Andres Freund

RE: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Tsunakawa, Takayuki

From: David Rowley [mailto:david.row...@2ndquadrant.com]
> It would be good to get your view on the
> shrink_bloated_locallocktable_v3.patch I worked on last night. I was
> unable to measure any overhead to solving the problem that way.

Thanks, it looks super simple and good.  I understood the idea behind your 
patch is:

* Transactions that touch many partitions and/or tables are a special event and 
not normal, and the hash table bloat is an unlucky accident.  So it's 
reasonable to revert the bloated hash table back to the original size.

* Repeated transactions that acquire many locks have to enlarge the hash table 
every time.  However, the overhead of hash table expansion should be hidden 
behind other various processing (acquiring/releasing locks, reading/writing the 
relations, accessing the catalogs of those relations)


TBH, I think the linked list approach feels more intuitive because the 
resulting code looks what it wants to do (efficiently iterate over acquired 
locks) and is based on the classic book.  But your approach seems to relieve 
people.  So I'm OK with your patch.

I'm registering you as another author and me as a reviewer, and marking this 
ready for committer.


Regards
Takayuki Tsunakawa

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-04-07 Thread Amit Kapila

On Mon, Apr 8, 2019 at 7:54 AM Jamison, Kirk  wrote:
>
> On Monday, April 8, 2019 9:04 AM (GMT+9), Haribabu Kommi wrote:
>
> >On Thu, Apr 4, 2019 at 3:29 PM Amit Kapila  wrote:
>
> > The patch looks good to me.  I have changed the commit message and ran
> > the pgindent in the attached patch.  Can you once see if that looks
> > fine to you?  Also, we should backpatch this till 9.6.  So, can you
> > once verify if the change is fine in all bank branches?   Also, test
> > with a force_parallel_mode option.  I have already tested it with
> > force_parallel_mode = 'regress' in HEAD, please test it in back
> > branches as well.
> >
> > Thanks for the updated patch.
> > I tested in back branches even with force_parallelmode and it is working
> > as expected. But the patches apply is failing in back branches, so attached
> > the patches for their branches. For v11 it applies with hunks.
>
> There are 3 patches for this thread:
> _v5: for PG v11 to current head
> _10: for PG10 branch
> _96: for PG9.6
>
> I have also tried applying these latest patches, .
> The patch set works correctly from patch application, build to compilation.
> I also tested with force_parallel_mode, and it works as intended.
>
> So I am marking this thread as “Ready for Committer”.
>

Thanks, Hari and Jamison for verification.  The patches for
back-branches looks good to me.  I will once again verify them before
commit. I will commit this patch tomorrow unless someone has
objections.  Robert/Alvaro, do let me know if you see any problem with
this fix?

> I hope this makes it on v12 before the feature freeze.
>

Yes,  I think fixing bugs should be fine unless we delay too much.

I see one typo in the commit message (transactions as we that is what
we do in ../transactions as that is what we do in ..), will fix it
before commit.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Back-branch bugs with fully-prunable UPDATEs

2019-04-07 Thread Etsuro Fujita


(2019/04/07 16:54), Amit Langote wrote:

On Sun, Apr 7, 2019 at 5:28 AM Tom Lane  wrote:



This seems to be related to what Amit Langote complained of in
https://www.postgresql.org/message-id/21e7eaa4-0d4d-20c2-a1f7-c7e96f4ce...@lab.ntt.co.jp
but since there's no foreign tables involved at all, either it's
a different bug or he misdiagnosed what he was seeing.


I think that one is a different bug, but maybe I haven't looked closely enough.


I started working on that from last Friday (though I didn't work on the 
weekend).  I agree on Amit's reasoning stated in that post, and I think 
that that's my fault.  Sorry for the delay.


Best regards,
Etsuro Fujita

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread David Rowley

On Mon, 8 Apr 2019 at 14:54, Tsunakawa, Takayuki
 wrote:
>
> From: 'Andres Freund' [mailto:and...@anarazel.de]
> > Did you see that people measured slowdowns?
>
> Yeah, 0.5% decrease with pgbench -M prepared -S (select-only), which feels 
> like a somewhat extreme test case.  And that might be within noise as was 
> mentioned.
>
> If we want to remove even the noise, we may have to think of removing the 
> LocalLockHash completely.  But it doesn't seem feasible...

It would be good to get your view on the
shrink_bloated_locallocktable_v3.patch I worked on last night. I was
unable to measure any overhead to solving the problem that way.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

RE: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Tsunakawa, Takayuki

From: 'Andres Freund' [mailto:and...@anarazel.de]
> On 2019-04-08 02:28:12 +, Tsunakawa, Takayuki wrote:
> > I think the linked list of LOCALLOCK approach is natural, simple, and
> > good.
> 
> Did you see that people measured slowdowns?

Yeah, 0.5% decrease with pgbench -M prepared -S (select-only), which feels like 
a somewhat extreme test case.  And that might be within noise as was mentioned.

If we want to remove even the noise, we may have to think of removing the 
LocalLockHash completely.  But it doesn't seem feasible...

Regards
Takayuki Tsunakawa

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread 'Andres Freund'

Hi,

On 2019-04-08 02:28:12 +, Tsunakawa, Takayuki wrote:
> I think the linked list of LOCALLOCK approach is natural, simple, and
> good.

Did you see that people measured slowdowns?

Greetings,

Andres Freund

RE: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Tsunakawa, Takayuki

From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> On the whole I don't think there's an adequate case for committing
> this patch.

From: Andres Freund [mailto:and...@anarazel.de]
> On 2019-04-05 23:03:11 -0400, Tom Lane wrote:
> > If I reduce the number of partitions in Amit's example from 8192
> > to something more real-world, like 128, I do still measure a
> > performance gain, but it's ~ 1.5% which is below what I'd consider
> > a reproducible win.  I'm accustomed to seeing changes up to 2%
> > in narrow benchmarks like this one, even when "nothing changes"
> > except unrelated code.
> 
> I'm not sure it's actually that narrow these days. With all the
> partitioning improvements happening, the numbers of locks commonly held
> are going to rise. And while 8192 partitions is maybe on the more
> extreme side, it's a workload with only a single table, and plenty
> workloads touch more than a single partitioned table.

I would feel happy if I could say such a many-partitions use case is narrow or 
impractical and ignore it, but it's not narrow.  Two of our customers are 
actually requesting such usage: one uses 5,500 partitions and is trying to 
migrate from a commercial database on Linux, and the other requires 200,000 
partitions to migrate from a legacy database on a mainframe.  At first, I 
thought such many partitions indicate a bad application design, but it sounded 
valid (or at least I can't insist that's bad).  PostgreSQL is now expected to 
handle such huge workloads.

From: Andres Freund [mailto:and...@anarazel.de]
> I'm not sure I'm quite that concerned. For one, a good bit of that space
> was up for grabs until the recent reordering of LOCALLOCK and nobody
> complained. But more importantly, I think commonly the amount of locks
> around is fairly constrained, isn't it? We can't really have that many
> concurrently held locks, due to the shared memory space, and the size of
> a LOCALLOCK isn't that big compared to say relcache entries.  We also
> probably fairly easily could win some space back - e.g. make nLocks 32
> bits.

+1

From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> I'd also point out that this is hardly the only place where we've
> seen hash_seq_search on nearly-empty hash tables become a bottleneck.
> So I'm not thrilled about attacking that with one-table-at-time patches.
> I'd rather see us do something to let hash_seq_search win across
> the board.
> 
> I spent some time wondering whether we could adjust the data structure
> so that all the live entries in a hashtable are linked into one chain,
> but I don't quite see how to do it without adding another list link to
> struct HASHELEMENT, which seems pretty expensive.

I think the linked list of LOCALLOCK approach is natural, simple, and good.  In 
the Jim Gray's classic book "Transaction processing: concepts and techniques", 
we can find the following sentence in "8.4.5 Lock Manager Internal Logic."  The 
sample implementation code in the book uses a similar linked list to remember 
and release a transaction's acquired locks.

"All the locks of a transaction are kept in a list so they can be quickly found 
and released at commit or rollback."

And handling this issue with the LOCALLOCK linked list is more natural than 
with the hash table resize.  We just want to quickly find all grabbed locks, so 
we use a linked list.  A hash table is a mechanism to find a particular item 
quickly.  So it was merely wrong to use the hash table to iterate all grabbed 
locks.  Also, the hash table got big because some operation in the session 
needed it, and some subsequent operations in the same session may need it 
again.  So we wouldn't be relieved with shrinking the hash table.

Regards
Takayuki Tsunakawa

RE: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-04-07 Thread Jamison, Kirk

On Monday, April 8, 2019 9:04 AM (GMT+9), Haribabu Kommi wrote:

>On Thu, Apr 4, 2019 at 3:29 PM Amit Kapila 
>mailto:amit.kapil...@gmail.com>> wrote:
>On Wed, Apr 3, 2019 at 10:45 AM Haribabu Kommi 
>mailto:kommi.harib...@gmail.com>> wrote:
>>
>> Thanks for the review.
>>
>> While changing the approach to use the is_parallel_worker_flag, I thought
>> that the rest of the stats are also not required to be updated and also those
>> are any way write operations and those values are zero anyway for parallel
>> workers.
>>
>> Instead of expanding the patch scope, I just changed to avoid the commit
>> or rollback stats as discussed, and later we can target the handling of all 
>> the
>> internal transactions and their corresponding stats.
>>

> The patch looks good to me.  I have changed the commit message and ran
> the pgindent in the attached patch.  Can you once see if that looks
> fine to you?  Also, we should backpatch this till 9.6.  So, can you
> once verify if the change is fine in all bank branches?   Also, test
> with a force_parallel_mode option.  I have already tested it with
> force_parallel_mode = 'regress' in HEAD, please test it in back
> branches as well.
>
>
> Thanks for the updated patch.
> I tested in back branches even with force_parallelmode and it is working
> as expected. But the patches apply is failing in back branches, so attached
> the patches for their branches. For v11 it applies with hunks.


There are 3 patches for this thread:
_v5: for PG v11 to current head
_10: for PG10 branch
_96: for PG9.6

I have also tried applying these latest patches, .
The patch set works correctly from patch application, build to compilation.
I also tested with force_parallel_mode, and it works as intended.

So I am marking this thread as “Ready for Committer”.
I hope this makes it on v12 before the feature freeze.


Regards,
Kirk Jamison

RE: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Tsunakawa, Takayuki

Hi Andres, Fujii-san, any committer,

From: Andres Freund [mailto:and...@anarazel.de]
> On 2019-04-08 09:52:27 +0900, Fujii Masao wrote:
> > I'm thinking to commit this patch at first.  We can change the term
> > and add the support of "TRUNCATE" option for VACUUM command later.
> 
> I hope you realize feature freeze is in a few hours...

Ouch!  Could you take care of committing the patch, please?  I wouldn't be able 
to express the sadness and tiredness just in case this is pushed to 13 because 
of the parameter name...

Regards
Takayuki Tsunakawa

Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Andres Freund

Hi,

On 2019-04-08 09:52:27 +0900, Fujii Masao wrote:
> I'm thinking to commit this patch at first.  We can change the term
> and add the support of "TRUNCATE" option for VACUUM command later.

I hope you realize feature freeze is in a few hours...

Greetings,

Andres Freund

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

2019-04-07 Thread Andres Freund

Hi,

On 2019-04-04 21:04:49 -0700, Andres Freund wrote:
> On 2019-04-04 23:57:58 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > I think the right approach would be to do all of this in heap_insert and
> > > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > > is specified, remember whether it is either currently empty, or is
> > > already marked as all-visible. If previously empty, mark it as all
> > > visible at the end. If already all visible, there's no need to change
> > > that. If not yet all-visible, no need to do anything, since it can't
> > > have been inserted with COPY FREEZE.  Do you see any problem doing it
> > > that way?
> > 
> > Do we want to add overhead to these hot-spot routines for this purpose?
> 
> For heap_multi_insert I can't see it being a problem - it's only used
> from copy.c, and the cost should be "smeared" over many tuples.  I'd
> assume that compared with locking a page, WAL logging, etc, it'd not
> even meaningfully show up for heap_insert. Especially because we already
> have codepaths for options & HEAP_INSERT_FROZEN in
> heap_prepare_insert(), and I'd assume those could be combined.
> 
> I think we should measure it, but I don't think that one or two
> additional, very well predictd, branches are going to be measurable in
> in those routines.
> 
> The patch, as implemented, has modifications in
> RelationGetBufferForTuple(), that seems like they'd be more expensive.

Here's a *prototype* patch for this.  It only implements what I
described for heap_multi_insert, not for plain inserts. I wanted to see
what others think before investing additional time.

I don't think it's possible to see the overhead of the changed code in
heap_multi_insert(), and probably - with less confidence - that it's
also going to be ok for heap_insert(). But we gotta measure that.

This avoids an extra WAL record for setting empty pages to all visible,
by adding XLH_INSERT_ALL_VISIBLE_SET & XLH_INSERT_ALL_FROZEN_SET, and
setting those when appropriate in heap_multi_insert.  Unfortunately
currently visibilitymap_set() doesn't really properly allow to do this,
as it has embedded WAL logging for heap.

I think we should remove the WAL logging from visibilitymap_set(), and
move it to a separate, heap specific, function. Right now different
tableams e.g. would have to reimplement visibilitymap_set(), so that's a
second need to separate that functionality.  Let me try to come up with
a proposal.

The patch currently does a vmbuffer_pin() while holding an exclusive
lwlock on the page. That's something we normally try to avoid - but I
think it's probably OK here, because INSERT_FROZEN can only be used to
insert into a new relfilenode (which thus no other session can see). I
think it's preferrable to have this logic in specific to the
INSERT_FROZEN path, rather than adding nontrivial complications to
RelationGetBufferForTuple().

I noticed that, before this patch, we do a
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
after every filled page - that doesn't strike me as particularly smart -
it's pretty likely that the next heap page to be filled is going to be
on the same vm page as the previous iteration.

I noticed one small oddity that I think is common to all the approaches
presented in this thread so far. After

BEGIN;
TRUNCATE foo;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COMMIT;

we currently end up with pages like:
┌───┬───┬──┬───┬───┬───┬─┬──┬─┬───┐
│ blkno │lsn│ checksum │ flags │ lower │ upper │ special │ pagesize │ 
version │ prune_xid │
├───┼───┼──┼───┼───┼───┼─┼──┼─┼───┤
│ 0 │ 0/50B5488 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│ 1 │ 0/50B6360 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│ 2 │ 0/50B71B8 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│ 3 │ 0/50B8028 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│ 4 │ 0/50B8660 │0 │ 4 │   408 │  5120 │8192 │ 8192 │   
4 │ 0 │
│ 5 │ 0/50B94B8 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│ 6 │ 0/50BA328 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│ 7 │ 0/50BB180 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│ 8 │ 0/50BBFD8 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│ 9 │ 0/50BCF88 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│10 │ 0/50BDDE0 │0 │ 4 │   928 │   960 │8192 │ 8192 │   
4 │ 0 │
│11 │ 0/50BEC50 │0 │ 4 │   928 │

Re: Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Masahiko Sawada

On Mon, Apr 8, 2019 at 9:52 AM Fujii Masao  wrote:
>
> On Sat, Apr 6, 2019 at 2:04 AM Robert Haas  wrote:
> >
> > On Thu, Apr 4, 2019 at 9:19 PM Masahiko Sawada  
> > wrote:
> > > As INDEX_CLEANUP option has been added by commit a96c41f, the new
> > > option for this feature could also accept zero or one boolean
> > > argument, that is SHRINK_TABLE [true|false] and true by default.
> > > Explicit options on VACUUM command overwrite options set by
> > > reloptions. And if the boolean argument is omitted the option depends
> > > on the reloptions.
> >
> > Yes, I think that's how it should work, because that's how the other
> > option works, and there's no compelling reason to be consistent.
> >
> > My preference is for "truncate" over "shrink".
>
> +1
>
> Attached is the updated version of the patch.
> I just replaced "shrink" with "truncate" and rebased the patch
> on the master.

Thank you for updating the patch! "vacuum_truncate" option would be
more consistent with vacuum_index_cleanup option rather than
"vacuum_truncate_enabled' option?

>  I'm thinking to commit this patch at first.
> We can change the term and add the support of "TRUNCATE" option
> for VACUUM command later.

+1

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-04-07 Thread Fujii Masao

On Sat, Apr 6, 2019 at 2:04 AM Robert Haas  wrote:
>
> On Thu, Apr 4, 2019 at 9:19 PM Masahiko Sawada  wrote:
> > As INDEX_CLEANUP option has been added by commit a96c41f, the new
> > option for this feature could also accept zero or one boolean
> > argument, that is SHRINK_TABLE [true|false] and true by default.
> > Explicit options on VACUUM command overwrite options set by
> > reloptions. And if the boolean argument is omitted the option depends
> > on the reloptions.
>
> Yes, I think that's how it should work, because that's how the other
> option works, and there's no compelling reason to be consistent.
>
> My preference is for "truncate" over "shrink".

+1

Attached is the updated version of the patch.
I just replaced "shrink" with "truncate" and rebased the patch
on the master. I'm thinking to commit this patch at first.
We can change the term and add the support of "TRUNCATE" option
for VACUUM command later.

Regards,

-- 
Fujii Masao


disable-vacuum-truncation_v7.patch
Description: Binary data

Cleanup/remove/update references to OID column

2019-04-07 Thread Justin Pryzby

Cleanup/remove/update references to OID column...

..in wake of 578b229718e8f.

See also
93507e67c9ca54026019ebec3026de35d30370f9
1464755fc490a9911214817fe83077a3689250ab
---
 doc/src/sgml/ddl.sgml  |  9 -
 doc/src/sgml/ref/insert.sgml   | 12 +---
 doc/src/sgml/ref/psql-ref.sgml |  3 +++
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 9e761db..db044c5 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3672,11 +3672,10 @@ VALUES ('Albany', NULL, NULL, 'NY');
   
Partitions cannot have columns that are not present in the parent.  It
is not possible to specify columns when creating partitions with
-   CREATE TABLE, nor is it possible to add columns to
-   partitions after-the-fact using ALTER TABLE.  Tables 
may be
-   added as a partition with ALTER TABLE ... ATTACH 
PARTITION
-   only if their columns exactly match the parent, including any
-   oid column.
+   CREATE TABLE, to add columns to
+   partitions after-the-fact using ALTER TABLE, nor to
+   add a partition with ALTER TABLE ... ATTACH PARTITION
+   if its columns would not exactly match those of the parent.
   
  
 
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 62e142f..3e1be4c 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -552,13 +552,11 @@ INSERT INTO table_name [ AS oid count
 
The count is the
-   number of rows inserted or updated.  If count is exactly one, and the
-   target table has OIDs, then oid is the OID
-   assigned to the inserted row.  The single row must have been
-   inserted rather than updated.  Otherwise oid is zero.
+   number of rows inserted or updated.
+   oid used to be the object ID of the inserted row
+   if rows was 1 and the target table had OIDs, but
+   OIDs system columns are not supported anymore; therefore
+   oid is always 0.
   
 
   
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 08f4bab..0e6e792 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3794,6 +3794,9 @@ bar
 command. This variable is only guaranteed to be valid until
 after the result of the next SQL command has
 been displayed.
+PostgreSQL servers since version 12 do not
+support OID system columns in user tables, and LASTOID will always be 0
+following INSERT.
 
 
   
-- 
2.1.4

>From b471a3e4d0713a157f53dd64bb26e2da1fc57c6f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 2 Apr 2019 19:13:55 -0500
Subject: [PATCH v1] Cleanup/remove/update references to OID column...
To: pgsql-hackers@lists.postgresql.org

..in wake of 578b229718e8f.

See also
93507e67c9ca54026019ebec3026de35d30370f9
1464755fc490a9911214817fe83077a3689250ab
---
 doc/src/sgml/ddl.sgml  |  9 -
 doc/src/sgml/ref/insert.sgml   | 12 +---
 doc/src/sgml/ref/psql-ref.sgml |  3 +++
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 9e761db..db044c5 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3672,11 +3672,10 @@ VALUES ('Albany', NULL, NULL, 'NY');
   
Partitions cannot have columns that are not present in the parent.  It
is not possible to specify columns when creating partitions with
-   CREATE TABLE, nor is it possible to add columns to
-   partitions after-the-fact using ALTER TABLE.  Tables may be
-   added as a partition with ALTER TABLE ... ATTACH PARTITION
-   only if their columns exactly match the parent, including any
-   oid column.
+   CREATE TABLE, to add columns to
+   partitions after-the-fact using ALTER TABLE, nor to
+   add a partition with ALTER TABLE ... ATTACH PARTITION
+   if its columns would not exactly match those of the parent.
   
  
 
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 62e142f..3e1be4c 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -552,13 +552,11 @@ INSERT INTO table_name [ AS oid count
 
The count is the
-   number of rows inserted or updated.  If count is exactly one, and the
-   target table has OIDs, then oid is the OID
-   assigned to the inserted row.  The single row must have been
-   inserted rather than updated.  Otherwise oid is zero.
+   number of rows inserted or updated.
+   oid used to be the object ID of the inserted row
+   if rows was 1 and the target table had OIDs, but
+   OIDs system columns are not supported anymore; therefore
+   oid is always 0.
   
 
   
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 08f4bab..0e6e792 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3794,6 +3794,9 @@ bar
 command. This variable is only

Re: clean up pg_checksums.sgml

2019-04-07 Thread Justin Pryzby

On Sat, Mar 30, 2019 at 10:51:23AM +0900, Michael Paquier wrote:
> On Fri, Mar 29, 2019 at 09:32:10AM -0500, Justin Pryzby wrote:
> > PFA patch with minor improvements to documentation.
> 
> Patch does not apply, and I have reworded the last paragraph about
> failures while operating.

Sorry, the patch was on top of an brief effort I made to rename "check
checksums" to "verify checksums", before asking about the idea.

PFA patch to master.

Justin

> > Also, what do you think about changing user-facing language from
> > "check checksum" to "verify checksum" ?  I see that commit ed308d78
> > actually moved in the other direction, but I preferred "verify".
> 
> Yes, that's a debate that we had during the discussion for the new
> switches, and we have decided to use --check over --verify for the
> default option.  On the one hand, "Check checksums" is rather
> redundant, but that's more consistent with the option name.  "Verify
> checksums" is perhaps more elegant.  My opinion is that having some
> consistency between the option names and the docs is nicer.
>From f151f8200c65f4a3ae61afe444f213e190e94013 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 28 Mar 2019 19:20:52 -0500
Subject: [PATCH] Clean up pg_checksums.sgml

---
 doc/src/sgml/ref/pg_checksums.sgml | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 47d4a62..01c65fe 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -39,15 +39,16 @@ PostgreSQL documentation
pg_checksums checks, enables or disables data
checksums in a PostgreSQL cluster.  The server
must be shut down cleanly before running
-   pg_checksums. The exit status is zero if there
-   are no checksum errors when checking them, and nonzero if at least one
-   checksum failure is detected. If enabling or disabling checksums, the
+   pg_checksums. When verifying checksums, the exit
+   status is zero if there are no checksum errors, and nonzero if at least one
+   checksum failure is detected. When enabling or disabling checksums, the
exit status is nonzero if the operation failed.
   
 
   
-   While checking or enabling checksums needs to scan or write every file in
-   the cluster, disabling checksums will only update the file
+   When verifying checksums, every file in the cluster is scanned;
+   When enabling checksums, every file in the cluster is also rewritten.
+   Disabling checksums only updates the file
pg_control.
   
  
@@ -218,10 +219,9 @@ PostgreSQL documentation
   
   
If pg_checksums is aborted or killed while
-   enabling or disabling checksums, the cluster will keep the same
-   configuration for data checksums as before the operation attempted.
-   pg_checksums can be restarted to
-   attempt again the same operation.
+   enabling or disabling checksums, the cluster's checksum state will be
+   unchanged, and pg_checksums would need to be
+   rerun and start its operation from scratch.
   
  
 
-- 
2.1.4

RE: Timeout parameters

2019-04-07 Thread Tsunakawa, Takayuki

From: Michael Paquier [mailto:mich...@paquier.xyz]
> I have just committed the GUC and libpq portion for TCP_USER_TIMEOUT after
> a last lookup, and I have cleaned up a couple of places. 

Thank you for further cleanup and committing.


> For the socket_timeout stuff, its way of solving the problem it thinks is
> solves does not seem right to me, and this thread has not reached a consensus
> anyway, so I have discarded the issue.
> 
> I am marking the CF entry as committed.  In the future, it would be better
> to not propose multiple concepts on the same thread, and if the
> socket_timeout business is resubmitted, I would suggest a completely new
> CF entry, and a new thread.

Understood.  Looking back the review process, it seems that tcp_user_timeout 
and socket_timeout should have been handled in separate threads.


Regards
Takayuki Tsunakawa

Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-04-07 Thread Haribabu Kommi

On Thu, Apr 4, 2019 at 3:29 PM Amit Kapila  wrote:

> On Wed, Apr 3, 2019 at 10:45 AM Haribabu Kommi 
> wrote:
> >
> > Thanks for the review.
> >
> > While changing the approach to use the is_parallel_worker_flag, I thought
> > that the rest of the stats are also not required to be updated and also
> those
> > are any way write operations and those values are zero anyway for
> parallel
> > workers.
> >
> > Instead of expanding the patch scope, I just changed to avoid the commit
> > or rollback stats as discussed, and later we can target the handling of
> all the
> > internal transactions and their corresponding stats.
> >
>
> The patch looks good to me.  I have changed the commit message and ran
> the pgindent in the attached patch.  Can you once see if that looks
> fine to you?  Also, we should backpatch this till 9.6.  So, can you
> once verify if the change is fine in all bank branches?   Also, test
> with a force_parallel_mode option.  I have already tested it with
> force_parallel_mode = 'regress' in HEAD, please test it in back
> branches as well.
>


Thanks for the updated patch.
I tested in back branches even with force_parallelmode and it is working
as expected. But the patches apply is failing in back branches, so attached
the patches for their branches. For v11 it applies with hunks.

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Avoid-counting-transaction-stats-for-parallel-worker_10.patch
Description: Binary data


0001-Avoid-counting-transaction-stats-for-parallel-worker_96.patch
Description: Binary data

Re: Emacs vs pg_indent's weird indentation for function declarations

2019-04-07 Thread Thomas Munro

On Mon, Jan 28, 2019 at 9:48 PM Thomas Munro
 wrote:
> On Mon, Jan 28, 2019 at 8:08 PM Michael Paquier  wrote:
> > If you could get pgindent smarter in this area, it would be really
> > nice..
>
> Ah, it's not indent doing it, it's pgindent's post_indent subroutine
> trying to correct the effects of the (implied) -psl option, but not
> doing a complete job of it (it should adjust the indentation lines of
> later lines if it changes the first line).
>
> One idea I had was to tell indent not to do that by using -npsl when
> processing headers, like in the attached.  That fixes all the headers
> I looked at, though of course it doesn't fix the static function
> declarations that appear in .c files, so it's not quite the right
> answer.

I tried teaching pgindent's post_indent subroutine to unmangle the
multi-line declarations it mangles.  That produces correct
indentation!  But can also produce lines that exceed the column limit
we would normally wrap at (of course, because pg_bsd_indent had less
whitespace on the left when it made wrapping decisions).  Doh.
Attached for posterity, but it's useless.

So I think pg_bsd_indent itself needs to be fixed.  I think I know
where the problem is.  lexi.c isn't looking far ahead enough to
recognise multi-line function declarations:

if (*buf_ptr == '(' && state->tos <= 1 && state->ind_level == 0 &&
state->in_parameter_declaration == 0 && state->block_init == 0) {
char *tp = buf_ptr;
while (tp < buf_end)
if (*tp++ == ')' && (*tp == ';' || *tp == ','))
goto not_proc;
strncpy(state->procname, token, sizeof state->procname - 1);
if (state->in_decl)
state->in_parameter_declaration = 1;
return (funcname);
not_proc:;
}

That loop that looks for ')' followed by ';' is what causes the lexer
to conclude that the "foo" is an "ident" rather than a "funcname",
given the following input:

  extern void foo(int i);

But if because buf_end ends at the newline, it can't see the
semi-colon and concludes that "foo" is a "funcname" here:

  extern void foo(int i,
  int j);

That determination causes indent.c's procnames_start_line (-psl) mode
to put "extern void" on its own line here and stop thinking of it as a
declaration:

case funcname:
case ident: /* got an identifier or constant */
if (ps.in_decl) {
if (type_code == funcname) {
ps.in_decl = false;
if (procnames_start_line && s_code != e_code) {
*e_code = '\0';
dump_line();
}

I guess it'd need something smarter than fill_buffer() to see far
enough, but simply loading N lines at once wouldn't be enough because
you could still happen to be looking at the final line in the buffer;
you'd probably need a sliding window.  I'm not planning on trying to
fix that myself in the short term, but since it annoys me every time I
commit anything, I couldn't resist figuring out where it's coming from
at least...

-- 
Thomas Munro
https://enterprisedb.com

0001-Fix-pgindent-s-postprocessing-of-multi-line-prototyp.patch
Description: Binary data

Re: ToDo: show size of partitioned table

2019-04-07 Thread Justin Pryzby

PFA patch intended to have been previously attached...

Hmm..maybe it should say

|RELATION'S partitions is also displayed, along with the RELATION'S
|description.   
  
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 3312570..8d50aef 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1682,8 +1682,8 @@ testdb=
 
 
 If + is appended to the command, the sum of sizes of
-table's partitions (including that of their indexes) is also displayed,
-along with the associated description.
+table's partitions is also displayed, along with the associated
+description.
 If n is combined with +, two
 sizes are shown: one including the total size of directly-attached
 leaf partitions, and another showing the total size of all partitions,
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 22fb3fb..1ca30fb 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3860,7 +3860,7 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 
 	if (showNested || pattern)
 		appendPQExpBuffer(,
-		  ",\n  c3.oid::regclass as \"%s\"",
+		  ",\n  inh.inhparent::regclass as \"%s\"",
 		  gettext_noop("Parent name"));
 
 	if (showIndexes)
@@ -3901,8 +3901,7 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 
 	if (showNested || pattern)
 		appendPQExpBufferStr(,
-			 "\n LEFT JOIN pg_catalog.pg_inherits inh ON c.oid = inh.inhrelid"
-			 "\n LEFT JOIN pg_catalog.pg_class c3 ON c3.oid = inh.inhparent");
+			 "\n LEFT JOIN pg_catalog.pg_inherits inh ON c.oid = inh.inhrelid");
 
 	if (verbose)
 	{

Re: ToDo: show size of partitioned table

2019-04-07 Thread Justin Pryzby

On Sun, Apr 07, 2019 at 03:13:36PM -0400, Alvaro Herrera wrote:
> On 2019-Apr-07, Pavel Stehule wrote:
> 
> > ne 7. 4. 2019 v 20:27 odesílatel Alvaro Herrera 
> > napsal:
> 
> > > In order for this to display sanely, I added the "Parent name" column if
> > > either the "n" flag is shown or a pattern is given (previously it only
> > > appeared in the former case).
> > 
> > I am thinking about it and original behave and this new behave should be
> > expected (and unexpected too). We can go this way - I have not
> > counter-arguments, but yes, it is more consistent with some other commands,
> > pattern disables some other constraints.
> > 
> > It should be documented - Using any pattern in this case forces 'n' flag.
> 
> Added to the docs, and pushed.

Thanks for helping it across the finish line.

I see you changed from relname to oid::regclass, which is good, thanks.

Then, it's unnecessary to join against pg_class again (and I don't know why
it was a left join) ?

Also docs are wrong re: indices.

Justin

What is the correct behaviour for a wCTE UPDATE followed by a DELETE?

2019-04-07 Thread Andres Freund

Hi,

While fixing the report at https://postgr.es/m/19321.1554567...@sss.pgh.pa.us
I noticed that our behaviour for deleting (or updating albeit less
drastically) a row previously modified in the same query isn't
particularly useful:

DROP TABLE IF EXISTS blarg;
CREATE TABLE blarg(data text, count int);
INSERT INTO blarg VALUES('row', '1');
WITH upd AS (UPDATE blarg SET count = count + 1 RETURNING *)
DELETE FROM blarg USING upd RETURNING *;
SELECT * FROM blarg;
┌──┬───┐
│ data │ count │
├──┼───┤
│ row  │ 2 │
└──┴───┘
(1 row)

I.e. the delete is plainly ignored. That's because it falls under:

/*
 * The target tuple was already updated or 
deleted by the
 * current command, or by a later command in 
the current
 * transaction.  The former case is possible in 
a join DELETE
 * where multiple tuples join to the same 
target tuple. This
 * is somewhat questionable, but Postgres has 
always allowed
 * it: we just ignore additional deletion 
attempts.
 *
 * The latter case arises if the tuple is 
modified by a
 * command in a BEFORE trigger, or perhaps by a 
command in a
 * volatile function used in the query.  In 
such situations we
 * should not ignore the deletion, but it is 
equally unsafe to
 * proceed.  We don't want to discard the 
original DELETE
 * while keeping the triggered actions based on 
its deletion;
 * and it would be no better to allow the 
original DELETE
 * while discarding updates that it triggered.  
The row update
 * carries some information that might be 
important according
 * to business rules; so throwing an error is 
the only safe
 * course.
 *
 * If a trigger actually intends this type of 
interaction, it
 * can re-execute the DELETE and then return 
NULL to cancel
 * the outer delete.
 */
if (tmfd.cmax != estate->es_output_cid)
ereport(ERROR,

(errcode(ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION),
 errmsg("tuple to be 
deleted was already modified by an operation triggered by the current command"),
 errhint("Consider 
using an AFTER trigger instead of a BEFORE trigger to propagate changes to 
other rows.")));

/* Else, already deleted by self; nothing to do 
*/


I'm not sure what the right behaviour is. But it feels to me like the
current behaviour wasn't particularly intentional, it's just what
happened. And certainly the "already deleted by self" comment doesn't
indicate understanding that it could just as well be an update. Nor does
the comment above it refer to the possibility that the update might have
been from a [different] wCTE in a different ModifyTable node, rather
than just a redundant update/delete by the same node.

Nor do I feel is there proper tests attesting to what the behaviour
should be.

Marko, Hitoshi, Tom, was there some intended beheaviour in
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=389af951552ff2209eae3e62fa147fef12329d4f
?

Kevin, did you know that that could happen when writing
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6868ed7491b7ea7f0af6133bb66566a2f5fe5a75
?

Anyone, do you have a concrete and doable proposal of how we should
actually handle this?

Greetings,

Andres Freund

Re: [GSoC 2019] Proposal: Develop Performance Farm Database and Website

2019-04-07 Thread Ila B.

Thank you so much for your answer, it provided a clearer understanding of the 
task and it was really useful to complete my proposal which I’ve now submitted. 
I really hope to keep on working with Postgres.

Best of luck to all GSoC students :)

Ilaria

> On 4 Apr 2019, at 18:59, Mark Wong  wrote:
> 
> Hi Ilaria,
> 
> Edited for bottom posting. :)
> 
> On Fri, Mar 29, 2019 at 03:01:05PM +0100, Ilaria wrote:
>>> Am 29.03.2019 um 13:52 schrieb Peter Eisentraut 
>>> :
>>> 
 On 2019-03-29 13:04, Robert Haas wrote:
> On Tue, Mar 26, 2019 at 9:10 AM Ila B.  wrote:
> I am Ilaria Battiston, an aspiring GSoC student, and I would love to have 
> a feedback on the first draft of my Google Summer of Code proposal. The 
> project is "Develop Performance Farm Database and Website”. You can find 
> any other detail in the attached PDF file :)
 
 I think there's probably a very large amount of work to be done in
 this area.  Nobody is going to finish it in a summer.  Still, there's
 probably some useful things you could get done in a summer.  I think a
 lot will depend on finding a good mentor who is familiar with these
 areas (which I am not).  Has anyone expressed an interest?
>>> 
>>> Moreover, I have a feeling that have been hearing about work on a
>>> performance farm for many years.  Perhaps it should be investigated what
>>> became of that work and what the problems were getting it to a working
>>> state.
> 
>> Hello,
>> 
>> Thanks for the answer. This project is on the official PostgreSQL project 
>> list of GSoC 2019, and potential mentors are stated there. 
>> 
>> I trust mentors’ judgement on outlining the work and the tasks to be done in 
>> three months, and there is the previous student’s work to use as example if 
>> needed. The project consists in building a database and a website on top of 
>> it for users to browse performance data. 
>> 
>> Let me know whether there are any specific issues you’re concerned about. 
> 
> Hongyuan, our student last summer, put together a summary of his
> progress in a GitHub issue:
> 
> https://github.com/PGPerfFarm/pgperffarm/issues/22
> 
> 
> We have systems for proofing (from OSUOSL) and you can also see the
> prototype here:
> 
> http://140.211.168.111/
> 
> 
> For Phase 1, I'd recommend getting familiar with the database schema in
> place now.  Perhaps it can use some tweaking, but I just mean to suggest
> that it might not be necessary to rebuild it from scratch.
> 
> In Phase 2, we had some difficulty last year about getting the
> authentication/authorization completely integrated.  I think the main
> issue was how to integrate this app while using resources outside of the
> community infrastructure.  We may have to continue working around that.
> 
> Otherwise, I think the rest make sense.  Let us know if you have any
> more questions.
> 
> Regards,
> Mark
> 
> -- 
> Mark Wong
> 2ndQuadrant - PostgreSQL Solutions for the Enterprise
> https://www.2ndQuadrant.com/

Re: ToDo: show size of partitioned table

2019-04-07 Thread Alvaro Herrera

On 2019-Apr-07, Pavel Stehule wrote:

> ne 7. 4. 2019 v 20:27 odesílatel Alvaro Herrera 
> napsal:

> > In order for this to display sanely, I added the "Parent name" column if
> > either the "n" flag is shown or a pattern is given (previously it only
> > appeared in the former case).
> 
> I am thinking about it and original behave and this new behave should be
> expected (and unexpected too). We can go this way - I have not
> counter-arguments, but yes, it is more consistent with some other commands,
> pattern disables some other constraints.
> 
> It should be documented - Using any pattern in this case forces 'n' flag.

Added to the docs, and pushed.

I couldn't resist tweaking the ORDER BY clause, too.  I think listing
all tables first, followed by all indexes, and sorting by parent in each
category, is much easier to read.  (Maybe this can use additional
tweaking, but it's a minor thing anyway -- for example putting together
all indexes that correspond to some particular table?)

I noticed that \d never seems to use pg_total_relation_size, so toast
size is never shown.  I did likewise here too and used pg_table_size
everywhere.  I'm not 100% sure this is the most convenient thing.  Maybe
we need yet another column, and/or yet another flag ...?

Also, I think the new \dP should gain a new flag (maybe "l") to make it
list leaf tables/indexes too with their local sizes, and remove those
from the standard \d listing.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ToDo: show size of partitioned table

2019-04-07 Thread Pavel Stehule

ne 7. 4. 2019 v 20:27 odesílatel Alvaro Herrera 
napsal:

> On 2019-Apr-07, Pavel Stehule wrote:
>
> > ne 7. 4. 2019 v 19:16 odesílatel Alvaro Herrera <
> alvhe...@2ndquadrant.com>
> > napsal:
> >
> > > Here.
> >
> > + 1
>
> BTW I'm not sure if you noticed, but I removed the error message "no
> partitioned relations found" when the result was empty.  This was
> mimicking \d behavior (which was justified on historical grounds), and
> \drds behavior (which was justified on pattern ordering grounds); but I
> see nothing that justifies it for \dP, so let's make it behave like all
> the other commands and display an empty table.
>
> And there's an additional change to make.  In the regression database,
> this returns an empty table:
>
> regression=# \dPi regress_indexing.pk5_pkey
>  List of partitioned indexes
>  Esquema | Nombre | Dueño | On table
> -++---+--
> (0 filas)
>
> but the index does exist, and it is a partitioned one.  So it should be
> displayed.  In fact, if I add the "n" flag, it shows:
>
> regression=# \dPin regress_indexing.pk5_pkey
>List of partitioned indexes
>  Esquema  |  Nombre  |  Dueño   |   Parent name|
>  On table
>
> --+--+--+--+--
>  regress_indexing | pk5_pkey | alvherre | regress_indexing.pk_pkey |
> regress_indexing.pk5
> (1 fila)
>
> I think the fact that \dPi doesn't show it is broken, and this fixes it:
>
> @@ -3946,7 +3946,8 @@ listPartitionedTables(const char *reltypes, const
> char *pattern, bool verbose)
> appendPQExpBufferStr(, "''");   /* dummy */
> appendPQExpBufferStr(, ")\n");
>
> -   appendPQExpBufferStr(, !showNested ? " AND NOT
> c.relispartition\n" : "\n");
> +   appendPQExpBufferStr(, !showNested && !pattern ?
> +" AND NOT
> c.relispartition\n" : "");
>
> if (!pattern)
> appendPQExpBufferStr(, "  AND n.nspname <>
> 'pg_catalog'\n"
>
> In order for this to display sanely, I added the "Parent name" column if
> either the "n" flag is shown or a pattern is given (previously it only
> appeared in the former case).
>

I am thinking about it and original behave and this new behave should be
expected (and unexpected too). We can go this way - I have not
counter-arguments, but yes, it is more consistent with some other commands,
pattern disables some other constraints.

It should be documented - Using any pattern in this case forces 'n' flag.



> Note that this changes the expected output in the test; now every test
> that uses a pattern displays *all* partitioned relations that match the
> pattern, not just top-level ones.  I'm about +0.2 convinced that this is
> desirable, but my first example above tilts the scales to changing it as
> described.
>
> I noticed this while testing after messing with the tab completion as
> suggested by Justin: we already had Query_for_list_of_partitioned_tables
> (which you could have used), but since \dP works for both indexes and
> tables, I instead modified your new
> Query_for_list_of_partitioned_relations to list both relation kinds.
>
>
has sense.



> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: ToDo: show size of partitioned table

2019-04-07 Thread Alvaro Herrera

On 2019-Apr-07, Pavel Stehule wrote:

> ne 7. 4. 2019 v 19:16 odesílatel Alvaro Herrera 
> napsal:
> 
> > Here.
> 
> + 1

BTW I'm not sure if you noticed, but I removed the error message "no
partitioned relations found" when the result was empty.  This was
mimicking \d behavior (which was justified on historical grounds), and
\drds behavior (which was justified on pattern ordering grounds); but I
see nothing that justifies it for \dP, so let's make it behave like all
the other commands and display an empty table.

And there's an additional change to make.  In the regression database,
this returns an empty table:

regression=# \dPi regress_indexing.pk5_pkey 
 List of partitioned indexes
 Esquema | Nombre | Dueño | On table 
-++---+--
(0 filas)

but the index does exist, and it is a partitioned one.  So it should be
displayed.  In fact, if I add the "n" flag, it shows:

regression=# \dPin regress_indexing.pk5_pkey 
   List of partitioned indexes
 Esquema  |  Nombre  |  Dueño   |   Parent name|   On 
table   
--+--+--+--+--
 regress_indexing | pk5_pkey | alvherre | regress_indexing.pk_pkey | 
regress_indexing.pk5
(1 fila)

I think the fact that \dPi doesn't show it is broken, and this fixes it:

@@ -3946,7 +3946,8 @@ listPartitionedTables(const char *reltypes, const char 
*pattern, bool verbose)
appendPQExpBufferStr(, "''");   /* dummy */
appendPQExpBufferStr(, ")\n");

-   appendPQExpBufferStr(, !showNested ? " AND NOT c.relispartition\n" 
: "\n");
+   appendPQExpBufferStr(, !showNested && !pattern ?
+" AND NOT c.relispartition\n" 
: "");

if (!pattern)
appendPQExpBufferStr(, "  AND n.nspname <> 
'pg_catalog'\n"

In order for this to display sanely, I added the "Parent name" column if
either the "n" flag is shown or a pattern is given (previously it only
appeared in the former case).

Note that this changes the expected output in the test; now every test
that uses a pattern displays *all* partitioned relations that match the
pattern, not just top-level ones.  I'm about +0.2 convinced that this is
desirable, but my first example above tilts the scales to changing it as
described.

I noticed this while testing after messing with the tab completion as
suggested by Justin: we already had Query_for_list_of_partitioned_tables
(which you could have used), but since \dP works for both indexes and
tables, I instead modified your new
Query_for_list_of_partitioned_relations to list both relation kinds.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: change password_encryption default to scram-sha-256?

2019-04-07 Thread David Fetter

On Sun, Apr 07, 2019 at 12:59:05PM -0400, Tom Lane wrote:
> Peter Eisentraut  writes:
> > Should we change the default of the password_encryption setting to
> > 'scram-sha-256' in PG12?
> 
> I thought we were going to wait a bit longer --- that just got added
> last year, no?  What do we know about the state of support in client
> libraries?

Great idea!  Does it make sense to test all, or at least some
significant fraction of the connectors listed in
https://wiki.postgresql.org/wiki/Client_Libraries by default?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Contribution to Perldoc for TestLib module in Postgres

2019-04-07 Thread Ramanarayana

Hi,
Please find the updated patch. Added to the commitfest as well
Regards,
Ram.


v2_perldoc_testlib.patch.patch
Description: Binary data

Re: ToDo: show size of partitioned table

2019-04-07 Thread Pavel Stehule

ne 7. 4. 2019 v 19:16 odesílatel Alvaro Herrera 
napsal:

> Here.
>

+ 1

Pavel

>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: ToDo: show size of partitioned table

2019-04-07 Thread Alvaro Herrera

Here.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
commit 39766894a30d2cbfea7904cb04e9bb0ed398b76c
Author: Alvaro Herrera 
AuthorDate: Sun Apr 7 07:59:12 2019 -0400
CommitDate: Sun Apr 7 12:51:13 2019 -0400

psql \dP: list partitioned tables and indexes

The new command lists partitioned relations (tables and/or indexes),
possibly with their sizes, possibly including partitioned partitions;
their parents (if not top-level); if indexes show the tables they belong
to; and their descriptions.

While there are various possible improvements to this, having it in this
form is already a great improvement than not having any way to obtain
this report.

Author: Pavel Stěhule, with help from Mathias Brossard, Amit Langote and
	Justin Pryzby.
Reviewed-by: Amit Langote, Mathias Brossard, Melanie Plageman,
	Michaël Paquier

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 2bc8bbc2a74..3587abce8b2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1659,6 +1659,36 @@ testdb=
 
   
 
+
+  
+\dP[itn+] [ pattern ]
+
+
+Lists partitioned relations.  If pattern
+is specified, only entries whose name matches the pattern are listed.
+The modifiers t (tables) and i
+(indexes) can be used, filtering the kind of relations to list.  By
+default, partitioned tables and indexes are listed.
+
+
+
+If the modifier n (nested) is used,
+then non-root partitioned tables are included, and a column is shown
+displaying the parent of each partitioned relation.
+
+
+
+If + is appended to the command name, the sum of
+sizes of table's partitions (including that of their indexes) is also
+displayed, along with the associated description.
+If n is combined with +, two
+sizes are shown: one including the total size of directly-attached
+leaf partitions, and another showing the total size of all partitions,
+including indirectly attached sub-partitions.
+
+
+  
+
   
 \drds [ role-pattern [ database-pattern ] ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index bc3d10e5158..8254d610999 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -784,6 +784,23 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 			case 'p':
 success = permissionsList(pattern);
 break;
+			case 'P':
+{
+	switch (cmd[2])
+	{
+		case '\0':
+		case '+':
+		case 't':
+		case 'i':
+		case 'n':
+			success = listPartitionedTables([2], pattern, show_verbose);
+			break;
+		default:
+			status = PSQL_CMD_UNKNOWN;
+			break;
+	}
+}
+break;
 			case 'T':
 success = describeTypes(pattern, show_verbose, show_system);
 break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index f7f7285acca..54f5d3fb663 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3777,6 +3777,217 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 	return true;
 }
 
+/*
+ * \dP
+ * Takes an optional regexp to select particular relations
+ *
+ * As with \d, you can specify the kinds of relations you want:
+ *
+ * t for tables
+ * i for indexes
+ *
+ * And there's additional flags:
+ *
+ * n to list non-leaf partitioned tables
+ *
+ * and you can mix and match these in any order.
+ */
+bool
+listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
+{
+	bool		showTables = strchr(reltypes, 't') != NULL;
+	bool		showIndexes = strchr(reltypes, 'i') != NULL;
+	bool		showNested = strchr(reltypes, 'n') != NULL;
+	PQExpBufferData buf;
+	PQExpBufferData title;
+	PGresult   *res;
+	printQueryOpt myopt = pset.popt;
+	bool translate_columns[] = {false, false, false, false, false, false, false, false, false};
+	const char *tabletitle;
+	bool		mixed_output = false;
+
+	/*
+	 * Note: Declarative table partitioning is only supported as of Pg 10.0.
+	 */
+	if (pset.sversion < 10)
+	{
+		char		sverbuf[32];
+
+		pg_log_error("The server (version %s) does not support declarative table partitioning.",
+	 formatPGVersionNumber(pset.sversion, false,
+		   sverbuf, sizeof(sverbuf)));
+		return true;
+	}
+
+	/* If no relation kind was selected, show them all */
+	if (!showTables && !showIndexes)
+		showTables = showIndexes = true;
+
+	if (showIndexes && !showTables)
+		tabletitle = _("List of partitioned indexes");	/* \dPi */
+	else if (showTables && !showIndexes)
+		tabletitle = _("List of partitioned tables");	/* \dPt */
+	else
+	{
+		/* show all kinds */
+		tabletitle = _("List of partitioned relations");
+		mixed_output = true;
+	}
+
+

Re: change password_encryption default to scram-sha-256?

2019-04-07 Thread Tom Lane

Peter Eisentraut  writes:
> Should we change the default of the password_encryption setting to
> 'scram-sha-256' in PG12?

I thought we were going to wait a bit longer --- that just got added
last year, no?  What do we know about the state of support in client
libraries?

regards, tom lane

Re: Back-branch bugs with fully-prunable UPDATEs

2019-04-07 Thread Tom Lane

Amit Langote  writes:
> On Sun, Apr 7, 2019 at 5:28 AM Tom Lane  wrote:
>> This test script works fine in HEAD:
>> In v11, it suffers an assertion failure in ExecSetupPartitionTupleRouting.
>> In v10, it doesn't crash, but we do get
>> WARNING:  relcache reference leak: relation "parttbl" not closed

> What we did in the following commit is behind this:
> commit 58947fbd56d1481a86a03087c81f728fdf0be866
> Before this commit, partitioning related code in the executor could
> always rely on the fact that ModifyTableState.resultRelInfo[] only
> contains *leaf* partitions.  As of this commit, it may contain the
> root partitioned table in some cases, which breaks that assumption.

Ah.  Thanks for the diagnosis and patches; pushed.

I chose to patch HEAD similarly to v11, even though no bug manifests
right now; it seems safer that way.  We should certainly have the
test case in HEAD, now that we realize there wasn't coverage for this.

regards, tom lane

Trailing whitespaces in various documentations

2019-04-07 Thread Julien Rouhaud

Hi,

While working on unrelated documentation change, I noticed some
trailing whitespaces in various documentation files.  PFA a simple
patch to get rid of them (I didn't removed the one corresponding to
psql output), if that helps.
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 45a3cf3def..ffed887ab7 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -671,7 +671,7 @@ hostnogssenc database  usererror fields indicate problems in the
corresponding lines of the file.
   
- 
+
   

 To connect to a particular database, a user must not only pass the
@@ -747,9 +747,9 @@ hostall all .example.comscram-sha-256
 # reject all connections from 192.168.54.1 (since that entry will be
 # matched first), but allow GSSAPI-encrypted connections from anywhere else
 # on the Internet.  The zero mask causes no bits of the host IP address to
-# be considered, so it matches any host.  Unencrypted GSSAPI connections 
+# be considered, so it matches any host.  Unencrypted GSSAPI connections
 # (which "fall through" to the third line since "hostgssenc" only matches
-# encrypted GSSAPI connections) are allowed, but only from 192.168.12.10.  
+# encrypted GSSAPI connections) are allowed, but only from 192.168.12.10.
 #
 # TYPE  DATABASEUSERADDRESS METHOD
 hostall all 192.168.54.1/32 reject
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d2da1abe61..75f9471ff6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -953,7 +953,7 @@ include_dir 'conf.d'
 This parameter is supported only on systems that support
 TCP_USER_TIMEOUT; on other systems, it must be zero.
 In sessions connected via a Unix-domain socket, this parameter is
-ignored and always reads as zero. 
+ignored and always reads as zero.


 
diff --git a/doc/src/sgml/ecpg.sgml b/doc/src/sgml/ecpg.sgml
index d5ee6a58e7..1bd7cf4ebd 100644
--- a/doc/src/sgml/ecpg.sgml
+++ b/doc/src/sgml/ecpg.sgml
@@ -312,7 +312,7 @@ current=testdb1 (should be testdb1)
   
 
   
-   The third option is to declare a sql identifier linked to 
+   The third option is to declare a sql identifier linked to
the connection, for example:
 
 EXEC SQL AT connection-name DECLARE statement-name STATEMENT;
@@ -558,7 +558,7 @@ EXEC SQL COMMIT;

   
  
- 
+
  
   EXEC SQL SET AUTOCOMMIT TO ON
   
@@ -6843,7 +6843,7 @@ EXEC SQL [ AT connection_name ] DEC
 A database connection name established by the CONNECT command.


-If AT clause is omitted, an SQL statement identifier is associated with the DEFAULT connection.   
+If AT clause is omitted, an SQL statement identifier is associated with the DEFAULT connection.

   
  
@@ -6862,10 +6862,10 @@ EXEC SQL [ AT connection_name ] DEC

 

-Notes
+Notes
 
  AT clause can be used at other dynamic SQL statements. The following table
- gives the connected database when AT clause is used at DECLARE STATEMENT 
+ gives the connected database when AT clause is used at DECLARE STATEMENT
  and other dynamic statements.
 
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5c3724ab9e..6211ff3927 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17397,7 +17397,7 @@ SET search_path TO schema , sc
 because it needs access to the predicate lock manager's shared
 state for a short time.

-   
+

 version

@@ -21225,7 +21225,7 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
   Name Return Type Description
  
 
-  
+ 
   

 pg_partition_tree
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 2b4dcd03c8..543691dad4 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1518,7 +1518,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
 processing would request a file from the WAL archive, reporting failure
 if the file was unavailable.  For standby processing it is normal for
 the next WAL file to be unavailable, so the standby must wait for
-it to appear. For files ending in 
+it to appear. For files ending in
 .history there is no need to wait, and a non-zero return
 code must be returned. A waiting restore_command can be
 written as a custom script that loops after polling for the existence of
diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 3e0e92a785..8c5df6f0bb 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -625,7 +625,7 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
  
 
  
-  jsonpath Type 
+  jsonpath Type
 
   
jsonpath
diff --git

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread David Rowley

On Mon, 8 Apr 2019 at 04:09, Tom Lane  wrote:
> Um ... I don't see where you're destroying the old hash?

In CreateLocalLockHash.

> Also, I entirely dislike wiring in assumptions about hash_seq_search's
> private state structure here.  I think it's worth having an explicit
> entry point in dynahash.c to get the current number of buckets.

Okay. Added hash_get_max_bucket()

> Also, I would not define "significantly bloated" as "the table has
> grown at all".  I think the threshold ought to be at least ~100
> buckets, if we're starting at 16.

I wouldn't either. I don't think the comment says that. It says there
can be slowdowns when its significantly bloated, and then goes on to
say that we just resize when it's bigger than standard.

> Probably we ought to try to gather some evidence to inform the
> choice of cutoff here.  Maybe instrument the regression tests to
> see how big the table typically gets?

In partition_prune.sql I see use of a bucket as high as 285 on my machine with:

drop table lp, coll_pruning, rlp, mc3p, mc2p, boolpart, rp,
coll_pruning_multi, like_op_noprune, lparted_by_int2, rparted_by_int2;

I've not added any sort of cut-off though as I benchmarked it and
surprisingly I don't see any slowdown with the worst case. So I'm
thinking there might not be any point.

alter system set plan_cache_mode = 'force_generic_plan';
create table hp (a int primary key) partition by hash (a);
select 'create table hp' || x::text || ' partition of hp for values
with (modulus 32, remainder ' || (x)::text || ');' from
generate_series(0,31) x;
\gexec

select.sql
\set p 1
select * from hp where a = :p

Master
$ pgbench -n -M prepared -f select.sql -T 60 postgres
tps = 11834.764309 (excluding connections establishing)
tps = 12279.212223 (excluding connections establishing)
tps = 12007.263547 (excluding connections establishing)

Patched:
$ pgbench -n -M prepared -f select.sql -T 60 postgres
tps = 13380.684817 (excluding connections establishing)
tps = 12790.999279 (excluding connections establishing)
tps = 12568.892558 (excluding connections establishing)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


shrink_bloated_locallocktable_v3.patch
Description: Binary data

Re: Checksum errors in pg_stat_database

2019-04-07 Thread Julien Rouhaud

Thanks for looking it it!

On Sun, Apr 7, 2019 at 4:36 PM Magnus Hagander  wrote:
>
> I'm not sure I like the idea of using "" as the database 
> name. It's not very likely that somebody would be using that as a name for 
> their database, but i's not impossible. But it also just looks strrange. 
> Wouldn't NULL be a more appropriate choice?
>
> Likewise, shouldn't we return NULL as the number of backends for the shared 
> counters, rather than 0?
I wanted to make things more POLA-compliant, but maybe it was a bad
idea.  I changed it for NULL here and for numbackends.

> Micro-nit:
> + Time at which the last data page checksum failures was detected 
> in
> s/failures/failure/

Oops.

v5 attached.
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4eb43f2de9..6bad265413 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2498,20 +2498,22 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
 
  datid
  oid
- OID of a database
+ OID of this database, or 0 for objects belonging to a shared
+ relation
 
 
  datname
  name
- Name of this database
+ Name of this database, or NULL for the shared
+ objects.
 
 
  numbackends
  integer
- Number of backends currently connected to this database.
- This is the only column in this view that returns a value reflecting
- current state; all other columns return the accumulated values since
- the last reset.
+ Number of backends currently connected to this database, or
+ NULL for the shared objects.  This is the only column
+ in this view that returns a value reflecting current state; all other
+ columns return the accumulated values since the last reset.
 
 
  xact_commit
@@ -2597,7 +2599,14 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
 
  checksum_failures
  bigint
- Number of data page checksum failures detected in this database
+ Number of data page checksum failures detected in this
+ database
+
+
+ checksum_last_failure
+ timestamp with time zone
+ Time at which the last data page checksum failure was detected in
+ this database, or on a shared object.
 
 
  blk_read_time
@@ -2622,7 +2631,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
 
   
The pg_stat_database view will contain one row
-   for each database in the cluster, showing database-wide statistics.
+   for each database in the cluster, plus one for the shared objects, showing
+   database-wide statistics.
   
 
   
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 7f32310308..7fc3152c6d 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -218,7 +218,9 @@ PostgreSQL documentation
 I/O system that would otherwise be silent. Enabling checksums
 may incur a noticeable performance penalty. This option can only
 be set during initialization, and cannot be changed later. If
-set, checksums are calculated for all objects, in all databases.
+set, checksums are calculated for all objects, in all databases. All
+checksum failures will be reported in the  view.

   
  
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index dd6bce57d2..3c7baf1fb4 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -531,7 +531,8 @@ PostgreSQL documentation
 By default, checksums are verified and checksum failures will result
 in a non-zero exit status. However, the base backup will not be
 removed in such a case, as if the --no-clean option
-had been used.
+had been used.  Checksum verifications failures will also be reported
+in the  view.

   
  
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 16e456a7d9..e90d251a2c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -816,7 +816,10 @@ CREATE VIEW pg_stat_database AS
 SELECT
 D.oid AS datid,
 D.datname AS datname,
-pg_stat_get_db_numbackends(D.oid) AS numbackends,
+CASE
+WHEN (D.oid = (0)::oid) THEN NULL::integer
+ELSE pg_stat_get_db_numbackends(D.oid)
+END AS numbackends,
 pg_stat_get_db_xact_commit(D.oid) AS xact_commit,
 pg_stat_get_db_xact_rollback(D.oid) AS xact_rollback,
 pg_stat_get_db_blocks_fetched(D.oid) -
@@ -832,10 +835,15 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_temp_bytes(D.oid) AS temp_bytes,
 pg_stat_get_db_deadlocks(D.oid) AS deadlocks,
 pg_stat_get_db_checksum_failures(D.oid) AS

Re: pgbench - add minimal stats on initialization

2019-04-07 Thread Fabien COELHO



 done in 0.68 s (drop 0.06 s, create table 0.02 s, generate 0.34 s, vacuum 
0.13 s, primary keys 0.13 s).


See the durations on the last line.


It's even better with working TAP tests.

--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 99529de52a..76a5b87fe8 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3941,32 +3941,48 @@ checkInitSteps(const char *initialize_steps)
 static void
 runInitSteps(const char *initialize_steps)
 {
-	PGconn	   *con;
-	const char *step;
+	PQExpBufferData	stats;
+	PGconn			   *con;
+	const char		   *step;
+	doublerun_time = 0.0;
+	boolfirst = true;
+
+	initPQExpBuffer();
 
 	if ((con = doConnect()) == NULL)
 		exit(1);
 
 	for (step = initialize_steps; *step != '\0'; step++)
 	{
+		instr_time	start;
+		char 	   *op = NULL;
+
+		INSTR_TIME_SET_CURRENT(start);
+
 		switch (*step)
 		{
 			case 'd':
+op = "drop";
 initDropTables(con);
 break;
 			case 't':
+op = "create table";
 initCreateTables(con);
 break;
 			case 'g':
+op = "generate",
 initGenerateData(con);
 break;
 			case 'v':
+op = "vacuum";
 initVacuum(con);
 break;
 			case 'p':
+op = "primary keys";
 initCreatePKeys(con);
 break;
 			case 'f':
+op = "foreign keys";
 initCreateFKeys(con);
 break;
 			case ' ':
@@ -3977,10 +3993,28 @@ runInitSteps(const char *initialize_steps)
 PQfinish(con);
 exit(1);
 		}
+
+		if (op != NULL)
+		{
+			instr_time	diff;
+			double		elapsed_sec;
+
+			INSTR_TIME_SET_CURRENT(diff);
+			INSTR_TIME_SUBTRACT(diff, start);
+			elapsed_sec = INSTR_TIME_GET_DOUBLE(diff);
+
+			if (!first)
+appendPQExpBufferStr(, ", ");
+
+			first = false;
+			appendPQExpBuffer(, "%s %.2f s", op, elapsed_sec);
+			run_time += elapsed_sec;
+		}
 	}
 
-	fprintf(stderr, "done.\n");
+	fprintf(stderr, "done in %.2f s (%s).\n", run_time, stats.data);
 	PQfinish(con);
+	termPQExpBuffer();
 }
 
 /*
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 62906d5e55..696dc2b36c 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -101,7 +101,7 @@ pgbench(
 	[qr{^$}],
 	[
 		qr{creating tables},   qr{vacuuming},
-		qr{creating primary keys}, qr{done\.}
+		qr{creating primary keys}, qr{done in \d+\.\d\d s }
 	],
 	'pgbench scale 1 initialization',);
 
@@ -116,7 +116,7 @@ pgbench(
 		qr{vacuuming},
 		qr{creating primary keys},
 		qr{creating foreign keys},
-		qr{done\.}
+		qr{done in \d+\.\d\d s }
 	],
 	'pgbench scale 1 initialization');
 
@@ -131,7 +131,7 @@ pgbench(
 		qr{creating primary keys},
 		qr{.* of .* tuples \(.*\) done},
 		qr{creating foreign keys},
-		qr{done\.}
+		qr{done in \d+\.\d\d s }
 	],
 	'pgbench --init-steps');

Re: ToDo: show size of partitioned table

2019-04-07 Thread Pavel Stehule

ne 7. 4. 2019 v 18:07 odesílatel Alvaro Herrera 
napsal:

> On 2019-Apr-07, Pavel Stehule wrote:
>
> > ne 7. 4. 2019 v 17:27 odesílatel Justin Pryzby 
> > napsal:
>
> > > I think there's an issue with showing indices.  You said that \dP
> should be
> > > same as \dPti, no?  Right now, indices are not shown in \dP, unless a
> > > pattern is given.  I see you add that behavior in the regression
> > > tests; is that really what's intended ?  Also, right now adding a
> > > pattern affects how sizes are computed, I don't see why that's
> > > desirable or, if so, how to resolve that inconsistency, or how to
> > > document it.
> >
> > That depends. If there are not pattern, then \dP show only tables, but
> with
> > total relation size (so size of indexes are nested). It is different than
> > \dPti, but I think so it is useful - when you don't specify object type,
> > then usually you would to see a tables, but with total size.
> >
> > I don't see a benefit from \dP == \dPti. When there are a pattern (that
> can
> > choose some index, then, indexes are displayed and \dP == \dPti.
>
> Well, I think Justin has it right --- \dP should be just like \df, which
> means to list "everything".  If you add the "t" or the "i", that means
> to list only those kinds of things (just like adding one of a, n, p, t,
> w does for \df).  You can add both, and then it list both kinds, just
> like \dfanptw list the same things that \df does.
>
> That's also what I changed the docs to say, but I failed to update the
> code correctly, and didn't verify the expected output closely either.
> So I'm due to resubmit this ...
>

ok

Pavel


> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Tom Lane

David Rowley  writes:
> Okay.  Here's another version with all the average locks code removed
> that only recreates the table when it's completely empty.

Um ... I don't see where you're destroying the old hash?

Also, I entirely dislike wiring in assumptions about hash_seq_search's
private state structure here.  I think it's worth having an explicit
entry point in dynahash.c to get the current number of buckets.

Also, I would not define "significantly bloated" as "the table has
grown at all".  I think the threshold ought to be at least ~100
buckets, if we're starting at 16.

Probably we ought to try to gather some evidence to inform the
choice of cutoff here.  Maybe instrument the regression tests to
see how big the table typically gets?

regards, tom lane

Re: query logging of prepared statements

2019-04-07 Thread Alvaro Herrera

On 2019-Apr-07, Justin Pryzby wrote:

> [...] Since I've been using log_statement=all, and not
> log_min_duration_statement, I don't have a strong opinion about it.

Ah, so your plan for this in production is to use the sample-based
logging facilities, I see!  Did you get Adrien a beer already?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ToDo: show size of partitioned table

2019-04-07 Thread Alvaro Herrera

On 2019-Apr-07, Pavel Stehule wrote:

> ne 7. 4. 2019 v 17:27 odesílatel Justin Pryzby 
> napsal:

> > I think there's an issue with showing indices.  You said that \dP should be
> > same as \dPti, no?  Right now, indices are not shown in \dP, unless a
> > pattern is given.  I see you add that behavior in the regression
> > tests; is that really what's intended ?  Also, right now adding a
> > pattern affects how sizes are computed, I don't see why that's
> > desirable or, if so, how to resolve that inconsistency, or how to
> > document it.
> 
> That depends. If there are not pattern, then \dP show only tables, but with
> total relation size (so size of indexes are nested). It is different than
> \dPti, but I think so it is useful - when you don't specify object type,
> then usually you would to see a tables, but with total size.
> 
> I don't see a benefit from \dP == \dPti. When there are a pattern (that can
> choose some index, then, indexes are displayed and \dP == \dPti.

Well, I think Justin has it right --- \dP should be just like \df, which
means to list "everything".  If you add the "t" or the "i", that means
to list only those kinds of things (just like adding one of a, n, p, t,
w does for \df).  You can add both, and then it list both kinds, just
like \dfanptw list the same things that \df does.

That's also what I changed the docs to say, but I failed to update the
code correctly, and didn't verify the expected output closely either.
So I'm due to resubmit this ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread David Rowley

On Mon, 8 Apr 2019 at 03:47, Andres Freund  wrote:
> Could you benchmark your adversarial case?

Which case?

I imagine the worst case for v2 is a query that just constantly asks
for over 16 locks. Perhaps a prepared plan, so not to add planner
overhead.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: ToDo: show size of partitioned table

2019-04-07 Thread Pavel Stehule

ne 7. 4. 2019 v 14:15 odesílatel Alvaro Herrera 
napsal:

> So how about the attached version?
>

 +1

Pavel

>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: ToDo: show size of partitioned table

2019-04-07 Thread Pavel Stehule

ne 7. 4. 2019 v 17:27 odesílatel Justin Pryzby 
napsal:

> On Sun, Apr 07, 2019 at 08:15:06AM -0400, Alvaro Herrera wrote:
> > So how about the attached version?
>
> +1
>
> I found a few issues.
>
> \dP+ didn't work.  Fix attached.
>
> +static const SchemaQuery Query_for_list_of_partitioned_relations = {
>
> +   .catname = "pg_catalog.pg_class c",
>
> +   .selcondition = "c.relkind = "
> CppAsString2(RELKIND_PARTITIONED_TABLE),
>
>
> => Should it be called Query_for_list_of_partitioned_tables ?  Or should
> c.relkind match indices, too ?
>
> On Sat, Apr 06, 2019 at 01:36:23AM -0300, Alvaro Herrera wrote:
> > Maybe the only behavior change I'd do to the submitted patch is to have
> > \dP show both tables and indexes, while \dPt shows only tables and \dPi
> > shows only indexes.  Maybe have \dPti show both tables and indexes? (
> > identical to \dP)  That would be consistent with \d itself.
>
> I think there's an issue with showing indices.  You said that \dP should be
> same as \dPti, no?  Right now, indices are not shown in \dP, unless a
> pattern
> is given.  I see you add that behavior in the regression tests; is that
> really
> what's intended ?  Also, right now adding a pattern affects how sizes are
> computed, I don't see why that's desirable or, if so, how to resolve that
> inconsistency, or how to document it.
>

That depends. If there are not pattern, then \dP show only tables, but with
total relation size (so size of indexes are nested). It is different than
\dPti, but I think so it is useful - when you don't specify object type,
then usually you would to see a tables, but with total size.

I don't see a benefit from \dP == \dPti. When there are a pattern (that can
choose some index, then, indexes are displayed and \dP == \dPti.

I think so Alvaro's version is correct, and I prefer it.

Regards

Pavel


> Justin
>

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Andres Freund

Hi,

On 2019-04-08 03:40:52 +1200, David Rowley wrote:
> On Mon, 8 Apr 2019 at 03:20, Tom Lane  wrote:
> >
> > David Rowley  writes:
> > > The reason I thought it was a good idea to track some history there
> > > was to stop the lock table constantly being shrunk back to the default
> > > size every time a simple single table query was executed.
> >
> > I think that's probably gilding the lily, considering that this whole
> > issue is pretty new.  There's no evidence that expanding the local
> > lock table is a significant drag on queries that need a lot of locks.
> 
> Okay.  Here's another version with all the average locks code removed
> that only recreates the table when it's completely empty.

Could you benchmark your adversarial case?

- Andres

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread David Rowley

On Mon, 8 Apr 2019 at 03:20, Tom Lane  wrote:
>
> David Rowley  writes:
> > The reason I thought it was a good idea to track some history there
> > was to stop the lock table constantly being shrunk back to the default
> > size every time a simple single table query was executed.
>
> I think that's probably gilding the lily, considering that this whole
> issue is pretty new.  There's no evidence that expanding the local
> lock table is a significant drag on queries that need a lot of locks.

Okay.  Here's another version with all the average locks code removed
that only recreates the table when it's completely empty.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


shrink_bloated_locallocktable_v2.patch
Description: Binary data

Re: query logging of prepared statements

2019-04-07 Thread Justin Pryzby

Hi,

Thanks both for thinking about this.

On Fri, Apr 05, 2019 at 06:16:38PM -0700, Andres Freund wrote:
> On 2019-04-04 16:01:26 -0300, Alvaro Herrera wrote:
> > Also, if you parse once and bind/execute many times, IMO the statement
> > should be logged exactly once.  I think you could that with the flag I
> > propose.

> I think deduplicating the logging between bind and execute has less of
> that hazard.

Do you mean the bind parameters, which are only duplicated in the case of
log_min_duration_statement ?

I actually implemented that, using Alvaro's suggestion of a flag in the Portal,
and decided that if duration is exceeded, for bind or execute, then it's likely
desirable to log the params, even if it's duplicitive.  Since I've been using
log_statement=all, and not log_min_duration_statement, I don't have a strong
opinion about it.

Perhaps you're right (and perhaps I should switch to
log_min_duration_statement).  I'll tentately plan to withdraw the patch.

Thanks,
Justin

Re: ToDo: show size of partitioned table

2019-04-07 Thread Justin Pryzby

On Sun, Apr 07, 2019 at 08:15:06AM -0400, Alvaro Herrera wrote:
> So how about the attached version?

+1

I found a few issues.

\dP+ didn't work.  Fix attached.

+static const SchemaQuery Query_for_list_of_partitioned_relations = {   
  
+   .catname = "pg_catalog.pg_class c", 
  
+   .selcondition = "c.relkind = " CppAsString2(RELKIND_PARTITIONED_TABLE), 
  

=> Should it be called Query_for_list_of_partitioned_tables ?  Or should
c.relkind match indices, too ?

On Sat, Apr 06, 2019 at 01:36:23AM -0300, Alvaro Herrera wrote:
> Maybe the only behavior change I'd do to the submitted patch is to have
> \dP show both tables and indexes, while \dPt shows only tables and \dPi
> shows only indexes.  Maybe have \dPti show both tables and indexes? (
> identical to \dP)  That would be consistent with \d itself.

I think there's an issue with showing indices.  You said that \dP should be
same as \dPti, no?  Right now, indices are not shown in \dP, unless a pattern
is given.  I see you add that behavior in the regression tests; is that really
what's intended ?  Also, right now adding a pattern affects how sizes are
computed, I don't see why that's desirable or, if so, how to resolve that
inconsistency, or how to document it.

Justin
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 89f08fc..8254d61 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -789,6 +789,7 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 	switch (cmd[2])
 	{
 		case '\0':
+		case '+':
 		case 't':
 		case 'i':
 		case 'n':
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 40f7531..936439e 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3804,7 +3804,6 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 	printQueryOpt myopt = pset.popt;
 	bool translate_columns[] = {false, false, false, false, false, false, false, false, false};
 	const char *size_function;
-	const char *relkind_str;
 	const char *tabletitle;
 	bool		mixed_output = false;
 
@@ -3813,7 +3812,7 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 		showTables = showIndexes = true;
 
 	/*
-	 * Note: Declarative table partitions are only supported as of Pg 10.0.
+	 * Note: Declarative table partitioning is only supported as of Pg 10.0.
 	 */
 	if (pset.sversion < 10)
 	{
@@ -3829,14 +3828,12 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 	{
 		/* \dPi */
 		size_function = "pg_table_size";
-		relkind_str = CppAsString2(RELKIND_PARTITIONED_INDEX);
 		tabletitle = gettext_noop("List of partitioned indexes");
 	}
 	else if (showTables && !showIndexes)
 	{
 		/* \dPt */
 		size_function = "pg_table_size";
-		relkind_str = CppAsString2(RELKIND_PARTITIONED_TABLE);
 		tabletitle = gettext_noop("List of partitioned tables");
 	}
 	else
@@ -3844,17 +3841,14 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 		/* show all kinds */
 		tabletitle = gettext_noop("List of partitioned tables and indexes");
 		mixed_output = true;
+		size_function = "pg_table_size";
 		if (!pattern)
 		{
-			size_function = "pg_total_relation_size";
-			relkind_str = CppAsString2(RELKIND_PARTITIONED_TABLE);
+			// why ??? size_function = "pg_total_relation_size";
+			// why this too ??? relkind_str = CppAsString2(RELKIND_PARTITIONED_TABLE);
 		}
 		else
-		{
 			size_function = "pg_table_size";
-			relkind_str = CppAsString2(RELKIND_PARTITIONED_TABLE)
-", " CppAsString2(RELKIND_PARTITIONED_INDEX);
-		}
 	}
 
 	initPQExpBuffer();
@@ -3963,7 +3957,14 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 		}
 	}
 
-	appendPQExpBuffer(, "\nWHERE c.relkind IN (%s)", relkind_str);
+	appendPQExpBufferStr(, "\nWHERE c.relkind IN (");
+	if (showTables)
+		appendPQExpBufferStr(, CppAsString2(RELKIND_PARTITIONED_TABLE) ",");
+	if (showIndexes)
+		appendPQExpBufferStr(, CppAsString2(RELKIND_PARTITIONED_INDEX) ",");
+	appendPQExpBufferStr(, "''");	/* dummy */
+	appendPQExpBufferStr(, ")\n");
+
 	appendPQExpBufferStr(, !showNested ? " AND NOT c.relispartition\n" : "\n");
 
 	if (!pattern)

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Tom Lane

David Rowley  writes:
> On Mon, 8 Apr 2019 at 02:59, Tom Lane  wrote:
>> We *should* be using hash_get_num_entries(), but only to verify
>> that the table is empty before resetting it.  The additional bit
>> that is needed is to see whether the number of buckets is large
>> enough to justify calling the table bloated.

> The reason I thought it was a good idea to track some history there
> was to stop the lock table constantly being shrunk back to the default
> size every time a simple single table query was executed.

I think that's probably gilding the lily, considering that this whole
issue is pretty new.  There's no evidence that expanding the local
lock table is a significant drag on queries that need a lot of locks.

regards, tom lane

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread David Rowley

On Mon, 8 Apr 2019 at 02:59, Tom Lane  wrote:
>
> David Rowley  writes:
> > hash_get_num_entries() looks cheap enough to me. Can you explain why
> > you think that's too expensive?
>
> What I objected to cost-wise was counting the number of lock
> acquisitions/releases, which seems entirely beside the point.
>
> We *should* be using hash_get_num_entries(), but only to verify
> that the table is empty before resetting it.  The additional bit
> that is needed is to see whether the number of buckets is large
> enough to justify calling the table bloated.

The reason I thought it was a good idea to track some history there
was to stop the lock table constantly being shrunk back to the default
size every time a simple single table query was executed. For example,
a workload repeatably doing:

SELECT * FROM table_with_lots_of_partitions;
SELECT * FROM non_partitioned_table;

I was worried that obtaining locks on the partitioned table would
become a little slower because it would have to expand the hash table
each time the query is executed.

> > As cheap as possible sounds good, but I'm confused at why you think
> > the table will always be empty at the end of transaction.
>
> It's conceivable that it won't be, which is why we need a test.
> I'm simply arguing that if it is not, we can just postpone de-bloating
> till it is.  Session-level locks are so rarely used that there's no
> need to sweat about that case.

That seems fair.  It would certainly simplify the patch.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Tom Lane

David Rowley  writes:
> On Mon, 8 Apr 2019 at 02:20, Tom Lane  wrote:
>> I like the concept ... but the particular implementation, not so much.
>> It seems way overcomplicated.  In the first place, why should we
>> add code to copy entries?  Just don't do it except when the table
>> is empty.  In the second, I think we could probably have a far
>> cheaper test for how big the table is --- maybe we'd need to
>> expose some function in dynahash.c, but the right way here is just
>> to see how many buckets there are.  I don't like adding statistics
>> counting for this, because it's got basically nothing to do with
>> what the actual problem is.  (If you acquire and release one lock,
>> and do that over and over, you don't have a bloat problem no
>> matter how many times you do it.)

> hash_get_num_entries() looks cheap enough to me. Can you explain why
> you think that's too expensive?

What I objected to cost-wise was counting the number of lock
acquisitions/releases, which seems entirely beside the point.

We *should* be using hash_get_num_entries(), but only to verify
that the table is empty before resetting it.  The additional bit
that is needed is to see whether the number of buckets is large
enough to justify calling the table bloated.

> As cheap as possible sounds good, but I'm confused at why you think
> the table will always be empty at the end of transaction.

It's conceivable that it won't be, which is why we need a test.
I'm simply arguing that if it is not, we can just postpone de-bloating
till it is.  Session-level locks are so rarely used that there's no
need to sweat about that case.

regards, tom lane

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread David Rowley

On Mon, 8 Apr 2019 at 02:36, David Rowley  wrote:
> > LockMethodLocalHash is special in that it predictably goes to empty
> > at the end of every transaction, so that de-bloating at that point
> > is a workable strategy.  I think we'd probably need something more
> > robust if we were trying to fix this generally for all hash tables.
> > But if we're going to go with the one-off hack approach, we should
> > certainly try to keep that hack as simple as possible.
>
> As cheap as possible sounds good, but I'm confused at why you think
> the table will always be empty at the end of transaction. It's my
> understanding and I see from debugging that session level locks remain
> in there. If I don't copy those into the new table they'll be lost.

Or we could just skip the table recreation if there are no
session-levels.  That would require calling hash_get_num_entries() on
the table again and just recreating the table if there are 0 locks.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread David Rowley

On Mon, 8 Apr 2019 at 02:20, Tom Lane  wrote:
> I like the concept ... but the particular implementation, not so much.
> It seems way overcomplicated.  In the first place, why should we
> add code to copy entries?  Just don't do it except when the table
> is empty.  In the second, I think we could probably have a far
> cheaper test for how big the table is --- maybe we'd need to
> expose some function in dynahash.c, but the right way here is just
> to see how many buckets there are.  I don't like adding statistics
> counting for this, because it's got basically nothing to do with
> what the actual problem is.  (If you acquire and release one lock,
> and do that over and over, you don't have a bloat problem no
> matter how many times you do it.)

hash_get_num_entries() looks cheap enough to me. Can you explain why
you think that's too expensive?

> LockMethodLocalHash is special in that it predictably goes to empty
> at the end of every transaction, so that de-bloating at that point
> is a workable strategy.  I think we'd probably need something more
> robust if we were trying to fix this generally for all hash tables.
> But if we're going to go with the one-off hack approach, we should
> certainly try to keep that hack as simple as possible.

As cheap as possible sounds good, but I'm confused at why you think
the table will always be empty at the end of transaction. It's my
understanding and I see from debugging that session level locks remain
in there. If I don't copy those into the new table they'll be lost.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Checksum errors in pg_stat_database

2019-04-07 Thread Magnus Hagander

On Thu, Apr 4, 2019 at 2:52 PM Julien Rouhaud  wrote:

> On Thu, Apr 4, 2019 at 1:25 PM Magnus Hagander 
> wrote:
> >
> > On Thu, Apr 4, 2019 at 10:47 AM Julien Rouhaud 
> wrote:
> >>
> >> Actually we do track counters for shared relations (see
> >> pgstat_report_stat), we just don't expose them in any view.  But it's
> >> still possible to get the counters manually:
> >>
> >> # select pg_stat_get_db_blocks_hit(0);
> >>  pg_stat_get_db_blocks_hit
> >> ---
> >>2710329
> >> (1 row)
> >
> >
> > Oh, right, we do actually collect it, we just don't show is. So that's
> another argument *for* having it in pg_stat_database. Or at least not for
> having it in a checksum specific view, because then we should really make a
> separate view for this as well.
>
> Ok, so let's expose all the shared counters in pg_stat_database and
> remove the pg_stat_checksum view.
>
> >> My main concern is that pg_stat_get_db_numbackends(0) report something
> >> like the total number of backend (though it seems that there's an
> >> extra connection accounted for, I don't know which process it's), so
> >> if we expose it in pg_stat_database, sum(numbackends) won't make sense
> >> anymore.
> >
> > We could also just hardcoded it so that one always shows 0?
>
> That's a bit hacky, but that's probably the best compromise.  Attached
> v4 with all those changes.
>

I'm not sure I like the idea of using "" as the database
name. It's not very likely that somebody would be using that as a name for
their database, but i's not impossible. But it also just looks strrange.
Wouldn't NULL be a more appropriate choice?

Likewise, shouldn't we return NULL as the number of backends for the shared
counters, rather than 0?

Micro-nit:
+ Time at which the last data page checksum failures was
detected in
s/failures/failure/

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: pg_rewind vs superuser

2019-04-07 Thread Magnus Hagander

On Fri, Apr 5, 2019 at 1:05 PM Michael Paquier  wrote:

> On Fri, Apr 05, 2019 at 10:39:26AM +0200, Michael Banck wrote:
> > Ok, so the problem is that that checkpoint might be still ongoing when
> > you quickly issue a pg_rewind from the other side?
>
> The end-of-recovery checkpoint may not have even begun.
>

So can we *detect* that this is the case? Because if so, we could perhaps
just wait for it to be done? Because there will always be one?

The main point is -- we know from experience that it's pretty fragile to
assume the user read the documentation :) So if we can find *any* way to
handle this in code rather than docs, that'd be great. We would still
absolutely want the docs change for back branches of course.

> I think it might be useful to specify more exactly which of the two
> > servers (the remote one AIUI) needs a CHECKPOINT in the above. Also, if
> > it is the case that a CHECKPOINT is done automatically (see above), that
> > paragraph could be rewritten to say something like "pg_rewind needs to
> > wait for the checkoint on the remote server to finish. This can be
> > ensured by issueing an explicit checkpoint on the remote server prior to
> > running pg_rewind."
>
> Well, the target server needs to be cleanly shut down, so it seems
> pretty clear to me which one needs to have a checkpoint :)
>

Clear to you and us of course, but quite possibly not to everybody. I'm
sure there are a *lot* of users out there who do not realize that "cleanly
shut down" means "ran a checkpoint just before it shut down".

> Finally, (and still, if I got the above correctly), to the suggestion of
> > Magnus of pg_rewind running the checkpoint itself on the remote: would
> > that again mean that pg_rewind needs SUPERUSER rights or is there
> > a(nother) GRANTable function that could be added to the list in this
> > case?
>
> pg_rewind would require again a superuser.  So this could be
>

Ugh, you are right of course.

> optional.  In one HA workflow I maintain, what I actually do is to
> enforce directly a checkpoint immediately after the promotion is done
> to make sure that the data is up-to-date, and I don't meddle with
> pg_rewind workflow.
>

Sure. And every other HA setup also has to take care of it. That's why it
would make sense to centralize it into the tool itself when it's
*mandatory* to deal with it somehow.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Tom Lane

David Rowley  writes:
> On Sat, 6 Apr 2019 at 16:03, Tom Lane  wrote:
>> My own thought about how to improve this situation was just to destroy
>> and recreate LockMethodLocalHash at transaction end (or start)
>> if its size exceeded $some-value.  Leaving it permanently bloated seems
>> like possibly a bad idea, even if we get rid of all the hash_seq_searches
>> on it.

> Which I thought was an okay idea.  I think the one advantage that
> would have over making hash_seq_search() faster for large and mostly
> empty tables is that over-sized hash tables are just not very cache
> efficient, and if we don't need it to be that large then we should
> probably consider making it smaller again.

> I've had a go at implementing this and using Amit's benchmark the
> performance looks pretty good. I can't detect any slowdown for the
> general case.

I like the concept ... but the particular implementation, not so much.
It seems way overcomplicated.  In the first place, why should we
add code to copy entries?  Just don't do it except when the table
is empty.  In the second, I think we could probably have a far
cheaper test for how big the table is --- maybe we'd need to
expose some function in dynahash.c, but the right way here is just
to see how many buckets there are.  I don't like adding statistics
counting for this, because it's got basically nothing to do with
what the actual problem is.  (If you acquire and release one lock,
and do that over and over, you don't have a bloat problem no
matter how many times you do it.)

LockMethodLocalHash is special in that it predictably goes to empty
at the end of every transaction, so that de-bloating at that point
is a workable strategy.  I think we'd probably need something more
robust if we were trying to fix this generally for all hash tables.
But if we're going to go with the one-off hack approach, we should
certainly try to keep that hack as simple as possible.

regards, tom lane

Re: [PATCH] Implement uuid_version()

2019-04-07 Thread David Fetter

On Sat, Apr 06, 2019 at 12:35:47PM -0400, Tom Lane wrote:
> Jose Luis Tallon  writes:
> >      While working on an application, the need arose to be able 
> > efficiently differentiate v4/v5 UUIDs (for use in partial indexes, among 
> > others)
> > ... so please find attached a trivial patch which adds the 
> > functionality.
> 
> No particular objection...
> 
> >      I'm not sure whether this actually would justify a version bump for 
> > the OSSP-UUID extension
> 
> Yes.  Basically, once we've shipped a given version of an extension's
> SQL script, that version is *frozen*.  Anything at all that you want
> to do to it has to be done in an extension update script, because
> otherwise there's no clean migration path for users.
> 
> So basically, leave uuid-ossp--1.1.sql as it stands, and put the
> new CREATE FUNCTION in a new uuid-ossp--1.1--1.2.sql script.
> See any recent patch that updated an extension for an example, eg
> commit eb6f29141bed9dc95cb473614c30f470ef980705.
> 
> (We do allow exceptions when somebody's already updated the extension
> in the current devel cycle, but that doesn't apply here.)
> 
> > Another matter, which I'd like to propose in a later thread, is 
> > whether it'd be interesting to include the main UUID functionality 
> > directly in core
> 
> We've rejected that before, and I don't see any reason to think
> the situation has changed since prior discussions.

I see some.

UUIDs turn out to be super useful in distributed systems to give good
guarantees of uniqueness without coordinating with a particular node.
Such systems have become a good bit more common since the most recent
time this was discussed.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread David Rowley

On Sat, 6 Apr 2019 at 16:03, Tom Lane  wrote:
> I'd also point out that this is hardly the only place where we've
> seen hash_seq_search on nearly-empty hash tables become a bottleneck.
> So I'm not thrilled about attacking that with one-table-at-time patches.
> I'd rather see us do something to let hash_seq_search win across
> the board.

Rewinding back to mid-Feb:

You wrote:
> My own thought about how to improve this situation was just to destroy
> and recreate LockMethodLocalHash at transaction end (or start)
> if its size exceeded $some-value.  Leaving it permanently bloated seems
> like possibly a bad idea, even if we get rid of all the hash_seq_searches
> on it.

Which I thought was an okay idea.  I think the one advantage that
would have over making hash_seq_search() faster for large and mostly
empty tables is that over-sized hash tables are just not very cache
efficient, and if we don't need it to be that large then we should
probably consider making it smaller again.

I've had a go at implementing this and using Amit's benchmark the
performance looks pretty good. I can't detect any slowdown for the
general case.

master:

plan_cache_mode = auto:

$ pgbench -n -M prepared -T 60 -f select.sql postgres
tps = 9373.698212 (excluding connections establishing)
tps = 9356.993148 (excluding connections establishing)
tps = 9367.579806 (excluding connections establishing)

plan_cache_mode = force_custom_plan:

$ pgbench -n -M prepared -T 60 -f select.sql postgres
tps = 12863.758185 (excluding connections establishing)
tps = 12787.766054 (excluding connections establishing)
tps = 12817.878940 (excluding connections establishing)

shrink_bloated_locallocktable.patch:

plan_cache_mode = auto:

$ pgbench -n -M prepared -T 60 -f select.sql postgres
tps = 12756.021211 (excluding connections establishing)
tps = 12800.939518 (excluding connections establishing)
tps = 12804.501977 (excluding connections establishing)

plan_cache_mode = force_custom_plan:

$ pgbench -n -M prepared -T 60 -f select.sql postgres
tps = 12763.448836 (excluding connections establishing)
tps = 12901.673271 (excluding connections establishing)
tps = 12856.512745 (excluding connections establishing)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


shrink_bloated_locallocktable.patch
Description: Binary data

Re: [PATCH] Implement uuid_version()

2019-04-07 Thread Jose Luis Tallon


On 6/4/19 18:35, Tom Lane wrote:

Jose Luis Tallon  writes:

      While working on an application, the need arose to be able
efficiently differentiate v4/v5 UUIDs (for use in partial indexes, among
others)
... so please find attached a trivial patch which adds the
functionality.

No particular objection...


      I'm not sure whether this actually would justify a version bump for
the OSSP-UUID extension

Yes.  Basically, once we've shipped a given version of an extension's
SQL script, that version is *frozen*.  Anything at all that you want
to do to it has to be done in an extension update script, because
otherwise there's no clean migration path for users.


Got it, and done. Please find attached a v2 patch with the upgrade 
script included.



Thank you for taking a look. Your time is much appreciated :)


    J.L.


diff --git a/contrib/uuid-ossp/Makefile b/contrib/uuid-ossp/Makefile
index c52c583d64..7f29bec535 100644
--- a/contrib/uuid-ossp/Makefile
+++ b/contrib/uuid-ossp/Makefile
@@ -4,7 +4,7 @@ MODULE_big = uuid-ossp
 OBJS = uuid-ossp.o $(UUID_EXTRA_OBJS) $(WIN32RES)
 
 EXTENSION = uuid-ossp
-DATA = uuid-ossp--1.1.sql uuid-ossp--1.0--1.1.sql uuid-ossp--unpackaged--1.0.sql
+DATA = uuid-ossp--1.1.sql uuid-ossp--1.0--1.1.sql uuid-ossp--1.1--1.2.sql uuid-ossp--unpackaged--1.0.sql
 PGFILEDESC = "uuid-ossp - UUID generation"
 
 REGRESS = uuid_ossp
diff --git a/contrib/uuid-ossp/uuid-ossp--1.1--1.2.sql b/contrib/uuid-ossp/uuid-ossp--1.1--1.2.sql
new file mode 100644
index 00..8e47ca60a1
--- /dev/null
+++ b/contrib/uuid-ossp/uuid-ossp--1.1--1.2.sql
@@ -0,0 +1,9 @@
+/* contrib/uuid-ossp/uuid-ossp--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION uuid-ossp UPDATE TO '1.2'" to load this file. \quit
+
+CREATE FUNCTION uuid_version(namespace uuid)
+RETURNS int4
+AS 'MODULE_PATHNAME', 'uuid_version'
+IMMUTABLE STRICT LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/uuid-ossp/uuid-ossp.c b/contrib/uuid-ossp/uuid-ossp.c
index f5ae915f24..b4997281c0 100644
--- a/contrib/uuid-ossp/uuid-ossp.c
+++ b/contrib/uuid-ossp/uuid-ossp.c
@@ -122,6 +122,7 @@ PG_FUNCTION_INFO_V1(uuid_generate_v1mc);
 PG_FUNCTION_INFO_V1(uuid_generate_v3);
 PG_FUNCTION_INFO_V1(uuid_generate_v4);
 PG_FUNCTION_INFO_V1(uuid_generate_v5);
+PG_FUNCTION_INFO_V1(uuid_version);
 
 #ifdef HAVE_UUID_OSSP
 
@@ -531,3 +532,16 @@ uuid_generate_v5(PG_FUNCTION_ARGS)
   VARDATA_ANY(name), VARSIZE_ANY_EXHDR(name));
 #endif
 }
+
+Datum
+uuid_version(PG_FUNCTION_ARGS)
+{
+	pg_uuid_t  *arg = PG_GETARG_UUID_P(0);
+	dce_uuid_t uu;
+
+	/* function is marked STRICT, so arg can't be NULL */
+	memcpy(,arg,UUID_LEN);
+	UUID_TO_NETWORK(uu);
+
+	PG_RETURN_INT32(uu.time_hi_and_version >> 12);
+}
diff --git a/contrib/uuid-ossp/uuid-ossp.control b/contrib/uuid-ossp/uuid-ossp.control
index 657476c182..3479b06eff 100644
--- a/contrib/uuid-ossp/uuid-ossp.control
+++ b/contrib/uuid-ossp/uuid-ossp.control
@@ -1,5 +1,5 @@
 # uuid-ossp extension
 comment = 'generate universally unique identifiers (UUIDs)'
-default_version = '1.1'
+default_version = '1.2'
 module_pathname = '$libdir/uuid-ossp'
 relocatable = true
diff --git a/doc/src/sgml/uuid-ossp.sgml b/doc/src/sgml/uuid-ossp.sgml
index b3b816c372..43dd565886 100644
--- a/doc/src/sgml/uuid-ossp.sgml
+++ b/doc/src/sgml/uuid-ossp.sgml
@@ -156,6 +156,22 @@ SELECT uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org');
 

   
+
+  
+   Functions Returning UUID attributes
+   
+
+ 
+  uuid_version()
+  
+   
+Returns the UUID version (1,3,4,5). Assumes variant 1 (RFC4122).
+   
+  
+ 
+
+   
+

Re: Fix foreign key constraint check for partitioned tables

2019-04-07 Thread Alvaro Herrera

On 2019-Apr-06, Tom Lane wrote:

> Hadi Moshayedi  writes:

> > This patch also changed the output of some of tests, i.e. previously
> > foreign key constraint failures errored on the partitioned table itself,
> > but now it shows the child table's name in the error message. I hope it is
> > ok.
> 
> Yeah, I think that's OK.  Interestingly, no such changes appear in
> HEAD's version of the regression test --- probably Alvaro's earlier
> changes had the same effect.

Yeah, they did.


-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ToDo: show size of partitioned table

2019-04-07 Thread Alvaro Herrera

So how about the attached version?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 2bc8bbc2a74..3587abce8b2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1659,6 +1659,36 @@ testdb=
 
   
 
+
+  
+\dP[itn+] [ pattern ]
+
+
+Lists partitioned relations.  If pattern
+is specified, only entries whose name matches the pattern are listed.
+The modifiers t (tables) and i
+(indexes) can be used, filtering the kind of relations to list.  By
+default, partitioned tables and indexes are listed.
+
+
+
+If the modifier n (nested) is used,
+then non-root partitioned tables are included, and a column is shown
+displaying the parent of each partitioned relation.
+
+
+
+If + is appended to the command name, the sum of
+sizes of table's partitions (including that of their indexes) is also
+displayed, along with the associated description.
+If n is combined with +, two
+sizes are shown: one including the total size of directly-attached
+leaf partitions, and another showing the total size of all partitions,
+including indirectly attached sub-partitions.
+
+
+  
+
   
 \drds [ role-pattern [ database-pattern ] ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index bc3d10e5158..89f08fc0eda 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -784,6 +784,22 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 			case 'p':
 success = permissionsList(pattern);
 break;
+			case 'P':
+{
+	switch (cmd[2])
+	{
+		case '\0':
+		case 't':
+		case 'i':
+		case 'n':
+			success = listPartitionedTables([2], pattern, show_verbose);
+			break;
+		default:
+			status = PSQL_CMD_UNKNOWN;
+			break;
+	}
+}
+break;
 			case 'T':
 success = describeTypes(pattern, show_verbose, show_system);
 break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index f7f7285acca..40f753120b6 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3777,6 +3777,235 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 	return true;
 }
 
+/*
+ * \dP
+ * Takes an optional regexp to select particular relations
+ *
+ * As with \d, you can specify the kinds of relations you want:
+ *
+ * t for tables
+ * i for indexes
+ *
+ * And there's additional flags:
+ *
+ * n to list non-leaf partitioned tables
+ *
+ * and you can mix and match these in any order.
+ */
+bool
+listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
+{
+	bool		showTables = strchr(reltypes, 't') != NULL;
+	bool		showIndexes = strchr(reltypes, 'i') != NULL;
+	bool		showNested = strchr(reltypes, 'n') != NULL;
+	PQExpBufferData buf;
+	PQExpBufferData title;
+	PGresult   *res;
+	printQueryOpt myopt = pset.popt;
+	bool translate_columns[] = {false, false, false, false, false, false, false, false, false};
+	const char *size_function;
+	const char *relkind_str;
+	const char *tabletitle;
+	bool		mixed_output = false;
+
+	/* If no relation kind was selected, show them all */
+	if (!showTables && !showIndexes)
+		showTables = showIndexes = true;
+
+	/*
+	 * Note: Declarative table partitions are only supported as of Pg 10.0.
+	 */
+	if (pset.sversion < 10)
+	{
+		char		sverbuf[32];
+
+		pg_log_error("The server (version %s) does not support declarative table partitioning.",
+	 formatPGVersionNumber(pset.sversion, false,
+		   sverbuf, sizeof(sverbuf)));
+		return true;
+	}
+
+	if (showIndexes && !showTables)
+	{
+		/* \dPi */
+		size_function = "pg_table_size";
+		relkind_str = CppAsString2(RELKIND_PARTITIONED_INDEX);
+		tabletitle = gettext_noop("List of partitioned indexes");
+	}
+	else if (showTables && !showIndexes)
+	{
+		/* \dPt */
+		size_function = "pg_table_size";
+		relkind_str = CppAsString2(RELKIND_PARTITIONED_TABLE);
+		tabletitle = gettext_noop("List of partitioned tables");
+	}
+	else
+	{
+		/* show all kinds */
+		tabletitle = gettext_noop("List of partitioned tables and indexes");
+		mixed_output = true;
+		if (!pattern)
+		{
+			size_function = "pg_total_relation_size";
+			relkind_str = CppAsString2(RELKIND_PARTITIONED_TABLE);
+		}
+		else
+		{
+			size_function = "pg_table_size";
+			relkind_str = CppAsString2(RELKIND_PARTITIONED_TABLE)
+", " CppAsString2(RELKIND_PARTITIONED_INDEX);
+		}
+	}
+
+	initPQExpBuffer();
+
+	printfPQExpBuffer(,
+	  "SELECT n.nspname as \"%s\",\n"
+	  "  c.relname as \"%s\",\n"
+	  "  pg_catalog.pg_get_userbyid(c.relowner) as \"%s\"",
+	  gettext_noop("Schema"),
+

Re: Augment every test postgresql.conf

2019-04-07 Thread Andrew Dunstan

On Sun, Apr 7, 2019 at 2:41 AM Noah Misch  wrote:
>
> On Sun, Dec 30, 2018 at 10:32:31AM -0500, Andrew Dunstan wrote:
> > On 12/30/18 12:53 AM, Noah Misch wrote:
> > > 2. stats_temp_directory is incompatible with TAP suites that start more 
> > > than
> > >one node simultaneously.
>
> > The obvious quick fix would be to have PostgresNode.pm set this to the
> > default after inserting the TEMP_CONFIG file.
>
> I'd like to get $SUBJECT in place for variables other than
> stats_temp_directory, using your quick fix idea.  Attached.  When its time
> comes, your stats_temp_directory work can delete that section.

Looks good.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: monitoring CREATE INDEX [CONCURRENTLY]

2019-04-07 Thread Peter Eisentraut

On 2019-04-06 06:40, Alvaro Herrera wrote:
> On 2019-Apr-05, Peter Eisentraut wrote:
> 
>> I've reworded the phases a bit.  There was a bit of a mixup of waiting
>> for snapshots and waiting for lockers.  Perhaps not so important from a
>> user's perspective, but at least now it's more consistent with the
>> source code comments.
> 
> No disagreement with that.  Looks reasonable.
> 
> I didn't test the patch, but it seems OK in a quick once-over.

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-04-07 Thread Peter Eisentraut

On 2019-04-06 05:03, Tom Lane wrote:
> Trying a standard pgbench test case (pgbench -M prepared -S with
> one client and an -s 10 database), it seems that the patch is about
> 0.5% slower than HEAD.  Again, that's below the noise threshold,
> but it's not promising for the net effects of this patch on workloads
> that aren't specifically about large and prunable partition sets.

In my testing, I've also noticed that it seems to be slightly on the
slower side for these simple tests.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

change password_encryption default to scram-sha-256?

2019-04-07 Thread Peter Eisentraut

Should we change the default of the password_encryption setting to
'scram-sha-256' in PG12?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Back-branch bugs with fully-prunable UPDATEs

2019-04-07 Thread Amit Langote

On Sun, Apr 7, 2019 at 5:28 AM Tom Lane  wrote:
>
> This test script works fine in HEAD:
>
> drop table if exists parttbl cascade;
> CREATE TABLE parttbl (a int, b int) PARTITION BY LIST (a);
> CREATE TABLE parttbl_1 PARTITION OF parttbl FOR VALUES IN (NULL,500,501,502);
> UPDATE parttbl SET a = NULL, b = NULL WHERE a = 1600 AND b = 999;
>
> In v11, it suffers an assertion failure in ExecSetupPartitionTupleRouting.
>
> In v10, it doesn't crash, but we do get
>
> WARNING:  relcache reference leak: relation "parttbl" not closed
>
> which is surely a bug as well.
>
> (This is a boiled-down version of the script I mentioned in
> https://www.postgresql.org/message-id/13344.1554578...@sss.pgh.pa.us)

What we did in the following commit is behind this:

commit 58947fbd56d1481a86a03087c81f728fdf0be866
Author: Tom Lane 
Date:   Fri Feb 22 12:23:00 2019 -0500

Fix plan created for inherited UPDATE/DELETE with all tables excluded.

Before this commit, partitioning related code in the executor could
always rely on the fact that ModifyTableState.resultRelInfo[] only
contains *leaf* partitions.  As of this commit, it may contain the
root partitioned table in some cases, which breaks that assumption.

I've attached fixes for PG 10 and 11, modifying ExecInitModifyTable()
and inheritance_planner(), respectively.

> This seems to be related to what Amit Langote complained of in
> https://www.postgresql.org/message-id/21e7eaa4-0d4d-20c2-a1f7-c7e96f4ce...@lab.ntt.co.jp
> but since there's no foreign tables involved at all, either it's
> a different bug or he misdiagnosed what he was seeing.

I think that one is a different bug, but maybe I haven't looked closely enough.

Thanks,
Amit

pg11-empty-ModifyTable-no-routing.patch
Description: Binary data

pg10-ExecInitModifyTable-root-table-fix.patch
Description: Binary data

Re: Augment every test postgresql.conf

2019-04-07 Thread Noah Misch

On Sun, Dec 30, 2018 at 10:32:31AM -0500, Andrew Dunstan wrote:
> On 12/30/18 12:53 AM, Noah Misch wrote:
> > 2. stats_temp_directory is incompatible with TAP suites that start more than
> >one node simultaneously.

> The obvious quick fix would be to have PostgresNode.pm set this to the
> default after inserting the TEMP_CONFIG file.

I'd like to get $SUBJECT in place for variables other than
stats_temp_directory, using your quick fix idea.  Attached.  When its time
comes, your stats_temp_directory work can delete that section.
diff --git a/src/bin/pg_ctl/t/001_start_stop.pl 
b/src/bin/pg_ctl/t/001_start_stop.pl
index 50a57d0..a1143e3 100644
--- a/src/bin/pg_ctl/t/001_start_stop.pl
+++ b/src/bin/pg_ctl/t/001_start_stop.pl
@@ -24,6 +24,8 @@ command_ok([ $ENV{PG_REGRESS}, '--config-auth', 
"$tempdir/data" ],
'configure authentication');
 open my $conf, '>>', "$tempdir/data/postgresql.conf";
 print $conf "fsync = off\n";
+print $conf TestLib::slurp_file($ENV{TEMP_CONFIG})
+  if defined $ENV{TEMP_CONFIG};
 if (!$windows_os)
 {
print $conf "listen_addresses = ''\n";
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 81deed9..f576336 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -447,7 +447,17 @@ sub init
print $conf "log_statement = all\n";
print $conf "log_replication_commands = on\n";
print $conf "wal_retrieve_retry_interval = '500ms'\n";
-   print $conf "port = $port\n";
+
+   # If a setting tends to affect whether tests pass or fail, print it 
after
+   # TEMP_CONFIG.  Otherwise, print it before TEMP_CONFIG, thereby 
permitting
+   # overrides.  Settings that merely improve performance or ease debugging
+   # belong before TEMP_CONFIG.
+   print $conf TestLib::slurp_file($ENV{TEMP_CONFIG})
+ if defined $ENV{TEMP_CONFIG};
+
+   # XXX Neutralize any stats_temp_directory in TEMP_CONFIG.  Nodes running
+   # concurrently must not share a stats_temp_directory.
+   print $conf "stats_temp_directory = 'pg_stat_tmp'\n";
 
if ($params{allows_streaming})
{
@@ -473,6 +483,7 @@ sub init
print $conf "max_wal_senders = 0\n";
}
 
+   print $conf "port = $port\n";
if ($TestLib::windows_os)
{
print $conf "listen_addresses = '$host'\n";

Re: Speed up build on Windows by generating symbol definition in batch

2019-04-07 Thread Noah Misch

On Sat, Mar 30, 2019 at 03:42:39PM +0900, Peifeng Qiu wrote:
> When I watched the whole build process with a task manager, I discovered
> that a lot of time was spent on generating export symbol definitions,
> without consuming much CPU or IO.
> The script that doing this is src/tools/msvc/gendef.pl, it enumerates the
> whole directory for ".obj" files and call dumpbin utility to generate
> ".sym" files one by one like this:
> 
> dumpbin /symbols /out:a.sym a.obj >NUL
> 
> Actually the dumpbin utility accepts a wildcard file name, so we can
> generate the export symbols of all ".obj" files in batch.
> 
> dumpbin /symbols /out:all.sym *.obj >NUL
> 
> This will avoid wasting time by creating and destroying dumpbin process
> repeatedly and can speed up the build process considerably.
> I've tested on my 4-core 8-thread Intel i7 CPU. I've set MSBFLAGS=/m to
> ensure it can utilize all CPU cores.
> Building without this patch takes about 370 seconds. Building with this
> patch takes about 200 seconds. That's almost 2x speed up.

I, too, get a strong improvement, from 201s to 149s.  I can confirm it yields
identical *.def files.  Thanks for identifying this improvement.

> - my ($objfile, $symfile) = @_;
> - my ($symvol, $symdirs, $symbase) = splitpath($symfile);
> - my $tmpfile = catpath($symvol, $symdirs, "symbols.out");

You removed the last use of File::Spec::Functions, so remove its "use"
statement.

> - system("dumpbin /symbols /out:$tmpfile $_ >NUL")
> -   && die "Could not call dumpbin";

This error handling was crude, but don't replace it with zero error handling.

> - rename($tmpfile, $symfile);

Keep the use of a temporary file, too.

> +system("dumpbin /symbols /out:$symfile $ARGV[0]/*obj >NUL");

That should be *.obj, not *obj.

87 matches

Mail list logo