Re: 回复:Re: Cache relation sizes?

2021-07-14 Thread Anastasia Lubennikova
On Wed, Jun 16, 2021 at 9:24 AM Thomas Munro  wrote:

> No change yet, just posting a rebase to keep cfbot happy.
>
>
Hi, Thomas

I think that the proposed feature is pretty cool not only because it fixes
some old issues with lseek() performance and reliability, but also because
it opens the door to some new features, such as disk size quota management
[1]. Cheap up-to-date size tracking is one of the major problems for it.

I've only read the 1st patch so far. Here are some thoughts about it:

- SMgrSharedRelation - what about calling it SMgrShmemRelation. It looks
weird too, but "...SharedRelation" makes me think of shared catalog tables.

-  We can exclude temp relations from consideration as they are never
shared and use RelFileNode instead of RelFileNodeBackend in
SMgrSharedRelationMapping.
Not only it will save us some space, but also it will help to avoid a
situation when temp relations occupy whole cache (which may easily happen
in some workloads).
I assume you were trying to make a generic solution, but in practice, temp
rels differ from regular ones and may require different optimizations.
Local caching will be enough for them if ever needed, for example, we can
leave smgr_cached_nblocks[] for temp relations.

>  int smgr_shared_relations = 1000;
- How bad would it be to keep cache for all relations?  Let's test memory
overhead on some realistic datasets. I suppose this 1000 default was chosen
just for WIP patch.
I wonder if we can use DSM instead of regular shared memory?

- I'd expect that CREATE DATABASE and ALTER DATABASE SET TABLESPACE require
special treatment, because they bypass smgr, just like dropdb. Don't see it
in the patch.

[1] https://github.com/greenplum-db/diskquota

-- 
Best regards,
Lubennikova Anastasia


Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

2021-07-05 Thread Anastasia Lubennikova
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

I looked through the patch. Looks good to me. 

CFbot tests are passing and, as I got it from the thread, nobody opposes this 
refactoring, so, move it to RFC status.

The new status of this patch is: Ready for Committer


Extensible storage manager API - smgr hooks

2021-06-29 Thread Anastasia Lubennikova
Hi, hackers!

Many recently discussed features can make use of an extensible storage
manager API. Namely, storage level compression and encryption [1], [2], [3],
disk quota feature [4], SLRU storage changes [5], and any other features
that may want to substitute PostgreSQL storage layer with their
implementation (i.e. lazy_restore [6]).

Attached is a proposal to change smgr API to make it extensible.  The idea
is to add a hook for plugins to get control in smgr and define custom
storage managers. The patch replaces smgrsw[] array and smgr_sw selector
with smgr() function that loads f_smgr implementation.

As before it has only one implementation - smgr_md, which is wrapped into
smgr_standard().

To create custom implementation, a developer needs to implement smgr API
functions
static const struct f_smgr smgr_custom =
{
.smgr_init = custominit,
...
}

create a hook function
   const f_smgr * smgr_custom(BackendId backend, RelFileNode rnode)
  {
  //Here we can also add some logic and chose which smgr to use based
on rnode and backend
  return _custom;
  }

and finally set the hook:
smgr_hook = smgr_custom;

[1]
https://www.postgresql.org/message-id/flat/11996861554042...@iva4-dd95b404a60b.qloud-c.yandex.net
[2]
https://www.postgresql.org/message-id/flat/272dd2d9.e52a.17235f2c050.Coremail.chjischj%40163.com
[3] https://postgrespro.com/docs/enterprise/9.6/cfs
[4]
https://www.postgresql.org/message-id/flat/CAB0yre%3DRP_ho6Bq4cV23ELKxRcfhV2Yqrb1zHp0RfUPEWCnBRw%40mail.gmail.com
[5]
https://www.postgresql.org/message-id/flat/20180814213500.GA74618%4060f81dc409fc.ant.amazon.com
[6] https://wiki.postgresql.org/wiki/PGCon_2021_Fun_With_WAL#Lazy_Restore


-- 
Best regards,
Lubennikova Anastasia
From 90085398f5ecc90d6b7caa318bd3d5f2867ef95c Mon Sep 17 00:00:00 2001
From: anastasia 
Date: Tue, 29 Jun 2021 22:16:26 +0300
Subject: [PATCH] smgr_api.patch

Make smgr API pluggable. Add smgr_hook that can be used to define custom storage managers.
Remove smgrsw[] array and smgr_sw selector. Instead, smgropen() uses smgr() function to load
f_smgr implementation using smgr_hook.

Also add smgr_init_hook and smgr_shutdown_hook.
And a lot of mechanical changes in smgr.c functions.
---
 src/backend/storage/smgr/smgr.c | 136 ++--
 src/include/storage/smgr.h  |  56 -
 2 files changed, 116 insertions(+), 76 deletions(-)

diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 4dc24649df..5f1981a353 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -26,47 +26,8 @@
 #include "utils/hsearch.h"
 #include "utils/inval.h"
 
-
-/*
- * This struct of function pointers defines the API between smgr.c and
- * any individual storage manager module.  Note that smgr subfunctions are
- * generally expected to report problems via elog(ERROR).  An exception is
- * that smgr_unlink should use elog(WARNING), rather than erroring out,
- * because we normally unlink relations during post-commit/abort cleanup,
- * and so it's too late to raise an error.  Also, various conditions that
- * would normally be errors should be allowed during bootstrap and/or WAL
- * recovery --- see comments in md.c for details.
- */
-typedef struct f_smgr
-{
-	void		(*smgr_init) (void);	/* may be NULL */
-	void		(*smgr_shutdown) (void);	/* may be NULL */
-	void		(*smgr_open) (SMgrRelation reln);
-	void		(*smgr_close) (SMgrRelation reln, ForkNumber forknum);
-	void		(*smgr_create) (SMgrRelation reln, ForkNumber forknum,
-bool isRedo);
-	bool		(*smgr_exists) (SMgrRelation reln, ForkNumber forknum);
-	void		(*smgr_unlink) (RelFileNodeBackend rnode, ForkNumber forknum,
-bool isRedo);
-	void		(*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
-BlockNumber blocknum, char *buffer, bool skipFsync);
-	bool		(*smgr_prefetch) (SMgrRelation reln, ForkNumber forknum,
-  BlockNumber blocknum);
-	void		(*smgr_read) (SMgrRelation reln, ForkNumber forknum,
-			  BlockNumber blocknum, char *buffer);
-	void		(*smgr_write) (SMgrRelation reln, ForkNumber forknum,
-			   BlockNumber blocknum, char *buffer, bool skipFsync);
-	void		(*smgr_writeback) (SMgrRelation reln, ForkNumber forknum,
-   BlockNumber blocknum, BlockNumber nblocks);
-	BlockNumber (*smgr_nblocks) (SMgrRelation reln, ForkNumber forknum);
-	void		(*smgr_truncate) (SMgrRelation reln, ForkNumber forknum,
-  BlockNumber nblocks);
-	void		(*smgr_immedsync) (SMgrRelation reln, ForkNumber forknum);
-} f_smgr;
-
-static const f_smgr smgrsw[] = {
+static const f_smgr smgr_md = {
 	/* magnetic disk */
-	{
 		.smgr_init = mdinit,
 		.smgr_shutdown = NULL,
 		.smgr_open = mdopen,
@@ -82,11 +43,8 @@ static const f_smgr smgrsw[] = {
 		.smgr_nblocks = mdnblocks,
 		.smgr_truncate = mdtruncate,
 		.smgr_immedsync = mdimmedsync,
-	}
 };
 
-static const int NSmgr = lengthof(smgrsw);
-
 /*
  * Each backend has a 

Re: Testing autovacuum wraparound (including failsafe)

2021-06-10 Thread Anastasia Lubennikova
On Thu, Jun 10, 2021 at 10:52 AM Andres Freund  wrote:

>
> I started to write a test for $Subject, which I think we sorely need.
>
> Currently my approach is to:
> - start a cluster, create a few tables with test data
> - acquire SHARE UPDATE EXCLUSIVE in a prepared transaction, to prevent
>   autovacuum from doing anything
> - cause dead tuples to exist
> - restart
> - run pg_resetwal -x 227648
> - do things like acquiring pins on pages that block vacuum from progressing
> - commit prepared transaction
> - wait for template0, template1 datfrozenxid to increase
> - wait for relfrozenxid for most relations in postgres to increase
> - release buffer pin
> - wait for postgres datfrozenxid to increase
>
>
Cool. Thank you for working on that!
Could you please share a WIP patch for the $subj? I'd be happy to help with
it.

So far so good. But I've encountered a few things that stand in the way of
> enabling such a test by default:
>
> 1) During startup StartupSUBTRANS() zeroes out all pages between
>oldestActiveXID and nextXid. That takes 8s on my workstation, but only
>because I have plenty memory - pg_subtrans ends up 14GB as I currently
> do
>the test. Clearly not something we could do on the BF.
>  
>
3) pg_resetwal -x requires to carefully choose an xid: It needs to be the
>first xid on a clog page. It's not hard to determine which xids are but
> it
>depends on BLCKSZ and a few constants in clog.c. I've for now hardcoded
> a
>value appropriate for 8KB, but ...
>
> Maybe we can add new pg_resetwal option?  Something like pg_resetwal
--xid-near-wraparound, which will ask pg_resetwal to calculate exact xid
value using values from pg_control and clog macros?
I think it might come in handy for manual testing too.


> I have 2 1/2 ideas about addressing 1);
>
> - We could exposing functionality to do advance nextXid to a future value
> at
>   runtime, without filling in clog/subtrans pages. Would probably have to
> live
>   in varsup.c and be exposed via regress.so or such?
>
> This option looks scary to me. Several functions rely on the fact that
StartupSUBTRANS() have zeroed pages.
And if we will do it conditional just for tests, it means that we won't
test the real code path.

- The only reason StartupSUBTRANS() does that work is because of the
> prepared
>   transaction holding back oldestActiveXID. That transaction in turn
> exists to
>   prevent autovacuum from doing anything before we do test setup
>   steps.
>


>
>   Perhaps it'd be sufficient to set autovacuum_naptime really high
> initially,
>   perform the test setup, set naptime to something lower, reload config.
> But
>   I'm worried that might not be reliable: If something ends up allocating
> an
>   xid we'd potentially reach the path in GetNewTransaction() that wakes up
> the
>   launcher?  But probably there wouldn't be anything doing so?
>
>
  Another aspect that might not make this a good choice is that it actually
>   seems relevant to be able to test cases where there are very old still
>   running transactions...
>
> Maybe this exact scenario can be covered with a separate long-running
test, not included in buildfarm test suite?

-- 
Best regards,
Lubennikova Anastasia


Re: A test for replay of regression tests

2021-06-10 Thread Anastasia Lubennikova
вт, 8 июн. 2021 г. в 02:44, Anastasia Lubennikova :

>
> вт, 8 июн. 2021 г. в 02:25, Thomas Munro :
>
>> Ok, here's a new version incorporating feedback so far.
>>
>> 1.  Invoke pg_regress directly (no make).
>>
>> 2.  Use PG_TEST_EXTRA="wal_consistency_checking" as a way to opt in to
>> the more expensive test.
>>
>> 3.  Use parallel schedule rather than serial.  It's faster but also
>> the non-determinism might discover more things.  This required
>> changing the TAP test max_connections setting from 10 to 25.
>>
>> 4.  Remove some extraneous print statements and
>> check-if-data-is-replicated-using-SELECT tests that are technically
>> not needed (I had copied those from 001_stream_rep.pl).
>>
>
> Thank you for working on this test set!
> I was especially glad to see the skip-tests option for pg_regress. I think
> it will become a very handy tool for hackers.
>
> To try the patch I had to resolve a few merge conflicts, see a rebased
> version in attachments.
>
> >   auth_extra   => [ '--create-role', 'repl_role' ]);
> This line and the comment above it look like some copy-paste artifacts.
> Did I get it right? If so, I suggest removing them.
> Other than that, the patch looks good to me.
>

For some reason, it failed on cfbot, so I'll switch it back to 'Waiting on
author'.
BTW, do I get it right, that cfbot CI will need some adjustments to print
regression.out for this test?

See one more revision of the patch attached. It contains the following
changes:
- rebase to recent main
- removed 'auth_extra' piece, that I mentioned above.
- added lacking make clean and gitignore changes.

-- 
Best regards,
Lubennikova Anastasia
commit 9630f6367d2b5079976bef0a27bb925d00cb798d
Author: anastasia 
Date:   Tue Jun 8 02:01:35 2021 +0300

v4-0001-Test-replay-of-regression-tests.patch

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index cb401a45b3..7a10c83d8a 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -289,6 +289,17 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl'
   
  
 
+
+
+ wal_consistency_checking
+ 
+  
+   Uses wal_consistency_checking=all while running
+   some of the tests under src/test/recovery.  Not
+   enabled by default because it is resource intensive.
+  
+ 
+

 
Tests for features that are not supported by the current build
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 45d1636128..7b0af9fd78 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -484,7 +484,7 @@ sub init
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
 		print $conf "shared_buffers = 1MB\n";
-		print $conf "max_connections = 10\n";
+		print $conf "max_connections = 25\n";
 		# limit disk space consumption, too:
 		print $conf "max_wal_size = 128MB\n";
 	}
diff --git a/src/test/recovery/.gitignore b/src/test/recovery/.gitignore
index 871e943d50..03410b70a3 100644
--- a/src/test/recovery/.gitignore
+++ b/src/test/recovery/.gitignore
@@ -1,2 +1,10 @@
 # Generated by test suite
 /tmp_check/
+/results/
+/expected/
+/sql/
+/log/
+
+# Note: regression.* are only left behind on a failure; that's why they're not ignored
+#/regression.diffs
+#/regression.out
diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile
index 96442ceb4e..0b0d6094cc 100644
--- a/src/test/recovery/Makefile
+++ b/src/test/recovery/Makefile
@@ -15,6 +15,10 @@ subdir = src/test/recovery
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
+# These are created by pg_regress.
+# So we need rules to clean them.
+pg_regress_clean_files += sql/ expected/
+
 check:
 	$(prove_check)
 
@@ -22,4 +26,4 @@ installcheck:
 	$(prove_installcheck)
 
 clean distclean maintainer-clean:
-	rm -rf tmp_check
+	rm -rf $(pg_regress_clean_files)
diff --git a/src/test/recovery/t/026_stream_rep_regress.pl b/src/test/recovery/t/026_stream_rep_regress.pl
new file mode 100644
index 00..0043405739
--- /dev/null
+++ b/src/test/recovery/t/026_stream_rep_regress.pl
@@ -0,0 +1,59 @@
+# Run the standard regression tests with streaming replication
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 2;
+
+# Initialize primary node
+my $node_primary = get_new_node('primary');
+$node_primary->init(
+	allows_streaming => 1);
+
+# WAL consistency checking is resource intensive so require opt-in with the
+# PG_TEST_EXTRA environment variable.
+if ($ENV{PG_TEST_EXTRA} &&
+	$ENV{PG_TEST_EXTRA} =~ m/\bwal_consistency_checking\b/) {
+	$node_primary->append_conf('postgresql.conf',
+		'wal_consistency_checking = all');
+}
+
+$node_primary->start;
+is( $node_primary->psq

Re: A test for replay of regression tests

2021-06-07 Thread Anastasia Lubennikova
вт, 8 июн. 2021 г. в 02:25, Thomas Munro :

> Ok, here's a new version incorporating feedback so far.
>
> 1.  Invoke pg_regress directly (no make).
>
> 2.  Use PG_TEST_EXTRA="wal_consistency_checking" as a way to opt in to
> the more expensive test.
>
> 3.  Use parallel schedule rather than serial.  It's faster but also
> the non-determinism might discover more things.  This required
> changing the TAP test max_connections setting from 10 to 25.
>
> 4.  Remove some extraneous print statements and
> check-if-data-is-replicated-using-SELECT tests that are technically
> not needed (I had copied those from 001_stream_rep.pl).
>

Thank you for working on this test set!
I was especially glad to see the skip-tests option for pg_regress. I think
it will become a very handy tool for hackers.

To try the patch I had to resolve a few merge conflicts, see a rebased
version in attachments.

>   auth_extra   => [ '--create-role', 'repl_role' ]);
This line and the comment above it look like some copy-paste artifacts. Did
I get it right? If so, I suggest removing them.
Other than that, the patch looks good to me.

-- 
Best regards,
Lubennikova Anastasia
commit 2eeaf5802c060612831b3fdeb33401a07c230f83
Author: anastasia 
Date:   Tue Jun 8 02:01:35 2021 +0300

v3-0001-Test-replay-of-regression-tests.patch

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index cb401a45b3..7a10c83d8a 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -289,6 +289,17 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl'
   
  
 
+
+
+ wal_consistency_checking
+ 
+  
+   Uses wal_consistency_checking=all while running
+   some of the tests under src/test/recovery.  Not
+   enabled by default because it is resource intensive.
+  
+ 
+

 
Tests for features that are not supported by the current build
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 45d1636128..7b0af9fd78 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -484,7 +484,7 @@ sub init
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
 		print $conf "shared_buffers = 1MB\n";
-		print $conf "max_connections = 10\n";
+		print $conf "max_connections = 25\n";
 		# limit disk space consumption, too:
 		print $conf "max_wal_size = 128MB\n";
 	}
diff --git a/src/test/recovery/t/025_stream_rep_regress.pl b/src/test/recovery/t/025_stream_rep_regress.pl
new file mode 100644
index 00..8b125a1d67
--- /dev/null
+++ b/src/test/recovery/t/025_stream_rep_regress.pl
@@ -0,0 +1,62 @@
+# Run the standard regression tests with streaming replication
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 2;
+
+# Initialize primary node
+my $node_primary = get_new_node('primary');
+# A specific role is created to perform some tests related to replication,
+# and it needs proper authentication configuration.
+$node_primary->init(
+	allows_streaming => 1,
+	auth_extra   => [ '--create-role', 'repl_role' ]);
+
+# WAL consistency checking is resource intensive so require opt-in with the
+# PG_TEST_EXTRA environment variable.
+if ($ENV{PG_TEST_EXTRA} &&
+	$ENV{PG_TEST_EXTRA} =~ m/\bwal_consistency_checking\b/) {
+	$node_primary->append_conf('postgresql.conf',
+		'wal_consistency_checking = all');
+}
+
+$node_primary->start;
+is( $node_primary->psql(
+'postgres',
+qq[SELECT pg_create_physical_replication_slot('standby_1');]),
+0,
+'physical slot created on primary');
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = get_new_node('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf('postgresql.conf',
+"primary_slot_name = standby_1");
+$node_standby_1->start;
+
+# XXX The tablespace tests don't currently work when the standby shares a
+# filesystem with the primary, due to colliding absolute paths.  We'll skip
+# that for now.
+
+# Run the regression tests against the primary.
+system_or_bail("../regress/pg_regress",
+			   "--port=" . $node_primary->port,
+			   "--schedule=../regress/parallel_schedule",
+			   "--dlpath=../regress",
+			   "--inputdir=../regress",
+			   "--skip-tests=tablespace");
+
+# Wait for standby to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+ok(1, "caught up");
+
+$node_standby_1->stop;
+$node_primary->stop;
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index e04d365258..510afaadbf 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -83,6 +83,7 @@ static int	max_connections = 0;
 static int	max_concurrent_tests = 0;
 static char *encoding = NULL;
 static _stringlist *schedulelist = NULL;

Re: Performing partition pruning using row value

2021-02-16 Thread Anastasia Lubennikova

On 21.07.2020 11:24, kato-...@fujitsu.com wrote:

So, after looking at these functions and modifying this patch, I would like to 
add this patch to the next

I updated this patch and registered for the next CF .

https://commitfest.postgresql.org/29/2654/

regards,
sho kato


Thank you for working on this improvement. I took a look at the code.

1) This piece of code is unneeded:

            switch (get_op_opfamily_strategy(opno, partopfamily))
            {
                case BTLessStrategyNumber:
                case BTLessEqualStrategyNumber:
                case BTGreaterEqualStrategyNumber:
                case BTGreaterStrategyNumber:

See the comment for RowCompareExpr, which states that "A RowCompareExpr 
node is only generated for the < <= > >= cases".


2) It's worth to add a regression test for this feature.

Other than that, the patch looks good to me.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: CREATE INDEX CONCURRENTLY on partitioned index

2021-02-15 Thread Anastasia Lubennikova

On 28.01.2021 17:30, Justin Pryzby wrote:

On Thu, Jan 28, 2021 at 09:51:51PM +0900, Masahiko Sawada wrote:

On Mon, Nov 30, 2020 at 5:22 AM Justin Pryzby  wrote:

On Sat, Oct 31, 2020 at 01:31:17AM -0500, Justin Pryzby wrote:

Forking this thread, since the existing CFs have been closed.
https://www.postgresql.org/message-id/flat/20200914143102.GX18552%40telsasoft.com#58b1056488451f8594b0f0ba40996afd

The strategy is to create catalog entries for all tables with indisvalid=false,
and then process them like REINDEX CONCURRENTLY.  If it's interrupted, it
leaves INVALID indexes, which can be cleaned up with DROP or REINDEX, same as
CIC on a plain table.

On Sat, Aug 08, 2020 at 01:37:44AM -0500, Justin Pryzby wrote:

On Mon, Jun 15, 2020 at 09:37:42PM +0900, Michael Paquier wrote:
Note that the mentioned problem wasn't serious: there was missing index on
child table, therefor the parent index was invalid, as intended.  However I
agree that it's not nice that the command can fail so easily and leave behind
some indexes created successfully and some failed some not created at all.

But I took your advice initially creating invalid inds.

...

That gave me the idea to layer CIC on top of Reindex, since I think it does
exactly what's needed.

On Sat, Sep 26, 2020 at 02:56:55PM -0500, Justin Pryzby wrote:

On Thu, Sep 24, 2020 at 05:11:03PM +0900, Michael Paquier wrote:

It would be good also to check if
we have a partition index tree that maps partially with a partition
table tree (aka no all table partitions have a partition index), where
these don't get clustered because there is no index to work on.

This should not happen, since a incomplete partitioned index is "invalid".



I had been waiting to rebase since there hasn't been any review comments and I
expected additional, future conflicts.



I attempted to review this feature, but the last patch conflicts with 
the recent refactoring, so I wasn't able to test it properly.

Could you please send a new version?

Meanwhile, here are my questions about the patch:

1) I don't see a reason to change the logic here. We don't skip counting 
existing indexes when create parent index. Why should we skip them in 
CONCURRENTLY mode?


            // If concurrent, maybe this should be done after excluding 
indexes which already exist ?

pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
                                         nparts);

2) Here we access relation field after closing the relation. Is it safe?

    /* save lockrelid and locktag for below */
    heaprelid = rel->rd_lockInfo.lockRelId;

3) leaf_partitions() function only handles indexes, so I suggest to name 
it more specifically and add a comment about meaning of 'options' parameter.


4) I don't quite understand the idea of the regression test. Why do we 
expect to see invalid indexes there?

+    "idxpart_a_idx1" UNIQUE, btree (a) INVALID

5) Speaking of documentation, I think we need to add a paragraph about 
CIC on partitioned indexes which will explain that invalid indexes may 
appear and what user should do to fix them.


6) ReindexIndexesConcurrently() needs some code cleanup.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: CREATE INDEX CONCURRENTLY on partitioned index

2021-02-15 Thread Anastasia Lubennikova

On 28.01.2021 17:30, Justin Pryzby wrote:

On Thu, Jan 28, 2021 at 09:51:51PM +0900, Masahiko Sawada wrote:

On Mon, Nov 30, 2020 at 5:22 AM Justin Pryzby  wrote:

On Sat, Oct 31, 2020 at 01:31:17AM -0500, Justin Pryzby wrote:

Forking this thread, since the existing CFs have been closed.
https://www.postgresql.org/message-id/flat/20200914143102.GX18552%40telsasoft.com#58b1056488451f8594b0f0ba40996afd

The strategy is to create catalog entries for all tables with indisvalid=false,
and then process them like REINDEX CONCURRENTLY.  If it's interrupted, it
leaves INVALID indexes, which can be cleaned up with DROP or REINDEX, same as
CIC on a plain table.

On Sat, Aug 08, 2020 at 01:37:44AM -0500, Justin Pryzby wrote:

On Mon, Jun 15, 2020 at 09:37:42PM +0900, Michael Paquier wrote:
Note that the mentioned problem wasn't serious: there was missing index on
child table, therefor the parent index was invalid, as intended.  However I
agree that it's not nice that the command can fail so easily and leave behind
some indexes created successfully and some failed some not created at all.

But I took your advice initially creating invalid inds.

...

That gave me the idea to layer CIC on top of Reindex, since I think it does
exactly what's needed.

On Sat, Sep 26, 2020 at 02:56:55PM -0500, Justin Pryzby wrote:

On Thu, Sep 24, 2020 at 05:11:03PM +0900, Michael Paquier wrote:

It would be good also to check if
we have a partition index tree that maps partially with a partition
table tree (aka no all table partitions have a partition index), where
these don't get clustered because there is no index to work on.

This should not happen, since a incomplete partitioned index is "invalid".



I had been waiting to rebase since there hasn't been any review comments and I
expected additional, future conflicts.



I attempted to review this feature, but the last patch conflicts with 
the recent refactoring, so I wasn't able to test it properly.

Could you please send a new version?

Meanwhile, here are my questions about the patch:

1) I don't see a reason to change the logic here. We don't skip counting 
existing indexes when create parent index. Why should we skip them in 
CONCURRENTLY mode?


            // If concurrent, maybe this should be done after excluding 
indexes which already exist ?

pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
                                         nparts);

2) Here we access relation field after closing the relation. Is it safe?

    /* save lockrelid and locktag for below */
    heaprelid = rel->rd_lockInfo.lockRelId;

3) leaf_partitions() function only handles indexes, so I suggest to name 
it more specifically and add a comment about meaning of 'options' parameter.


4) I don't quite understand the idea of the regression test. Why do we 
expect to see invalid indexes there?

+    "idxpart_a_idx1" UNIQUE, btree (a) INVALID

5) Speaking of documentation, I think we need to add a paragraph about 
CIC on partitioned indexes which will explain that invalid indexes may 
appear and what user should do to fix them.


6) ReindexIndexesConcurrently() needs some code cleanup.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: WIP: document the hook system

2021-02-12 Thread Anastasia Lubennikova

On 17.01.2021 16:53, Magnus Hagander wrote:

On Fri, Jan 15, 2021 at 8:28 AM Peter Eisentraut
 wrote:

On 2020-12-31 04:28, David Fetter wrote:

This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.

This patch seems quite short of a state where one could begin to
evaluate it.  Documenting the hooks better seems a worthwhile goal.   I
think the question is whether we can develop documentation that is
genuinely useful by itself without studying the relevant source code.
This submission does not address that question.

Even just having a list of available hooks would be a nice improvement though :)

But maybe it's something that belongs better in a README file instead,
since as you say it's unlikely to be properly useful without looking
at the source anyway. But just a list of hooks and a *very* high
overview of where each of them hooks in would definitely be useful to
have somewhere, I think. Having to find with "grep" whether there may
or may not exist a hook for approximately what it is you're looking
for is definitely a process to improve on.


+1 for README.
Hooks are intended for developers and can be quite dangerous without 
proper understanding of the internal code.


I also want to remind about a readme gathered by mentees [1]. It was 
done under a PostgreSQL license, so we can use it.
By the way, is there any agreement on the plain-text format of 
PostrgeSQL README files or we can use md?


[1] https://github.com/AmatanHead/psql-hooks/blob/master/Detailed.md

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: some pointless HeapTupleHeaderIndicatesMovedPartitions calls

2021-02-12 Thread Anastasia Lubennikova
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

I wonder, why this patch hangs on commitfest for so long. 
The idea of the fix is clear, the code is correct and all tests pass, so, I 
move it to ReadyForCommitter status.

The new status of this patch is: Ready for Committer


Re: pg_upgrade fails with non-standard ACL

2021-01-25 Thread Anastasia Lubennikova

On 24.01.2021 11:39, Noah Misch wrote:

On Thu, Jan 21, 2021 at 01:03:58AM +0300, Anastasia Lubennikova wrote:

On 03.01.2021 14:29, Noah Misch wrote:

Overall, this patch predicts a subset of cases where pg_dump will emit a
failing GRANT or REVOKE that targets a pg_catalog object.  Can you write a
code comment stating what it does and does not detect?  I think it's okay to
not predict every failure, but let's record the intended coverage.  Here are a
few examples I know; there may be more.  The above query only finds GRANTs to
non-pinned roles.  pg_dump dumps the following, but the above query doesn't
find them:

   REVOKE ALL ON FUNCTION pg_last_wal_replay_lsn FROM public;
   GRANT EXECUTE ON FUNCTION pg_reload_conf() TO pg_signal_backend;

I see a new comment listing object types.  Please also explain the lack of
preventing REVOKE failures (first example query above) and the limitation
around non-pinned roles (second example).



1) Could you please clarify, what do you mean by REVOKE failures?

I tried following example:

Start 9.6 cluster.

REVOKE ALL ON function pg_switch_xlog() FROM public;
REVOKE ALL ON function pg_switch_xlog() FROM backup;

The upgrade to the current master passes with and without patch.
It seems that current implementation of pg_upgrade doesn't take into 
account revoke ACLs.


2) As for pinned roles, I think we should fix this behavior, rather than 
adding a comment. Because such roles can have grants on system objects.


In earlier versions of the patch, I gathered ACLs directly from system 
catalogs: pg_proc.proacl, pg_class.relacl pg_attribute.attacl and 
pg_type.typacl.


The only downside of this approach is that it cannot be easily extended 
to other object types, as we need to handle every object type separately.
I don't think it is a big deal, as we already do it in 
check_for_changed_signatures()


And also the query to gather non-standard ACLs won't look as nice as 
now, because of the need to parse arrays of aclitems.


What do you think?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: pg_upgrade fails with non-standard ACL

2021-01-20 Thread Anastasia Lubennikova

On 03.01.2021 14:29, Noah Misch wrote:

On Thu, Jun 11, 2020 at 07:58:43PM +0300, Anastasia Lubennikova wrote:

On 08.06.2020 19:31, Alvaro Herrera wrote:

I'm thinking what's a good way to have a test that's committable.  Maybe
it would work to add a TAP test to pg_upgrade that runs initdb, does a
few GRANTS as per your attachment, then runs pg_upgrade?  Currently the
pg_upgrade tests are not TAP ... we'd have to revive
https://postgr.es/m/20180126080026.gi17...@paquier.xyz
(Some progress was already made on the buildfarm front to permit this)

I would be glad to add some test, but it seems to me that the infrastructure
changes for cross-version pg_upgrade test is much more complicated task than
this modest bugfix.

Agreed.

Thank you for the review.
New version of the patch is attached, though I haven't tested it 
properly yet. I am planning to do in a couple of days.

--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
+static void
+check_for_changed_signatures(void)
+{
+   snprintf(output_path, sizeof(output_path), "revoke_objects.sql");

If an upgrade deleted function pg_last_wal_replay_lsn, upgrading a database in
which "REVOKE ALL ON FUNCTION pg_last_wal_replay_lsn FROM public" happened
requires a GRANT.  Can you use a file name that will still make sense if
someone enhances pg_upgrade to generate such GRANT statements?
I changed the name to 'fix_system_objects_ACL.sql'. Does it look good to 
you?





+   if (script == NULL && (script = fopen_priv(output_path, 
"w")) == NULL)
+   pg_fatal("could not open file \"%s\": 
%s\n",
+output_path, 
strerror(errno));

Use %m instead of passing sterror(errno) to %s.

Done.

+   }
+
+   /* Handle columns separately */
+   if (strstr(aclinfo->obj_type, "column") != NULL)
+   {
+   char   *pdot = 
last_dot_location(aclinfo->obj_ident);

I'd write strrchr(aclinfo->obj_ident, '.') instead of introducing
last_dot_location().

This assumes column names don't contain '.'; how difficult would it be to
remove that assumption?  If it's difficult, a cheap approach would be to
ignore any entry containing too many dots.  We're not likely to create such
column names at initdb time, but v13 does allow:

   ALTER VIEW pg_locks RENAME COLUMN objid TO "a.b";
   GRANT SELECT ("a.b") ON pg_locks TO backup;

After which revoke_objects.sql has:

   psql:./revoke_objects.sql:9: ERROR:  syntax error at or near "") ON 
pg_catalog.pg_locks.""
   LINE 1: REVOKE ALL (b") ON pg_catalog.pg_locks."a FROM "backup";

While that ALTER VIEW probably should have required allow_system_table_mods,
we need to assume such databases exist.


I've fixed it by using pg_identify_object_as_address(). Now we don't 
have to parse obj_identity.




Overall, this patch predicts a subset of cases where pg_dump will emit a
failing GRANT or REVOKE that targets a pg_catalog object.  Can you write a
code comment stating what it does and does not detect?  I think it's okay to
not predict every failure, but let's record the intended coverage.  Here are a
few examples I know; there may be more.  The above query only finds GRANTs to
non-pinned roles.  pg_dump dumps the following, but the above query doesn't
find them:

   REVOKE ALL ON FUNCTION pg_last_wal_replay_lsn FROM public;
   GRANT EXECUTE ON FUNCTION pg_reload_conf() TO pg_signal_backend;

The above query should test refclassid.




This function should not run queries against servers older than 9.6, because
pg_dump emits GRANT/REVOKE for pg_catalog objects only when targeting 9.6 or
later.  An upgrade from 8.4 to master is failing on the above query:


Fixed.


The patch adds many lines wider than 78 columns, this being one example.  In
general, break such lines.  (Don't break string literal arguments of
ereport(), though.)

Fixed.

nm



--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 43fc297eb6..429e4468f2 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -16,6 +16,7 @@
 
 static void check_new_cluster_is_empty(void);
 static void check_databases_are_compatible(void);
+static void check_for_changed_signatures(void);
 static void check_locale_and_encoding(DbInfo *olddb, DbInfo *newdb);
 static bool equivalent_locale(int category, const char *loca, const char *locb);
 static void check_is_install_user(ClusterInfo *cluster);
@@ -151,6 +152,8 @@ check_and_dump_old_cluster(bool live_check)
 	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
 		new_9_0_populate_pg

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

2021-01-16 Thread Anastasia Lubennikova

On 12.01.2021 22:30, Tomas Vondra wrote:
Thanks. These patches seem to resolve the TOAST table issue, freezing 
it as expected. I think the code duplication is not an issue, but I 
wonder why heap_insert uses this condition:


    /*
 * ...
 *
 * No need to update the visibilitymap if it had all_frozen bit set
 * before this insertion.
 */
    if (all_frozen_set && ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0))

while heap_multi_insert only does this:

    if (all_frozen_set) { ... }

I haven't looked at the details, but shouldn't both do the same thing?



I decided to add this check for heap_insert() to avoid unneeded calls of 
visibilitymap_set(). If we insert tuples one by one, we can only call 
this once per page.
In my understanding, heap_multi_insert() inserts tuples in batches, so 
it doesn't need this optimization.





However, I've also repeated the test counting all-frozen pages in both 
the main table and TOAST table, and I get this:


patched
===

select count(*) from pg_visibility((select reltoastrelid from pg_class 
where relname = 't'));


 count

 12
(1 row)


select count(*) from pg_visibility((select reltoastrelid from pg_class 
where relname = 't')) where not all_visible;


 count

  0
(1 row)

That is - all TOAST pages are frozen (as expected, which is good). But 
now there are 12 pages, not just 10 pages. That is, we're now 
creating 2 extra pages, for some reason. I recall Pavan reported 
similar issue with every 32768-th page not being properly filled, but 
I'm not sure if that's the same issue.



regards



As Pavan correctly figured it out before the problem is that 
RelationGetBufferForTuple() moves to the next page, losing free space in 
the block:


> ... I see that a relcache invalidation arrives
> after 1st and then after every 32672th block is filled. That clears the
> rel->rd_smgr field and we lose the information about the saved target
> block. The code then moves to extend the relation again and thus 
skips the
> previously less-than-half filled block, losing the free space in that 
block.


The reason of this cache invalidation is vm_extend() call, which happens 
every 32762 blocks.


RelationGetBufferForTuple() tries to use the last page, but for some 
reason this code is under 'use_fsm' check. And COPY FROM doesn't use fsm 
(see TABLE_INSERT_SKIP_FSM).



        /*
         * If the FSM knows nothing of the rel, try the last page before we
         * give up and extend.  This avoids one-tuple-per-page syndrome 
during

         * bootstrapping or in a recently-started system.
         */
        if (targetBlock == InvalidBlockNumber)
        {
            BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
            if (nblocks > 0)
                targetBlock = nblocks - 1;
        }


I think we can use this code without regard to 'use_fsm'. With this 
change, the number of toast rel pages is correct. The patch is attached.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/pg_visibility/expected/pg_visibility.out b/contrib/pg_visibility/expected/pg_visibility.out
index ca4b6e186b..0017e3415c 100644
--- a/contrib/pg_visibility/expected/pg_visibility.out
+++ b/contrib/pg_visibility/expected/pg_visibility.out
@@ -179,6 +179,69 @@ select pg_truncate_visibility_map('test_partition');
  
 (1 row)
 
+-- test copy freeze
+create table copyfreeze (a int, b char(1500));
+-- load all rows via COPY FREEZE and ensure that all pages are set all-visible
+-- and all-frozen.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen 
+---+-+
+ 0 | t   | t
+ 1 | t   | t
+ 2 | t   | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid 
+
+(0 rows)
+
+-- load half the rows via regular COPY and rest via COPY FREEZE. The pages
+-- which are touched by regular COPY must not be set all-visible/all-frozen. On
+-- the other hand, pages allocated by COPY FREEZE should be marked
+-- all-frozen/all-visible.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen 
+---+-+
+ 0 | f   | f
+ 1 | f   | f
+ 2 | t   | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid 
+
+(0 rows)
+
+-- Try a mix of regular COPY and COPY FREEZE.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+copy copyfreeze from stdin;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen 
+---+-+
+ 0 | t   | t
+   

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

2021-01-12 Thread Anastasia Lubennikova

On 12.01.2021 00:51, Tomas Vondra wrote:



On 1/11/21 10:00 PM, Anastasia Lubennikova wrote:

On 11.01.2021 01:35, Tomas Vondra wrote:

Hi,

I started looking at this patch again, hoping to get it committed in 
this CF, but I think there's a regression in handling TOAST tables 
(compared to the v3 patch submitted by Pavan in February 2019).


The test I'm running a very simple test (see test.sql):

1) start a transaction
2) create a table with a text column
3) copy freeze data into it
4) use pg_visibility to see how many blocks are all_visible both in the
   main table and it's TOAST table

For v3 patch (applied on top of 278584b526 and 
s/hi_options/ti_options) I get this:


   pages   NOT all_visible
  --
  main   637 0
  toast    50001 3

There was some discussion about relcache invalidations causing a 
couple TOAST pages not be marked as all_visible, etc.


However, for this patch on master I get this

   pages   NOT all_visible
  --
  main   637 0
  toast    50001 50001

So no pages in TOAST are marked as all_visible. I haven't 
investigated what's causing this, but IMO that needs fixing to make 
ths patch RFC.


Attached is the test script I'm using, and a v5 of the patch - 
rebased on current master, with some minor tweaks to comments etc.




Thank you for attaching the test script. I reproduced the problem. 
This regression occurs because TOAST internally uses heap_insert().

You have asked upthread about adding this optimization to heap_insert().

I wrote a quick fix, see the attached patch 0002. The TOAST test 
passes now, but I haven't tested performance or any other use-cases yet.

I'm going to test it properly in a couple of days and share results.



Thanks. I think it's important to make this work for TOAST tables - it 
often stores most of the data, and it was working in v3 of the patch. 
I haven't looked into the details, but if it's really just due to 
TOAST using heap_insert, I'd say it just confirms the importance of 
tweaking heap_insert too. 



I've tested performance. All tests were run on my laptop, latest master 
with and without patches, all default settings, except disabled 
autovacuum and installed pg_stat_statements extension.


The VACUUM is significantly faster with the patch, as it only checks 
visibility map. COPY speed fluctuates a lot between tests, but I didn't 
notice any trend.
I would expect minor slowdown with the patch, as we need to handle 
visibility map pages during COPY FREEZE. But in some runs, patched 
version was faster than current master, so the impact of the patch is 
insignificant.


I run 3 different tests:

1) Regular table (final size 5972 MB)

patched   |   master

COPY FREEZE data 3 times:

33384,544 ms   31135,037 ms
31666,226 ms   31158,294 ms
32802,783 ms   33599,234 ms

VACUUM
54,185 ms         48445,584 ms


2) Table with TOAST (final size 1207 MB where 1172 MB is in toast table)

patched |   master

COPY FREEZE data 3 times:

368166,743 ms    383231,077 ms
398695,018 ms    454572,630 ms
410174,682 ms    567847,288 ms

VACUUM
43,159 ms    6648,302 ms


3) Table with a trivial BEFORE INSERT trigger (final size 5972 MB)

patched |   master

COPY FREEZE data 3 times:

73544,225 ms  64967,802 ms
90960,584 ms  71826,362 ms
81356,025 ms  80023,041 ms

VACUUM
49,626 ms    40100,097 ms

I took another look at the yesterday's patch and it looks fine to me. So 
now I am waiting for your review.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

2021-01-11 Thread Anastasia Lubennikova

On 11.01.2021 01:35, Tomas Vondra wrote:

Hi,

I started looking at this patch again, hoping to get it committed in 
this CF, but I think there's a regression in handling TOAST tables 
(compared to the v3 patch submitted by Pavan in February 2019).


The test I'm running a very simple test (see test.sql):

1) start a transaction
2) create a table with a text column
3) copy freeze data into it
4) use pg_visibility to see how many blocks are all_visible both in the
   main table and it's TOAST table

For v3 patch (applied on top of 278584b526 and 
s/hi_options/ti_options) I get this:


   pages   NOT all_visible
  --
  main   637 0
  toast    50001 3

There was some discussion about relcache invalidations causing a 
couple TOAST pages not be marked as all_visible, etc.


However, for this patch on master I get this

   pages   NOT all_visible
  --
  main   637 0
  toast    50001 50001

So no pages in TOAST are marked as all_visible. I haven't investigated 
what's causing this, but IMO that needs fixing to make ths patch RFC.


Attached is the test script I'm using, and a v5 of the patch - rebased 
on current master, with some minor tweaks to comments etc.




Thank you for attaching the test script. I reproduced the problem. This 
regression occurs because TOAST internally uses heap_insert().

You have asked upthread about adding this optimization to heap_insert().

I wrote a quick fix, see the attached patch 0002. The TOAST test passes 
now, but I haven't tested performance or any other use-cases yet.

I'm going to test it properly in a couple of days and share results.

With this change a lot of new code is repeated in heap_insert() and 
heap_multi_insert(). I think it's fine, because these functions already 
have a lot in common.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 6ccaa4f526b38fbdd2f3a38ac3bd51e96fb140b6 Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Sun, 10 Jan 2021 20:30:29 +0100
Subject: [PATCH] Set PD_ALL_VISIBLE and visibility map bits in COPY FREEZE

Make sure COPY FREEZE marks the pages as PD_ALL_VISIBLE and updates the
visibility map. Until now it only marked individual tuples as frozen,
but page-level flags were not updated.

This is a fairly old patch, and multiple people worked on it. The first
version was written by Jeff Janes, and then reworked by Pavan Deolasee
and Anastasia Lubennikova.

Author: Pavan Deolasee, Anastasia Lubennikova, Jeff Janes
Reviewed-by: Kuntal Ghosh, Jeff Janes, Tomas Vondra, Masahiko Sawada, Andres Freund, Ibrar Ahmed, Robert Haas, Tatsuro Ishii
Discussion: https://postgr.es/m/caboikdn-ptgv0mzntrk2q8otfuuajqaymgmkdu1dckftuxv...@mail.gmail.com
Discussion: https://postgr.es/m/CAMkU%3D1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ%40mail.gmail.com
---
 .../pg_visibility/expected/pg_visibility.out  | 64 +++
 contrib/pg_visibility/sql/pg_visibility.sql   | 77 +++
 src/backend/access/heap/heapam.c  | 76 --
 src/backend/access/heap/hio.c | 17 
 src/include/access/heapam_xlog.h  |  3 +
 5 files changed, 229 insertions(+), 8 deletions(-)

diff --git a/contrib/pg_visibility/expected/pg_visibility.out b/contrib/pg_visibility/expected/pg_visibility.out
index ca4b6e186b..0017e3415c 100644
--- a/contrib/pg_visibility/expected/pg_visibility.out
+++ b/contrib/pg_visibility/expected/pg_visibility.out
@@ -179,6 +179,69 @@ select pg_truncate_visibility_map('test_partition');
  
 (1 row)
 
+-- test copy freeze
+create table copyfreeze (a int, b char(1500));
+-- load all rows via COPY FREEZE and ensure that all pages are set all-visible
+-- and all-frozen.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen 
+---+-+
+ 0 | t   | t
+ 1 | t   | t
+ 2 | t   | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid 
+
+(0 rows)
+
+-- load half the rows via regular COPY and rest via COPY FREEZE. The pages
+-- which are touched by regular COPY must not be set all-visible/all-frozen. On
+-- the other hand, pages allocated by COPY FREEZE should be marked
+-- all-frozen/all-visible.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen 
+---+-+
+ 0 | f   | f
+ 1 | f   | f
+ 2 | t   | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid 
+
+(0 rows)
+
+-- Try a mix of regular COPY and COPY FREEZE.
+begin;
+trunc

Re: Commitfest 2020-11 is closed

2020-12-03 Thread Anastasia Lubennikova

On 02.12.2020 23:59, Tom Lane wrote:

Anastasia Lubennikova  writes:

Commitfest 2020-11 is officially closed now.
Many thanks to everyone who participated by posting patches, reviewing
them, committing and sharing ideas in discussions!

Thanks for all the hard work!


Today, me and Georgios will move the remaining items to the next CF or
return them with feedback. We're planning to leave Ready For Committer
till the end of the week, to make them more visible and let them get the
attention they deserve.

This is actually a bit problematic, because now the cfbot is ignoring
those patches (or if it's not, I don't know where it's displaying the
results).  Please go ahead and move the remaining open patches, or
else re-open the CF if that's possible.

regards, tom lane


Oh, I wasn't aware of that. Thank you for the reminder.

Now all patches are moved to the next CF.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: BUG #15383: Join Filter cost estimation problem in 10.5

2020-12-01 Thread Anastasia Lubennikova

On 30.10.2020 19:33, David G. Johnston wrote:
On Fri, Oct 30, 2020 at 9:16 AM Anastasia Lubennikova 
mailto:lubennikov...@gmail.com>> wrote:


Status update for a commitfest entry.

It looks like there was no real progress on this issue since
April. I see only an experimental patch. What kind of review does
it need right now? Do we need more testing or maybe production
statistics for such queries?

David, are you going to continue working on it?
Can you, please, provide a short summary of the problem and list
open questions for reviewers.

The new status of this patch is: Waiting on Author


I agree that updating a wiki seems like an unappealing conclusion to 
this entry.  I'm on the fence whether we just leave it in the cf and 
get a periodic nag when it gets bumped.  As we are describing real 
problems or potential improvements to live portions of the codebase 
ISTM that adding code comments near to those locations would be 
warranted.  The commit message can reference this thread.  There seems 
to already be a convention for annotating code comments in such a way 
that they can be searched for - which basically is what sticking it in 
the wiki would accomplish but the searcher would have to look in the 
codebase and not on a website.  For the target audience of this 
specific patch that seems quite reasonable.


David J.


Here we are again.
The commitfest is closed now and the discussion haven't moved from where 
it was a month ago.
I don't see any use in moving it to the next CF as is and I don't want 
to return it either, as this is clearly a bug.


I think we have a few options:

1. We can add it into TODO until better times.
2. We can write a patch to address this problem in code comments.
3. We can maybe CC this thread to someone and ask for help. Do you know 
people, who have  expertise in this area?


None of these options is perfect, but the second option will perhaps be 
a good compromise.

David, can you, please submit a patch?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Reduce the time required for a database recovery from archive.

2020-12-01 Thread Anastasia Lubennikova

On 09.11.2020 19:31, Stephen Frost wrote:

Greetings,

* Dmitry Shulga (d.shu...@postgrespro.ru) wrote:

On 19 Oct 2020, at 23:25, Stephen Frost  wrote:

process finishes a WAL file but then just sit around doing nothing while
waiting for the applying process to finish another segment.

I believe that for typical set-up the parameter max_restore_command_workers 
would have value 2 or 3 in order to supply
a delivered WAL file on time just before it be started processing.

This use case is for environment where time required for delivering WAL file 
from archive is greater  than time required for applying records contained in 
the WAL file.
If time required for WAL file delivering lesser than than time required for 
handling records contained in it then max_restore_command_workers shouldn't be 
specified at all

That's certainly not correct at all- the two aren't really all that
related, because any time spent waiting for a WAL file to be delivered
is time that the applying process *could* be working to apply WAL
instead of waiting.  At a minimum, I'd expect us to want to have, by
default, at least one worker process running out in front of the
applying process to hopefully eliminate most, if not all, time where the
applying process is waiting for a WAL to show up.  In cases where the
applying process is faster than a single fetching process, a user might
want to have two or more restore workers, though ultimately I still
contend that what they really want is as many workers as needed to make
sure that the applying process doesn't ever need to wait- up to some
limit based on the amount of space that's available.

And back to the configuration side of this- have you considered the
challenge that a user who is using very large WAL files might run
into with the proposed approach that doesn't allow them to control the
amount of space used?  If I'm using 1G WAL files, then I need to have
16G available to have *any* pre-fetching done with this proposed
approach, right?  That doesn't seem great.

Thanks,

Stephen


Status update for a commitfest entry.

The commitfest is closed now. As this entry has been Waiting on Author 
for a while, I've marked it as returned with feedback. Dmitry, feel free 
to resubmit an updated version to a future commitfest.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Corruption during WAL replay

2020-12-01 Thread Anastasia Lubennikova

On 06.11.2020 14:40, Masahiko Sawada wrote:


So I agree to
proceed with the patch that adds a critical section independent of
fixing other related things discussed in this thread. If Teja seems
not to work on this I’ll write the patch.

Regards,


--
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/



Status update for a commitfest entry.

The commitfest is closed now. As this entry is a bug fix, I am moving it 
to the next CF.

Are you planning to continue working on it?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Corner-case bug in pg_rewind

2020-12-01 Thread Anastasia Lubennikova

On 16.11.2020 05:49, Ian Lawrence Barwick wrote:


Note that the patch  may require reworking for HEAD due to changes in
commit 9c4f5192f6. I'll try to take another look this week.


Regards

Ian Barwick


--
EnterpriseDB: https://www.enterprisedb.com




Status update for a commitfest entry.

The patch is Waiting on Author for some time. As this is a bug fix, I am 
moving it to the next CF.

Ian, are you planning to continue working on it?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Unnecessary delay in streaming replication due to replay lag

2020-12-01 Thread Anastasia Lubennikova

On 20.11.2020 11:21, Michael Paquier wrote:

On Tue, Sep 15, 2020 at 05:30:22PM +0800, lchch1...@sina.cn wrote:

I read the code and test the patch, it run well on my side, and I have several 
issues on the
patch.

+   RequestXLogStreaming(ThisTimeLineID,
+startpoint,
+PrimaryConnInfo,
+PrimarySlotName,
+wal_receiver_create_temp_slot);

This patch thinks that it is fine to request streaming even if
PrimaryConnInfo is not set, but that's not fine.

Anyway, I don't quite understand what you are trying to achieve here.
"startpoint" is used to request the beginning of streaming.  It is
roughly the consistency LSN + some alpha with some checks on WAL
pages (those WAL page checks are not acceptable as they make
maintenance harder).  What about the case where consistency is
reached but there are many segments still ahead that need to be
replayed?  Your patch would cause streaming to begin too early, and
a manual copy of segments is not a rare thing as in some environments
a bulk copy of segments can make the catchup of a standby faster than
streaming.

It seems to me that what you are looking for here is some kind of
pre-processing before entering the redo loop to determine the LSN
that could be reused for the fast streaming start, which should match
the end of the WAL present locally.  In short, you would need a
XLogReaderState that begins a scan of WAL from the redo point until it
cannot find anything more, and use the last LSN found as a base to
begin requesting streaming.  The question of timeline jumps can also
be very tricky, but it could also be possible to not allow this option
if a timeline jump happens while attempting to guess the end of WAL
ahead of time.  Another thing: could it be useful to have an extra
mode to begin streaming without waiting for consistency to finish?
--
Michael



Status update for a commitfest entry.

This entry was "Waiting On Author" during this CF, so I've marked it as 
returned with feedback. Feel free to resubmit an updated version to a 
future commitfest.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Strange behavior with polygon and NaN

2020-12-01 Thread Anastasia Lubennikova

On 25.11.2020 11:14, Kyotaro Horiguchi wrote:

At Wed, 25 Nov 2020 11:39:39 +0900 (JST), Kyotaro Horiguchi 
 wrote in

So that line of thought prompts me to tread *very* carefully when
trying to dodge NaN results.  We need to be certain that we
introduce only logically-defensible special cases.  Something like
float8_coef_mul() seems much more likely to lead us into errors
than away from them.

Agreed on that point.  I'm going to rewirte the patch in that
direction.

Removed the function float8_coef_mul().


I noticed that the check you proposed to add to line_closept_point
doesn't work for the following case:

select line('{1,-1,0}') <-> point(1e300, 'Infinity');

Ax + By + C = 1 * 1e300 + -1 * Inf + 0 = -Inf is not NaN so we go on
the following steps.

derive the perpendicular line: => line(-1, -1, Inf}
derive the cross point   : => point(Inf, Inf)
calculate the distance   : => NaN  (which should be Infinity)

So I left the check whether distance is NaN in this version. In the previous 
version the check is done before directly calculating the distance, but since 
we already have the result of Ax+Bx+C so I decided not to use point_dt() in this
version.

Although I wrote that it should be wrong that applying FPzero() to
coefficients, there are some places already doing that so I followed
those predecessors.


Reverted the change of pg_hypot().


While checking the regression results, I noticed that the follwoing
calculation, which seems wrong.

select line('{3,NaN,5}') = line('{3,NaN,5}');
  ?column?
--
  t

But after looking point_eq(), I decided to let the behavior alone
since I'm not sure the reason for the behavior of the functions. At
least the comment for point_eq() says that is the delibarate
behvior. box_same, poly_same base on the point_eq_point so they behave
the same way.


By the way, '=' doesn't compare the shape but compares the area.
However, what is the area of a line?  That should be always 0 even if
we considered it. And it is also strange that we don't have
corresponding comparison ('<' and so) operators.  It seems to me as if
a mistake of '~='.  If it is correct, I should revert the change of
line_eq() along with fixing operator assignment.

regards.



Status update for a commitfest entry.

The commitfest is closed now and this entry is "Waiting on author".
As far as I see, part of the fixes is already committed. Is there 
anything left to work on or this patch needs review/ ready for committer 
now?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Terminate the idle sessions

2020-12-01 Thread Anastasia Lubennikova

On 25.11.2020 05:18, Li Japin wrote:




On Nov 24, 2020, at 11:20 PM, David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:


On Mon, Nov 23, 2020 at 11:22 PM Li Japin <mailto:japi...@hotmail.com>> wrote:



How about use “foreign-data wrapper” replace “postgres_fdw”?


I don't see much value in avoiding mentioning that specific term - my 
proposal turned it into an example instead of being exclusive.



-         This parameter should be set to zero if you use some
connection-pooling software,
-         or pg servers used by postgres_fdw, because connections
might be closed unexpectedly.
+         This parameter should be set to zero if you use
connection-pooling software,
+         or PostgreSQL servers
connected to using foreign-data
+         wrapper, because connections might be closed unexpectedly.
         


Maybe:

+ or your PostgreSQL server receives connection from postgres_fdw or 
similar middleware.

+ Such software is expected to self-manage its connections.


Thank you for your suggestion and patient! Fixed.

```
+        
+         This parameter should be set to zero if you use 
connection-pooling software,
+         or PostgreSQL servers connected 
to using postgres_fdw
+         or similar middleware (such software is expected to 
self-manage its connections),

+         because connections might be closed unexpectedly.
+        
```

--
Best regards
Japin Li



Status update for a commitfest entry.
As far as I see, all recommendations from reviewers were addressed in 
the last version of the patch.


It passes CFbot successfully, so I move it to Ready For Committer.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Commitfest 2020-11 is closed

2020-12-01 Thread Anastasia Lubennikova

Hi, hackers!

Commitfest 2020-11 is officially closed now.
Many thanks to everyone who participated by posting patches, reviewing 
them, committing and sharing ideas in discussions!


Today, me and Georgios will move the remaining items to the next CF or 
return them with feedback. We're planning to leave Ready For Committer 
till the end of the week, to make them more visible and let them get the 
attention they deserve. And finally in the weekend I'll gather and share 
some statistics.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: enable_incremental_sort changes query behavior

2020-12-01 Thread Anastasia Lubennikova

On 01.12.2020 03:08, James Coleman wrote:

On Tue, Nov 3, 2020 at 4:39 PM Tomas Vondra
 wrote:

I've pushed the 0001 part, i.e. the main fix. Not sure about the other
parts (comments and moving the code back to postgres_fdw) yet.

I noticed the CF entry [1] got moved to the next CF; I'm thinking this
entry should be marked as committed since the fix for the initial bug
reported on this thread has been pushed. We have the parallel safety
issue outstanding, but there's a separate thread and patch for that,
so I'll make a new CF entry for that.

I can mark it as committed, but I'm not sure how to "undo" (or if
that's desirable) the move to the next CF.

Thoughts?

James

1: https://commitfest.postgresql.org/30/2754/



Oops...
I must have rushed with this one, thank you for noticing.
I don't see how to move it back either. I think it's fine to mark it as 
Committed where it is now.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [patch] [doc] Introduce view updating options more succinctly

2020-11-30 Thread Anastasia Lubennikova
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:tested, passed

I wonder, why this patch didn't get a review during the CF.
This minor improvement looks good to me, so I mark it Ready for Committer.

The new status of this patch is: Ready for Committer


Re: [DOC] Document concurrent index builds waiting on each other

2020-11-30 Thread Anastasia Lubennikova
Status update for a commitfest entry.

The commitfest is nearing the end and I wonder what is this discussion waiting 
for.
It looks like the proposed patch received its fair share of review, so I mark 
it as ReadyForCommitter and lay responsibility for the final decision on them.

The new status of this patch is: Ready for Committer


Re: Change JOIN tutorial to focus more on explicit joins

2020-11-30 Thread Anastasia Lubennikova
Status update for a commitfest entry.

The commitfest is nearing the end and this thread was inactive for a while. As 
far as I see something got committed and now the discussion is stuck in arguing 
about parenthesis.
FWIW, I think it is a matter of personal taste. Maybe we can compromise on 
simply leaving this part unchanged.

If you are planning to continue working on it, please move it to the next CF.

Re: [PATCH] remove deprecated v8.2 containment operators

2020-11-30 Thread Anastasia Lubennikova

On 16.11.2020 23:55, Justin Pryzby wrote:

On Fri, Nov 13, 2020 at 10:03:43AM -0500, Stephen Frost wrote:

* Magnus Hagander (mag...@hagander.net) wrote:

On Thu, Nov 12, 2020 at 11:28 PM Tom Lane  wrote:

The changes to the contrib modules appear to be incomplete in some ways.
   In cube, hstore, and seg, there are no changes to the extension
scripts to remove the operators.  All you're doing is changing the C
code to no longer recognize the strategy, but that doesn't explain what
will happen if the operator is still used.  In intarray, by contrast,
you're editing an existing extension script, but that should be done by
an upgrade script instead.

In the contrib modules, I'm afraid what you gotta do is remove the
SQL operator definitions but leave the opclass code support in place.
That's because there's no guarantee that users will update the extension's
SQL version immediately, so a v14 build of the .so might still be used
with the old SQL definitions.  It's not clear how much window we need
give for people to do that update, but I don't think "zero" is an
acceptable answer.

Based on my experience from the field, the answer is "never".

As in, most people have no idea they are even *supposed* to do such an
upgrade, so they don't do it. Until we solve that problem, I think
we're basically stuck with keeping them "forever". (and even if/when
we do, "zero" is probably not going to cut it, no)

Yeah, this is a serious problem and one that we should figure out a way
to fix or at least improve on- maybe by having pg_upgrade say something
about extensions that could/should be upgraded..?

I think what's needed are:

1) a way to *warn* users about deprecation.  CREATE EXTENSION might give an
elog(WARNING), but it's probably not enough.  It only happens once, and if it's
in pg_restore/pg_upgrade, it be wrapped by vendor upgrade scripts.  It needs to
be more like first function call in every session.  Or more likely, put it in
documentation for 10 years.

2) a way to *enforce* it, like making CREATE EXTENSION fail when run against an
excessively old server, including by pg_restore/pg_upgrade (which ought to also
handle it in --check).

Are there any contrib for which (1) is done and we're anywhere near doing (2) ?



Status update for a commitfest entry.

The commitfest is nearing the end and this thread is "Waiting on 
Author". As far as I see we don't have a patch here and discussion is a 
bit stuck.

So, I am planning to return it with feedback. Any objections?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Dumping/restoring fails on inherited generated column

2020-11-30 Thread Anastasia Lubennikova

On 09.11.2020 13:43, Peter Eisentraut wrote:

On 2020-11-06 04:55, Masahiko Sawada wrote:

Both of these result in the same change to the dump output.  Both of
them have essentially the same idea.  The first one adds the
conditionals during the information gathering phase of pg_dump, the
second one adds the conditionals during the output phase.

Any further thoughts?

I think the first one is better than the second (mine) because it can
save the number of intermediate objects.


I was hoping to wrap this issue up this week, but I found more 
problems with how these proposed changes interact with 
--binary-upgrade mode.  I think I need to formalize my findings into 
pg_dump test cases as a next step.  Then we can figure out what 
combination of tweaks will make them all work.


I am moving this patch to the next CF, but it looks like the discussion 
is a bit stuck.



Peter, can you please share your concerns about the interaction of the 
patch with --binary-upgrade mode? If you don't have time to write tests, 
you can just describe problems.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY

2020-11-30 Thread Anastasia Lubennikova

On 04.11.2020 02:56, Alvaro Herrera wrote:

Here's an updated version of this patch.

Apart from rebasing to current master, I made the following changes:

* On the first transaction (the one that marks the partition as
detached), the partition is locked with ShareLock rather than
ShareUpdateExclusiveLock.  This means that DML is not allowed anymore.
This seems a reasonable restriction to me; surely by the time you're
detaching the partition you're not inserting data into it anymore.

* In ExecInitPartitionDispatchInfo, the previous version always
excluded detached partitions.  Now it does include them in isolation
level repeatable read and higher.  Considering the point above, this
sounds a bit contradictory: you shouldn't be inserting new tuples in
partitions being detached.  But if you do, it makes more sense: in RR
two queries that insert tuples in the same partition would not fail
mid-transaction.  (In a read-committed transaction, the second query
does fail, but to me that does not sound surprising.)

* ALTER TABLE .. DETACH PARTITION FINALIZE now waits for concurrent old
snapshots, as previously discussed. This should ensure that the user
doesn't just cancel the wait during the second transaction by Ctrl-C and
run FINALIZE immediately afterwards, which I claimed would bring
inconsistency.

* Avoid creating the CHECK constraint if an identical one already
exists.

(Note I do not remove the constraint on ATTACH.  That seems pointless.)

Still to do: test this using the new hook added by 6f0b632f083b.


Status update for a commitfest entry.

The commitfest is nearing the end and this thread is "Waiting on Author".
As far as I see the last message contains a patch. Is there anything 
left to work on or it needs review now? Alvaro, are you planning to 
continue working on it?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Asymmetric partition-wise JOIN

2020-11-30 Thread Anastasia Lubennikova

On 09.11.2020 13:53, Anastasia Lubennikova wrote:

On 21.08.2020 09:02, Andrey V. Lepikhov wrote:

On 7/1/20 2:10 PM, Daniel Gustafsson wrote:

On 27 Dec 2019, at 08:34, Kohei KaiGai  wrote:


The attached v2 fixed the problem, and regression test finished 
correctly.


This patch no longer applies to HEAD, please submit an rebased version.
Marking the entry Waiting on Author in the meantime.

Rebased version of the patch on current master (d259afa736).

I rebased it because it is a base of my experimental feature than we 
don't break partitionwise join of a relation with foreign partition 
and a local relation if we have info that remote server has foreign 
table link to the local relation (by analogy with shippable extensions).


Maybe mark as 'Needs review'?


Status update for a commitfest entry.

According to cfbot, the patch fails to apply. Could you please send a 
rebased version?


This thread was inactive for quite some time. Is anyone going to 
continue working on it?


I see some interest in the idea of sharable hash, but I don't see even 
a prototype in this thread. So, probably, it is a matter of a separate 
discussion.


Also, I took a look at the code. It looks like it needs some extra 
work. I am not a big expert in this area, so I'm sorry if questions 
are obvious.


1. What would happen if this assumption is not met?

+         * MEMO: We assume this pathlist keeps at least one 
AppendPath that

+         * represents partitioned table-scan, symmetric or asymmetric
+         * partition-wise join. It is not correct right now, however, 
a hook
+         * on add_path() to give additional decision for path removel 
allows

+         * to retain this kind of AppendPath, regardless of its cost.

2. Why do we wrap extract_asymmetric_partitionwise_subjoin() call into 
PG_TRY/PG_CATCH? What errors do we expect?


3. It looks like a crutch. If it isn't, I'd like to see a better 
comment about why "dynamic programming" is not applicable here.

And shouldn't we also handle a root->join_cur_level?

+                /* temporary disables "dynamic programming" algorithm */
+                root->join_rel_level = NULL;

4. This change looks like it can lead to a memory leak for old code. 
Maybe it is never the case, but again I think it worth a comment.


-    /* If there's nothing to adjust, don't call this function. */
-    Assert(nappinfos >= 1 && appinfos != NULL);
+    /* If there's nothing to adjust, just return a duplication */
+    if (nappinfos == 0)
+        return copyObject(node);

5. extract_asymmetric_partitionwise_subjoin() lacks a comment

The new status of this patch is: Waiting on Author


Status update for a commitfest entry.

This entry was inactive during this CF, so I've marked it as returned 
with feedback. Feel free to resubmit an updated version to a future 
commitfest.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [PATCH] Runtime control of CLOBBER_CACHE_ALWAYS

2020-11-30 Thread Anastasia Lubennikova

On 27.10.2020 11:34, Peter Eisentraut wrote:

On 2020-09-25 09:40, Craig Ringer wrote:
While working on extensions I've often wanted to enable cache 
clobbering for a targeted piece of code, without paying the price of 
running it for all backends during postgres startup and any initial 
setup tasks that are required.


So here's a patch that, when CLOBBER_CACHE_ALWAYS or 
CLOBBER_CACHE_RECURSIVE are defined, adds a GUC named 
clobber_cache_depth . It defaults to 1 for CLOBBER_CACHE_ALWAYS or 3 
for CLOBBER_CACHE_RECURSIVE to match the existing compiled-in 
behaviour. But with this change it's now possible to run Pg with 
clobber_cache_depth=0 then set it to 1 only for targeted tests.


clobber_cache_depth is treated as an unknown GUC if Pg was not built 
with CLOBBER_CACHE_ALWAYS or CLOBBER_CACHE_RECURSIVE defined.


I think it would be great if the cache clobber code is always compiled 
in (but turned off) when building with assertions.  The would reduce 
the number of hoops to jump through to actually use this locally and 
reduce the number of surprises from the build farm.


For backward compatibility, you can arrange it so that the built-in 
default of clobber_cache_depth is 1 or 3 if CLOBBER_CACHE_ALWAYS or 
CLOBBER_CACHE_RECURSIVE are defined.




Status update for a commitfest entry.

This thread was inactive during the commitfest, so I am going to mark it 
as "Returned with Feedback".
Craig, are you planning to continue working on it? The proposed idea is 
great and it looks like the patch needs only a minor improvement.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: pgbench - test whether a variable exists

2020-11-30 Thread Anastasia Lubennikova

On 30.11.2020 14:53, Fabien COELHO wrote:



The new status of this patch is: Waiting on Author


This patch was inactive during the commitfest, so I am going to mark 
it as "Returned with Feedback".

Fabien, are you planning to continue working on it?


Not in the short term, but probably for the next CF. Can you park it 
there?



Sure, I'll move it to the next CF then.
I also noticed, that the first message mentions the idea of refactoring 
to use some code it in both pgbench and psql code. Can you, please, 
share a link to the thread, if it exists?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Improving psql slash usage help message

2020-11-29 Thread Anastasia Lubennikova

On 02.11.2020 18:02, Anastasia Lubennikova wrote:

Status update for a commitfest entry.
This thread was inactive for a while. Is anyone going to continue working on it?

My two cents on the topic:
I don’t see it as a big problem in the first place. In the source code, \dE 
refers to foreign tables and \de refers to forign servers. So, it seems more 
reasonable to me, to rename the latter.
   \dE[S+] [PATTERN]  list foreign tables
   \det[+] [PATTERN]  list foreign servers

I also do not support merging the entries for different commands. I think that 
current help output is easier to read.

The new status of this patch is: Waiting on Author



Status update for a commitfest entry.

This entry was inactive during this CF, so I've marked it as returned 
with feedback. Feel free to resubmit an updated version to a future 
commitfest.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: proposal: function pg_setting_value_split() to parse shared_preload_libraries etc.

2020-11-29 Thread Anastasia Lubennikova

On 23.10.2020 05:06, Ian Lawrence Barwick wrote:

Having just submitted this, I realised I'm focussing on the GUCs which call
"SplitDirectoriesString()" (as my specific uses case is for
"shared_preload_libraries")
but the patch does not consider the other GUC_LIST_INPUT settings, which
call "SplitIdentifierString()", so as is, it might produce unexpected
results for those.

I'll rethink and submit an updated version.


Status update for a commitfest entry.

This entry was "Waiting on author" during this CF. As I see, the patch 
needs more work before review, so I changed it to "Withdrawn".

Feel free to resubmit an updated version to a future commitfest.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: archive status ".ready" files may be created too early

2020-11-29 Thread Anastasia Lubennikova

On 14.10.2020 03:06, Kyotaro Horiguchi wrote:

The patch includes a fix for primary->standby case. But I'm not sure
we can do that in the cascaded case. A standby is not aware of the
structure of a WAL blob and has no idea of up-to-where to send the
received blobs. However, if we can rely on the behavior of CopyData
that we always receive a blob as a whole sent from the sender at once,
the cascaded standbys are free from the trouble (as far as the
cascaded-standby doesn't crash just before writing the last-half of a
record into pg_wal and after archiving the last full-segment, which
seems unlikely.).

regards.



Status update for a commitfest entry.

This entry was "Waiting on author" during this CF. As I see, the latest 
message contains new version of the patch.

Does it need more work? Are you going to continue working on it?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: pgbench - test whether a variable exists

2020-11-29 Thread Anastasia Lubennikova

On 20.10.2020 15:22, Ibrar Ahmed wrote:

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   tested, failed
Documentation:not tested

I am not very convinced to have that, but I have performed some testing on 
master ().

I found a crash

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `pgbench postgres -T 60 -f a.sql'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x55bfb572a4b2 in expr_yylex (yylval_param=0x7fff29a60870, 
yyscanner=0x55bfb6867a90) at exprscan.c:1107
1107*yy_cp = yyg->yy_hold_char;
(gdb) bt
#0  0x55bfb572a4b2 in expr_yylex (yylval_param=0x7fff29a60870, 
yyscanner=0x55bfb6867a90) at exprscan.c:1107
#1  0x55bfb572cb63 in expr_lex_one_word (state=0x55bfb6867a10, 
word_buf=0x7fff29a608d0, offset=0x7fff29a608a8) at exprscan.l:340
#2  0x55bfb57358fd in process_backslash_command (sstate=0x55bfb6867a10, 
source=0x55bfb68653a0 "a.sql") at pgbench.c:4540
#3  0x55bfb5736528 in ParseScript (
 script=0x55bfb68655f0 "\\set nbranches :scale\n\\set ntellers 10 * :scale\n\\set 
naccounts 10 * :scale\n\\setrandom aid 1 :naccounts\n\\setrandom bid 1 :nbranches\n\\setrandom 
tid 1 :ntellers\n\\setrandom delta -5000 5000\nBEGIN;\nUPD"..., desc=0x55bfb68653a0 
"a.sql", weight=1) at pgbench.c:4826
#4  0x55bfb5736967 in process_file (filename=0x55bfb68653a0 "a.sql", 
weight=1) at pgbench.c:4962
#5  0x55bfb5738628 in main (argc=6, argv=0x7fff29a610d8) at pgbench.c:5641

The new status of this patch is: Waiting on Author


This patch was inactive during the commitfest, so I am going to mark it 
as "Returned with Feedback".

Fabien, are you planning to continue working on it?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Extending range type operators to cope with elements

2020-11-27 Thread Anastasia Lubennikova

On 31.10.2020 01:08, Tomas Vondra wrote:

Hi,

On Fri, Oct 30, 2020 at 04:01:27PM +, Georgios Kokolatos wrote:

Hi,

thank you for your contribution.

I did notice that the cfbot [1] is failing for this patch.
Please try to address the issues if you can for the upcoming commitfest.



I took a look at the patch today - the regression failure was trivial,
the expected output for one query was added to the wrong place, a couple
lines off the proper place. Attached is an updated version of the patch,
fixing that.

I also reviewed the code - it seems pretty clean and in line with the
surrounding code in rangetypes.c. Good job Esteban! I'll do a bit more
review next week, and I'll see if I can get it committed.

regards



CFM reminder. Just in case you forgot about this thread)
The commitfest is heading to the end. Tomas, will you have time to push 
this patch?


The patch still applies and passes all cfbot checks. I also took a quick 
look at the code and everything looks good to me.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Removing unneeded self joins

2020-11-27 Thread Anastasia Lubennikova

On 31.10.2020 12:26, Andrey V. Lepikhov wrote:

Thank you for this partial review, I included your changes:

On 9/23/20 9:23 AM, David Rowley wrote:
On Fri, 3 Apr 2020 at 17:43, Andrey Lepikhov 
 wrote:

Doing thing the way I describe will allow you to get rid of all the
UniqueRelInfo stuff.

Thanks for the review and sorry for the late reply.
I fixed small mistakes, mentioned in your letter.
Also I rewrote this patch at your suggestion [1].
Because of many changes, this patch can be viewed as a sketch.

To change self-join detection algorithm I used your delta patch from 
[2]. I added in the split_selfjoin_quals routine complex expressions 
handling for demonstration. But, it is not very useful with current 
infrastructure, i think.


Also I implemented one additional way for self-join detection 
algorithm: if the join target list isn't contained vars from inner 
relation, then we can detect self-join with only quals like a1.x=a2.y 
if check innerrel_is_unique is true.
Analysis of the target list is contained in the new routine - 
tlist_contains_rel_exprs - rewritten version of the 
build_joinrel_tlist routine.


Also changes of the join_is_removable() routine is removed from the 
patch. I couldn't understand why it is needed here.


Note, this patch causes change of one join.sql regression test output. 
It is not a bug, but maybe fixed.


Applied over commit 4a071afbd0.

> [1] 
https://www.postgresql.org/message-id/CAKJS1f8p-KiEujr12k-oa52JNWWaQUjEjNg%2Bo1MGZk4mHBn_Rg%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAKJS1f8cJOCGyoxi7a_LG7eu%2BWKF9%2BHTff3wp1KKS5gcUg2Qfg%40mail.gmail.com



Status update for a commitfest entry.

This entry was "Waiting on author" during this CF. As I see, the latest 
message contains new version of the patch. Does it need more work? Are 
you going to continue working on it?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [proposal] de-TOAST'ing using a iterator

2020-11-27 Thread Anastasia Lubennikova

On 02.11.2020 22:08, John Naylor wrote:



On Mon, Nov 2, 2020 at 1:30 PM Alvaro Herrera <mailto:alvhe...@alvh.no-ip.org>> wrote:


On 2020-Nov-02, Anastasia Lubennikova wrote:

> Status update for a commitfest entry.
>
> This entry was inactive for a very long time.
> John, are you going to continue working on this?


Not in the near future. For background, this was a 2019 GSoC project 
where I was reviewer of record, and the patch is mostly good, but 
there is some architectural awkwardness. I have tried to address that, 
but have not had success.


The commitfest is nearing the end and as this thread has stalled, I've 
marked it Returned with Feedback. Feel free to open a new entry if you 
return to this patch.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: fixing old_snapshot_threshold's time->xid mapping

2020-11-26 Thread Anastasia Lubennikova

On 06.10.2020 08:32, Thomas Munro wrote:

On Fri, Sep 25, 2020 at 2:00 PM Michael Paquier  wrote:

On Thu, Sep 24, 2020 at 03:46:14PM -0400, Robert Haas wrote:


Thomas, with respect to your part of this patch set, I wonder if we
can make the functions that you're using to write tests safe enough
that we could add them to contrib/old_snapshot and let users run them
if they want. As you have them, they are hedged around with vague and
scary warnings, but is that really justified? And if so, can it be
fixed? It would be nicer not to end up with two loadable modules here,
and maybe the right sorts of functions could even have some practical
use.

Yeah, you may be right.  I am thinking about that.  In the meantime,
here is a rebase.  A quick recap of these remaining patches:

0001 replaces the current "magic test mode" that didn't really test
anything with a new test mode that verifies pruning and STO behaviour.
0002 fixes a separate bug that Andres reported: the STO XID map
suffers from wraparound-itis.
0003 adds a simple smoke test for Robert's commit 55b7e2f4.  Before
that fix landed, it failed.


I have switched this item as waiting on author in the CF app then, as
we are not completely done yet.

Thanks.  For the record, I think there is still one more complaint
from Andres that remains unaddressed even once these are in the tree:
there are thought to be some more places that lack
TestForOldSnapshot() calls.


Status update for a commitfest entry.

This entry is "Waiting on author" and the thread was inactive for a 
while.  As far as I see, part of the fixes is already committed. Is 
there anything left to work on or this patch set needs review now?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: cleanup temporary files after crash

2020-11-26 Thread Anastasia Lubennikova

On 01.11.2020 04:25, Michael Paquier wrote:

On Sat, Oct 31, 2020 at 09:01:15PM -0300, Euler Taveira wrote:

I thought about not providing a GUC at all or provide it in the developer
section. I've never heard someone saying that they use those temporary
files to investigate an issue. Regarding a crash, all information is already
available and temporary files don't provide extra details. This new
GUC is just to keep the previous behavior. I'm fine without the GUC, though.

The original behavior is as old as 4a5f38c4, and last time we talked
about that there were arguments about still keeping the existing
behavior to not cleanup files during a restart-after-crash scenario
for the sake of being useful just "in case".  I have never used that
property myself, TBH, and I have seen much more cases of users caring
about the data folder not facing an ENOSPC particularly if they don't
use different partitions for pg_wal/ and the main data folder.
--
Michael


Thank you, Euler for submitting this.
+1 for the feature. One of the platforms we support uses temp files a 
lot and we faced the problem, while never actually used these orphan 
files for debugging purposes.


I also think that the GUC is not needed here. This 'feature' was 
internal from the very beginning, so users shouldn't care about 
preserving old behavior. Without the GUC the patch is very simple, 
please see attached version. I also omit the test, because I am not sure 
it will be stable given that the RemovePgTempFiles() allows the 
possibility of failure.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 417581b287f4060178595cd96465adc639639290
Author: anastasia 
Date:   Thu Nov 26 11:45:05 2020 +0300

Remove temporary files after a backend crash

Successive crashes could result in useless storage usage until service
is restarted. It could be the case on host with inadequate resources.
This manual intervention for some environments is not desirable.

The only reason we kept temp files was that someone might want to examine
them for debugging purposes. With this patch we favor service continuity
over debugging.

Author: Euler Taveira
Reviewer: Anastasia Lubennikova

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index b7799ed1d24..a151250734f 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -3999,6 +3999,9 @@ PostmasterStateMachine(void)
 		ereport(LOG,
 (errmsg("all server processes terminated; reinitializing")));
 
+		/* remove leftover temporary files after a crash */
+		RemovePgTempFiles();
+
 		/* allow background workers to immediately restart */
 		ResetBackgroundWorkerCrashTimes();
 
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 05abcf72d68..d30bfc28541 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2992,15 +2992,15 @@ CleanupTempFiles(bool isCommit, bool isProcExit)
  * remove any leftover files created by OpenTemporaryFile and any leftover
  * temporary relation files created by mdcreate.
  *
- * NOTE: we could, but don't, call this during a post-backend-crash restart
- * cycle.  The argument for not doing it is that someone might want to examine
- * the temp files for debugging purposes.  This does however mean that
- * OpenTemporaryFile had better allow for collision with an existing temp
- * file name.
+ * This is also called during a post-backend-crash restart cycle. The argument
+ * for not doing it is that crashes of queries using temp files can result in
+ * useless storage usage that can only be reclaimed by a service restart.
  *
  * NOTE: this function and its subroutines generally report syscall failures
  * with ereport(LOG) and keep going.  Removing temp files is not so critical
  * that we should fail to start the database when we can't do it.
+ * Because this routine is not strict, OpenTemporaryFile allows for collision
+ * with an existing temp file name.
  */
 void
 RemovePgTempFiles(void)


Re: PoC: custom signal handler for extensions

2020-11-25 Thread Anastasia Lubennikova

On 03.09.2020 14:25, Pavel Borisov wrote:


Is there a chance to see you restarting working on this patch ?


I noticed that despite interest to the matter, the discussion in this 
CF thread stopped quite a while ago. So I'd like to push here a patch 
revised according to the previous discussion. Actually the patch is 
being used as a part of the pg_query_state extension during several 
major PostgreSQL releases. So I'd like to raise the matter of 
reviewing this updated patch and include it into version 14 as I see 
much use of this change.


I welcome all your considerations,

I took a look at the patch. It looks fine and I see, that it contains 
fixes for the latest questions in the thread.


I think we should provide a test module for this feature, that will 
serve as both test and example of the use.


This is a feature for extension developers, so I don't know where we 
should document it. At the very least we can improve comments. For 
example, describe the fact that custom signals are handled after 
receiving SIGUSR1.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Improper use about DatumGetInt32

2020-11-25 Thread Anastasia Lubennikova

On 02.11.2020 18:59, Peter Eisentraut wrote:

I have committed 0003.

For 0001, normal_rand(), I think you should reject negative arguments 
with an error.


I've updated 0001. The change is trivial, see attached.



For 0002, I think you should change the block number arguments to 
int8, same as other contrib modules do.


Agree. It will need a bit more work, though. Probably a new version of 
pageinspect contrib, as the public API will change.

Ashutosh, are you going to continue working on it?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit b42df63c757bf5ff715052a401aba37691a7e2b8
Author: anastasia 
Date:   Wed Nov 25 17:19:33 2020 +0300

Handle negative number of tuples passed to normal_rand()

The function converts the first argument i.e. the number of tuples to
return into an unsigned integer which turns out to be huge number when a
negative value is passed. This causes the function to take much longer
time to execute. Instead throw an error about invalid parameter value.

While at it, improve SQL test to test the number of tuples returned by
this function.

Authors: Ashutosh Bapat, Anastasia Lubennikova

diff --git a/contrib/tablefunc/expected/tablefunc.out b/contrib/tablefunc/expected/tablefunc.out
index fffadc6e1b4..97bfd690841 100644
--- a/contrib/tablefunc/expected/tablefunc.out
+++ b/contrib/tablefunc/expected/tablefunc.out
@@ -3,10 +3,20 @@ CREATE EXTENSION tablefunc;
 -- normal_rand()
 -- no easy way to do this for regression testing
 --
-SELECT avg(normal_rand)::int FROM normal_rand(100, 250, 0.2);
- avg 
--
- 250
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(100, 250, 0.2);
+ avg | count 
+-+---
+ 250 |   100
+(1 row)
+
+-- negative number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(-1, 250, 0.2);
+ERROR:  invalid number of tuples: -1
+-- zero number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(0, 250, 0.2);
+ avg | count 
+-+---
+ | 0
 (1 row)
 
 --
diff --git a/contrib/tablefunc/sql/tablefunc.sql b/contrib/tablefunc/sql/tablefunc.sql
index ec375b05c63..5341c005f91 100644
--- a/contrib/tablefunc/sql/tablefunc.sql
+++ b/contrib/tablefunc/sql/tablefunc.sql
@@ -4,7 +4,11 @@ CREATE EXTENSION tablefunc;
 -- normal_rand()
 -- no easy way to do this for regression testing
 --
-SELECT avg(normal_rand)::int FROM normal_rand(100, 250, 0.2);
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(100, 250, 0.2);
+-- negative number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(-1, 250, 0.2);
+-- zero number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(0, 250, 0.2);
 
 --
 -- crosstab()
diff --git a/contrib/tablefunc/tablefunc.c b/contrib/tablefunc/tablefunc.c
index 02f02eab574..c6d32d46f01 100644
--- a/contrib/tablefunc/tablefunc.c
+++ b/contrib/tablefunc/tablefunc.c
@@ -174,6 +174,7 @@ normal_rand(PG_FUNCTION_ARGS)
 	FuncCallContext *funcctx;
 	uint64		call_cntr;
 	uint64		max_calls;
+	int32		num_tuples;
 	normal_rand_fctx *fctx;
 	float8		mean;
 	float8		stddev;
@@ -193,7 +194,14 @@ normal_rand(PG_FUNCTION_ARGS)
 		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
 
 		/* total number of tuples to be returned */
-		funcctx->max_calls = PG_GETARG_UINT32(0);
+		num_tuples = PG_GETARG_INT32(0);
+
+		if (num_tuples < 0)
+			ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid number of tuples: %d", num_tuples)));
+
+		funcctx->max_calls = num_tuples;
 
 		/* allocate memory for user context */
 		fctx = (normal_rand_fctx *) palloc(sizeof(normal_rand_fctx));


Ready For Committer patches (CF 2020-11)

2020-11-24 Thread Anastasia Lubennikova

Hi,

with this message, I want to draw attention to the RFC patches on the 
current commitfest. It would be good if committers could take a look at 
them.


While doing a sweep through the CF, I have kicked a couple of entries 
back to Waiting on author, so now the list is correct. Now we have 17 
entries.


https://commitfest.postgresql.org/30/?status=3

The oldest RFC patches are listed below. These entries are quite large 
and complex. Still they got a good amount of reviews and it looks like 
after a long discussion they are in a good shape.


Generic type subscripting <https://commitfest.postgresql.org/30/1062/>

schema variables, LET command <https://commitfest.postgresql.org/30/1608/>

pgbench - add pseudo-random permutation function
<https://commitfest.postgresql.org/30/2138/>
Allow REINDEX, CLUSTER and VACUUM FULL to rebuild on new 
TABLESPACE/INDEX_TABLESPACE <https://commitfest.postgresql.org/30/2269/>


range_agg / multiranges <https://commitfest.postgresql.org/30/2112/>

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Libpq support to connect to standby server as priority

2020-11-24 Thread Anastasia Lubennikova

On 30.09.2020 10:57, Greg Nancarrow wrote:

Thanks for your thoughts, patches and all the pointers.
I'll be looking at all of them.
(And yes, the comma instead of bitwise OR is of course an error,
somehow made and gone unnoticed; the next field in the struct is an
enum, so accepts any int value).

Regards,
Greg Nancarrow
Fujitsu Australia


CFM reminder.

Hi, this entry is "Waiting on Author" and the thread was inactive for a 
while. As far as I see, the patch needs some further work.
Are you going to continue working on it, or should I mark it as 
"returned with feedback" until a better time?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: pgbench and timestamps (bounced)

2020-11-24 Thread Anastasia Lubennikova

On 11.09.2020 16:59, Fabien COELHO wrote:


Hello Tom,


It requires a mutex around the commands, I tried to do some windows
implementation which may or may not work.


Ugh, I'd really rather not do that.  Even disregarding the effects
of a mutex, though, my initial idea for fixing this has a big problem:
if we postpone PREPARE of the query until first execution, then it's
happening during timed execution of the benchmark scenario and thus
distorting the timing figures.  (Maybe if we'd always done it like
that, it'd be okay, but I'm quite against changing the behavior now
that it's stood for a long time.)


Hmmm.

Prepare is done *once* per client, ISTM that the impact on any 
statistically significant benchmark is nul in practice, or it would 
mean that the benchmark settings are too low.


Second, the mutex is only used when absolutely necessary, only for the 
substitution part of the query (replacing :stuff by ?), because 
scripts are shared between threads. This is just once, in an unlikely 
case occuring at the beginning.



However, perhaps there's more than one way to fix this.  Once we've
scanned all of the script and seen all the \set commands, we know
(in principle) the set of all variable names that are in use.
So maybe we could fix this by

(1) During the initial scan of the script, make variable-table
entries for every \set argument, with the values shown as undefined
for the moment.  Do not try to parse SQL commands in this scan,
just collect them.


The issue with this approach is

  SELECT 1 AS one \gset pref_

which will generate a "pref_one" variable, and these names cannot be 
guessed without SQL parsing and possibly execution. That is why the

preparation is delayed to when the variables are actually known.


(2) Make another scan in which we identify variable references
in the SQL commands and issue PREPAREs (if enabled).



(3) Perform the timed run.

This avoids any impact of this bug fix on the semantics or timing
of the benchmark proper.  I'm not sure offhand whether this
approach makes any difference for the concerns you had about
identifying/suppressing variable references inside quotes.


I do not think this plan is workable, because of the \gset issue.

I do not see that the conditional mutex and delayed PREPARE would have 
any significant (measurable) impact on an actual (reasonable) 
benchmark run.


A workable solution would be that each client actually execute each 
script once before starting the actual benchmark. It would still need 
a mutex and also a sync barrier (which I'm proposing in some other 
thread). However this may raise some other issues because then some 
operations would be trigger out of the benchmarking run, which may or 
may not be desirable.


So I'm not to keen to go that way, and I think the proposed solution 
is reasonable from a benchmarking point of view as the impact is 
minimal, although not zero.



CFM reminder.

Hi, this entry is "Waiting on Author" and the thread was inactive for a 
while. I see this discussion still has some open questions. Are you 
going to continue working on it, or should I mark it as "returned with 
feedback" until a better time?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: LogwrtResult contended spinlock

2020-11-24 Thread Anastasia Lubennikova

On 04.09.2020 20:13, Andres Freund wrote:

Hi,

On 2020-09-04 10:05:45 -0700, Andres Freund wrote:

On 2020-09-03 14:34:52 -0400, Alvaro Herrera wrote:

Looking at patterns like this

if (XLogCtl->LogwrtRqst.Write < EndPos)
XLogCtl->LogwrtRqst.Write = EndPos;

It seems possible to implement with

 do {
XLogRecPtr  currwrite;

 currwrite = pg_atomic_read_u64(LogwrtRqst.Write);
if (currwrite > EndPos)
 break;  // already done by somebody else
 if (pg_atomic_compare_exchange_u64(LogwrtRqst.Write,
   currwrite, EndPos))
 break;  // successfully updated
 } while (true);

This assumes that LogwrtRqst.Write never goes backwards, so it doesn't
seem good material for a general routine.

This *seems* correct to me, though this is muddy territory to me.  Also,
are there better ways to go about this?

Hm, I was thinking that we'd first go for reading it without a spinlock,
but continuing to write it as we currently do.

But yea, I can't see an issue with what you propose here. I personally
find do {} while () weird and avoid it when not explicitly useful, but
that's extremely minor, obviously.

Re general routine: On second thought, it might actually be worth having
it. Even just for LSNs - there's plenty places where it's useful to
ensure a variable is at least a certain size.  I think I would be in
favor of a general helper function.

Do you mean by general helper function something like this?

void
swap_lsn(XLogRecPtr old_value, XLogRecPtr new_value, bool to_largest)
{
  while (true) {
    XLogRecPtr  currwrite;

    currwrite = pg_atomic_read_u64(old_value);

    if (to_largest)
  if (currwrite > new_value)
    break;  /* already done by somebody else */
    else
  if (currwrite < new_value)
    break;  /* already done by somebody else */

    if (pg_atomic_compare_exchange_u64(old_value,
   currwrite, new_value))
    break;  /* already done by somebody else */
  }
}


which will be called like
swap_lsn(XLogCtl->LogwrtRqst.Write, EndPos, true);



Greetings,

Andres Freund


This CF entry was inactive for a while. Alvaro, are you going to 
continue working on it?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: psql: add \si, \sm, \st and \sr functions to show CREATE commands for indexes, matviews, triggers and tables

2020-11-24 Thread Anastasia Lubennikova

On 18.08.2020 17:25, Tom Lane wrote:

a.pervush...@postgrespro.ru writes:

[ si_st_sm_sr_v2.patch ]

I hadn't particularly noticed this thread before, but I happened to
look through this patch, and I've got to say that this proposed feature
seems like an absolute disaster from a maintenance standpoint.  There
will be no value in an \st command that is only 90% accurate; the produced
DDL has to be 100% correct.  This means that, if we accept this feature,
psql will have to know everything pg_dump knows about how to construct the
DDL describing tables, indexes, views, etc.  That is a lot of code, and
it's messy, and it changes nontrivially on a very regular basis.  I can't
accept that we want another copy in psql --- especially one that looks
nothing like what pg_dump has.

There've been repeated discussions about somehow extracting pg_dump's
knowledge into a library that would also be available to other client
programs (see e.g. the concurrent thread at [1]).  That's quite a tall
order, which is why it's not happened yet.  But I think we really need
to have something like that before we can accept this feature for psql.

BTW, as an example of why this is far more difficult than it might
seem at first glance, this patch doesn't even begin to meet the
expectation stated at the top of describe.c:

  * Support for the various \d ("describe") commands.  Note that the current
  * expectation is that all functions in this file will succeed when working
  * with servers of versions 7.4 and up.  It's okay to omit irrelevant
  * information for an old server, but not to fail outright.

It might be okay for this to cut off at 8.0 or so, as I think pg_dump
does, but not to just fail on older servers.

Another angle, which I'm not even sure how we want to think about it, is
security.  It will not do for "\et" to allow some attacker to replace
function calls appearing in the table's CHECK constraints, for instance.
So this means you've got to be very aware of CVE-2018-1058-style attacks.
Our answer to that for pg_dump has partially depended on restricting the
search_path used at both dump and restore time ... but I don't think \et
gets to override the search path that the psql user is using.  I'm not
sure what that means in practice but it certainly requires some thought
before we add the feature, not after.

Anyway, I can see the attraction of having psql commands like these,
but "write a bunch of new code that we'll have to maintain" does not
seem like a desirable way to get them.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/9df8a3d3-13d2-116d-26ab-6a273c1ed38c%402ndquadrant.com




Since there has been no activity on this thread since before the CF and
no response from the author I have marked this "returned with feedback".

Alexandra, feel free to resubmit it to the next commitfest, when you 
have time to address the issues raised in the review.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: New default role- 'pg_read_all_data'

2020-11-23 Thread Anastasia Lubennikova

On 29.10.2020 17:19, Stephen Frost wrote:

Greetings,

* Georgios Kokolatos (gkokola...@protonmail.com) wrote:

this patch is in "Ready for committer" state and the Cfbot is happy.

Glad that's still the case. :)


Is there any committer that is available for taking a look at it?

If there aren't any objections or further comments, I'll take another
look through it and will commit it during the upcoming CF.

Thanks!

Stephen


CFM reminder. Just in case you forgot about this thread)
The commitfest is heading to the end. And there was a plenty of time for 
anyone to object.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: deferred primary key and logical replication

2020-11-23 Thread Anastasia Lubennikova

On 27.10.2020 13:46, Amit Kapila wrote:

On Sun, Oct 25, 2020 at 9:39 PM Euler Taveira
 wrote:

On Mon, 5 Oct 2020 at 08:34, Amit Kapila  wrote:

On Mon, May 11, 2020 at 2:41 AM Euler Taveira
 wrote:

Hi,

While looking at an old wal2json issue, I stumbled on a scenario that a table
with a deferred primary key is not updatable in logical replication. AFAICS it
has been like that since the beginning of logical decoding and seems to be an
oversight while designing logical decoding.


I am not sure if it is an oversight because we document that the index
must be non-deferrable, see "USING INDEX records the old values of the
columns covered by the named index, which must be unique, not partial,
not deferrable, and include only columns marked NOT NULL." in docs
[1].


Inspecting this patch again, I forgot to consider that RelationGetIndexList()
is called by other backend modules. Since logical decoding deals with finished
transactions, it is ok to use a deferrable primary key.


But starting PG-14, we do support logical decoding of in-progress
transactions as well.



Commitfest entry status update.
As far as I see, this patch needs some further work, so I move it to 
"Waiting on author".

Euler, are you going to continue working on it?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Online verification of checksums

2020-11-23 Thread Anastasia Lubennikova

On 23.11.2020 18:35, Stephen Frost wrote:

Greetings,

* Anastasia Lubennikova (a.lubennik...@postgrespro.ru) wrote:

On 21.11.2020 04:30, Michael Paquier wrote:

The only method I can think as being really
reliable is based on two facts:
- Do a check only on pd_checksums, as that validates the full contents
of the page.
- When doing a retry, make sure that there is no concurrent I/O
activity in the shared buffers.  This requires an API we don't have
yet.

It seems reasonable to me to rely on checksums only.

As for retry, I think that API for concurrent I/O will be complicated.
Instead, we can introduce a function to read the page directly from shared
buffers after PAGE_RETRY_THRESHOLD attempts. It looks like a bullet-proof
solution to me. Do you see any possible problems with it?

We might end up reading pages back in that have been evicted, for one
thing, which doesn't seem great,
TBH, I think it is highly unlikely that the page that was just updated 
will be evicted.

and this also seems likely to be
awkward for cases which aren't using the replication protocol, unless
every process maintains a connection to PG the entire time, which also
doesn't seem great.
Have I missed something? Now pg_basebackup has only one process + one 
child process for streaming. Anyway, I totally agree with your argument. 
The need to maintain connection(s) to PG is the most unpleasant part of 
the proposed approach.



Also- what is the point of reading the page from shared buffers
anyway..?
Well... Reading a page from shared buffers is a reliable way to get a 
correct page from postgres under any concurrent load. So it just seems 
natural to me.

All we need to do is prove that the page will be rewritten
during WAL replay.
Yes and this is a tricky part. Until you have explained it in your 
latest message, I wasn't sure how we can distinct concurrent update from 
a page header corruption. Now I agree that if page LSN updated and 
increased between rereads, it is safe enough to conclude that we have 
some concurrent load.



  If we can prove that, we don't actually care what
the contents of the page are.  We certainly can't calculate the
checksum on a page we plucked out of shared buffers since we only
calculate the checksum when we go to write the page out.
Good point. I was thinking that we can recalculate checksum. Or even 
save a page without it, as we have checked LSN and know for sure that it 
will be rewritten by WAL replay.



To sum up, I agree with your proposal to reread the page and rely on 
ascending LSNs. Can you submit a patch?

You can write it on top of the latest attachment in this thread:
v8-master-0001-Fix-page-verifications-in-base-backups.patch 
<https://www.postgresql.org/message-id/attachment/115403/v8-master-0001-Fix-page-verifications-in-base-backups.patch> 
from this message 
https://www.postgresql.org/message-id/20201030023028.gc1...@paquier.xyz


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: DROP relation IF EXISTS Docs and Tests - Bug Fix

2020-11-23 Thread Anastasia Lubennikova

On 30.09.2020 05:00, David G. Johnston wrote:
On Tue, Sep 15, 2020 at 3:48 PM Alexander Korotkov 
mailto:aekorot...@gmail.com>> wrote:


Hi!

I've skimmed through the thread and checked the patchset. Everything
looks good, except one paragraph, which doesn't look completely clear.

+  
+   This emulates the functionality provided by
+    but sets the created object's
+   type
definition
+   to domain.
+  

As I get it states that CREATE DOMAIN somehow "emulates" CREATE TYPE.
Could you please, rephrase it?  It looks confusing to me yet.


v5 attached, looking at this fresh and with some comments to consider.

I ended up just combining both patches into one.

I did away with the glossary changes altogether, and the invention of 
the new term.  I ended up limiting "type's type" to just domain usage 
but did a couple of a additional tweaks that tried to treat domains as 
not being actual types even though, at least in PostgreSQL, they are 
(at least as far as DROP TYPE is concerned - and since I don't have 
any understanding of the SQL Standard's decision to separate out 
create domain and create type I'll just stick to the implementation in 
front of me.


David J.


Reminder from a CF manager, as this thread was inactive for a while.
Alexander, I see you signed up as a committer for this entry. Are you 
going to continue this work?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Online verification of checksums

2020-11-23 Thread Anastasia Lubennikova

On 21.11.2020 04:30, Michael Paquier wrote:

The only method I can think as being really
reliable is based on two facts:
- Do a check only on pd_checksums, as that validates the full contents
of the page.
- When doing a retry, make sure that there is no concurrent I/O
activity in the shared buffers.  This requires an API we don't have
yet.


It seems reasonable to me to rely on checksums only.

As for retry, I think that API for concurrent I/O will be complicated. 
Instead, we can introduce a function to read the page directly from 
shared buffers after PAGE_RETRY_THRESHOLD attempts. It looks like a 
bullet-proof solution to me. Do you see any possible problems with it?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Use of "long" in incremental sort code

2020-11-23 Thread Anastasia Lubennikova

On 05.11.2020 02:53, James Coleman wrote:

On Tue, Nov 3, 2020 at 4:42 PM Tomas Vondra
 wrote:

On Tue, Nov 03, 2020 at 03:53:53AM +0100, Tomas Vondra wrote:

Hi,

I took another look at this, and 99% of the patch (the fixes to sort
debug messages) seems fine to me.  Attached is the part I plan to get
committed, including commit message etc.


I've pushed this part. Thanks for the patch, Haiying Tang.


The one change I decided to remove is this change in tuplesort_free:

-   longspaceUsed;
+   int64   spaceUsed;

The reason why I think this variable should be 'long' is that we're
using it for this:

spaceUsed = LogicalTapeSetBlocks(state->tapeset);

and LogicalTapeSetBlocks is defined like this:

extern long LogicalTapeSetBlocks(LogicalTapeSet *lts);

FWIW the "long" is not introduced by incremental sort - it used to be in
tuplesort_end, the incremental sort patch just moved it to a different
function. It's a bit confusing that tuplesort_updatemax has this:

int64   spaceUsed;

But I'd argue this is actually wrong, and should be "long" instead. (And
this actually comes from the incremental sort patch, by me.)


FWIW while looking at what the other places calling LogicalTapeSetBlocks
do, and I noticed this:

uint64  disk_used = LogicalTapeSetBlocks(...);

in the disk-based hashagg patch. So that's a third data type ...


IMHO this should simply switch the current int64 variable to long, as it
was before. Not sure about about the hashagg uint64 variable.

Is there anything that actually limits tape code to using at most 4GB
on 32-bit systems?


At first glance, I haven't found anything that could limit tape code. It 
uses BufFile, which is not limited by the OS file size limit.
Still, If we want to change 'long' in LogicalTapeSetBlocks, we should 
probably also update nBlocksWritten and other variables.


As far as I see, the major part of the patch was committed, so l update 
the status of the CF entry to "Committed". Feel free to create a new 
entry, if you're going to continue working on the remaining issue.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Commitfest 2020-11

2020-11-17 Thread Anastasia Lubennikova
Now that we are halfway through the commitfest, the status breakdown 
looks like this:


Needs review: 116
Waiting on Author: 45
Ready for Committer: 22
Committed: 51
Returned with Feedback: 1
Withdrawn: 8
Rejected: 1
Total: 244

which means we have reached closure on a quarter of the patches. And 
many discussions have significantly moved forward. Keep it up and don't 
wait for the deadline commitfest)


Most inactive discussions and patches which didn't apply were notified 
on their respective threads. If there will be no response till the last 
days of the CF they will be considered stalled and returned with feedback.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Skip ExecCheckRTPerms in CTAS with no data

2020-11-09 Thread Anastasia Lubennikova

On 29.09.2020 14:39, Bharath Rupireddy wrote:

On Mon, Sep 28, 2020 at 7:48 PM Tom Lane  wrote:

Bharath Rupireddy  writes:

In case of CTAS with no data, we actually do not insert the tuples
into the created table, so we can skip checking for the insert
permissions. Anyways, the insert permissions will be checked when the
tuples are inserted into the table.

I'd argue this is wrong.  You don't get to skip permissions checks
in ordinary queries just because, say, there's a LIMIT 0 on the
query.


Right, when there's a select with limit 0 clause, we do check for the
select permissions. But my point is, we don't check insert
permissions(or select or update etc.) when we create a plain table
using CREATE TABLE test_tbl(a1 INT). Of course, we do check create
permissions. Going by the similar point, shouldn't we also check only
create permission(which is already being done as part of
DefineRelation) and skip the insert permission(the change this patch
does) for the new table being created as part of CREATE TABLE test_tbl
AS SELECT * FROM test_tbl2? However select permission will be checked
for test_tbl2. The insert permissions will be checked anyways before
inserting rows into the table created in CTAS.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

I see Tom's objection above. Still, I tend to agree that if 'WITH NO 
DATA' was specified explicitly, CREATE AS should behave more like a 
utility statement rather than a regular query. So I think that this 
patch can be useful in some use-cases and I definitely don't see any 
harm it could cause. Even the comment in the current code suggests that 
it is an option.


I took a look at the patch. It is quite simple, so no comments about the 
code. It would be good to add a test to select_into.sql to show that it 
only applies to 'WITH NO DATA' and that subsequent insertions will fail 
if permissions are not set.


Maybe we should also mention it a documentation, but I haven't found any 
specific paragraph about permissions on CTAS.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Prevent printing "next step instructions" in initdb and pg_upgrade

2020-11-09 Thread Anastasia Lubennikova

On 02.11.2020 16:23, Magnus Hagander wrote:

On Tue, Oct 27, 2020 at 11:35:25AM +0100, Peter Eisentraut wrote:

On 2020-10-06 12:26, Magnus Hagander wrote:

I went with the name --no-instructions to have the same name for both
initdb and pg_upgrade. The downside is that "no-instructions" also
causes the scripts not to be written in pg_upgrade, which arguably is a
different thing. We could go with "--no-instructions" and
"--no-scripts", but that would leave the parameters different. I also
considered "--no-next-step", but that one didn't quite have the right
ring to me. I'm happy for other suggestions on the parameter names.

What scripts are left after we remove the analyze script, as discussed in a
different thread?

There is still delete_old_cluster.sh.
I wonder if we should just not do it and just say "you should delete
the old cluster". Then we can leave it up to platform integrations to
enhance that, based on their platform knowledge (such as for example
knowing if the platform runs systemd or not). That would leave us with
both pg_upgrade and initdb printing instructions, and not scripts, and
solve that part of the issue.

Do we only care about .sh scripts? There are also reindex_hash.sql and 
pg_largeobject.sqlin src/bin/pg_upgrade/version.c with instructions.

How should we handle them?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Asymmetric partition-wise JOIN

2020-11-09 Thread Anastasia Lubennikova

On 21.08.2020 09:02, Andrey V. Lepikhov wrote:

On 7/1/20 2:10 PM, Daniel Gustafsson wrote:

On 27 Dec 2019, at 08:34, Kohei KaiGai  wrote:


The attached v2 fixed the problem, and regression test finished 
correctly.


This patch no longer applies to HEAD, please submit an rebased version.
Marking the entry Waiting on Author in the meantime.

Rebased version of the patch on current master (d259afa736).

I rebased it because it is a base of my experimental feature than we 
don't break partitionwise join of a relation with foreign partition 
and a local relation if we have info that remote server has foreign 
table link to the local relation (by analogy with shippable extensions).


Maybe mark as 'Needs review'?


Status update for a commitfest entry.

According to cfbot, the patch fails to apply. Could you please send a 
rebased version?


This thread was inactive for quite some time. Is anyone going to 
continue working on it?


I see some interest in the idea of sharable hash, but I don't see even a 
prototype in this thread. So, probably, it is a matter of a separate 
discussion.


Also, I took a look at the code. It looks like it needs some extra work. 
I am not a big expert in this area, so I'm sorry if questions are obvious.


1. What would happen if this assumption is not met?

+         * MEMO: We assume this pathlist keeps at least one AppendPath that
+         * represents partitioned table-scan, symmetric or asymmetric
+         * partition-wise join. It is not correct right now, however, a 
hook
+         * on add_path() to give additional decision for path removel 
allows

+         * to retain this kind of AppendPath, regardless of its cost.

2. Why do we wrap extract_asymmetric_partitionwise_subjoin() call into 
PG_TRY/PG_CATCH? What errors do we expect?


3. It looks like a crutch. If it isn't, I'd like to see a better comment 
about why "dynamic programming" is not applicable here.

And shouldn't we also handle a root->join_cur_level?

+                /* temporary disables "dynamic programming" algorithm */
+                root->join_rel_level = NULL;

4. This change looks like it can lead to a memory leak for old code. 
Maybe it is never the case, but again I think it worth a comment.


-    /* If there's nothing to adjust, don't call this function. */
-    Assert(nappinfos >= 1 && appinfos != NULL);
+    /* If there's nothing to adjust, just return a duplication */
+    if (nappinfos == 0)
+        return copyObject(node);

5. extract_asymmetric_partitionwise_subjoin() lacks a comment

The new status of this patch is: Waiting on Author

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit

2020-11-06 Thread Anastasia Lubennikova
Status update for a commitfest entry.

This thread was inactive for a while and from the latest messages, I see that 
the patch needs some further work.
So I move it to "Waiting on Author".

The new status of this patch is: Waiting on Author


Re: A problem about partitionwise join

2020-11-06 Thread Anastasia Lubennikova
Status update for a commitfest entry.

According to CFbot this patch fails to apply. Richard, can you send an update, 
please?

Also, I see that the thread was inactive for a while.
Are you going to continue this work? I think it would be helpful, if you could 
write a short recap about current state of the patch and list open questions 
for reviewers.

The new status of this patch is: Waiting on Author


Re: bitmaps and correlation

2020-11-06 Thread Anastasia Lubennikova
Status update for a commitfest entry

According to cfbot, the patch fails to apply.  Could you please send a rebased 
version?

I wonder why this patch hangs so long without a review. Maybe it will help to 
move discussion forward, if you provide more examples of queries that can 
benefit from this imporovement?

The first patch is simply a refactoring and don't see any possible objections 
against it.
The second patch also looks fine to me. The logic is understandable and the 
code is neat.

It wouldn't hurt to add a comment for this computation, though.
+   pages_fetched = pages_fetchedMAX + 
indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);

The new status of this patch is: Waiting on Author


Re: WIP: BRIN multi-range indexes

2020-11-02 Thread Anastasia Lubennikova
Status update for a commitfest entry.

According to cfbot the patch no longer compiles.
Tomas, can you send an update, please?

I also see that a few last messages mention a data corruption bug. Sounds 
pretty serious.
Alvaro, have you had a chance to look at it? I don't see anything committed 
yet, nor any active discussion in other threads.

Re: RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-11-02 Thread Anastasia Lubennikova
Status update for a commitfest entry.

This thread was inactive for a while. The latest review suggests that it is 
Ready For Committer.
I also took a quick look at the patch and agree that it looks sensible. Maybe 
add a comment before the _bt_compare_inl() to explain the need for this code 
change.

The new status of this patch is: Ready for Committer


Re: [proposal] de-TOAST'ing using a iterator

2020-11-02 Thread Anastasia Lubennikova
Status update for a commitfest entry.

This entry was inactive for a very long time. 
John, are you going to continue working on this?

The last message mentions some open issues, namely backend crashes, so I move 
it to "Waiting on author".

The new status of this patch is: Waiting on Author


Re: Feedback on table expansion hook (including patch)

2020-11-02 Thread Anastasia Lubennikova
Status update for a commitfest entry.

This patch implements useful improvement and the reviewer approved the code. It 
lacks a test, but looking at previously committed hooks, I think it is not 
mandatory. 
So, I move it to RFC.

The new status of this patch is: Ready for Committer


Re: Improving psql slash usage help message

2020-11-02 Thread Anastasia Lubennikova
Status update for a commitfest entry.
This thread was inactive for a while. Is anyone going to continue working on it?

My two cents on the topic: 
I don’t see it as a big problem in the first place. In the source code, \dE 
refers to foreign tables and \de refers to forign servers. So, it seems more 
reasonable to me, to rename the latter.
  \dE[S+] [PATTERN]  list foreign tables
  \det[+] [PATTERN]  list foreign servers

I also do not support merging the entries for different commands. I think that 
current help output is easier to read.

The new status of this patch is: Waiting on Author


Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

2020-11-02 Thread Anastasia Lubennikova

On 30.10.2020 03:42, Tomas Vondra wrote:

Hi,

I might be somewhat late to the party, but I've done a bit of
benchmarking too ;-) I used TPC-H data from a 100GB test, and tried
different combinations of COPY [FREEZE] and VACUUM [FREEZE], both on
current master and with the patch.

So I don't really observe any measurable slowdowns in the COPY part (in
fact there seems to be a tiny speedup, but it might be just noise). In
the VACUUM part, there's clear speedup when the data was already frozen
by COPY (Yes, those are zeroes, because it took less than 1 second.)

So that looks pretty awesome, I guess.

For the record, these tests were run on a server with NVMe SSD, so
hopefully reliable and representative numbers.


Thank you for the review.


A couple minor comments about the code:

2) I find the "if (all_frozen_set)" block a bit strange. It's a matter
of personal preference, but I'd just use a single level of nesting, i.e.
something like this:

    /* if everything frozen, the whole page has to be visible */
    Assert(!(all_frozen_set && !PageIsAllVisible(page)));

    /*
 * If we've frozen everything on the page, and if we're already
 * holding pin on the vmbuffer, record that in the visibilitymap.
 * If we're not holding the pin, it's OK to skip this - it's just
 * an optimization.
 *
 * It's fine to use InvalidTransactionId here - this is only used
 * when HEAP_INSERT_FROZEN is specified, which intentionally
 * violates visibility rules.
 */
    if (all_frozen_set &&
    visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer))
visibilitymap_set(...);

IMHO it's easier to read, but YMMV. I've also reworded the comment a bit
to say what we're doing etc. And I've moved the comment from inside the
if block into the main comment - that was making it harder to read too.

I agree that it's a matter of taste. I've updated comments and left 
nesting unchanged to keep assertions simple.




3) I see RelationGetBufferForTuple does this:

    /*
 * This is for COPY FREEZE needs. If page is empty,
 * pin vmbuffer to set all_frozen bit
 */
    if ((options & HEAP_INSERT_FROZEN) &&
    (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0))
    visibilitymap_pin(relation, BufferGetBlockNumber(buffer), 
vmbuffer);


so is it actually possible to get into the (all_frozen_set) without
holding a pin on the visibilitymap? I haven't investigated this so
maybe there are other ways to get into that code block. But the new
additions to hio.c get the pin too.

I was thinking that GetVisibilityMapPins() can somehow unset the pin. I 
gave it a second look. And now I don't think it's possible to get into 
this code block without a pin.  So, I converted this check into an 
assertion.




4) In heap_xlog_multi_insert we now have this:

    if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
    PageClearAllVisible(page);
    if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
    PageSetAllVisible(page);

IIUC it makes no sense to have both flags at the same time, right? So
what about adding

    Assert(!(XLH_INSERT_ALL_FROZEN_SET && 
XLH_INSERT_ALL_VISIBLE_CLEARED));


to check that?


Agree.

I placed this assertion to the very beginning of the function. It also 
helped to simplify the code a bit.
I also noticed, that we were not updating visibility map for all_frozen 
from heap_xlog_multi_insert. Fixed.




I wonder what to do about the heap_insert - I know there are concerns it
would negatively impact regular insert, but is it really true? I suppose
this optimization would be valuable even for cases where multi-insert is
not possible.

Do we have something like INSERT .. FREEZE? I only see 
TABLE_INSERT_FROZEN set for COPY FREEZE and for matview operations. Can 
you explain, what use-case are we trying to optimize by extending this 
patch to heap_insert()?


The new version is attached.
I've also fixed a typo in the comment by Tatsuo Ishii suggestion.
Also, I tested this patch with replication and found no issues.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 395716e277ac4be22b9b61311c301a69db0f9101
Author: anastasia 
Date:   Mon Nov 2 16:27:48 2020 +0300

Teach COPY FREEZE to set PD_ALL_VISIBLE and visibility map bits.

diff --git a/contrib/pg_visibility/expected/pg_visibility.out b/contrib/pg_visibility/expected/pg_visibility.out
index ca4b6e186b..0017e3415c 100644
--- a/contrib/pg_visibility/expected/pg_visibility.out
+++ b/contrib/pg_visibility/expected/pg_visibility.out
@@ -179,6 +179,69 @@ select pg_truncate_visibility_map('test_partition');
  
 (1 row)
 
+-- test copy freeze
+create table copyfreeze (a int, b char(1500));
+-- load all rows via COPY FREEZE and ensure that all pages are set all-visible
+-- and all-frozen.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visi

Re: Commitfest 2020-11

2020-11-02 Thread Anastasia Lubennikova

Hi everyone,

November commitfest is now in progress!
Me and Georgios Kokolatos are happy to volunteer to manage it.

During this CF, I want to pay more attention to a long-living issues.

Current state for the Commitfest is:

Needs review: 154
Waiting on Author: 32
Ready for Committer: 20
Committed: 32
Withdrawn: 5
Rejected: 1
Total: 244

We have quite a few ReadyForCommitter patches of a different size and 
complexity.


Please, if you have submitted patches in this CF make sure that you are 
also reviewing patches of a similar number and complexity. The CF cannot 
move forward without patch review.


Also, check the state of your patch at http://cfbot.cputube.org/

Happy hacking!

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Corruption during WAL replay

2020-10-30 Thread Anastasia Lubennikova
Status update for a commitfest entry.

I see quite a few unanswered questions in the thread since the last patch 
version was sent. So, I move it to "Waiting on Author".

The new status of this patch is: Waiting on Author


Re: BUG #15383: Join Filter cost estimation problem in 10.5

2020-10-30 Thread Anastasia Lubennikova
Status update for a commitfest entry.

It looks like there was no real progress on this issue since April. I see only 
an experimental patch. What kind of review does it need right now? Do we need 
more testing or maybe production statistics for such queries?

David, are you going to continue working on it?
Can you, please, provide a short summary of the problem and list open questions 
for reviewers.

The new status of this patch is: Waiting on Author


Re: Implementing Incremental View Maintenance

2020-10-28 Thread Anastasia Lubennikova
ср, 28 окт. 2020 г. в 08:02, Yugo NAGATA :

> Hi Anastasia Lubennikova,
>
> I am writing this to you because I would like to ask the commitfest
> manager something.
>
> The status of the patch was changed to "Waiting on Author" from
> "Ready for Committer" at the beginning of this montfor the reason
> that rebase was necessary. Now I updated the patch, so can I change
> the status back to "Ready for Committer"?
>
> Regards,
> Yugo Nagata
>
>
Yes, go ahead. As far as I see, the patch is in a good shape and there are
no unanswered questions from reviewers.
Feel free to change the status of CF entries, when it seems reasonable to
you.

P.S. Please, avoid top-posting, It makes it harder to follow the
discussion, in-line replies are customary in pgsql mailing lists.
See https://en.wikipedia.org/wiki/Posting_style#Top-posting for details.
-- 
Best regards,
Lubennikova Anastasia


Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-27 Thread Anastasia Lubennikova

On 26.10.2020 04:13, Michael Paquier wrote:

On Fri, Oct 23, 2020 at 08:00:08AM +0900, Michael Paquier wrote:

Yeah, we could try to make the logic a bit more complicated like
that.  However, for any code path relying on a page read without any
locking insurance, we cannot really have a lot of trust in any of the
fields assigned to the page as this could just be random corruption
garbage, and the only thing I am ready to trust here a checksum
mismatch check, because that's the only field on the page that's
linked to its full contents on the 8k page.  This also keeps the code
simpler.

A small update here.  I have extracted the refactored part for
PageIsVerified() and committed it as that's independently useful.
This makes the patch proposed here simpler on HEAD, leading to the
attached.
--
Michael


Thank you for committing the first part.

In case you need a second opinion on the remaining patch, it still looks 
good to me.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: SEARCH and CYCLE clauses

2020-10-27 Thread Anastasia Lubennikova

On 10.10.2020 08:25, Pavel Stehule wrote:

Hi

pá 9. 10. 2020 v 12:17 odesílatel Pavel Stehule 
mailto:pavel.steh...@gmail.com>> napsal:




pá 9. 10. 2020 v 11:40 odesílatel Peter Eisentraut
mailto:peter.eisentr...@2ndquadrant.com>> napsal:

On 2020-09-22 20:29, Pavel Stehule wrote:
> The result is correct. When I tried to use UNION instead
UNION ALL, the
> pg crash

I fixed the crash, but UNION [DISTINCT] won't actually work
here because
row/record types are not hashable.  I'm leaving the partial
support in,
but I'm documenting it as currently not supported.

 I think so UNION is a common solution against the cycles. So
missing support for this specific case is not a nice thing. How
much work is needed for hashing rows. It should not be too much code.


> looks so clause USING in cycle detection is unsupported for
DB2 and
> Oracle - the examples from these databases doesn't work on
PG without
> modifications

Yeah, the path clause is actually not necessary from a user's
perspective, but it's required for internal bookkeeping.  We
could
perhaps come up with a mechanism to make it invisible coming
out of the
CTE (maybe give the CTE a target list internally), but that
seems like a
separate project.

The attached patch fixes the issues you have reported (also
the view
issue from the other email).  I have also moved the whole rewrite
support to a new file to not blow up rewriteHandler.c so much.

-- 
Peter Eisentraut http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training &
Services


This patch is based on transformation CYCLE and SEARCH clauses to 
specific expressions - it is in agreement with ANSI SQL


There is not a problem with compilation
Nobody had objections in discussion
There are enough regress tests and documentation
check-world passed
doc build passed

I'll mark this patch as ready for committer

Possible enhancing for this feature (can be done in next steps)

1. support UNION DISTINCT
2. better compatibility with Oracle and DB2 (USING clause can be optional)

Regards

Pavel





Status update for a commitfest entry.
According to cfbot patch no longer applies. So I moved it to waiting on 
author.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

2020-10-27 Thread Anastasia Lubennikova
Status update for a commitfest entry.

This patch is ReadyForCommitter. It applies and passes the CI. There are no 
unanswered questions in the discussion. 

The discussion started in 2015 with a patch by Jeff Janes. Later it was revived 
by Pavan Deolasee.  After it was picked up by Ibrar Ahmed and finally, it was 
rewritten by me, so I moved myself from reviewers to authors as well.

The latest version was reviewed and tested by Ibrar Ahmed. The patch doesn't 
affect COPY FREEZE performance and significantly decreases the time of the 
following VACUUM.

Commitfest 2020-11

2020-10-26 Thread Anastasia Lubennikova

Hello, hackers!

November commitfest will start just in a few days.
I'm happy to volunteer to be the CFM for this one. With a help of 
Georgios Kokolatos [1].


It's time to register your patch in the commitfest, if not yet.

If you already have a patch in the commitfest, update its status and 
make sure it still applies and that the tests pass. Check the state at  
http://cfbot.cputube.org/


If there is a long-running stale discussion, please send a short summary 
update about its current state, open questions, and TODOs. I hope, it 
will encourage reviewers to pay more attention to the thread.


[1] 
https://www.postgresql.org/message-id/AxH0n_zLwwJ0MBN3uJpHfYDkV364diOGhtpLAv0OC0qHLN8ClyPsbRi1fSUAJLJZzObZE_y1qc-jqGravjIMoxVrdtLm74HmTUeIPWWkmSg%3D%40pm.me


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Commitfest manager 2020-11

2020-10-25 Thread Anastasia Lubennikova

On 20.10.2020 10:30, gkokola...@pm.me wrote:





‐‐‐ Original Message ‐‐‐

[snip]

Wow, that was well in advance) I am willing to assist if you need any help.


Indeed it was a bit early. I left for vacation after that. For the record, I am 
newly active to the community. In our PUG, in Stockholm, we held a meetup 
during which a contributor presented ways to contribute to the community, one 
of which is becoming CFM. So, I thought of picking up the recommendation.

I have taken little part in CFs as reviewer/author and I have no idea how a CF 
is actually run. A contributor from Stockholm has been willing to mentor me to 
the part.

Since you have both the knowledge and specific ideas on improving the CF, how 
about me assisting you? I could shadow you and you can groom me to the part, so 
that I can take the lead to a future CF more effectively.

This is just a suggestion of course. I am happy with anything that can help the 
community as a whole.

Even though, I've worked a lot with community, I have never been CFM 
before as well. So, I think we can just follow these articles:


https://www.2ndquadrant.com/en/blog/managing-a-postgresql-commitfest/
https://wiki.postgresql.org/wiki/CommitFest_Checklist

Some parts are a bit outdated, but in general the checklist is clear. 
I've already requested CFM privileges in pgsql-www and I'm going to 
spend next week sending pings and updates to the patches at commitfest.


There are already 219 patches, so I will appreciate if you join me in 
this task.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-22 Thread Anastasia Lubennikova

On 22.10.2020 04:25, Michael Paquier wrote:

On Thu, Oct 22, 2020 at 12:47:03AM +0300, Anastasia Lubennikova wrote:

We can also read such pages via shared buffers to be 100% sure.

Yeah, but this has its limits as well.  One can use
ignore_checksum_failure, but this can actually be very dangerous as
you can finish by loading into shared buffers a page that has a header
thought as sane but with a large portion of its page randomly
corrupted, spreading corruption around and leading to more fancy
logic failures in Postgres, with more panic from customers.  Not using
ignore_checksum_failure is safer, but it makes an analyze of the
damages for a given file harder as things would stop at the first
failure of a file with a seqscan.  pg_prewarm can help here, but
that's not the goal of the tool to do that either.

I was thinking about applying this only to pages with LSN > startLSN.

Most of such pages are valid and already in memory, because they were 
changed just recently, so no need for pg_prewarm here. If such LSN 
appeared because of a data corruption, page verification from inside 
ReadBuffer() will report an error first. In proposed function, we can 
handle this error in any fashion we want. Something like:


if (PageGetLSN(page) > startptr)
{
    if (!read_page_via_buffercache())

        //throw a warning about corrupted page
        //handle checksum error as needed
    else
        //page is valid. No worries
}

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Allow some recovery parameters to be changed with reload

2020-10-21 Thread Anastasia Lubennikova

On 10.08.2020 23:20, Robert Haas wrote:

On Sun, Aug 9, 2020 at 1:21 AM Michael Paquier  wrote:

Sorry for the late reply.  I have been looking at that stuff again,
and restore_command can be called in the context of a WAL sender
process within the page_read callback of logical decoding via
XLogReadDetermineTimeline(), as readTimeLineHistory() could look for a
timeline history file.  So restore_command is not used only in the
startup process.

Hmm, interesting. But, does that make this change wrong, apart from
the comments? Like, in the case of primary_conninfo, maybe some
confusion could result if the startup process decided whether to ask
for a WAL receiver based on thinking primary_conninfo being set, while
that process thought that it wasn't actually set after all, as
previously discussed in
http://postgr.es/m/ca+tgmozvmjx1+qtww2tsnphrnkwkzxc3zsrynfb-fpzm1ox...@mail.gmail.com
... but what's the corresponding hazard here, exactly? It doesn't seem
that there's any way in which the decision one process makes affects
the decision the other process makes. There's still a race condition:
it's possible for a walsender

Did you mean walreceiver here?

  to use the old restore_command after the
startup process had already used the new one, or the other way around.
However, it doesn't seem like that should confuse anything inside the
server, and therefore I'm not sure we need to code around it.
I came up with following scenario. Let's say we have xlog files 1,2,3 in 
dir1 and files 4,5 in dir2. If startup process had only handled files 1 
and 2, before we switched restore_command from reading dir1 to reading 
dir2, it will fail to find next file. IIUC, it will assume that recovery 
is done, start server and walreceiver. The walreceiver will fail as 
well. I don't know, how realistic is this case, though.


In general,. this feature looks useful and consistent with previous 
changes, so I am interested in pushing it forward.
Sergey, could you please attach this thread to the upcoming CF, if 
you're going to continue working on it.


 A few more questions:
- RestoreArchivedFile() is also used by pg_rewind. I don't see any 
particular problem with it, just want to remind that we should test it too.
- How will it interact with possible future optimizations of archive 
restore? For example, WAL prefetch [1].


 [1] 
https://www.postgresql.org/message-id/flat/601ee1f5-0b78-47e1-9aae-c15f74a1c...@postgrespro.ru


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-21 Thread Anastasia Lubennikova

On 20.10.2020 09:24, Michael Paquier wrote:

On Fri, Oct 16, 2020 at 06:02:52PM +0300, Anastasia Lubennikova wrote:

In the current patch, PageIsVerifed called from pgbasebackup simply doesn't
Should we change this too? I don't see any difference.

I considered that, but now that does not seem worth bothering here.


Done.

Thanks for the updated patch.  I have played with dd and some fake
files to check the validity of the zero-case (as long as pd_lsn does
not go wild).  Here are some comments, with an updated patch attached:
- Added a PageIsVerifiedExtended to make this patch back-patchable,
with the original routine keeping its original shape.
Thank you. I always forget about this. Do we have any checklist for such 
changes, that patch authors and reviewers can use?

- I did not like much to show the WARNING from PageIsVerified() for
the report, and we'd lose some context related to the base backups.
The important part is to know which blocks from which files are found
as problematic.
- Switched the checks to have PageIsVerified() placed first in the
hierarchy, so as we do all the basic validity checks before a look at
the LSN.  This is not really change things logically.
- The meaning of block_retry is also the opposite of what it should be
in the original code?  IMO, the flag should be set to true if we still
are fine to retry a block, and set it to false once the one-time retry
has failed.


Agree.


- The error strings are not really consistent with the project style
in this area.  These are usually not spawned across multiple lines to
ease searches with grep or such.
Same question as above. I don't see this info about formatting in the 
error message style guide in documentation. Is it mentioned somewhere else?

Anastasia, Michael B, does that look OK to you?

The final patch looks good to me.

NB: This patch fixes only one problem, the zero-page case, as it was
the intention of Michael B to split this part into its own thread.  We
still have, of course, a second problem when it comes to a random LSN
put into the page header which could mask an actual checksum failure
so this only takes care of half the issues.  Having a correct LSN
approximation is a tricky problem IMO, but we could improve things by
having some delta with an extra WAL page on top of GetInsertRecPtr().
And this function cannot be used for base backups taken from
standbys.


We can also read such pages via shared buffers to be 100% sure.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [PATCH] Add extra statistics to explain for Nested Loop

2020-10-16 Thread Anastasia Lubennikova

On 16.10.2020 12:07, Julien Rouhaud wrote:
Le ven. 16 oct. 2020 à 16:12, Pavel Stehule <mailto:pavel.steh...@gmail.com>> a écrit :




pá 16. 10. 2020 v 9:43 odesílatel mailto:e.sokol...@postgrespro.ru>> napsal:

Hi, hackers.
For some distributions of data in tables, different loops in
nested loop
joins can take different time and process different amounts of
entries.
It makes average statistics returned by explain analyze not
very useful
for DBA.
To fix it, here is the patch that add printing of min and max
statistics
for time and rows across all loops in Nested Loop to EXPLAIN
ANALYSE.
Please don't hesitate to share any thoughts on this topic!


+1

This is great feature - sometimes it can be pretty messy current
limited format


+1, this can be very handy!


Cool.
I have added your patch to the commitfest, so it won't get lost.
https://commitfest.postgresql.org/30/2765/

I will review the code next week.  Unfortunately, I cannot give any 
feedback about usability of this feature.


User visible change is:

-   ->  Nested Loop (actual rows=N loops=N)
+  ->  Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2)

Pavel, Julien, could you please say if it looks good?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Commitfest manager 2020-11

2020-10-16 Thread Anastasia Lubennikova

On 16.10.2020 21:57, Tom Lane wrote:

Anastasia Lubennikova  writes:

On the other hand, I noticed a lot of stall threads, that weren't
updated in months. Some of them seem to pass several CFs without any
activity at all. I believe that it is wrong for many reasons, the major
of which IMHO is a frustration of the authors. Can we come up with
something to impove this situation?

Yeah, that's a perennial problem.  Part of the issue is just a shortage
of people --- there are always more patches than we can review and
commit in one month.  IMO, another cause is that we have a hard time
saying "no".  If a particular patch isn't too well liked, we tend to
just let it slide to the next CF rather than making the uncomfortable
decision to reject it.  If you've got thoughts about that, or any other
ways to improve the process, for sure speak up.



From a CFM perspective, we can try the following things:

- Write recaps for long-running threads, listing open questions and TODOs.
This one is my personal pain. Some threads do look scary and it is less 
likely that someone will even start a review if they have to catch up 
with a year-long discussion of 10 people.


- Mark patches from first-time contributors with some tag.
Probably, these entries are simple/dummy enough to handle them faster. 
And also it will be a good reminder to be a bit less demanding with 
beginners. See Dmitry's statistic about how many people have sent patch 
only once [1].


- Proactively ask committers, if they are going to work on the upcoming 
CF and will they need any specific help.
Maybe we can also ask about their preferred code areas and check what is 
left uncovered. It's really bad if there is no one, who is working with 
let's say WAL internals during the CF. TBH, I have no idea, what are we 
going to do with this knowledge, but I think it's better to know.


- From time to time call a council of several committers and make tough 
decisions about patches that are in discussion for too long (let's say 4 
commitfests).
Hopefully, it will be easier to reach a consensus in a "real-time" 
discussion, or we can toss a coin. This problem was raised in previous 
discussions too [2].


[1] 
https://www.postgresql.org/message-id/CA+q6zcXtg7cFwX-c+BoOwk65+jtR-sQGZ=1mqg-vgmvzuh8...@mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/CAA8%3DA7-owFLugBVZ0JjehTZJue7brEs2qTjVyZFRDq-B%3D%2BNwNg%40mail.gmail.com



--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Commitfest manager 2020-11

2020-10-16 Thread Anastasia Lubennikova

On 24.08.2020 16:08, gkokola...@pm.me wrote:

Hi all,

Admittedly quite ahead of time, I would like to volunteer as Commitfest manager 
for 2020-11.

If the role is not filled and there are no objections, I can reach out again in 
October for confirmation.

//Georgios


Wow, that was well in advance) I am willing to assist if you need any help.

I was looking for this message, to find out who is the current CFM. 
Apparently, the November commitfest is not in progress yet.


Still, I have a question. Should we also maintain statuses of the 
patches in the "Open" commitfest? 21 patches were already committed 
during this CF, which shows that even "open" CF is quite active. I've 
updated a few patches, that were sent by my colleagues. If there are no 
objections, I can do that for other entries too.


On the other hand, I noticed a lot of stall threads, that weren't 
updated in months. Some of them seem to pass several CFs without any 
activity at all. I believe that it is wrong for many reasons, the major 
of which IMHO is a frustration of the authors. Can we come up with 
something to impove this situation?


P.S. I have a few more ideas about the CF management. I suppose, that 
they are usually being discussed at pgcon meetings, but those won't 
happen anytime soon. Is there a special place for such discussions, or I 
may continue this thread?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-16 Thread Anastasia Lubennikova

On 07.10.2020 11:18, Michael Paquier wrote:

On Fri, Sep 25, 2020 at 08:53:27AM +0200, Michael Banck wrote:

Oh right, I've fixed up the commit message in the attached V4.

Not much a fan of what's proposed here, for a couple of reasons:
- If the page is not new, we should check if the header is sane or
not.
- It may be better to actually count checksum failures if these are
repeatable in a given block, and report them.
- The code would be more simple with a "continue" added for a block
detected as new or with a LSN newer than the backup start.
- The new error messages are confusing, and I think that these are
hard to translate in a meaningful way.

I am interested in moving this patch forward, so here is the updated v5.
This version uses PageIsVerified. All error messages are left unchanged.

So I think that we should try to use PageIsVerified() directly in the
code path of basebackup.c, and this requires a couple of changes to
make the routine more extensible:
- Addition of a dboid argument for pgstat_report_checksum_failure(),
whose call needs to be changed to use the *_in_db() flavor.


In the current patch, PageIsVerifed called from pgbasebackup simply 
doesn't report failures to pgstat.
Because in basebackup.c we already report checksum failures separately. 
Once per file.

        pgstat_report_checksum_failures_in_db(dboid, checksum_failures);

Should we change this too? I don't see any difference.


- Addition of an option to decide if a log should be generated or
not.
- Addition of an option to control if a checksum failure should be
reported to pgstat or not, even if this is actually linked to the
previous point, I have seen cases where being able to control both
separately would be helpful, particularly the log part.

For the last two ones, I would use an extensible argument based on
bits32 with a set of flags that the caller can set at will.

Done.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 553d62dbcdec1a00139f3c5ab6a325ed857b6c9d
Author: anastasia 
Date:   Fri Oct 16 17:36:02 2020 +0300

Fix checksum verification in base backups for zero page headers

We currently do not checksum a page if it is considered new by PageIsNew().
However, this means that if the whole page header is zero, PageIsNew() will
consider that page new due to the pd_upper field being zero and no subsequent
checksum verification is done.

To fix, use PageIsVerified() in pg_basebackup code.

Reported-By: Andres Freund
Author: Michael Banck
Reviewed-By: Asif Rehman, Anastasia Lubennikova
Discussion: https://postgr.es/m/20190319200050.ncuxejradurja...@alap3.anarazel.de

diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index dbbd3aa31f..b4e050c9b8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -443,7 +443,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
 
 		smgrread(src, forkNum, blkno, buf.data);
 
-		if (!PageIsVerified(page, blkno))
+		if (!PageIsVerified(page, blkno, PIV_LOG_WARNING | PIV_REPORT_STAT))
 			ereport(ERROR,
 	(errcode(ERRCODE_DATA_CORRUPTED),
 	 errmsg("invalid page in block %u of relation %s",
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..aa8d52c1aa 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1682,11 +1682,14 @@ sendFile(const char *readfilename, const char *tarfilename,
  * this case. We also skip completely new pages, since they
  * don't have a checksum yet.
  */
-if (!PageIsNew(page) && PageGetLSN(page) < startptr)
+if (PageGetLSN(page) < startptr)
 {
-	checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
-	phdr = (PageHeader) page;
-	if (phdr->pd_checksum != checksum)
+	int piv_flags = (checksum_failures < 5) ? (PIV_LOG_WARNING) : 0;
+	/*
+	 * Do not report checksum failures to pgstats from
+	 *  PageIsVerified, since we will do it later.
+	 */
+	if (!PageIsVerified(page, blkno, piv_flags))
 	{
 		/*
 		 * Retry the block on the first failure.  It's
@@ -1732,13 +1735,6 @@ sendFile(const char *readfilename, const char *tarfilename,
 
 		checksum_failures++;
 
-		if (checksum_failures <= 5)
-			ereport(WARNING,
-	(errmsg("checksum verification failed in "
-			"file \"%s\", block %d: calculated "
-			"%X but expected %X",
-			readfilename, blkno, checksum,
-			phdr->pd_checksum)));
 		if (checksum_failures == 5)
 			ereport(WARNING,
 	(errmsg("further checksum verification "
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e549fa1d30..971c96c7de 100644
--- a/src/backend/storage/buffer/bufmg

Re: Deleting older versions in unique indexes to avoid page splits

2020-10-14 Thread Anastasia Lubennikova

On 08.10.2020 02:48, Peter Geoghegan wrote:

On Tue, Jun 30, 2020 at 5:03 PM Peter Geoghegan  wrote:

Attached is a POC patch that teaches nbtree to delete old duplicate
versions from unique indexes. The optimization targets non-HOT
duplicate version bloat. Although the patch is rather rough, it
nevertheless manages to more or less eliminate a whole class of index
bloat: Unique index bloat from non-HOT updates in workloads where no
transaction lasts for more than a few seconds.

I'm slightly surprised that this thread didn't generate more interest
back in June. After all, maintaining the pristine initial state of
(say) a primary key index even after many high throughput non-HOT
updates (i.e. avoiding "logically unnecessary" page splits entirely)
is quite appealing. It arguably goes some way towards addressing long
held criticisms of our approach to MVCC. Especially if it can be
generalized to all b-tree indexes -- the Uber blog post mentioned
tables that have several indexes, which presumably means that there
can be no HOT updates (the author of the blog post didn't seem to be
aware of HOT at all).
The idea seems very promising, especially when extended to handle 
non-unique indexes too.

I've been trying to generalize my approach to work with all indexes. I
think that I can find a strategy that is largely effective at
preventing version churn page splits that take place with workloads
that have many non-HOT updates, without any serious downsides for
workloads that do not benefit. I want to get feedback on that now,
since I expect that it will be controversial. Teaching indexes about
how tuples are versioned or chaining tuples seems like a non-starter,
so the trick will be to come up with something that behaves in
approximately the same way as that in cases where it helps.

The approach I have in mind is to pass down a hint from the executor
to btinsert(), which lets nbtree know that the index tuple insert is
in fact associated with a non-HOT update. This hint is only given when
the update did not logically modify the columns that the index covers
That's exactly what I wanted to discuss after the first letter. If we 
could make (non)HOT-updates index specific, I think it could improve 
performance a lot.

Here is the maybe-controversial part: The algorithm initially assumes
that all indexes more or less have the same properties as unique
indexes from a versioning point of view, even though that's clearly
not true. That is, it initially assumes that the only reason why there
can be duplicates on any leaf page it looks at is because some
previous transaction also did a non-HOT update that added a new,
unchanged index tuple. The new algorithm only runs when the hint is
passed down from the executor and when the only alternative is to
split the page (or have a deduplication pass), so clearly there is
some justification for this assumption -- it really is highly unlikely
that this update (which is on the verge of splitting the page) just so
happened to be the first such update that affected the page.
To be blunt: It may be controversial that we're accessing multiple
heap pages while holding an exclusive lock on a leaf page, in the
hopes that we can avoid a page split, but without any certainty that
it'll work out.

Sometimes (maybe even most of the time), this assumption turns out to
be mostly correct, and we benefit in the obvious way -- no
"unnecessary" page splits for affected non-unique indexes. Other times
it won't work out, usually because the duplicates are in fact logical
duplicates from distinct logical rows.
I think that this optimization can affect low cardinality indexes 
negatively, but it is hard to estimate impact without tests. Maybe it 
won't be a big deal, given that we attempt to eliminate old copies not 
very often and that low cardinality b-trees are already not very useful. 
Besides, we can always make this thing optional, so that users could 
tune it to their workload.



I wonder, how this new feature will interact with physical replication? 
Replica may have quite different performance profile. For example, there 
can be long running queries, that now prevent vacuumfrom removing 
recently-dead rows. How will we handle same situation with this 
optimized deletion?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: MultiXact\SLRU buffers configuration

2020-10-07 Thread Anastasia Lubennikova

On 28.09.2020 17:41, Andrey M. Borodin wrote:

Hi, Anastasia!


28 авг. 2020 г., в 23:08, Anastasia Lubennikova  
написал(а):

1) The first patch is sensible and harmless, so I think it is ready for 
committer. I haven't tested the performance impact, though.

2) I like the initial proposal to make various SLRU buffers configurable, 
however, I doubt if it really solves the problem, or just moves it to another 
place?

The previous patch you sent was based on some version that contained changes for other 
slru buffers numbers: 'multixact_offsets_slru_buffers' and 
'multixact_members_slru_buffers'. Have you just forgot to attach them? The patch message 
"[PATCH v2 2/4]" hints that you had 4 patches)
Meanwhile, I attach the rebased patch to calm down the CFbot. The changes are 
trivial.

2.1) I think that both min and max values for this parameter are too extreme. 
Have you tested them?

+   _local_cache_entries,
+   256, 2, INT_MAX / 2,

2.2) MAX_CACHE_ENTRIES is not used anymore, so it can be deleted.

3) No changes for third patch. I just renamed it for consistency.

Thank you for your review.

Indeed, I had 4th patch with tests, but these tests didn't work well: I still 
did not manage to stress SLRUs to reproduce problem from production...

You are absolutely correct in point 2: I did only tests with sane values. And 
observed extreme performance degradation with values ~ 64 megabytes. I do not 
know which highest values should we pick? 1Gb? Or highest possible functioning 
value?


I would go with the values that we consider adequate for this setting. 
As I see, there is no strict rule about it in guc.c and many variables 
have large border values. Anyway, we need to explain it at least in the 
documentation and code comments.


It seems that the default was conservative enough, so it can be also a 
minimal value too. As for maximum, can you provide any benchmark 
results? If we have a peak and a noticeable performance degradation 
after that, we can use it to calculate the preferable maxvalue.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [PATCH] Automatic HASH and LIST partition creation

2020-10-06 Thread Anastasia Lubennikova

On 06.10.2020 00:21, Pavel Borisov wrote:

Hi, hackers!
I added some extra tests for different cases of use of automatic 
partition creation.
v3-0002 can be applied on top of the original v2 patch for correct 
work with some corner cases with constraints included in this test.



Thank you for the tests. I've added them and the fix into the patch.

I also noticed, that some table parameters, such as persistence were not 
promoted to auto generated partitions. This is fixed now. The test cases 
for temp and unlogged auto partitioned tables are updated respectively.
Besides, I slightly refactored the code and fixed documentation typos, 
that were reported by Rahila.


With my recent changes, one test statement, that you've added as 
failing, works.


CREATE TABLE list_parted_fail (a int) PARTITION BY LIST (a) CONFIGURATION
(VALUES IN ('1' collate "POSIX"));

It simply ignores collate POSIX part and creates a table with following 
structure:



   Partitioned table "public.list_parted_fail"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats 
target | Description

+-+---+--+-+-+--+-
 a  | integer |   |  | | plain 
|  |

Partition key: LIST (a)
Partitions: list_parted_fail_0 FOR VALUES IN (1)

Do you think that it is a bug? For now, I removed this statement from 
tests just to calm down the CI.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 4a387cfbd43e93b2c2e363307fb9c9ca53c3f56e
Author: anastasia 
Date:   Tue Oct 6 20:23:22 2020 +0300

Auto generated HASH and LIST partitions.

New syntax:

CREATE TABLE tbl_hash (i int) PARTITION BY HASH (i)
CONFIGURATION (modulus 3);

CREATE TABLE tbl_list (i int) PARTITION BY LIST (i)
CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);

With tests and documentation draft

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 28f844071b..4501e81bb5 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -29,6 +29,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 ] )
 [ INHERITS ( parent_table [, ... ] ) ]
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -41,6 +42,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [, ... ]
 ) ]
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -53,6 +55,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [, ... ]
 ) ] { FOR VALUES partition_bound_spec | DEFAULT }
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -96,6 +99,11 @@ FROM ( { partition_bound_expr | MIN
   TO ( { partition_bound_expr | MINVALUE | MAXVALUE } [, ...] ) |
 WITH ( MODULUS numeric_literal, REMAINDER numeric_literal )
 
+and partition_bound_auto_spec is:
+
+VALUES IN ( partition_bound_expr [, ...] ), [( partition_bound_expr [, ...] )] [, ...] [DEFAULT PARTITION default_part_name]
+MODULUS numeric_literal
+
 index_parameters in UNIQUE, PRIMARY KEY, and EXCLUDE constraints are:
 
 [ INCLUDE ( column_name [, ... ] ) ]
@@ -383,6 +391,11 @@ WITH ( MODULUS numeric_literal, REM
   however, you can define these constraints on individual partitions.
  
 
+ 
+  Hash and list partitioning also support automatic creation of partitions
+  with an optional CONFIGURATION clause.
+
+
  
   See  for more discussion on table
   partitioning.
@@ -391,6 +404,42 @@ WITH ( MODULUS numeric_literal, REM
 

 
+   
+CONFIGURATION ( partition_bound_auto_spec ) ] 
+
+ 
+  The optional CONFIGURATION clause used together
+  with PARTITION BY specifies a rule of generating bounds
+  for partitions of the partitioned table. All partitions are created automatically
+  along with the parent table.
+  
+  Any indexes, constraints and user-defined row-level triggers that exist
+  in the parent table are cloned on the new pa

Re: [PATCH] Automatic HASH and LIST partition creation

2020-10-01 Thread Anastasia Lubennikova

On 30.09.2020 22:58, Rahila Syed wrote:

Hi Anastasia,

I tested the syntax with some basic commands and it works fine, 
regression tests also pass.



Thank you for your review.

Couple of comments:
1. The syntax used omits the { IMMEDIATE | DEFERRED} keywords 
suggested in
the earlier discussions. I think it is intuitive to include IMMEDIATE 
with the current implementation
so that the syntax can be extended with a  DEFERRED clause in future 
for dynamic partitions.


  CREATE TABLE tbl_lst (i int) PARTITION BY LIST (i)
 CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION
tbl_default);

After some consideration, I decided that we don't actually need to 
introduce IMMEDIATE | DEFERRED keyword. For hash and list partitions it 
will always be immediate, as the number of partitions cannot change 
after we initially set it. For range partitions, on the contrary, it 
doesn't make much sense to make partitions immediately, because in many 
use-cases one bound will be open.


2. One suggestion for generation of partition names is to append a 
unique id to

avoid conflicts.


Can you please give an example of such a conflict? I agree that current 
naming scheme is far from perfect, but I think that 'tablename'_partnum 
provides unique name for each partition.




3. Probably, here you mean to write list and hash instead of range and 
list as

per the current state.

     
     Range and list partitioning also support automatic creation
of partitions
      with an optional CONFIGURATION clause.
    

4. Typo in default_part_name

+VALUES IN ( partition_bound_expr [, ...] ), [(
partition_bound_expr
[, ...] )] [, ...] [DEFAULT PARTITION defailt_part_name]
+MODULUS numeric_literal



Yes, you're right. I will fix these typos in next version of the patch.


Thank you,
Rahila Syed



--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [PATCH] Automatic HASH and LIST partition creation

2020-09-24 Thread Anastasia Lubennikova

On 24.09.2020 06:27, Michael Paquier wrote:

On Mon, Sep 14, 2020 at 02:38:56PM +0300, Anastasia Lubennikova wrote:

Fixed. This was also caught by cfbot. This version should pass it clean.

Please note that regression tests are failing, because of 6b2c4e59.
--
Michael


Thank you. Updated patch is attached.

Open issues for review:
- new syntax;
- generation of partition names;
- overall patch review and testing, especially with complex partitioning 
clauses.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit faa5805b839effd9d8220eff787fb0995276c370
Author: anastasia 
Date:   Mon Sep 14 11:34:42 2020 +0300

Auto generated HASH and LIST partitions.

New syntax:

CREATE TABLE tbl_hash (i int) PARTITION BY HASH (i)
CONFIGURATION (modulus 3);

CREATE TABLE tbl_list (i int) PARTITION BY LIST (i)
CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);

With documentation draft.

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 087cad184c..ff9a7eda09 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -29,6 +29,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 ] )
 [ INHERITS ( parent_table [, ... ] ) ]
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -41,6 +42,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [, ... ]
 ) ]
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -53,6 +55,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [, ... ]
 ) ] { FOR VALUES partition_bound_spec | DEFAULT }
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -96,6 +99,11 @@ FROM ( { partition_bound_expr | MIN
   TO ( { partition_bound_expr | MINVALUE | MAXVALUE } [, ...] ) |
 WITH ( MODULUS numeric_literal, REMAINDER numeric_literal )
 
+and partition_bound_auto_spec is:
+
+VALUES IN ( partition_bound_expr [, ...] ), [( partition_bound_expr [, ...] )] [, ...] [DEFAULT PARTITION defailt_part_name]
+MODULUS numeric_literal
+
 index_parameters in UNIQUE, PRIMARY KEY, and EXCLUDE constraints are:
 
 [ INCLUDE ( column_name [, ... ] ) ]
@@ -383,6 +391,11 @@ WITH ( MODULUS numeric_literal, REM
   however, you can define these constraints on individual partitions.
  
 
+ 
+  Range and list partitioning also support automatic creation of partitions
+  with an optional CONFIGURATION clause.
+
+
  
   See  for more discussion on table
   partitioning.
@@ -391,6 +404,38 @@ WITH ( MODULUS numeric_literal, REM
 

 
+   
+CONFIGURATION ( partition_bound_auto_spec ) ] 
+
+ 
+  The optional CONFIGURATION clause used together
+  with PARTITION BY specifies a rule of generating bounds
+  for partitions of the partitioned table. All partitions are created automatically
+  along with the parent table.
+  
+  Any indexes, constraints and user-defined row-level triggers that exist
+  in the parent table are cloned on the new partitions.
+ 
+
+ 
+  The partition_bound_auto_spec
+  must correspond to the partitioning method and partition key of the
+  parent table, and must not overlap with any existing partition of that
+  parent.  The form with VALUES IN is used for list partitioning 
+  and the form with MODULUS is used for hash partitioning.
+  List partitioning can also provide a default partition using 
+  DEFAULT PARTITION.
+ 
+
+ 
+ Automatic range partitioning is not supported yet.
+ 
+
+
+
+
+   
+

 PARTITION OF parent_table { FOR VALUES partition_bound_spec | DEFAULT }
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 0409a40b82..6893fa5495 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4628,6 +4628,7 @@ _copyPartitionSpec(const PartitionSpec *from)
 
 	COPY_STRING_FIELD(strategy);
 	COPY_NODE_FIELD(partParams);
+	COPY_NODE_FIELD(autopart);
 	COPY_LOCATION_FIELD(location);
 
 	return newnode;
@@ -4650,6

Re: [patch] Fix checksum verification in base backups for zero page headers

2020-09-24 Thread Anastasia Lubennikova



On 22.09.2020 17:30, Michael Banck wrote:

Hi,

Am Dienstag, den 22.09.2020, 16:26 +0200 schrieb Michael Banck:

Am Mittwoch, den 02.09.2020, 16:50 +0300 schrieb Anastasia Lubennikova:

I've looked through the previous discussion. As far as I got it, most of
the controversy was about online checksums improvements.

The warning about pd_upper inconsistency that you've added is a good
addition. The patch is a bit messy, though, because a huge code block
was shifted.

Will it be different, if you just leave
  "if (!PageIsNew(page) && PageGetLSN(page) < startptr)"
block as it was, and add
  "else if (PageIsNew(page) && !PageIsZero(page))" ?

Thanks, that does indeed look better as a patch and I think it's fine
as-is for the code as well, I've attached a v2.

Sorry, forgot to add you as reviewer in the proposed commit message,
I've fixed that up now in V3.


Michael


Great. This version looks good to me.
Thank you for answering my questions, I agree, that we can work on them 
in separate threads.


So I mark this one as ReadyForCommitter.

The only minor problem is a typo (?) in the proposed commit message.
"If a page is all zero, consider that a checksum failure." It should be 
"If a page is NOT all zero...".


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [PATCH] Automatic HASH and LIST partition creation

2020-09-14 Thread Anastasia Lubennikova
and I'd like to meet 
this functionality in v14.
I also hope that this patch will make it to v14, but for now, I don't 
see a consensus on the syntax and some details, so I wouldn't rush.


Besides, it definitely needs more testing. I haven't thoroughly tested 
following cases yet:

- how triggers and constraints are propagated to partitions;
- how does it handle some tricky clauses in list partitioning expr_list;
and so on.

Also, there is an open question about partition naming. Currently, the 
patch implements dummy %tbl_%partnum name generation, which is far from 
user-friendly. I think we must provide some hook or trigger function to 
rename partitions after they were created.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 08c5f8b35c4ffcaf09b8189cd8d0dc27ce76d715
Author: anastasia 
Date:   Mon Sep 14 11:34:42 2020 +0300

Auto generated HASH and LIST partitions.

New syntax:

CREATE TABLE tbl_hash (i int) PARTITION BY HASH (i)
CONFIGURATION (modulus 3);

CREATE TABLE tbl_list (i int) PARTITION BY LIST (i)
CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);

With documentation draft.

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 087cad184c..ff9a7eda09 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -29,6 +29,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 ] )
 [ INHERITS ( parent_table [, ... ] ) ]
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -41,6 +42,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [, ... ]
 ) ]
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -53,6 +55,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [, ... ]
 ) ] { FOR VALUES partition_bound_spec | DEFAULT }
 [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
+[ CONFIGURATION ( partition_bound_auto_spec ) ]
 [ USING method ]
 [ WITH ( storage_parameter [= value] [, ... ] ) | WITHOUT OIDS ]
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
@@ -96,6 +99,11 @@ FROM ( { partition_bound_expr | MIN
   TO ( { partition_bound_expr | MINVALUE | MAXVALUE } [, ...] ) |
 WITH ( MODULUS numeric_literal, REMAINDER numeric_literal )
 
+and partition_bound_auto_spec is:
+
+VALUES IN ( partition_bound_expr [, ...] ), [( partition_bound_expr [, ...] )] [, ...] [DEFAULT PARTITION defailt_part_name]
+MODULUS numeric_literal
+
 index_parameters in UNIQUE, PRIMARY KEY, and EXCLUDE constraints are:
 
 [ INCLUDE ( column_name [, ... ] ) ]
@@ -383,6 +391,11 @@ WITH ( MODULUS numeric_literal, REM
   however, you can define these constraints on individual partitions.
  
 
+ 
+  Range and list partitioning also support automatic creation of partitions
+  with an optional CONFIGURATION clause.
+
+
  
   See  for more discussion on table
   partitioning.
@@ -391,6 +404,38 @@ WITH ( MODULUS numeric_literal, REM
 

 
+   
+CONFIGURATION ( partition_bound_auto_spec ) ] 
+
+ 
+  The optional CONFIGURATION clause used together
+  with PARTITION BY specifies a rule of generating bounds
+  for partitions of the partitioned table. All partitions are created automatically
+  along with the parent table.
+  
+  Any indexes, constraints and user-defined row-level triggers that exist
+  in the parent table are cloned on the new partitions.
+ 
+
+ 
+  The partition_bound_auto_spec
+  must correspond to the partitioning method and partition key of the
+  parent table, and must not overlap with any existing partition of that
+  parent.  The form with VALUES IN is used for list partitioning 
+  and the form with MODULUS is used for hash partitioning.
+  List partitioning can also provide a default partition using 
+  DEFAULT PARTITION.
+ 
+
+ 
+ Automatic range partitioning is not supported yet.
+ 
+
+
+
+
+   
+

 PARTITION OF parent_table { FOR VALUES partition_bound_spec | DEFAULT }
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 0409a40b82..6893fa5495 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.

Re: Making index_set_state_flags() transactional

2020-09-11 Thread Anastasia Lubennikova

On 03.09.2020 11:04, Michael Paquier wrote:

Hi all,

For a couple of things I looked at lately, it would be really useful
to make index_state_set_flags() transactional and replace its use of
heap_inplace_update() by CatalogTupleUpdate():
- When dropping an index used in a replica identity, we could make the
reset of relreplident for the parent table consistent with the moment
the index is marked as invalid:
https://www.postgresql.org/message-id/20200831062304.ga6...@paquier.xyz
- In order to make CREATE INDEX CONCURRENTLY work for partitioned
tables, we need to be able to switch indisvalid for all the index
partitions in one final transaction so as we have a consistent
partition tree at any state of the operation.  Obviously,
heap_inplace_update() is a problem here.

The restriction we have in place now can be lifted thanks to 813fb03,
that removed SnapshotNow, and as far as I looked at the code paths
doing scans of pg_index, I don't see any reasons why we could not make
this stuff transactional.  Perhaps that's just because nobody had use
cases where it would be useful, and I have two of them lately.  Note
that 3c84046 was the original commit that introduced the use of
heap_inplace_update() and index_state_set_flags(), but we don't have
SnapshotNow anymore.  Note as well that we already make the index
swapping transactional in REINDEX CONCURRENTLY for the pg_index
entries.

I got the feeling that this is an issue different than the other
threads where this was discussed, which is why I am beginning a new
thread.  Any thoughts?  Attached is a proposal of patch.

Thanks,
--
Michael

As this patch is needed to continue the work started in 
"reindex/cic/cluster of partitioned tables" thread, I took a look on it.


I agree that transactional update in index_set_state_flags() is good for 
both cases, that you mentioned and don't see any restrictions. It seems 
that not doing this before was just a loose end, as the comment from 
813fb03 patch clearly stated, that it is safe to make this update 
transactional.


The patch itself looks clear and committable.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: 回复:how to create index concurrently on partitioned table

2020-09-07 Thread Anastasia Lubennikova

On 07.09.2020 04:58, Michael Paquier wrote:

On Fri, Sep 04, 2020 at 09:51:13AM +0900, Michael Paquier wrote:

Glad to hear that, please take the time you need.

Attached is a rebased version to address the various conflicts after
844c05a.
--
Michael


Thank you.

With the fix for the cycle in reindex_partitions() and new comments 
added, I think this patch is ready for commit.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: history file on replica and double switchover

2020-09-03 Thread Anastasia Lubennikova

On 27.08.2020 16:02, Grigory Smolkin wrote:

Hello!

I`ve noticed, that when running switchover replica to master and back 
to replica, new history file is streamed to replica, but not archived,
which is not great, because it breaks PITR if archiving is running on 
replica. The fix looks trivial.

Bash script to reproduce the problem and patch are attached.


Thanks for the report. I agree that it looks like a bug.

For some reason, patch failed to apply on current master, even though I 
don't see any difference in the code.

I'll attach this thread to the next commitfest, so it doesn't get lost.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: 回复:how to create index concurrently on partitioned table

2020-09-03 Thread Anastasia Lubennikova

On 02.09.2020 04:39, Michael Paquier wrote:


The problem with dropped relations in REINDEX has been addressed by
1d65416, so I have gone through this patch again and simplified the
use of session locks, these being taken only when doing a REINDEX
CONCURRENTLY for a given partition.  This part is in a rather
committable shape IMO, so I would like to get it done first, before
looking more at the other cases with CIC and CLUSTER.  I am still
planning to go through it once again.
--
Michael


Thank you for advancing this work.

I was reviewing the previous version, but the review became outdated 
before I sent it. Overall design is fine, but I see a bunch of things 
that need to be fixed before commit.


First of all, this patch fails at cfbot:

indexcmds.c:2848:7: error: variable ‘parentoid’ set but not used 
[-Werror=unused-but-set-variable]

Oid parentoid;^


It seems to be just a typo. With this minor fix the patch compiles and 
passes tests.


diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 75008eebde..f5b3c53a83 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2864,7 +2864,7 @@ reindex_partitions(Oid relid, int options, bool 
concurrent,
 
/* Save partition OID */

old_context = MemoryContextSwitchTo(reindex_context);
-   partitions = lappend_oid(partitions, partoid);
+   partitions = lappend_oid(partitions, parentoid);
MemoryContextSwitchTo(old_context);
}


If this guessed fix is correct, I see the problem in the patch logic. In 
reindex_partitions() we collect parent relations to pass them to 
reindex_multiple_internal(). It implicitly changes the logic from 
REINDEX INDEX to REINDEX RELATION, which is not the same, if table has 
more than one index. For example, if I add one more index to a 
partitioned table from a create_index.sql test:


create index idx on concur_reindex_part_0_2 using btree (c2);

and call

REINDEX INDEX CONCURRENTLY concur_reindex_part_index;

idx will be reindexed as well. I doubt that it is the desired behavior.


A few more random issues, that I noticed at first glance:

1) in create_index.sql

Are these two lines intentional checks that must fail? If so, I propose 
to add a comment.


REINDEX TABLE concur_reindex_part_index;
REINDEX TABLE CONCURRENTLY concur_reindex_part_index;

A few lines around also look like they were copy-pasted and need a 
second look.


2)  This part of ReindexRelationConcurrently() is misleading.

        case RELKIND_PARTITIONED_TABLE:
        case RELKIND_PARTITIONED_INDEX:
        default:
            /* Return error if type of relation is not supported */
            ereport(ERROR,
                    (errcode(ERRCODE_WRONG_OBJECT_TYPE),
                     errmsg("cannot reindex this type of relation 
concurrently")));


Maybe assertion is enough. It seems, that we should never reach this 
code because both ReindexTable and ReindexIndex handle those relkinds 
separately.  Which leads me to the next question.


3) Is there a reason, why reindex_partitions() is not inside 
ReindexRelationConcurrently() ? I think that logically it belongs there.


4) I haven't tested multi-level partitions yet. In any case, it would be 
good to document this behavior explicitly.



I need a bit more time to review this patch more thoroughly. Please, 
wait for it, before committing.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [patch] Fix checksum verification in base backups for zero page headers

2020-09-02 Thread Anastasia Lubennikova

On 01.09.2020 13:22, Michael Banck wrote:

Hi,

as a continuation of [1], I've split-off the zero page header case from
the last patch, as this one seems less contentious.


Michael

[1] https://commitfest.postgresql.org/28/2308/

I've looked through the previous discussion. As far as I got it, most of 
the controversy was about online checksums improvements.


The warning about pd_upper inconsistency that you've added is a good 
addition. The patch is a bit messy, though, because a huge code block 
was shifted.


Will it be different, if you just leave
    "if (!PageIsNew(page) && PageGetLSN(page) < startptr)"
block as it was, and add
    "else if (PageIsNew(page) && !PageIsZero(page))" ?


While on it, I also have a few questions about the code around:

1) Maybe we can use other header sanity checks from PageIsVerified() as 
well? Or even the function itself.


2) > /* Reset loop to validate the block again */
How many times do we try to reread the block? Is one reread enough?
Speaking of which, 'reread_cnt' name looks confusing to me. I would 
expect that this variable contains a number of attempts, not the number 
of bytes read.


If everyone agrees, that for basebackup purpose it's enough to rely on a 
single reread, I'm ok with it.
Another approach is to read the page directly from shared buffers to 
ensure that the page is fine. This way is more complicated, but should 
be almost bullet-proof. Using it we can also check pages with lsn >= 
startptr.


3) Judging by warning messages, we count checksum failures per file, not 
per page, and don't report after a fifth failure. Why so?  Is it a 
common case that so many pages of one file are corrupted?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

2020-09-01 Thread Anastasia Lubennikova

On 01.09.2020 04:38, Michael Paquier wrote:

I have added some extra comments.  There is one in
ReindexRelationConcurrently() to mention that there should be no extra
use of MISSING_OK once the list of indexes is built as session locks
are taken where needed.
Great, it took me a moment to understand the logic around index list 
check at first pass.

Does the version attached look fine to you?  I have done one round of
indentation while on it.


Yes, this version is good.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

2020-08-31 Thread Anastasia Lubennikova

On 13.08.2020 07:38, Michael Paquier wrote:

Hi all,

While working on support for REINDEX for partitioned relations, I have
noticed an old bug in the logic of ReindexMultipleTables(): the list
of relations to process is built in a first transaction, and then each
table is done in an independent transaction, but we don't actually
check that the relation still exists when doing the real work.  I
think that a complete fix involves two things:
1) Addition of one SearchSysCacheExists1() at the beginning of the
loop processing each item in the list in ReindexMultipleTables().
This would protect from a relation dropped, but that would not be
enough if ReindexMultipleTables() is looking at a relation dropped
before we lock it in reindex_table() or ReindexRelationConcurrently().
Still that's simple, cheaper, and would protect from most problems.
2) Be completely water-proof and adopt a logic close to what we do for
VACUUM where we try to open a relation, but leave if it does not
exist.  This can be achieved with just some try_relation_open() calls
with the correct lock used, and we also need to have a new
REINDEXOPT_* flag to control this behavior conditionally.

2) and 1) are complementary, but 2) is invasive, so based on the lack
of complaints we have seen that does not seem really worth
backpatching to me, and I think that we could also just have 1) on
stable branches as a minimal fix, to take care of most of the
problems that could show up to users.

Attached is a patch to fix all that, with a cheap isolation test that
fails on HEAD with a cache lookup failure.  I am adding that to the
next CF for now, and I would rather fix this issue before moving on
with REINDEX for partitioned relations as fixing this issue reduces
the use of session locks for partition trees.

Any thoughts?
--
Michael


Hi,
I reviewed the patch. It does work and the code is clean and sane. It 
implements a logic that we already had in CLUSTER, so I think it was 
simply an oversight. Thank you for fixing this.


I noticed that REINDEXOPT_MISSING_OK can be passed to the TOAST table 
reindex. I think it would be better to reset the flag in this 
reindex_relation() call, as we don't expect a concurrent DROP here.


    /*
     * If the relation has a secondary toast rel, reindex that too while we
     * still hold the lock on the main table.
     */
    if ((flags & REINDEX_REL_PROCESS_TOAST) && OidIsValid(toast_relid))
        result |= reindex_relation(toast_relid, flags, options);


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





  1   2   >