RE: Bug in query rewriter - hasModifyingCTE not getting set

2021-05-19 Thread houzj.f...@fujitsu.com
From: Tom Lane 
> Greg Nancarrow  writes:
> > On Sun, Feb 7, 2021 at 10:03 AM Tom Lane  wrote:
> >> I think either the bit about rule_action is unnecessary, or most of
> >> the code immediately above this is wrong, because it's only updating
> >> flags in sub_action.  Why do you think it's necessary to change
> >> rule_action in addition to sub_action?
> 
> > I believe that the bit about rule_action IS necessary, as it's needed
> > for the case of INSERT...SELECT, so that hasModifyingCTE is set on the
> > rewritten INSERT (see comment above the call to
> > getInsertSelectQuery(), and the "KLUDGE ALERT" comment within that
> > function).
> 
> Hm.  So after looking at this more, the problem is that the rewrite is 
> producing
> something equivalent to
> 
> INSERT INTO bug6051_2
> (WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT * FROM t1);
> 
> If you try to do that directly, the parser will give you the raspberry:
> 
> ERROR:  WITH clause containing a data-modifying statement must be at the
> top level LINE 2: (WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT *
> FROM ...
>   ^
> 
> The code throwing that error, in analyzeCTE(), explains
> 
> /*
>  * We disallow data-modifying WITH except at the top level of a query,
>  * because it's not clear when such a modification should be executed.
>  */
> 
> That semantic issue doesn't get any less pressing just because the query was
> generated by rewrite.  So I now think that what we have to do is throw an 
> error
> if we have a modifying CTE and sub_action is different from rule_action.  Not
> quite sure how to phrase the error though.

I am +1 for throwing an error if we have a modifying CTE and sub_action is 
different
from rule_action. As we disallowed data-modifying CTEs which is not at the top 
level
of a query, it will be safe and consistent to disallow the same case here.

Maybe we can output the message like the following ?
"DO INSTEAD INSERT ... SELECT rules are not supported for INSERT contains 
data-modifying statements in WITH."

Best regards,
houzj





Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-19 Thread Andy Fan
On Wed, May 19, 2021 at 8:15 PM David Rowley  wrote:

> On Mon, 17 May 2021 at 14:52, Andy Fan  wrote:
> > Would marking the new added RestrictInfo.norm_selec > 1 be OK?
>
> There would be cases you'd want to not count the additional clauses in
> the selectivity estimation and there would be cases you would want to.
>
> For example:
>
> SELECT ... FROM t1 INNER JOIN t2 ON t1.dt = t2.dt WHERE t1.dt BETWEEN
> 'date1' AND 'date2';
>
> If you derived that t2.dt is also BETWEEN 'date1' AND 'date2' then
> you'd most likely want to include those quals for scans feeding merge,
> hash and non-parameterized nested loop joins, so you'd also want to
> count them in your selectivity estimations, else you'd feed junk
> values into the join selectivity estimations.
>
>
Yes, you are correct.


> Parameterized nested loop joins might be different as if you were
> looping up an index for t1.dt values on some index on t2.dt, then
> you'd likely not want to bother also filtering out the between clause
> values too. They're redundant in that case.
>
>
I do not truly understand this.


> I imagined we'd have some functions in equivclass.c that allows you to
> choose if you wanted the additional filters or not.
>

Sounds like a good idea.


>
> Tom's example, WHERE a = b AND a IN (1,2,3), if a and b were in the
> same relation then you'd likely never want to include the additional
> quals.  The only reason I could think that it would be a good idea is
> if "b" had an index but "a" didn't.  I've not checked the code, but
> the index matching code might already allow that to work anyway.
>
>
+1 for this feature overall.

-- 
Best Regards
Andy Fan (https://www.aliyun.com/)


Re[3]: On login trigger: take three

2021-05-19 Thread Ivan Panchenko

Hi,
I have upgraded the patch for the 14th version.
>Вторник, 16 марта 2021, 14:32 +03:00 от Ivan Panchenko :
> 
>Hi,
>
>Thank you, Konstantin, for this very good feature with numerous use cases.
>Please find the modified patch attached.
>
>I’ve added the ‘enable_client_connection_trigger’ GUC to the sample config 
>file and also an additional example page to the docs.
>Check world has passed and it is ready for committer.
> 
>>Четверг, 28 января 2021, 12:04 +03:00 от Konstantin Knizhnik < 
>>k.knizh...@postgrespro.ru >:
>> 
>>
>>
>>On 28.01.2021 5:47, Amit Kapila wrote:
>>> On Mon, Dec 28, 2020 at 5:46 PM Masahiko Sawada < sawada.m...@gmail.com > 
>>> wrote:
 On Sat, Dec 26, 2020 at 4:04 PM Pavel Stehule < pavel.steh...@gmail.com > 
 wrote:
>
>
> so 26. 12. 2020 v 8:00 odesílatel Pavel Stehule < pavel.steh...@gmail.com 
> > napsal:
>> Hi
>>
>>
>>> Thank you.
>>> I have applied all your fixes in on_connect_event_trigger-12.patch.
>>>
>>> Concerning enable_client_connection_trigger GUC, I think that it is 
>>> really useful: it is the fastest and simplest way to disable login 
>>> triggers in case
>>> of some problems with them (not only for superuser itself, but for all 
>>> users). Yes, it can be also done using "ALTER EVENT TRIGGER DISABLE".
>>> But assume that you have a lot of databases with different login 
>>> policies enforced by on-login event triggers. And you want temporary 
>>> disable them all, for example for testing purposes.
>>> In this case GUC is most convenient way to do it.
>>>
>> There was typo in patch
>>
>> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
>> index f810789..8861f1b 100644
>> --- a/doc/src/sgml/config.sgml
>> +++ b/doc/src/sgml/config.sgml
>> @@ -1,4 +1,4 @@
>> -
>> +\
>>
>> I have not any objections against functionality or design. I tested the 
>> performance, and there are no negative impacts when this feature is not 
>> used. There is significant overhead related to plpgsql runtime 
>> initialization, but when this trigger will be used, then probably some 
>> other PLpgSQL procedures and functions will be used too, and then this 
>> overhead can be ignored.
>>
>> * make without warnings
>> * make check-world passed
>> * doc build passed
>>
>> Possible ToDo:
>>
>> The documentation can contain a note so usage connect triggers in 
>> environments with short life sessions and very short fast queries 
>> without usage PLpgSQL functions or procedures can have negative impact 
>> on performance due overhead of initialization of PLpgSQL engine.
>>
>> I'll mark this patch as ready for committers
>
> looks so this patch has not entry in commitfestapp 2021-01
>
 Yeah, please register this patch before the next CommitFest[1] starts,
 2021-01-01 AoE[2].

>>> Konstantin, did you register this patch in any CF? Even though the
>>> reviewer seems to be happy with the patch, I am afraid that we might
>>> lose track of this unless we register it.
>>> Yes, certainly:
>>https://commitfest.postgresql.org/31/2900/
>>
>>--
>>Konstantin Knizhnik
>>Postgres Professional:  http://www.postgrespro.com
>>The Russian Postgres Company
>>
>>  
> 
> 
> 
>  
 
 
 
 diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index b33e59d5e4..2bb5804e76 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -182,7 +182,7 @@
 { oid = '1', oid_symbol = 'TemplateDbOid',
   descr = 'database\'s default template',
   datname = 'template1', encoding = 'ENCODING', datcollate = 'LC_COLLATE',
-  datctype = 'LC_CTYPE', datistemplate = 't', datallowconn = 't',
+  datctype = 'LC_CTYPE', datistemplate = 't', datallowconn = 't', dathaslogontriggers = 'f',
   datconnlimit = '-1', datlastsysoid = '0', datfrozenxid = '0',
   datminmxid = '1', dattablespace = 'pg_default', datacl = '_null_' },
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 6d06ad22b9..7e0113f1e8 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2968,6 +2968,17 @@ SCRAM-SHA-256$iteration count:
   
  
 
+ 
+  
+   dathaslogontriggers bool
+  
+  
+Indicates that there are client connection triggers defined for this database.
+This flag is used to avoid extra lookup of pg_event_trigger table on each backend startup.
+This flag is used internally by Postgres and should not be manually changed by DBA or application.
+  
+ 
+
  
   
datconnlimit int4
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7e32b0686c..d6d9b3eb34 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1035,6 +1035,24 @@ include_dir 'conf.d'
   
  
 
+ 
+  enable_client_connection_trigger (boolean)
+  
+   enable_client_connection_trigger 

Re: Transactions involving multiple postgres foreign servers, take 2

2021-05-19 Thread Masahiro Ikeda


On 2021/05/11 13:37, Masahiko Sawada wrote:
> I've attached the updated patches that incorporated comments from
> Zhihong and Ikeda-san.

Thanks for updating the patches!


I have other comments including trivial things.


a. about "foreign_transaction_resolver_timeout" parameter

Now, the default value of "foreign_transaction_resolver_timeout" is 60 secs.
Is there any reason? Although the following is minor case, it may confuse some
users.

Example case is that

1. a client executes transaction with 2PC when the resolver is processing
FdwXactResolverProcessInDoubtXacts().

2. the resolution of 1st transaction must be waited until the other
transactions for 2pc are executed or timeout.

3. if the client check the 1st result value, it should wait until resolution
is finished for atomic visibility (although it depends on the way how to
realize atomic visibility.) The clients may be waited
foreign_transaction_resolver_timeout". Users may think it's stale.

Like this situation can be observed after testing with pgbench. Some
unresolved transaction remains after benchmarking.

I assume that this default value refers to wal_sender, archiver, and so on.
But, I think this parameter is more like "commit_delay". If so, 60 seconds
seems to be big value.


b. about performance bottleneck (just share my simple benchmark results)

The resolver process can be performance bottleneck easily although I think
some users want this feature even if the performance is not so good.

I tested with very simple workload in my laptop.

The test condition is
* two remote foreign partitions and one transaction inserts an entry in each
partitions.
* local connection only. If NW latency became higher, the performance became
worse.
* pgbench with 8 clients.

The test results is the following. The performance of 2PC is only 10%
performance of the one of without 2PC.

* with foreign_twophase_commit = requried
-> If load with more than 10TPS, the number of unresolved foreign transactions
is increasing and stop with the warning "Increase
max_prepared_foreign_transactions".

* with foreign_twophase_commit = disabled
-> 122TPS in my environments.


c. v36-0001-Introduce-transaction-manager-for-foreign-transa.patch

* typo: s/tranasction/transaction/

* Is it better to move AtEOXact_FdwXact() in AbortTransaction() to before "if
(IsInParallelMode())" because make them in the same order as 
CommitTransaction()?

* functions name of fdwxact.c

Although this depends on my feeling, xact means transaction. If this feeling
same as you, the function names of FdwXactRegisterXact and so on are odd to
me. FdwXactRegisterEntry or FdwXactRegisterParticipant is better?

* Are the following better?

- s/to register the foreign transaction by/to register the foreign transaction
participant by/

- s/The registered foreign transactions/The registered participants/

- s/given foreign transaction/given foreign transaction participant/

- s/Foreign transactions involved in the current transaction/Foreign
transaction participants involved in the current transaction/


Regards,

-- 
Masahiro Ikeda
NTT DATA CORPORATION




Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-05-19 Thread Bharath Rupireddy
On Wed, May 19, 2021 at 6:13 PM Bharath Rupireddy
 wrote:
> Thanks. I will work on the new structure ParseSubOption only for
> subscription options.

PSA v2 patch that has changes for 1) new ParseSubOption structure 2)
the error reporting code refactoring.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From b72336b7dc803c4ebd49d25850c0b15d8fbdbbed Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 20 May 2021 09:03:37 +0530
Subject: [PATCH v2] Refactor parse_subscription_options

Currently parse_subscription_options function receives a lot(14) of input
parameters which makes it inextensible to add the new parameters. So,
better wrap all the input parameters within a ParseSubOptions structure to
which new parameters can be added easily.

Also, remove similar code (with the only difference in the option) when
throwing errors for mutually exclusive options. Have a variable for the
option and use it in the error message. It saves some LOC and makes the
code look better with lesser ereport(ERROR statements.
---
 src/backend/commands/subscriptioncmds.c | 348 +---
 1 file changed, 187 insertions(+), 161 deletions(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 8aa6de1785..feb9074436 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -51,6 +51,26 @@ static void check_duplicates_in_publist(List *publist, Datum *datums);
 static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
 static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
 
+/*
+ * Structure to hold subscription options for parsing
+ */
+typedef struct ParseSubOptions
+{
+	List 	*stmt_options;
+	bool 	*connect;
+	bool 	*enabled_given;
+	bool 	*enabled;
+	bool 	*create_slot;
+	bool 	*slot_name_given;
+	char 	**slot_name;
+	bool 	*copy_data;
+	char 	**synchronous_commit;
+	bool 	*refresh;
+	bool 	*binary_given;
+	bool 	*binary;
+	bool 	*streaming_given;
+	bool 	*streaming;
+} ParseSubOptions;
 
 /*
  * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands.
@@ -60,16 +80,7 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname,
  * accommodate that.
  */
 static void
-parse_subscription_options(List *options,
-		   bool *connect,
-		   bool *enabled_given, bool *enabled,
-		   bool *create_slot,
-		   bool *slot_name_given, char **slot_name,
-		   bool *copy_data,
-		   char **synchronous_commit,
-		   bool *refresh,
-		   bool *binary_given, bool *binary,
-		   bool *streaming_given, bool *streaming)
+parse_subscription_options(ParseSubOptions *opts)
 {
 	ListCell   *lc;
 	bool		connect_given = false;
@@ -78,45 +89,46 @@ parse_subscription_options(List *options,
 	bool		refresh_given = false;
 
 	/* If connect is specified, the others also need to be. */
-	Assert(!connect || (enabled && create_slot && copy_data));
+	Assert(!opts->connect || (opts->enabled && opts->create_slot &&
+			  opts->copy_data));
 
-	if (connect)
-		*connect = true;
-	if (enabled)
+	if (opts->connect)
+		*opts->connect = true;
+	if (opts->enabled)
 	{
-		*enabled_given = false;
-		*enabled = true;
+		*opts->enabled_given = false;
+		*opts->enabled = true;
 	}
-	if (create_slot)
-		*create_slot = true;
-	if (slot_name)
+	if (opts->create_slot)
+		*opts->create_slot = true;
+	if (opts->slot_name)
 	{
-		*slot_name_given = false;
-		*slot_name = NULL;
+		*opts->slot_name_given = false;
+		*opts->slot_name = NULL;
 	}
-	if (copy_data)
-		*copy_data = true;
-	if (synchronous_commit)
-		*synchronous_commit = NULL;
-	if (refresh)
-		*refresh = true;
-	if (binary)
+	if (opts->copy_data)
+		*opts->copy_data = true;
+	if (opts->synchronous_commit)
+		*opts->synchronous_commit = NULL;
+	if (opts->refresh)
+		*opts->refresh = true;
+	if (opts->binary)
 	{
-		*binary_given = false;
-		*binary = false;
+		*opts->binary_given = false;
+		*opts->binary = false;
 	}
-	if (streaming)
+	if (opts->streaming)
 	{
-		*streaming_given = false;
-		*streaming = false;
+		*opts->streaming_given = false;
+		*opts->streaming = false;
 	}
 
 	/* Parse options */
-	foreach(lc, options)
+	foreach(lc, opts->stmt_options)
 	{
 		DefElem*defel = (DefElem *) lfirst(lc);
 
-		if (strcmp(defel->defname, "connect") == 0 && connect)
+		if (strcmp(defel->defname, "connect") == 0 && opts->connect)
 		{
 			if (connect_given)
 ereport(ERROR,
@@ -124,19 +136,19 @@ parse_subscription_options(List *options,
 		 errmsg("conflicting or redundant options")));
 
 			connect_given = true;
-			*connect = defGetBoolean(defel);
+			*opts->connect = defGetBoolean(defel);
 		}
-		else if (strcmp(defel->defname, "enabled") == 0 && enabled)
+		else if (strcmp(defel->defname, "enabled") == 0 && opts->enabled)
 		{
-			if (*enabled_given)
+			if (*opts->enabled_given)
 ereport(ERROR,
 		

Adaptive Plan Sharing for PreparedStmt

2021-05-19 Thread Andy Fan
Currently we are using a custom/generic strategy to handle the data skew
issue. However, it doesn't work well all the time. For example:  SELECT *
FROM t WHERE a between $1 and $2. We assume the selectivity is 0.0025,
But users may provide a large range every time. Per our current strategy,
a generic plan will be chosen, Index scan on A will be chosen. oops..

I think Oracle's Adaptive Cursor sharing should work. First It calculate
the selectivity with the real bind values and generate/reuse different plan
based on the similarity of selectivity. The challenges I can think of now
are:
a). How to define the similarity.  b). How to adjust the similarity during
the
real run. for example, we say [1% ~ 10%] is similar. but we find
selectivity 20%
used the same plan as 10%. what should be done here.


I am searching for the best place to invest in the optimizer aspect. and
the above idea should be the one I can think of now.  Any thought?


Thanks
-- 
Best Regards
Andy Fan (https://www.aliyun.com/)


Re: Query about time zone patterns in to_char

2021-05-19 Thread Suraj Kharage
+1 for the change.

I quickly reviewed the patch and overall it looks good to me.
Few cosmetic suggestions:

1:
+RESET timezone;
+
+
 CREATE TABLE TIMESTAMPTZ_TST (a int , b timestamptz);

Extra line.

2:
+SET timezone = '00:00';
+SELECT to_char(now(), 'of') as "Of", to_char(now(), 'tzh:tzm') as
"tzh:tzm";

O should be small in alias just for consistency.

I am not sure whether we should backport this or not but I don't see any
issues with back-patching.

On Sun, May 16, 2021 at 9:43 PM Nitin Jadhav 
wrote:

> > AFAICS, table 9.26 specifically shows which case-variants are supported.
> > If there are some others that happen to work, we probably shouldn't
> > remove them for fear of breaking poorly-written apps ... but that does
> > not imply that we need to support every case-variant.
>
> Thanks for the explanation. I also feel that we may not support every
> case-variant. But the other reason which triggered me to think in the
> other way is, as mentioned in commit [1] where this feature was added,
> says that these format patterns are compatible with Oracle. Whereas
> Oracle supports both upper case and lower case patterns. I just wanted
> to get it confirmed with this point before concluding.
>
> [1] -
> commit 11b623dd0a2c385719ebbbdd42dd4ec395dcdc9d
> Author: Andrew Dunstan 
> Date:   Tue Jan 9 14:25:05 2018 -0500
>
> Implement TZH and TZM timestamp format patterns
>
> These are compatible with Oracle and required for the datetime template
> language for jsonpath in an upcoming patch.
>
> Nikita Glukhov and Andrew Dunstan, reviewed by Pavel Stehule.
>
> Thanks & Regards,
> Nitin Jadhav
>
> On Sun, May 16, 2021 at 8:40 PM Tom Lane  wrote:
> >
> > Nitin Jadhav  writes:
> > > While understanding the behaviour of the to_char() function as
> > > explained in [1], I observed that some patterns related to time zones
> > > do not display values if we mention in lower case. As shown in the
> > > sample output [2], time zone related patterns TZH, TZM and OF outputs
> > > proper values when specified in upper case but does not work if we
> > > mention in lower case. But other patterns like TZ, HH, etc works fine
> > > with upper case as well as lower case.
> >
> > > I would like to know whether the current behaviour of TZH, TZM and OF
> > > is done intentionally and is as expected.
> >
> > AFAICS, table 9.26 specifically shows which case-variants are supported.
> > If there are some others that happen to work, we probably shouldn't
> > remove them for fear of breaking poorly-written apps ... but that does
> > not imply that we need to support every case-variant.
> >
> > regards, tom lane
>
>
>

-- 
--

Thanks & Regards,
Suraj kharage,



edbpostgres.com


Re: Installation of regress.so?

2021-05-19 Thread Tom Lane
Michael Paquier  writes:
> Could it be possible to install regress.so at least in the same
> location as pg_regress?

I don't think this is a great idea.  Aside from the fact that
we'd be littering the install tree with a .so of no use to end
users, I'm failing to see how it really gets you anywhere unless
you want to further require regress.so from back versions to be
loadable into the current server.

regards, tom lane




Installation of regress.so?

2021-05-19 Thread Michael Paquier
Hi all,

While diving into a transformation of the tests of pg_upgrade to TAP,
I am getting annoyed by the fact that regress.so is needed if you
upgrade an older instance that holds the regression objects from the
main regression test suite.  The buildfarm code is using a trick to
copy regress.so from the source code tree of the old instance into its
installation.  See in PGBuild/Modules/TestUpgradeXversion.pm:
# at some stage we stopped installing regress.so
copy "$self->{pgsql}/src/test/regress/regress.so",
 "$installdir/lib/postgresql/regress.so"
 unless (-e "$installdir/lib/postgresql/regress.so");

This creates a hard dependency with the source code of the old
instance if attempting to create an old instance based on a dump,
which is what the buildfarm does, and something that I'd like to get
support for in the TAP tests of pg_upgrade in the tree.

Could it be possible to install regress.so at least in the same
location as pg_regress?  This would still require the test to either
move regress.so into a location from where the backend could load the
library, but at least the library could be accessible without a
dependency to the source tree of the old instance upgrading from.  To
make that really usable, this would require a backpatch, though..

Thoughts?
--
Michael


signature.asc
Description: PGP signature


RE: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

2021-05-19 Thread Pengchengliu
Hi Greg,
   Thanks a lot for you explanation and your fix.
   
   I think your fix can resolve the core dump issue. As with your fix, parallel 
process reset Transaction Xmin from ActiveSnapshot. 
   But it will change Transaction snapshot for all parallel  scenarios. I don't 
know whether it bring in other issue. 
   For test only, I think it is enough. 

   So is there anybody can explain what's exactly difference between 
ActiveSnapshot and TransactionSnapshot in parallel work process. 
   Then maybe we can find a better solution and try to fix it really.

Thanks
Pengcheng 

-Original Message-
From: Greg Nancarrow  
Sent: 2021年5月18日 17:15
To: Pengchengliu 
Cc: Andres Freund ; PostgreSQL-development 

Subject: Re: Re: Parallel scan with SubTransGetTopmostTransaction assert 
coredump

On Tue, May 18, 2021 at 11:27 AM Pengchengliu  wrote:
>
> Hi Greg,
>
>Actually I am very confused about ActiveSnapshot and TransactionSnapshot. 
> I don't know why main process send ActiveSnapshot and TransactionSnapshot 
> separately.  And what is exact difference between them?
>If you know that, could you explain that for me? It will be very 
> appreciated.

In the context of a parallel-worker, I am a little confused too, so I can't 
explain it either.
It is not really explained in the file
"src\backend\access\transam\README.parallel", it only mentions the following as 
part of the state that needs to be copied to each worker:

 - The transaction snapshot.
 - The active snapshot, which might be different from the transaction snapshot.

So they might be different, but exactly when and why?

When I debugged a typical parallel-SELECT case, I found that prior to plan 
execution, GetTransactionSnapshot() was called and its return value was stored 
in both the QueryDesc and the estate (es_snapshot), which was then pushed on 
the ActiveSnapshot stack. So by the time
InitializeParallelDSM() was called, the (top) ActiveSnapshot was the last 
snapshot returned from GetTransactionSnapshot().
So why InitializeParallelDSM() calls GetTransactionSnapshot() again is not 
clear to me (because isn't then the ActiveSnapshot a potentially earlier 
snapshot? - which it shouldn't be, AFAIK. And also, it's then different to the 
non-parallel case).

>Before we know them exactly, I think we should not modify the 
> TransactionSnapshot to ActiveSnapshot in main process. If it is, the main 
> process should send ActiveSnapshot only.

I think it would be worth you trying my suggested change (if you have a 
development environment, which I assume you have). Sure, IF it was deemed a 
proper solution, you'd only send the one snapshot, and adjust accordingly in 
ParallelWorkerMain(), but we need not worry about that in order to test it.


Regards,
Greg Nancarrow
Fujitsu Australia




Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Tom Lane
Michael Paquier  writes:
> On Wed, May 19, 2021 at 04:23:55PM -0400, Tom Lane wrote:
>> * Replace the edata->resultRelInfo field with two fields, one for
>> the original parent and one for the actual/current target.  Perhaps
>> this is worth doing, not sure.

> This one sounds more natural to me, though.

OK, I'll give that a try tomorrow.

> May I ask why you are not moving the snapshot pop and push into the
> finish() and create() routines for this patch?

That does need to happen, but I figured I'd leave it to the other
patch, since there are other things to change too for that snapshot
problem.  Obviously, whichever patch goes in second will need trivial
adjustments, but I think it's logically clearer that way.

> Also, any thoughts
> about adding the trigger tests from 013_partition.pl to REL_13_STABLE?

Yeah, if this is a pre-existing problem then we should back-port the
tests that revealed it.

regards, tom lane




Re: wal stats questions

2021-05-19 Thread Masahiro Ikeda


On 2021/05/19 11:40, Fujii Masao wrote:
> Thanks for updating the patch! I modified some comments slightly and
> pushed that version of the patch.

Thanks a lot!

Regards,

-- 
Masahiro Ikeda
NTT DATA CORPORATION




Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Michael Paquier
On Wed, May 19, 2021 at 03:35:00PM -0400, Tom Lane wrote:
> Magnus Hagander  writes:
>> Changing that to look globally can certainly be done. It takes a bit
>> of work I think, as there are no API endpoints today that will do
>> that, but those could be added.
> 
> Ah.  Personally, I'd settle for it checking -hackers, -docs and -bugs.
> Perhaps there's some case for -general as well.

FWIW, I have seen cases for -general in the past.
--
Michael


signature.asc
Description: PGP signature


Re: Skip partition tuple routing with constant partition key

2021-05-19 Thread David Rowley
On Thu, 20 May 2021 at 12:20, tsunakawa.ta...@fujitsu.com
 wrote:
> Yes, I want to make/keep it possible that application developers can be 
> unaware of partitions.  I believe that's why David-san, Alvaro-san, and you 
> have made great efforts to improve partitioning performance.  So, I'm +1 for 
> what Hou-san is trying to achieve.
>
> Is there something you're concerned about?  The amount and/or complexity of 
> added code?

It would be good to see how close Amit's patch gets to the performance
of the original patch on this thread.  As far as I can see, the
difference is, aside from the setup code to determine if the partition
is constant, that Amit's patch just requires an additional
ExecPartitionCheck() call per row.  That should be pretty cheap when
compared to the binary search to find the partition for a RANGE or
LIST partitioned table.

Houzj didn't mention how the table in the test was partitioned, so
it's hard to speculate how many comparisons would be done during a
binary search. Or maybe it was HASH partitioned and there was no
binary search.

David




Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Amit Langote
On Thu, May 20, 2021 at 5:23 AM Tom Lane  wrote:
> Amit Langote  writes:
> > IOW, the patch you posted earlier seems like the way to go.
>
> I really dislike that patch.  I think it's doubling down on the messy,
> unstructured coding patterns that got us into this situation to begin
> with.  I'd prefer to expend a little effort on refactoring so that
> the ExecCleanupTupleRouting call can be moved to the cleanup function
> where it belongs.
>
> So, I propose the attached, which invents a new struct to carry
> the stuff we've discovered to be necessary.  This makes the APIs
> noticeably cleaner IMHO.

Larger footprint, but definitely cleaner.  Thanks.

> I did not touch the APIs of the apply_XXX_internal functions,
> as it didn't really seem to offer any notational advantage.
> We can't simply collapse them to take an "edata" as I did for
> apply_handle_tuple_routing, because the ResultRelInfo they're
> supposed to operate on could be different from the original one.
> I considered a couple of alternatives:
>
> * Replace their estate arguments with edata, but keep the separate
> ResultRelInfo arguments.  This might be worth doing in future, if we
> add more fields to ApplyExecutionData.  Right now it'd save nothing,
> and it'd create a risk of confusion about when to use the
> ResultRelInfo argument vs. edata->resultRelInfo.
>
> * Allow apply_handle_tuple_routing to overwrite edata->resultRelInfo
> with the partition child's RRI, then simplify the apply_XXX_internal
> functions to take just edata instead of separate estate and
> resultRelInfo args.  I think this would work right now, but it seems
> grotty, and it might cause problems in future.
>
> * Replace the edata->resultRelInfo field with two fields, one for
> the original parent and one for the actual/current target.  Perhaps
> this is worth doing, not sure.
>
> Thoughts?

IMHO, it would be better to keep the lowest-level
apply_handle_XXX_internal() out of this, because presumably we're only
cleaning up the mess in higher-level callers.  Somewhat related, one
of the intentions behind a04daa97a43, which removed
es_result_relation_info in favor of passing the ResultRelInfo
explicitly to the executor's lower-level functions, was to avoid bugs
caused by failing to set/reset that global field correctly in
higher-level callers.  Now worker.c is pretty small compared with the
executor, but still it seems like a good idea to follow the same
principle here.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Skip partition tuple routing with constant partition key

2021-05-19 Thread David Rowley
On Thu, 20 May 2021 at 01:17, Amit Langote  wrote:
> I gave a shot to implementing your idea and ended up with the attached
> PoC patch, which does pass make check-world.

I only had a quick look at this.

+ if ((dispatch->key->strategy == PARTITION_STRATEGY_RANGE ||
+ dispatch->key->strategy == PARTITION_STRATEGY_RANGE))
+ dispatch->lastPartInfo = rri;

I think you must have meant to have one of these as PARTITION_STRATEGY_LIST?

Wondering what your thoughts are on, instead of caching the last used
ResultRelInfo from the last call to ExecFindPartition(), to instead
cached the last looked up partition index in PartitionDescData? That
way we could cache lookups between statements.  Right now your caching
is not going to help for single-row INSERTs, for example.

For multi-level partition hierarchies that would still require looping
and checking the cached value at each level.

I've not studied the code that builds and rebuilds PartitionDescData,
so there may be some reason that we shouldn't do that. I know that's
changed a bit recently with DETACH CONCURRENTLY.  However, providing
the cached index is not outside the bounds of the oids array, it
shouldn't really matter if the cached value happens to end up pointing
to some other partition. If that happens, we'll just fail the
ExecPartitionCheck() and have to look for the correct partition.

David




RE: Skip partition tuple routing with constant partition key

2021-05-19 Thread tsunakawa.ta...@fujitsu.com
From: Amit Langote 
> On Tue, May 18, 2021 at 11:11 AM houzj.f...@fujitsu.com
>  wrote:
> > For some big data scenario, we sometimes transfer data from one table(only
> store not expired data)
> > to another table(historical data) for future analysis.
> > In this case, we import data into historical table regularly(could be one 
> > day or
> half a day),
> > And the data is likely to be imported with date label specified, then all 
> > of the
> data to be
> > imported this time belong to the same partition which partition by time 
> > range.
> 
> Is directing that data directly into the appropriate partition not an
> acceptable solution to address this particular use case?  Yeah, I know
> we should avoid encouraging users to perform DML directly on
> partitions, but...

Yes, I want to make/keep it possible that application developers can be unaware 
of partitions.  I believe that's why David-san, Alvaro-san, and you have made 
great efforts to improve partitioning performance.  So, I'm +1 for what Hou-san 
is trying to achieve.

Is there something you're concerned about?  The amount and/or complexity of 
added code?


Regards
Takayuki Tsunakawa



Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Michael Paquier
On Wed, May 19, 2021 at 02:36:03PM -0400, Andrew Dunstan wrote:
> Yeah, this area needs substantial improvement. I have seen similar sorts
> of nasty hangs, where the script is waiting forever for some process
> that hasn't got the shutdown message. At least we probably need some way
> of making sure the END handler doesn't abort early. Maybe
> PostgresNode::stop() needs a mode that handles failure more gracefully.
> Maybe it needs to try shutting down all the nodes and only calling
> BAIL_OUT after trying all of them and getting a failure. But that might
> still leave us work to do on failures occuring pre-END.

For that, we could just make the END block called run_log() directly
as well, as this catches stderr and an error code.  What about making
the shutdown a two-phase logic by the way?  Trigger an immediate stop,
and if it fails fallback to an extra kill9() to be on the safe side.

Have you seen this being a problem even in cases where the tests all
passed?  If yes, it may be worth using the more aggressive flow even
in the case where the tests pass.
--
Michael


signature.asc
Description: PGP signature


Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Michael Paquier
On Wed, May 19, 2021 at 04:23:55PM -0400, Tom Lane wrote:
> I really dislike that patch.  I think it's doubling down on the messy,
> unstructured coding patterns that got us into this situation to begin
> with.  I'd prefer to expend a little effort on refactoring so that
> the ExecCleanupTupleRouting call can be moved to the cleanup function
> where it belongs.

Okay.

> I did not touch the APIs of the apply_XXX_internal functions,
> as it didn't really seem to offer any notational advantage.
> We can't simply collapse them to take an "edata" as I did for
> apply_handle_tuple_routing, because the ResultRelInfo they're
> supposed to operate on could be different from the original one.
> I considered a couple of alternatives:
> 
> * Replace their estate arguments with edata, but keep the separate
> ResultRelInfo arguments.  This might be worth doing in future, if we
> add more fields to ApplyExecutionData.  Right now it'd save nothing,
> and it'd create a risk of confusion about when to use the
> ResultRelInfo argument vs. edata->resultRelInfo.

Not sure about this one.  It may be better to wait until this gets
more expanded, if it gets expanded.

> * Allow apply_handle_tuple_routing to overwrite edata->resultRelInfo
> with the partition child's RRI, then simplify the apply_XXX_internal
> functions to take just edata instead of separate estate and
> resultRelInfo args.  I think this would work right now, but it seems
> grotty, and it might cause problems in future.

Agreed that it does not seem like a good idea to blindly overwrite
resultRelInfo with the partition targetted for the apply.

> * Replace the edata->resultRelInfo field with two fields, one for
> the original parent and one for the actual/current target.  Perhaps
> this is worth doing, not sure.

This one sounds more natural to me, though.

> Thoughts?

May I ask why you are not moving the snapshot pop and push into the
finish() and create() routines for this patch?  Also, any thoughts
about adding the trigger tests from 013_partition.pl to REL_13_STABLE?
--
Michael


signature.asc
Description: PGP signature


Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Alvaro Herrera
On 2021-May-19, Andrew Dunstan wrote:

> It's just a reference after all. So someone supplies a reference to an
> email on an out of the way list. What's the evil that will occur? Not
> much really  AFAICT.

... as long as it doesn't leak data from private lists ...

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: Freenode woes

2021-05-19 Thread Robert Treat
On Wed, May 19, 2021 at 10:19 AM Christoph Berg  wrote:
>
> Fwiw, if the PostgreSQL projects is considering moving the #postgresql
> IRC channel(s) elsewhere given [1,2], I'm a member of OFTC.net's network
> operations committee and would be happy to help.
>
> [1] https://gist.github.com/aaronmdjones/1a9a93ded5b7d162c3f58bdd66b8f491
> [2] https://fuchsnet.ch/freenode-resign-letter.txt
>

I've been wondering the same thing; given our relationship with SPI,
OFTC seems like an option worthy of consideration.
For those unfamiliar, there is additional info about the network at
https://www.oftc.net


Robert Treat
PostgreSQL Project SPI Liaison
https://xzilla.net




Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Tom Lane
Amit Langote  writes:
> IOW, the patch you posted earlier seems like the way to go.

I really dislike that patch.  I think it's doubling down on the messy,
unstructured coding patterns that got us into this situation to begin
with.  I'd prefer to expend a little effort on refactoring so that
the ExecCleanupTupleRouting call can be moved to the cleanup function
where it belongs.

So, I propose the attached, which invents a new struct to carry
the stuff we've discovered to be necessary.  This makes the APIs
noticeably cleaner IMHO.

I did not touch the APIs of the apply_XXX_internal functions,
as it didn't really seem to offer any notational advantage.
We can't simply collapse them to take an "edata" as I did for
apply_handle_tuple_routing, because the ResultRelInfo they're
supposed to operate on could be different from the original one.
I considered a couple of alternatives:

* Replace their estate arguments with edata, but keep the separate
ResultRelInfo arguments.  This might be worth doing in future, if we
add more fields to ApplyExecutionData.  Right now it'd save nothing,
and it'd create a risk of confusion about when to use the
ResultRelInfo argument vs. edata->resultRelInfo.

* Allow apply_handle_tuple_routing to overwrite edata->resultRelInfo
with the partition child's RRI, then simplify the apply_XXX_internal
functions to take just edata instead of separate estate and
resultRelInfo args.  I think this would work right now, but it seems
grotty, and it might cause problems in future.

* Replace the edata->resultRelInfo field with two fields, one for
the original parent and one for the actual/current target.  Perhaps
this is worth doing, not sure.

Thoughts?

regards, tom lane

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 1432554d5a..c51a83f797 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -178,6 +178,14 @@ static HTAB *xidhash = NULL;
 /* BufFile handle of the current streaming file */
 static BufFile *stream_fd = NULL;
 
+typedef struct ApplyExecutionData
+{
+	EState	   *estate;			/* always needed */
+	ResultRelInfo *resultRelInfo;	/* always needed */
+	ModifyTableState *mtstate;	/* used for partitioned target rel */
+	PartitionTupleRouting *proute;	/* used for partitioned target rel */
+} ApplyExecutionData;
+
 typedef struct SubXactInfo
 {
 	TransactionId xid;			/* XID of the subxact */
@@ -239,8 +247,7 @@ static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
 	LogicalRepRelation *remoterel,
 	TupleTableSlot *remoteslot,
 	TupleTableSlot **localslot);
-static void apply_handle_tuple_routing(ResultRelInfo *relinfo,
-	   EState *estate,
+static void apply_handle_tuple_routing(ApplyExecutionData *edata,
 	   TupleTableSlot *remoteslot,
 	   LogicalRepTupleData *newtup,
 	   LogicalRepRelMapEntry *relmapentry,
@@ -336,20 +343,21 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s)
 
 /*
  * Executor state preparation for evaluation of constraint expressions,
- * indexes and triggers.
+ * indexes and triggers for the specified relation.
  *
- * resultRelInfo is a ResultRelInfo for the relation to be passed to the
- * executor routines.  The caller must open and close any indexes to be
- * updated independently of the relation registered here.
+ * Note that the caller must open and close any indexes to be updated.
  */
-static EState *
-create_estate_for_relation(LogicalRepRelMapEntry *rel,
-		   ResultRelInfo **resultRelInfo)
+static ApplyExecutionData *
+create_edata_for_relation(LogicalRepRelMapEntry *rel)
 {
+	ApplyExecutionData *edata;
 	EState	   *estate;
 	RangeTblEntry *rte;
+	ResultRelInfo *resultRelInfo;
 
-	estate = CreateExecutorState();
+	edata = (ApplyExecutionData *) palloc0(sizeof(ApplyExecutionData));
+
+	edata->estate = estate = CreateExecutorState();
 
 	rte = makeNode(RangeTblEntry);
 	rte->rtekind = RTE_RELATION;
@@ -358,13 +366,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel,
 	rte->rellockmode = AccessShareLock;
 	ExecInitRangeTable(estate, list_make1(rte));
 
-	*resultRelInfo = makeNode(ResultRelInfo);
+	edata->resultRelInfo = resultRelInfo = makeNode(ResultRelInfo);
 
 	/*
 	 * Use Relation opened by logicalrep_rel_open() instead of opening it
 	 * again.
 	 */
-	InitResultRelInfo(*resultRelInfo, rel->localrel, 1, NULL, 0);
+	InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0);
 
 	/*
 	 * We put the ResultRelInfo in the es_opened_result_relations list, even
@@ -377,29 +385,38 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel,
 	 * an apply operation being responsible for that.
 	 */
 	estate->es_opened_result_relations =
-		lappend(estate->es_opened_result_relations, *resultRelInfo);
+		lappend(estate->es_opened_result_relations, resultRelInfo);
 
 	estate->es_output_cid = GetCurrentCommandId(true);
 
 	/* Prepare to 

Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Andrew Dunstan


On 5/19/21 3:07 PM, Magnus Hagander wrote:
> On Wed, May 19, 2021 at 7:39 PM Tom Lane  wrote:
>> Stephen Frost  writes:
>>> * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
 Since we have a "documentation" section in the commitfest, it would
 be useful to allow links to the -docs archives.
>>> ... or get rid of the pgsql-docs mailing list, as has been suggested
>>> before.
>> IIRC, the CF app also rejects threads on pgsql-bugs, which is even
>> more pointlessly annoying.  Couldn't we just remove that restriction
>> altogether, and allow anything posted to some pgsql list?
> It's not technically rejecting anything, it's just explicitly looking
> in -hackers and doesn't even know the others exist :)
>
> Changing that to look globally can certainly be done. It takes a bit
> of work I think, as there are no API endpoints today that will do
> that, but those could be added.
>
> But just to be clear -- "some pgsql list" would include things like
> pgsql-general, the pgadmin lists, the non-english regional lists, etc.
> That may be fine, I just want to be sure everybody realizes that's
> what it means. Basically everything on
> https://www.postgresql.org/list/
>

It's just a reference after all. So someone supplies a reference to an
email on an out of the way list. What's the evil that will occur? Not
much really  AFAICT.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Tom Lane
Magnus Hagander  writes:
> On Wed, May 19, 2021 at 7:39 PM Tom Lane  wrote:
>> IIRC, the CF app also rejects threads on pgsql-bugs, which is even
>> more pointlessly annoying.  Couldn't we just remove that restriction
>> altogether, and allow anything posted to some pgsql list?

> It's not technically rejecting anything, it's just explicitly looking
> in -hackers and doesn't even know the others exist :)

> Changing that to look globally can certainly be done. It takes a bit
> of work I think, as there are no API endpoints today that will do
> that, but those could be added.

Ah.  Personally, I'd settle for it checking -hackers, -docs and -bugs.
Perhaps there's some case for -general as well.

regards, tom lane




Re: pg_rewind fails if there is a read only file.

2021-05-19 Thread Andrew Dunstan


On 5/19/21 6:43 AM, Paul Guo wrote:
> Several weeks ago I saw this issue in a production environment. The
> read only file looks like a credential file. Michael told me that
> usually such kinds of files should be better kept in non-pgdata
> directories in production environments. Thought further it seems that
> pg_rewind should be more user friendly to tolerate such scenarios.
>
> The failure error is "Permission denied" after open(). The reason is
> open() fais with the below mode in open_target_file()
>
>   mode = O_WRONLY | O_CREAT | PG_BINARY;
>   if (trunc)
>   mode |= O_TRUNC;
>   dstfd = open(dstpath, mode, pg_file_create_mode);
>
> The fix should be quite simple, if open fails with "Permission denied"
> then we try to call chmod to add a S_IWUSR just before open() and call
> chmod to reset the flags soon after open(). A stat() call to get
> previous st_mode flags is needed.
>

Presumably the user has a reason for adding the file read-only to the
data directory, and we shouldn't lightly ignore that.

Michael's advice is reasonable. This seems like a case of:

Patient: Doctor, it hurts when I do this.

Doctor: Stop doing that.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Removed extra memory allocations from create_list_bounds

2021-05-19 Thread Justin Pryzby
On Tue, May 18, 2021 at 01:49:12PM -0400, Robert Haas wrote:
> I see that you have made a theoretical argument for why this should be
> good for performance, but it would be better to have some test results
> that show that it works out in practice. Sometimes things seem like
> they ought to be more efficient but turn out to be less efficient when
> they are actually tried.

I see this as a code cleanup more than an performance optimization.
I couldn't see a measurable difference in my tests, involving CREATE TABLE and
SELECT.

I think some of my patches could *increase* memory use, due to power-of-two
allocation logic.  I think it's still a good idea, since it doesn't seem to be
the dominant memory allocation.

On Thu, May 20, 2021 at 12:21:19AM +0530, Nitin Jadhav wrote:
> > I see that you have made a theoretical argument for why this should be
> > good for performance, but it would be better to have some test results
> > that show that it works out in practice. Sometimes things seem like
> > they ought to be more efficient but turn out to be less efficient when
> > they are actually tried.
> 
> After this I tried to create 10 partitions and observed the time taken
> to execute. Here is the data for 'with patch'.
> 
> postgres@34077=#select 'create table t_' || i || ' partition of t for
> postgres'# values in (' || i || ');'
> postgres-# from generate_series(10001, 10010) i
> postgres-# \gexec

I think you should be sure to do this within a transaction, without cassert,
and maybe with fsync=off full_page_writes=off.

> If we observe above data, we can see the improvement with the patch.
> The average time taken to execute for the last 10 partitions is.

It'd be interesting to know which patch(es) contributed to the improvement.
It's possible that (say) patch 0001 alone gives all the gain, or that 0002
diminishes the gains.

I think there'll be an interest in committing the smallest possible patch to
realize the gains, to minimize code churn an unrelated changes.

LIST and RANGE might need to be checked separately..

> With respect to memory usage, AFAIK the allocated memory gets cleaned
> during deallocation of the memory used by the memory context. So at
> the end of the query, we may see no difference in the memory usage but
> during the query execution it tries to get less memory than before.

You can check MAXRSS (at least on linux) if you enable log_executor_stats,
like:

\set QUIET \\ SET client_min_messages=debug; SET log_executor_stats=on; DROP 
TABLE p; CREATE TABLE p(i int) PARTITION BY LIST(i); CREATE TABLE pd PARTITION 
OF p DEFAULT;
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES IN (%s)', a,a) FROM 
generate_series(1,999)a;\gexec \\ SELECT;

-- 
Justin




Re: Removed extra memory allocations from create_list_bounds

2021-05-19 Thread Nitin Jadhav
> Created a table with one column of type 'int' and partitioned by that
> column. Created 1 million partitions using following queries.

Sorry. It's not 1 million. Its 10,000 partitions.

--
Thanks & Regards,
Nitin Jadhav

On Thu, May 20, 2021 at 12:21 AM Nitin Jadhav
 wrote:
>
> > 'git apply' is very picky. Use 'patch -p1' to apply your patches instead.
> >
> > Also, use 'git diff --check' or 'git log --check' before generating
> > patches to send, and fix any whitespace errors before submitting.
>
> Thanks for the suggestion. I will follow these.
>
> > I see that you have made a theoretical argument for why this should be
> > good for performance, but it would be better to have some test results
> > that show that it works out in practice. Sometimes things seem like
> > they ought to be more efficient but turn out to be less efficient when
> > they are actually tried.
>
> Created a table with one column of type 'int' and partitioned by that
> column. Created 1 million partitions using following queries.
>
> create table t(a int) partition by list(a);
> select 'create table t_' || i || ' partition of t for
> values in (' || i || ');'
> from generate_series(1, 1) i
> \gexec
>
> After this I tried to create 10 partitions and observed the time taken
> to execute. Here is the data for 'with patch'.
>
> postgres@34077=#select 'create table t_' || i || ' partition of t for
> postgres'# values in (' || i || ');'
> postgres-# from generate_series(10001, 10010) i
> postgres-# \gexec
> CREATE TABLE
> Time: 36.863 ms
> CREATE TABLE
> Time: 46.645 ms
> CREATE TABLE
> Time: 44.915 ms
> CREATE TABLE
> Time: 39.660 ms
> CREATE TABLE
> Time: 42.188 ms
> CREATE TABLE
> Time: 43.163 ms
> CREATE TABLE
> Time: 44.374 ms
> CREATE TABLE
> Time: 45.117 ms
> CREATE TABLE
> Time: 40.340 ms
> CREATE TABLE
> Time: 38.604 ms
>
> The data for 'without patch' looks like this.
>
> postgres@31718=#select 'create table t_' || i || ' partition of t for
> postgres'# values in (' || i || ');'
> postgres-# from generate_series(10001, 10010) i
> postgres-# \gexec
> CREATE TABLE
> Time: 45.917 ms
> CREATE TABLE
> Time: 46.815 ms
> CREATE TABLE
> Time: 44.180 ms
> CREATE TABLE
> Time: 48.163 ms
> CREATE TABLE
> Time: 45.884 ms
> CREATE TABLE
> Time: 48.330 ms
> CREATE TABLE
> Time: 48.614 ms
> CREATE TABLE
> Time: 48.376 ms
> CREATE TABLE
> Time: 46.380 ms
> CREATE TABLE
> Time: 48.233 ms
>
> If we observe above data, we can see the improvement with the patch.
> The average time taken to execute for the last 10 partitions is.
> With patch - 42.1869 ms
> Without patch - 47.0892 ms.
>
> With respect to memory usage, AFAIK the allocated memory gets cleaned
> during deallocation of the memory used by the memory context. So at
> the end of the query, we may see no difference in the memory usage but
> during the query execution it tries to get less memory than before.
> Maybe during some worst case scenario, if there is less memory
> available, we may see 'out of memory' errors without the patch but it
> may work with the patch. I have not done experiments in this angle. I
> am happy to do it if required.
>
> Please share your thoughts.
>
> --
> Thanks & Regards,
> Nitin Jadhav
>
> On Tue, May 18, 2021 at 11:19 PM Robert Haas  wrote:
> >
> > On Tue, May 18, 2021 at 1:29 PM Nitin Jadhav
> >  wrote:
> > > > The CFBOT had no issues with the patches, so I suspect an issue on your 
> > > > side.
> > > > http://cfbot.cputube.org/nitin-jadhav.html
> > >
> > > I am getting the following error when I try to apply in my machine.
> > >
> > > $ git apply 
> > > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch
> > > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18:
> > > trailing whitespace.
> >
> > 'git apply' is very picky. Use 'patch -p1' to apply your patches instead.
> >
> > Also, use 'git diff --check' or 'git log --check' before generating
> > patches to send, and fix any whitespace errors before submitting.
> >
> > I see that you have made a theoretical argument for why this should be
> > good for performance, but it would be better to have some test results
> > that show that it works out in practice. Sometimes things seem like
> > they ought to be more efficient but turn out to be less efficient when
> > they are actually tried.
> >
> > --
> > Robert Haas
> > EDB: http://www.enterprisedb.com




Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Magnus Hagander
On Wed, May 19, 2021 at 7:39 PM Tom Lane  wrote:
>
> Stephen Frost  writes:
> > * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> >> Since we have a "documentation" section in the commitfest, it would
> >> be useful to allow links to the -docs archives.
>
> > ... or get rid of the pgsql-docs mailing list, as has been suggested
> > before.
>
> IIRC, the CF app also rejects threads on pgsql-bugs, which is even
> more pointlessly annoying.  Couldn't we just remove that restriction
> altogether, and allow anything posted to some pgsql list?

It's not technically rejecting anything, it's just explicitly looking
in -hackers and doesn't even know the others exist :)

Changing that to look globally can certainly be done. It takes a bit
of work I think, as there are no API endpoints today that will do
that, but those could be added.

But just to be clear -- "some pgsql list" would include things like
pgsql-general, the pgadmin lists, the non-english regional lists, etc.
That may be fine, I just want to be sure everybody realizes that's
what it means. Basically everything on
https://www.postgresql.org/list/

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: SSL Tests for sslinfo extension

2021-05-19 Thread Andrew Dunstan


On 5/19/21 1:01 PM, Dagfinn Ilmari Mannsåker wrote:
> Daniel Gustafsson  writes:
>
>> In order to be able to test extensions with SSL connections, allow
>> configure_test_server_for_ssl to create any extensions passed as
>> comma separated list. Each extension is created in all the test
>> databases which may or may not be useful.
> Why the comma-separated string, rather than an array reference,
> i.e. `extensions => [qw(foo bar baz)]`?  Also, should it use `CREATE
> EXTENSION .. CASCADE`, in case the specified extensions depend on
> others?
>


Also, instead of one line per db there should be an inner loop over the
db  names.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Andrew Dunstan


On 5/19/21 1:39 PM, Tom Lane wrote:
> Stephen Frost  writes:
>> * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
>>> Since we have a "documentation" section in the commitfest, it would
>>> be useful to allow links to the -docs archives.
>> ... or get rid of the pgsql-docs mailing list, as has been suggested
>> before.
> IIRC, the CF app also rejects threads on pgsql-bugs, which is even
> more pointlessly annoying.  Couldn't we just remove that restriction
> altogether, and allow anything posted to some pgsql list?
>
>   



+several


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Removed extra memory allocations from create_list_bounds

2021-05-19 Thread Nitin Jadhav
> 'git apply' is very picky. Use 'patch -p1' to apply your patches instead.
>
> Also, use 'git diff --check' or 'git log --check' before generating
> patches to send, and fix any whitespace errors before submitting.

Thanks for the suggestion. I will follow these.

> I see that you have made a theoretical argument for why this should be
> good for performance, but it would be better to have some test results
> that show that it works out in practice. Sometimes things seem like
> they ought to be more efficient but turn out to be less efficient when
> they are actually tried.

Created a table with one column of type 'int' and partitioned by that
column. Created 1 million partitions using following queries.

create table t(a int) partition by list(a);
select 'create table t_' || i || ' partition of t for
values in (' || i || ');'
from generate_series(1, 1) i
\gexec

After this I tried to create 10 partitions and observed the time taken
to execute. Here is the data for 'with patch'.

postgres@34077=#select 'create table t_' || i || ' partition of t for
postgres'# values in (' || i || ');'
postgres-# from generate_series(10001, 10010) i
postgres-# \gexec
CREATE TABLE
Time: 36.863 ms
CREATE TABLE
Time: 46.645 ms
CREATE TABLE
Time: 44.915 ms
CREATE TABLE
Time: 39.660 ms
CREATE TABLE
Time: 42.188 ms
CREATE TABLE
Time: 43.163 ms
CREATE TABLE
Time: 44.374 ms
CREATE TABLE
Time: 45.117 ms
CREATE TABLE
Time: 40.340 ms
CREATE TABLE
Time: 38.604 ms

The data for 'without patch' looks like this.

postgres@31718=#select 'create table t_' || i || ' partition of t for
postgres'# values in (' || i || ');'
postgres-# from generate_series(10001, 10010) i
postgres-# \gexec
CREATE TABLE
Time: 45.917 ms
CREATE TABLE
Time: 46.815 ms
CREATE TABLE
Time: 44.180 ms
CREATE TABLE
Time: 48.163 ms
CREATE TABLE
Time: 45.884 ms
CREATE TABLE
Time: 48.330 ms
CREATE TABLE
Time: 48.614 ms
CREATE TABLE
Time: 48.376 ms
CREATE TABLE
Time: 46.380 ms
CREATE TABLE
Time: 48.233 ms

If we observe above data, we can see the improvement with the patch.
The average time taken to execute for the last 10 partitions is.
With patch - 42.1869 ms
Without patch - 47.0892 ms.

With respect to memory usage, AFAIK the allocated memory gets cleaned
during deallocation of the memory used by the memory context. So at
the end of the query, we may see no difference in the memory usage but
during the query execution it tries to get less memory than before.
Maybe during some worst case scenario, if there is less memory
available, we may see 'out of memory' errors without the patch but it
may work with the patch. I have not done experiments in this angle. I
am happy to do it if required.

Please share your thoughts.

--
Thanks & Regards,
Nitin Jadhav

On Tue, May 18, 2021 at 11:19 PM Robert Haas  wrote:
>
> On Tue, May 18, 2021 at 1:29 PM Nitin Jadhav
>  wrote:
> > > The CFBOT had no issues with the patches, so I suspect an issue on your 
> > > side.
> > > http://cfbot.cputube.org/nitin-jadhav.html
> >
> > I am getting the following error when I try to apply in my machine.
> >
> > $ git apply 
> > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch
> > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18:
> > trailing whitespace.
>
> 'git apply' is very picky. Use 'patch -p1' to apply your patches instead.
>
> Also, use 'git diff --check' or 'git log --check' before generating
> patches to send, and fix any whitespace errors before submitting.
>
> I see that you have made a theoretical argument for why this should be
> good for performance, but it would be better to have some test results
> that show that it works out in practice. Sometimes things seem like
> they ought to be more efficient but turn out to be less efficient when
> they are actually tried.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com




Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Andrew Dunstan


On 5/19/21 6:32 AM, Fabien COELHO wrote:
>
>
>> Confirmed, thanks for looking. I can reproduce it on my machine with
>> -m32. It's somewhat annoying that the buildfarm didn't pick it up
>> sooner :-(
>>
>> On Wed, 19 May 2021 at 08:28, Michael Paquier 
>> wrote:
>>>
>>> On Wed, May 19, 2021 at 09:06:16AM +0200, Fabien COELHO wrote:
 I see two simple approaches:

 (1) use another PRNG inside pgbench, eg Knuth's which was used in some
 previous submission and is very simple and IMHO better than the rand48
 stuff.

 (2) extend pg_*rand48() to provide an unsigned 64 bits out of the
 48 bits
 state.
>>>
>>> Or, (3) remove this test?  I am not quite sure what there is to gain
>>> with this extra test considering all the other tests with permute()
>>> already present in this script.
>>
>> Yes, I think removing the test is the best option. It was originally
>> added because there was a separate code path for larger permutation
>> sizes that needed testing, but that's no longer the case so the test
>> really isn't adding anything.
>
> Hmmm…
>
> It is the one test which worked in actually detecting an issue, so I
> would not say that it is not adding anything, on the contrary, it did
> prove its value! The permute function is expected to be deterministic
> on different platforms and architectures, and it is not.
>
> I agree that removing the test will hide the issue effectively:-) but
> ISTM more appropriate to solve the underlying issue and keep the test.
>
> I'd agree with a two phases approach: drop the test in the short term
> and deal with the PRNG later. I'm so unhappy with this 48 bit PRNG
> that I may be motivated enough to attempt to replace it, or at least
> add a better (faster?? larger state?? same/better quality?) alternative.
>

Yeah, this does seem to be something that should be fixed rather than
hidden.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Andrew Dunstan


On 5/18/21 11:03 PM, Michael Paquier wrote:
>
>> 3. Once the subscriber1 postmaster has exited, the TAP
>> test will eventually time out, and then this happens:
>>
>> [.. logs ..]
>>
>> That is, because we failed to shut down subscriber1, the
>> test script neglects to shut down subscriber2, and now
>> things just sit indefinitely.  So that's a robustness
>> problem in the TAP infrastructure, rather than a bug in
>> PG proper; but I still say it's a bug that needs fixing.
> This one comes down to teardown_node() that uses system_or_bail(),
> leaving things unfinished.  I guess that we could be more aggressive
> and ignore failures if we have a non-zero error code and that not all
> the tests have passed within the END block of PostgresNode.pm.



Yeah, this area needs substantial improvement. I have seen similar sorts
of nasty hangs, where the script is waiting forever for some process
that hasn't got the shutdown message. At least we probably need some way
of making sure the END handler doesn't abort early. Maybe
PostgresNode::stop() needs a mode that handles failure more gracefully.
Maybe it needs to try shutting down all the nodes and only calling
BAIL_OUT after trying all of them and getting a failure. But that might
still leave us work to do on failures occuring pre-END.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Daniel Gustafsson
> On 19 May 2021, at 19:39, Tom Lane  wrote:
> 
> Stephen Frost  writes:
>> * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
>>> Since we have a "documentation" section in the commitfest, it would
>>> be useful to allow links to the -docs archives.
> 
>> ... or get rid of the pgsql-docs mailing list, as has been suggested
>> before.
> 
> IIRC, the CF app also rejects threads on pgsql-bugs, which is even
> more pointlessly annoying.  Couldn't we just remove that restriction
> altogether, and allow anything posted to some pgsql list?

+1.  Regardless of the fate of any individual list I think this is the most
sensible thing for the CF app.

--
Daniel Gustafsson   https://vmware.com/





Re: cutting down the TODO list thread

2021-05-19 Thread John Naylor
Hi,

I let this drop off my radar a few months ago, but I'm going to try to get
back into the habit of looking at a few items a week. As before, let me
know in the next few days if anyone has thoughts or objections.

(Optimizer / Executor)

- Consider increasing the default values of from_collapse_limit,
join_collapse_limit, and/or geqo_threshold

http://archives.postgresql.org/message-id/4136ffa0905210551u22eeb31bn5655dbe7c9a3a...@mail.gmail.com

This seems to have been rejected.

- Improve use of expression indexes for ORDER BY

http://archives.postgresql.org/pgsql-hackers/2009-08/msg01553.php

Skimming the thread, I'm not quite sure if index-only scans (not available
at the time) solves the problem, or is orthogonal to it?

- Modify the planner to better estimate caching effects

http://archives.postgresql.org/pgsql-performance/2010-11/msg00117.php

Huge discussion. This sounds like a research project, and maybe a risky one.

- Allow shared buffer cache contents to affect index cost computations

http://archives.postgresql.org/pgsql-hackers/2011-06/msg01140.php

Related to the above, but has a more specific approach in mind. The
discussion thread is not useful for getting one's head around how to think
about the problem, much less to decide if it's worth working on -- it's
mostly complaining about the review process. Independent of that, the idea
of inspecting the buffer cache seems impractical.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Tom Lane
Stephen Frost  writes:
> * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
>> Since we have a "documentation" section in the commitfest, it would
>> be useful to allow links to the -docs archives.

> ... or get rid of the pgsql-docs mailing list, as has been suggested
> before.

IIRC, the CF app also rejects threads on pgsql-bugs, which is even
more pointlessly annoying.  Couldn't we just remove that restriction
altogether, and allow anything posted to some pgsql list?

regards, tom lane




Re: libpq_pipeline in tmp_install

2021-05-19 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-May-19, Tom Lane wrote:
>> +1, except that you should add documentation for NO_INSTALL to the
>> list of definable symbols at the head of pgxs.mk, and to the list
>> in extend.sgml (compare that for NO_INSTALLCHECK).

> I propose this.

WFM.

regards, tom lane




Re: Improve documentation for pg_upgrade, standbys and rsync

2021-05-19 Thread Laurenz Albe
On Wed, 2021-05-19 at 10:31 -0400, Stephen Frost wrote:
> * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> > I revently tried to upgrade a standby following the documentation,
> > but I found it hard to understand, [...]
>
> Haven't had a chance to look at this in depth but improving things here
> would be good.
> 
> An additional thing that we should really be mentioning is to tell
> people to go in and TRUNCATE all of their UNLOGGED tables before going
> through this process, otherwise the rsync will end up spending a bunch
> of time copying the files for UNLOGGED relations which you really don't
> want.

Thanks for the feedback and the suggestion.
CCing -hackers so that I can add it to the commitfest.

Yours,
Laurenz Albe





Re: SSL Tests for sslinfo extension

2021-05-19 Thread Dagfinn Ilmari Mannsåker
Daniel Gustafsson  writes:

> In order to be able to test extensions with SSL connections, allow
> configure_test_server_for_ssl to create any extensions passed as
> comma separated list. Each extension is created in all the test
> databases which may or may not be useful.

Why the comma-separated string, rather than an array reference,
i.e. `extensions => [qw(foo bar baz)]`?  Also, should it use `CREATE
EXTENSION .. CASCADE`, in case the specified extensions depend on
others?

- ilmari
-- 
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
 the consequences of."  -- Skud's Meta-Law




Re: Commitfest app vs. pgsql-docs

2021-05-19 Thread Stephen Frost
Greetings,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> I would like to add a thread on pgsql-docs to the commitfest, but I
> found that that cannot be done.
> 
> What is the best way to proceed?
> Since we have a "documentation" section in the commitfest, it would
> be useful to allow links to the -docs archives.

... or get rid of the pgsql-docs mailing list, as has been suggested
before.

Thanks,

Stephen


signature.asc
Description: PGP signature


Commitfest app vs. pgsql-docs

2021-05-19 Thread Laurenz Albe
I would like to add a thread on pgsql-docs to the commitfest, but I
found that that cannot be done.

What is the best way to proceed?
Since we have a "documentation" section in the commitfest, it would
be useful to allow links to the -docs archives.

Yours,
Laurenz Albe





Re: Inaccurate error message when set fdw batch_size to 0

2021-05-19 Thread Bharath Rupireddy
On Wed, May 19, 2021 at 5:20 PM Fujii Masao  wrote:
>
> ereport(ERROR,
> 
> (errcode(ERRCODE_SYNTAX_ERROR),
> -errmsg("%s requires a 
> non-negative numeric value",
> +errmsg("%s must be greater 
> than or equal to zero",
> 
> def->defname)));
> }
> else if (strcmp(def->defname, "extensions") == 0)
> @@ -142,7 +142,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
> if (fetch_size <= 0)
> ereport(ERROR,
> 
> (errcode(ERRCODE_SYNTAX_ERROR),
> -errmsg("%s requires a 
> non-negative integer value",
> +errmsg("%s must be greater 
> than zero",
>
> I'm fine to convert "non-negative" word to "greater than" or "greater than
> or equal to" in the messages. But this change also seems to get rid of
> the information about the data type of the option from the message.
> I'm not sure if this is an improvement. Probably isn't it better to
> convert "requires a non-negative integer value" to "must be an integer value
> greater than zero"?

Thanks for the comments. Done that way. PSA v3 patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


v3-0001-Disambiguate-error-messages-that-use-non-negative.patch
Description: Binary data


Add PortalDrop in exec_execute_message

2021-05-19 Thread Yura Sokolov

Hi, hackers.

I've been playing with "autoprepared" patch, and have got isolation
"freeze-the-dead" test stuck on first VACUUM FREEZE statement.
After some research I found issue is reproduced with unmodified master
branch if extended protocol used. I've prepared ruby script for
demonstration (cause ruby-pg has simple interface to PQsendQueryParams).

Further investigation showed it happens due to portal is not dropped
inside of exec_execute_message, and it is kept in third session till
COMMIT is called. Therefore heap page remains pinned, and VACUUM FREEZE
became locked inside LockBufferForCleanup.

It seems that it is usually invisible to common users since either:
- command is called as standalone and then transaction is closed
  immediately,
- next PQsendQueryParams will initiate another unnamed portal using
  CreatePortal("", true, true) and this action will drop previous
  one.

But "freeze-the-dead" remains locked since third session could not
send COMMIT until VACUUM FULL finished.

I propose to add PortalDrop at the 'if (completed)' branch of
exec_execute_message.

--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2209,6 +2209,8 @@ exec_execute_message(const char *portal_name, long 
max_rows)


if (completed)
{
+   PortalDrop(portal, false);
+
if (is_xact_command)
{

With this change 'make check-world' runs without flaws (at least
on empty configure with enable-cassert and enable-tap-tests).

There is small chance applications exist which abuses seekable
portals with 'execute' protocol message so not every completed
portal can be safely dropped. But I believe there is some sane
conditions that cover common case. For example, isn't empty name
check is enough? Can client reset or seek portal with empty
name?

regards,
Sokolov Yura aka funny_falconrequire 'pg'

c1 = PG.connect(host: "localhost", dbname: "postgres")
c2 = PG.connect(host: "localhost", dbname: "postgres")
c3 = PG.connect(host: "localhost", dbname: "postgres")

class PG::Connection
  def simple(sql)
puts sql
exec(sql)
  end
  def extended(sql)
puts "#{sql}"
exec_params(sql, [])
  end
end

c1.simple "DROP TABLE IF EXISTS tab_freeze" 
c1.simple "
CREATE TABLE tab_freeze (
  id int PRIMARY KEY,
  name char(3),
  x int);
INSERT INTO tab_freeze VALUES (1, '111', 0);
INSERT INTO tab_freeze VALUES (3, '333', 0);
"

c1.simple "BEGIN"
c2.simple "BEGIN"
c3.simple "BEGIN"
c1.simple "UPDATE tab_freeze SET x = x + 1 WHERE id = 3"
c2.extended "SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE"
c3.extended "SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE"
c1.simple "COMMIT"
c2.simple "COMMIT"
c2.simple "VACUUM FREEZE tab_freeze"
c1.simple "
BEGIN;
SET LOCAL enable_seqscan = false;
SET LOCAL enable_bitmapscan = false;
SELECT * FROM tab_freeze WHERE id = 3;
COMMIT;
"
c3.simple "COMMIT"
c2.simple "VACUUM FREEZE tab_freeze"
c1.simple "SELECT * FROM tab_freeze ORDER BY name, id"


Re: postgres_fdw - should we tighten up batch_size, fetch_size options against non-numeric values?

2021-05-19 Thread Bharath Rupireddy
On Wed, May 19, 2021 at 5:02 PM Fujii Masao  wrote:
> On 2021/05/19 14:34, Bharath Rupireddy wrote:
> > On Wed, May 19, 2021 at 8:28 AM Fujii Masao  
> > wrote:
> > I agree with throwing an error for non-numeric junk though.
> > Allowing that on the grounds of backwards compatibility
> > seems like too much of a stretch.
> 
>  +1.
> >>>
> >>> +1.
> >>
> >> +1
> >
> > Thanks all for your inputs. PSA which uses parse_int for
> > batch_size/fech_size and parse_real for fdw_startup_cost and
> > fdw_tuple_cost.
>
> Thanks for updating the patch! It basically looks good to me.
>
> -   val = strtod(defGetString(def), );
> -   if (*endp || val < 0)
> +   is_parsed = parse_real(defGetString(def), , 0, 
> NULL);
> +   if (!is_parsed || val < 0)
> ereport(ERROR,
> 
> (errcode(ERRCODE_SYNTAX_ERROR),
>  errmsg("%s requires a 
> non-negative numeric value",
>
> Isn't it better to check "!is_parsed" and "val < 0" separately like
> reloptions.c does? That is, we should throw different error messages
> for them?
>
> ERRCODE_SYNTAX_ERROR seems strange for this type of error?
> ERRCODE_INVALID_PARAMETER_VALUE is better and more proper?

Thanks for the comments. I added separate messages, changed the error
code from ERRCODE_SYNTAX_ERROR to ERRCODE_INVALID_PARAMETER_VALUE and
also quoted the option name in the error message. PSA v3 patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


v3-0001-Tighten-up-batch_size-fetch_size-options-against-.patch
Description: Binary data


Re: libpq_pipeline in tmp_install

2021-05-19 Thread Alvaro Herrera
On 2021-May-19, Tom Lane wrote:

> Peter Eisentraut  writes:
> > On 10.05.21 20:26, Peter Eisentraut wrote:
> >> The reason this is there is that the test suite uses PGXS to build the 
> >> test program, and so things get installed automatically.  I suggest that 
> >> we should either write out the build system by hand to avoid this, or 
> >> maybe extend PGXS to support building programs but not installing them. 
> 
> > Here is a patch that implements the second solution, which turned out to 
> > be very easy.

Great, thank you.

> +1, except that you should add documentation for NO_INSTALL to the
> list of definable symbols at the head of pgxs.mk, and to the list
> in extend.sgml (compare that for NO_INSTALLCHECK).

I propose this.

-- 
Álvaro Herrera   Valdivia, Chile
diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml
index ec95b4eb01..1a4f208692 100644
--- a/doc/src/sgml/extend.sgml
+++ b/doc/src/sgml/extend.sgml
@@ -1672,6 +1672,16 @@ include $(PGXS)
   
  
 
+ 
+  NO_INSTALL
+  
+   
+don't define an install target, useful for test
+modules that don't need their build products to be installed.
+   
+  
+ 
+
  
   NO_INSTALLCHECK
   
diff --git a/src/makefiles/pgxs.mk b/src/makefiles/pgxs.mk
index 271e7eaba8..224eac069c 100644
--- a/src/makefiles/pgxs.mk
+++ b/src/makefiles/pgxs.mk
@@ -49,6 +49,8 @@
 #   TAP_TESTS -- switch to enable TAP tests
 #   ISOLATION -- list of isolation test cases
 #   ISOLATION_OPTS -- additional switches to pass to pg_isolation_regress
+#   NO_INSTALL -- don't define an install target, useful for test modules
+# that don't need their build products to be installed
 #   NO_INSTALLCHECK -- don't define an installcheck target, useful e.g. if
 # tests require special configuration, or don't use pg_regress
 #   EXTRA_CLEAN -- extra files to remove in 'make clean'


Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Fabien COELHO



Or, (3) remove this test?  I am not quite sure what there is to gain
with this extra test considering all the other tests with permute()
already present in this script.


Yes, I think removing the test is the best option. It was originally
added because there was a separate code path for larger permutation
sizes that needed testing, but that's no longer the case so the test
really isn't adding anything.


Hmmm…

It is the one test which worked in actually detecting an issue, so I would
not say that it is not adding anything, on the contrary, it did prove its
value! The permute function is expected to be deterministic on different
platforms and architectures, and it is not.



In fact what it demonstrates is that the results from permute(), like
all the other pgbench random functions, will vary by platform for
sufficiently large size parameters.


Indeed, it is the case if the underlying math use doubles & large numbers. 
For integer-only computations it should be safe though, and permute should 
be in this category.



I'd agree with a two phases approach: drop the test in the short term and
deal with the PRNG later. I'm so unhappy with this 48 bit PRNG that I
may be motivated enough to attempt to replace it, or at least add a better
(faster?? larger state?? same/better quality?) alternative.


I don't necessarily have a problem with that provided the replacement
is well-chosen and has a proven track record (i.e., let's not invent
our own PRNG).


Yes, obviously, I'm not daft enough to reinvent a PRNG. The question is to 
chose one, motivate the choice, and build the relevant API for what pg 
needs, possibly with some benchmarking.



For now though, I'll go remove the test.


This also removes the reminder…

--
Fabien.

Re: libpq_pipeline in tmp_install

2021-05-19 Thread Tom Lane
Peter Eisentraut  writes:
> On 10.05.21 20:26, Peter Eisentraut wrote:
>> The reason this is there is that the test suite uses PGXS to build the 
>> test program, and so things get installed automatically.  I suggest that 
>> we should either write out the build system by hand to avoid this, or 
>> maybe extend PGXS to support building programs but not installing them. 

> Here is a patch that implements the second solution, which turned out to 
> be very easy.

+1, except that you should add documentation for NO_INSTALL to the
list of definable symbols at the head of pgxs.mk, and to the list
in extend.sgml (compare that for NO_INSTALLCHECK).

regards, tom lane




Re: PG 14 release notes, first draft

2021-05-19 Thread Justin Pryzby
These sound weird since markup was added in 6a5bde7d4:
https://www.postgresql.org/docs/devel/release-14.html
| Remove server and Chapter 34 support for the version 2 wire protocol (Heikki 
Linnakangas)
...
| Pass doubled quote marks in Chapter 36 SQL command strings literally (Tom 
Lane)

-Remove server and libpq support for the version 2 wire protocol (Heikki 
Linnakangas)
+Remove server and  support for the version 2 wire protocol (Heikki Linnakangas)

> Force custom server variable names to match the pattern used for unquoted SQL 
> identifiers (Tom Lane)
Say "Require" not force?

> This setting was disabled in PostgreSQL version 13.3.
"disabled" sounds like it was set to "off".  Maybe say it was ignored.

> Add long-running queries to be canceled if the client disconnects (Sergey 
> Cherkashin, Thomas Munro)
Should say: Allow

> The server variable client_connection_check_interval allows supporting 
> operating systems, e.g., Linux, to automatically cancel queries by 
> disconnected clients.
say "some operating systems" ?

> This can be disabled by turning client option "sslsni" off.
"turning off"

| Add %P to log_line_prefix to report the parallel group leader (Justin Pryzby)

Maybe it should say "Allow %P in log_line_prefix to ...", otherwise it sounds
like the default was changed.

| Reduce the default value of vacuum_cost_page_miss (Peter Geoghegan) 
|  This new default better reflects current hardware capabilities. 
Also say: the previous default was 10.




Re: Added missing tab completion for alter subscription set option

2021-05-19 Thread vignesh C
On Tue, May 18, 2021 at 9:20 PM Alvaro Herrera  wrote:
>
> On 2021-May-14, vignesh C wrote:
>
> > While I was reviewing one of the logical decoding features, I found
> > Streaming and binary options were missing in tab completion for the
> > alter subscription set option, the attached patch has the changes for
> > the same.
> > Thoughts?
>
> I wish we didn't have to keep knowledge in the psql source on which
> option names are to be used for each command.  If we had some function
>  SELECT pg_completion_options('alter subscription set');
> that returned the list of options usable for each command, we wouldn't
> have to ... psql would just retrieve the list of options for the current
> command.
>
> Maintaining such a list does not seem hard -- for example we could just
> have a function alongside parse_subscription_option() that returns the
> names that are recognized by that one.  If we drive the implementation
> of both off a single struct, it would never be outdated.
>

I like the idea of maintaining a common list, that will also prevent
options getting missed in the future. I will work on this and provide
a patch for it.

Regards,
Vignesh




Freenode woes

2021-05-19 Thread Christoph Berg
Fwiw, if the PostgreSQL projects is considering moving the #postgresql
IRC channel(s) elsewhere given [1,2], I'm a member of OFTC.net's network
operations committee and would be happy to help.

[1] https://gist.github.com/aaronmdjones/1a9a93ded5b7d162c3f58bdd66b8f491
[2] https://fuchsnet.ch/freenode-resign-letter.txt

Christoph




SSL Tests for sslinfo extension

2021-05-19 Thread Daniel Gustafsson
While playing around with the recent SSL testharness changes I wrote a test
suite for sslinfo as a side effect, which seemed valuable in its own right as
we currently have no coverage of this code.  The small change needed to the
testharness is to support installing modules, which is broken out into 0001 for
easier reading.

I'll park this in the next commitfest for now.

--
Daniel Gustafsson   https://vmware.com/



0002-Add-tests-for-sslinfo.patch
Description: Binary data


0001-Extend-configure_test_server_for_ssl-to-add-extensio.patch
Description: Binary data


Re: Skip partition tuple routing with constant partition key

2021-05-19 Thread Amit Langote
On Tue, May 18, 2021 at 11:11 AM houzj.f...@fujitsu.com
 wrote:
> > > Hmm, does this seem common enough for the added complexity to be
> > worthwhile?
> >
> > I'd also like to know if there's some genuine use case for this. For testing
> > purposes does not seem to be quite a good enough reason.
>
> Thanks for the response.
>
> For some big data scenario, we sometimes transfer data from one table(only 
> store not expired data)
> to another table(historical data) for future analysis.
> In this case, we import data into historical table regularly(could be one day 
> or half a day),
> And the data is likely to be imported with date label specified, then all of 
> the data to be
> imported this time belong to the same partition which partition by time range.

Is directing that data directly into the appropriate partition not an
acceptable solution to address this particular use case?  Yeah, I know
we should avoid encouraging users to perform DML directly on
partitions, but...

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Skip partition tuple routing with constant partition key

2021-05-19 Thread Amit Langote
On Tue, May 18, 2021 at 10:28 AM David Rowley  wrote:
> On Tue, 18 May 2021 at 01:31, Amit Langote  wrote:
> > Hmm, does this seem common enough for the added complexity to be worthwhile?
>
> I'd also like to know if there's some genuine use case for this. For
> testing purposes does not seem to be quite a good enough reason.
>
> A slightly different optimization that I have considered and even
> written patches before was to have ExecFindPartition() cache the last
> routed to partition and have it check if the new row can go into that
> one on the next call.  I imagined there might be a use case for
> speeding that up for RANGE partitioned tables since it seems fairly
> likely that most use cases, at least for time series ranges will
> always hit the same partition most of the time.   Since RANGE requires
> a binary search there might be some savings there.  I imagine that
> optimisation would never be useful for HASH partitioning since it
> seems most likely that we'll be routing to a different partition each
> time and wouldn't save much since routing to hash partitions are
> cheaper than other types.  LIST partitioning I'm not so sure about. It
> seems much less likely than RANGE to hit the same partition twice in a
> row.
>
> IIRC, the patch did something like call ExecPartitionCheck() on the
> new tuple with the previously routed to ResultRelInfo. I think the
> last used partition was cached somewhere like relcache (which seems a
> bit questionable).   Likely this would speed up the example case here
> a bit. Not as much as the proposed patch, but it would likely apply in
> many more cases.
>
> I don't think I ever posted the patch to the list, and if so I no
> longer have access to it, so it would need to be done again.

I gave a shot to implementing your idea and ended up with the attached
PoC patch, which does pass make check-world.

I do see some speedup:

-- creates a range-partitioned table with 1000 partitions
create unlogged table foo (a int) partition by range (a);
select 'create unlogged table foo_' || i || ' partition of foo for
values from (' || (i-1)*10+1 || ') to (' || i*10+1 || ');'
from generate_series(1, 1000) i;
\gexec

-- generates a 100 million record file
copy (select generate_series(1, 1)) to '/tmp/100m.csv' csv;

Times for loading that file compare as follows:

HEAD:

postgres=# copy foo from '/tmp/100m.csv' csv;
COPY 1
Time: 31813.964 ms (00:31.814)
postgres=# copy foo from '/tmp/100m.csv' csv;
COPY 1
Time: 31972.942 ms (00:31.973)
postgres=# copy foo from '/tmp/100m.csv' csv;
COPY 1
Time: 32049.046 ms (00:32.049)

Patched:

postgres=# copy foo from '/tmp/100m.csv' csv;
COPY 1
Time: 26151.158 ms (00:26.151)
postgres=# copy foo from '/tmp/100m.csv' csv;
COPY 1
Time: 28161.082 ms (00:28.161)
postgres=# copy foo from '/tmp/100m.csv' csv;
COPY 1
Time: 26700.908 ms (00:26.701)

I guess it would be nice if we could fit in a solution for the use
case that houjz mentioned as a special case.  BTW, houjz, could you
please check if a patch like this one helps the case you mentioned?




--
Amit Langote
EDB: http://www.enterprisedb.com


ExecFindPartition-cache-partition-PoC.patch
Description: Binary data


Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Tomas Vondra

On 5/19/21 1:42 AM, Tom Lane wrote:

I discovered $SUBJECT after wondering why hyrax hadn't reported
in recently, and trying to run check-world under CCA to see if
anything got stuck.  Indeed it did --- although this doesn't
explain the radio silence from hyrax, because that animal doesn't
run any TAP tests.  (Neither does avocet, which I think is the
only other active CCA critter.  So this could have been broken
for a very long time.)



There are three CCA animals on the same box (avocet, husky and 
trilobite) with different compilers, running in a round-robin manner. 
One cycle took about 14 days, but about a month ago the machine got 
stuck, requiring a hard reboot about a week ago (no idea why it got 
stuck). It has more CPU power now (8 cores instead of 2), so it should 
take less time to run one test cycle.


avocet already ran all the tests, husky is running HEAD at the moment 
and then it'll be trilobite's turn ... AFAICS none of those runs seems 
to have failed or got stuck so far.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-05-19 Thread Bharath Rupireddy
On Wed, May 19, 2021 at 5:56 PM Amit Kapila  wrote:
>
> On Wed, May 19, 2021 at 4:42 PM Bharath Rupireddy
>  wrote:
> >
> > On Wed, May 19, 2021 at 4:10 PM Amit Kapila  wrote:
> > >
> > > On Wed, May 19, 2021 at 3:08 PM Bharath Rupireddy
> > >  wrote:
> > > >
> > > > On Wed, May 19, 2021 at 2:33 PM Amul Sul  wrote:
> > > > >
> > > > > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy
> > > > >  wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > parse_subscription_options function has some similar code when
> > > > > > throwing errors [with the only difference in the option]. I feel we
> > > > > > could just use a variable for the option and use it in the error.
> > >
> > > I am not sure how much it helps to just refactor this part of the code
> > > alone unless we need to add/change it more. Having said that, this
> > > function is being modified by one of the proposed patches for logical
> > > decoding of 2PC and I noticed that the proposed patch is adding more
> > > parameters to this function which already takes 14 input parameters,
> > > so I suggested refactoring it. See comment 11 in email[1]. See, if
> > > that makes sense to you then we can refactor this function such that
> > > it can be enhanced easily by future patches.
> >
> > Thanks Amit for the comments. I agree to move the parse options to a
> > new structure ParseSubOptions as suggested. Then the function can just
> > be parse_subscription_options(ParseSubOptions opts); I wonder if we
> > should also have a structure for parse_publication_options as we might
> > add new options there in the future?
> >
>
> That function has just 5 parameters so not sure if that needs the same
> treatment. Let's leave it for now.

Thanks. I will work on the new structure ParseSubOption only for
subscription options.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Alias collision in `refresh materialized view concurrently`

2021-05-19 Thread Bharath Rupireddy
On Wed, May 19, 2021 at 5:33 PM Mathis Rudolf  wrote:
>
> Hello,
>
> we had a Customer-Report in which `refresh materialized view
> CONCURRENTLY` failed with: `ERROR: column reference "mv" is ambiguous`
>
> They're using `mv` as an alias for one column and this is causing a
> collision with an internal alias. They also made it reproducible like this:
> ```
> create materialized view testmv as select 'asdas' mv; --ok
> create unique index on testmv (mv); --ok
> refresh materialized view testmv; --ok
> refresh materialized view CONCURRENTLY testmv; ---BAM!
> ```
>
> ```
> ERROR: column reference "mv" is ambiguous
> LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ...
>^
> QUERY:  CREATE TEMP TABLE pg_temp_4.pg_temp_218322_2 AS SELECT mv.ctid
> AS tid, newdata FROM public.testmv mv FULL JOIN pg_temp_4.pg_temp_218322
> newdata ON (newdata.mv OPERATOR(pg_catalog.=) mv.mv AND newdata
> OPERATOR(pg_catalog.*=) mv) WHERE newdata IS NULL OR mv IS NULL ORDER BY tid
> ```
>
> The corresponding Code is in `matview.c` in function
> `refresh_by_match_merge`. With adding a prefix like `_pg_internal_` we
> could make collisions pretty unlikely, without intrusive changes.
>
> The appended patch does this change for the aliases `mv`, `newdata` and
> `newdata2`.

I think it's better to have some random name, see below. We could
either use "OIDNewHeap" or "MyBackendId" to make those column names
unique and almost unguessable. So, something like "pg_temp1_",
"pg_temp2_" or "pg_temp3_" and so on would be better IMO.

snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap);
snprintf(namespaceName, sizeof(namespaceName), "pg_temp_%d", MyBackendId);

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-05-19 Thread Amit Kapila
On Wed, May 19, 2021 at 4:42 PM Bharath Rupireddy
 wrote:
>
> On Wed, May 19, 2021 at 4:10 PM Amit Kapila  wrote:
> >
> > On Wed, May 19, 2021 at 3:08 PM Bharath Rupireddy
> >  wrote:
> > >
> > > On Wed, May 19, 2021 at 2:33 PM Amul Sul  wrote:
> > > >
> > > > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > parse_subscription_options function has some similar code when
> > > > > throwing errors [with the only difference in the option]. I feel we
> > > > > could just use a variable for the option and use it in the error.
> >
> > I am not sure how much it helps to just refactor this part of the code
> > alone unless we need to add/change it more. Having said that, this
> > function is being modified by one of the proposed patches for logical
> > decoding of 2PC and I noticed that the proposed patch is adding more
> > parameters to this function which already takes 14 input parameters,
> > so I suggested refactoring it. See comment 11 in email[1]. See, if
> > that makes sense to you then we can refactor this function such that
> > it can be enhanced easily by future patches.
>
> Thanks Amit for the comments. I agree to move the parse options to a
> new structure ParseSubOptions as suggested. Then the function can just
> be parse_subscription_options(ParseSubOptions opts); I wonder if we
> should also have a structure for parse_publication_options as we might
> add new options there in the future?
>

That function has just 5 parameters so not sure if that needs the same
treatment. Let's leave it for now.

-- 
With Regards,
Amit Kapila.




Re: Race condition in recovery?

2021-05-19 Thread Dilip Kumar
On Tue, May 18, 2021 at 12:22 PM Kyotaro Horiguchi
 wrote:

> And finally I think I could reach the situation the commit wanted to fix.
>
> I took a basebackup from a standby just before replaying the first
> checkpoint of the new timeline (by using debugger), without copying
> pg_wal.  In this backup, the control file contains checkPointCopy of
> the previous timeline.
>
> I modified StartXLOG so that expectedTLEs is set just after first
> determining recoveryTargetTLI, then started the grandchild node.  I
> have the following error and the server fails to continue replication.

> [postmaster] LOG:  starting PostgreSQL 14beta1 on x86_64-pc-linux-gnu...
> [startup] LOG:  database system was interrupted while in recovery at log...
> [startup] LOG:  set expectedtles tli=6, length=1
> [startup] LOG:  Probing history file for TLI=7
> [startup] LOG:  entering standby mode
> [startup] LOG:  scanning segment 3 TLI 6, source 0
> [startup] LOG:  Trying fetching history file for TLI=6
> [walreceiver] LOG:  fetching timeline history file for timeline 5 from pri...
> [walreceiver] LOG:  fetching timeline history file for timeline 6 from pri...
> [walreceiver] LOG:  started streaming ... primary at 0/300 on timeline 5
> [walreceiver] DETAIL:  End of WAL reached on timeline 5 at 0/30006E0.
> [startup] LOG:  unexpected timeline ID 1 in log segment 
> 00050003, offset 0
> [startup] LOG:  Probing history file for TLI=7
> [startup] LOG:  scanning segment 3 TLI 6, source 0
> (repeats forever)

So IIUC, this logs shows that
"ControlFile->checkPointCopy.ThisTimeLineID" is 6 but
"ControlFile->checkPoint" record is on TL 5?  I think if you had the
old version of the code (before the commit) or below code [1], right
after initializing expectedTLEs then you would have hit the FATAL the
patch had fix.

While debugging did you check what was the "ControlFile->checkPoint"
LSN vs the first LSN of the first segment with TL6?

expectedTLEs = readTimeLineHistory(recoveryTargetTLI);
[1]
if (tliOfPointInHistory(ControlFile->checkPoint, expectedTLEs) !=
ControlFile->checkPointCopy.ThisTimeLineID)
{
report(FATAL..
}

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-19 Thread David Rowley
On Mon, 17 May 2021 at 14:52, Andy Fan  wrote:
> Would marking the new added RestrictInfo.norm_selec > 1 be OK?

There would be cases you'd want to not count the additional clauses in
the selectivity estimation and there would be cases you would want to.

For example:

SELECT ... FROM t1 INNER JOIN t2 ON t1.dt = t2.dt WHERE t1.dt BETWEEN
'date1' AND 'date2';

If you derived that t2.dt is also BETWEEN 'date1' AND 'date2' then
you'd most likely want to include those quals for scans feeding merge,
hash and non-parameterized nested loop joins, so you'd also want to
count them in your selectivity estimations, else you'd feed junk
values into the join selectivity estimations.

Parameterized nested loop joins might be different as if you were
looping up an index for t1.dt values on some index on t2.dt, then
you'd likely not want to bother also filtering out the between clause
values too. They're redundant in that case.

I imagined we'd have some functions in equivclass.c that allows you to
choose if you wanted the additional filters or not.

Tom's example, WHERE a = b AND a IN (1,2,3), if a and b were in the
same relation then you'd likely never want to include the additional
quals.  The only reason I could think that it would be a good idea is
if "b" had an index but "a" didn't.  I've not checked the code, but
the index matching code might already allow that to work anyway.

David




Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Dean Rasheed
On Wed, 19 May 2021 at 12:07, Fabien COELHO  wrote:
>
> Attached patch disactivates the test with comments to outline that there
> is an issue to fix… so it is *not* removed.
>

I opted to just remove the test rather than comment it out, since the
issue highlighted isn't specific to permute(). Also changing the PRNG
will completely change the results, so all the test values would
require rewriting, rather than it just being a case of uncommenting
the test and expecting it to work.

> I'm obviously okay with providing an alternate PRNG, let me know if this
> is the prefered option.
>

That's something for consideration in v15. If we do decide we want a
new PRNG, it should apply across the board to all pgbench random
functions.

Regards,
Dean




Alias collision in `refresh materialized view concurrently`

2021-05-19 Thread Mathis Rudolf

Hello,

we had a Customer-Report in which `refresh materialized view 
CONCURRENTLY` failed with: `ERROR: column reference "mv" is ambiguous`


They're using `mv` as an alias for one column and this is causing a 
collision with an internal alias. They also made it reproducible like this:

```
create materialized view testmv as select 'asdas' mv; --ok
create unique index on testmv (mv); --ok
refresh materialized view testmv; --ok
refresh materialized view CONCURRENTLY testmv; ---BAM!
```

```
ERROR: column reference "mv" is ambiguous
LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ...
  ^
QUERY:  CREATE TEMP TABLE pg_temp_4.pg_temp_218322_2 AS SELECT mv.ctid 
AS tid, newdata FROM public.testmv mv FULL JOIN pg_temp_4.pg_temp_218322 
newdata ON (newdata.mv OPERATOR(pg_catalog.=) mv.mv AND newdata 
OPERATOR(pg_catalog.*=) mv) WHERE newdata IS NULL OR mv IS NULL ORDER BY tid

```

The corresponding Code is in `matview.c` in function 
`refresh_by_match_merge`. With adding a prefix like `_pg_internal_` we 
could make collisions pretty unlikely, without intrusive changes.


The appended patch does this change for the aliases `mv`, `newdata` and 
`newdata2`.


Kind regards,
Mathis

From 121b06258b80e5097c963335794e9a89f7e6d3ec Mon Sep 17 00:00:00 2001
From: Mathis Rudolf 
Date: Mon, 17 May 2021 14:55:49 +0200
Subject: [PATCH] Fix alias collision for REFRESH MV CONCURRENTLY

This patch adds the prefix _pg_internal_ to aliases like 'mv' and
'newdata' in 'refresh_by_match_merge()', which makes it unliky to cause
any collisions with user created MVs.
---
 src/backend/commands/matview.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 172ec6e982..04602dba80 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -629,12 +629,12 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	 */
 	resetStringInfo();
 	appendStringInfo(,
-	 "SELECT newdata FROM %s newdata "
-	 "WHERE newdata IS NOT NULL AND EXISTS "
-	 "(SELECT 1 FROM %s newdata2 WHERE newdata2 IS NOT NULL "
-	 "AND newdata2 OPERATOR(pg_catalog.*=) newdata "
-	 "AND newdata2.ctid OPERATOR(pg_catalog.<>) "
-	 "newdata.ctid)",
+	 "SELECT pg_internal_newdata FROM %s pg_internal_newdata "
+	 "WHERE pg_internal_newdata IS NOT NULL AND EXISTS "
+	 "(SELECT 1 FROM %s pg_internal_newdata2 WHERE pg_internal_newdata2 IS NOT NULL "
+	 "AND pg_internal_newdata2 OPERATOR(pg_catalog.*=) pg_internal_newdata "
+	 "AND pg_internal_newdata2.ctid OPERATOR(pg_catalog.<>) "
+	 "pg_internal_newdata.ctid)",
 	 tempname, tempname);
 	if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT)
 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
@@ -662,8 +662,8 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	resetStringInfo();
 	appendStringInfo(,
 	 "CREATE TEMP TABLE %s AS "
-	 "SELECT mv.ctid AS tid, newdata "
-	 "FROM %s mv FULL JOIN %s newdata ON (",
+	 "SELECT pg_internal_mv.ctid AS tid, pg_internal_newdata "
+	 "FROM %s pg_internal_mv FULL JOIN %s pg_internal_newdata ON (",
 	 diffname, matviewname, tempname);
 
 	/*
@@ -756,9 +756,9 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 if (foundUniqueIndex)
 	appendStringInfoString(, " AND ");
 
-leftop = quote_qualified_identifier("newdata",
+leftop = quote_qualified_identifier("pg_internal_newdata",
 	NameStr(attr->attname));
-rightop = quote_qualified_identifier("mv",
+rightop = quote_qualified_identifier("pg_internal_mv",
 	 NameStr(attr->attname));
 
 generate_operator_clause(,
@@ -786,8 +786,8 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	Assert(foundUniqueIndex);
 
 	appendStringInfoString(,
-		   " AND newdata OPERATOR(pg_catalog.*=) mv) "
-		   "WHERE newdata IS NULL OR mv IS NULL "
+		   " AND pg_internal_newdata OPERATOR(pg_catalog.*=) pg_internal_mv) "
+		   "WHERE pg_internal_newdata IS NULL OR pg_internal_mv IS NULL "
 		   "ORDER BY tid");
 
 	/* Create the temporary "diff" table. */
@@ -813,10 +813,10 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner,
 	/* Deletes must come before inserts; do them first. */
 	resetStringInfo();
 	appendStringInfo(,
-	 "DELETE FROM %s mv WHERE ctid OPERATOR(pg_catalog.=) ANY "
+	 "DELETE FROM %s pg_internal_mv WHERE ctid OPERATOR(pg_catalog.=) ANY "
 	 "(SELECT diff.tid FROM %s diff "
 	 "WHERE diff.tid IS NOT NULL "
-	 "AND diff.newdata IS NULL)",
+	 "AND diff.pg_internal_newdata IS NULL)",
 	 matviewname, diffname);
 	if (SPI_exec(querybuf.data, 0) != SPI_OK_DELETE)
 		elog(ERROR, "SPI_exec failed: %s", querybuf.data);
@@ -824,7 +824,7 @@ refresh_by_match_merge(Oid matviewOid, Oid 

Re: Parallel INSERT SELECT take 2

2021-05-19 Thread Greg Nancarrow
On Fri, May 14, 2021 at 6:24 PM houzj.f...@fujitsu.com
 wrote:
>
> Thanks for the comments, I have posted new version patches with this change.
>
> > How about reorganisation of the patches like the following?
> > 0001: CREATE ALTER TABLE PARALLEL DML
> > 0002: parallel-SELECT-for-INSERT (planner changes,
> > max_parallel_hazard() update, XID changes)
> > 0003: pg_get_parallel_safety()
> > 0004: regression test updates
>
> Thanks, it looks good and I reorganized the latest patchset in this way.
>
> Attaching new version patches with the following change.
>
> 0003
> Change functions arg type to regclass.
>
> 0004
> remove updates for "serial_schedule".
>

I've got some comments for the V4 set of patches:

(0001)

(i) Patch comment needs a little updating (suggested change is below):

Enable users to declare a table's parallel data-modification safety
(SAFE/RESTRICTED/UNSAFE).

Add a table property that represents parallel safety of a table for
DML statement execution.
It may be specified as follows:

CREATE TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE };
ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE };

This property is recorded in pg_class's relparallel column as 'u',
'r', or 's', just like pg_proc's proparallel.
The default is UNSAFE.

The planner assumes that all of the table, its descendant partitions,
and their ancillary objects have,
at worst, the specified parallel safety. The user is responsible for
its correctness.

---

NOTE: The following sentence was removed from the original V4 0001
patch comment (since this version of the patch is not doing runtime
parallel-safety checks on functions):.

If the parallel processes
find an object that is less safer than the assumed parallel safety during
statement execution, it throws an ERROR and abort the statement execution.


(ii) Update message to say "a foreign ...":

BEFORE:
+ errmsg("cannot support parallel data modification on foreign or
temporary table")));

AFTER:
+ errmsg("cannot support parallel data modification on a foreign or
temporary table")));


(iii) strVal() macro already casts to "Value *", so the cast can be
removed from the following:

+ char*parallel = strVal((Value *) def);


(0003)

(i) Suggested updates to the patch comment:

Provide a utility function "pg_get_parallel_safety(regclass)" that
returns records of
(objid, classid, parallel_safety) for all parallel unsafe/restricted
table-related objects
from which the table's parallel DML safety is determined. The user can
use this information
during development in order to accurately declare a table's parallel
DML safety. or to
identify any problematic objects if a parallel DML fails or behaves
unexpectedly.

When the use of an index-related parallel unsafe/restricted function
is detected, both the
function oid and the index oid are returned.

Provide a utility function "pg_get_max_parallel_hazard(regclass)" that
returns the worst
parallel DML safety hazard that can be found in the given relation.
Users can use this
function to do a quick check without caring about specific
parallel-related objects.


Regards,
Greg Nancarrow
Fujitsu Australia




Re: Inaccurate error message when set fdw batch_size to 0

2021-05-19 Thread Fujii Masao




On 2021/05/19 20:01, Bharath Rupireddy wrote:

On Mon, May 17, 2021 at 4:23 PM Amit Kapila  wrote:


On Tue, May 11, 2021 at 11:28 AM Bharath Rupireddy
 wrote:


On Mon, May 10, 2021 at 7:39 PM Tom Lane  wrote:


Bharath Rupireddy  writes:

On Mon, May 10, 2021 at 12:00 PM Tom Lane  wrote:

Yeah, this error message seems outright buggy.  However, it's a minor
matter.  Also, some people think "positive" is the same thing as
"non-negative", so maybe we need less ambiguous wording?



Since value 0 can't be considered as either a positive or negative
integer, I think we can do as following(roughly):



if (value < 0) "requires a zero or positive integer value"
if (value <= 0) "requires a positive integer value"


I was thinking of avoiding the passive voice and writing

 "foo must be greater than zero"


+1 for "foo must be greater than zero" if (foo <= 0) kind of errors.
But, we also have some values for which zero is accepted, see below
error messages. How about the error message "foo must be greater than
or equal to zero"?



+1 for your proposed message for the cases where we have a check if
(foo < 0). Tom, Michael, do you see any problem with the proposed
message? We would like to make a similar change at another place [1]
so wanted to be consistent.

[1] - 
https://www.postgresql.org/message-id/CALj2ACWGB9oHCR5ygkc8u6_QDqecObf9j2MxtOgsjZMMKsLj%3DQ%40mail.gmail.com


Thanks all for your inputs. PSA v2 patch that uses the new convention.


Thanks for the patch!

ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
-errmsg("%s requires a non-negative 
numeric value",
+errmsg("%s must be greater than or 
equal to zero",
def->defname)));
}
else if (strcmp(def->defname, "extensions") == 0)
@@ -142,7 +142,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
if (fetch_size <= 0)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
-errmsg("%s requires a non-negative 
integer value",
+errmsg("%s must be greater than 
zero",

I'm fine to convert "non-negative" word to "greater than" or "greater than
or equal to" in the messages. But this change also seems to get rid of
the information about the data type of the option from the message.
I'm not sure if this is an improvement. Probably isn't it better to
convert "requires a non-negative integer value" to "must be an integer value
greater than zero"?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: postgres_fdw - should we tighten up batch_size, fetch_size options against non-numeric values?

2021-05-19 Thread Fujii Masao




On 2021/05/19 14:34, Bharath Rupireddy wrote:

On Wed, May 19, 2021 at 8:28 AM Fujii Masao  wrote:

I agree with throwing an error for non-numeric junk though.
Allowing that on the grounds of backwards compatibility
seems like too much of a stretch.


+1.


+1.


+1


Thanks all for your inputs. PSA which uses parse_int for
batch_size/fech_size and parse_real for fdw_startup_cost and
fdw_tuple_cost.


Thanks for updating the patch! It basically looks good to me.

-   val = strtod(defGetString(def), );
-   if (*endp || val < 0)
+   is_parsed = parse_real(defGetString(def), , 0, 
NULL);
+   if (!is_parsed || val < 0)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
 errmsg("%s requires a non-negative 
numeric value",

Isn't it better to check "!is_parsed" and "val < 0" separately like
reloptions.c does? That is, we should throw different error messages
for them?

ERRCODE_SYNTAX_ERROR seems strange for this type of error?
ERRCODE_INVALID_PARAMETER_VALUE is better and more proper?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Incorrect usage of strtol, atoi for non-numeric junk inputs

2021-05-19 Thread Bharath Rupireddy
Hi,

While working on [1], I found that some parts of the code is using
strtol and atoi without checking for non-numeric junk input strings. I
found this strange. Most of the time users provide proper numeric
strings but there can be some scenarios where these strings are not
user-supplied but generated by some other code which may contain
non-numeric strings. Shouldn't the code use strtol or atoi
appropriately and error out in such cases?  One way to fix this once
and for all is to have a common API something like int
pg_strtol/pg_str_convert_to_int(char *opt_name, char *opt_value) which
returns a generic message upon invalid strings ("invalid value \"%s\"
is provided for option \"%s\", opt_name, opt_value) and returns
integers on successful parsing.

Although this is not a critical issue, I would like to seek thoughts.

[1] - 
https://www.postgresql.org/message-id/CALj2ACVMO6wY5Pc4oe1OCgUOAtdjHuFsBDw8R5uoYR86eWFQDA%40mail.gmail.com
[2] strtol:
vacuumlo.c --> ./vacuumlo -l '2323adfd'  postgres -p '5432ERE'
libpq_pipeline.c --> ./libpq_pipeline -r '2232adf' tests

atoi:
pg_amcheck.c  --> ./pg_amcheck -j '1211efe'
pg_basebackup --> ./pg_basebackup -Z 'EFEF' -s 'a$##'
pg_receivewal.c --> ./pg_receivewal -p '54343GDFD' -s '54343GDFD'
pg_recvlogical.c --> ./pg_recvlogical -F '-$#$#' -p '5432$$$' -s '100$$$'
pg_checksums.c. --> ./pg_checksums -f '1212abc'
pg_ctl.c --> ./pg_ctl -t 'efc'
pg_dump.c --> ./pg_dump -j '454adc' -Z '4adc' --extra-float-digits '-14adc'
pg_upgrade/option.c
pgbench.c
reindexdb.c
vacuumdb.c
pg_regress.c

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Dean Rasheed
On Wed, 19 May 2021 at 11:32, Fabien COELHO  wrote:
>
> >> Or, (3) remove this test?  I am not quite sure what there is to gain
> >> with this extra test considering all the other tests with permute()
> >> already present in this script.
> >
> > Yes, I think removing the test is the best option. It was originally
> > added because there was a separate code path for larger permutation
> > sizes that needed testing, but that's no longer the case so the test
> > really isn't adding anything.
>
> Hmmm…
>
> It is the one test which worked in actually detecting an issue, so I would
> not say that it is not adding anything, on the contrary, it did prove its
> value! The permute function is expected to be deterministic on different
> platforms and architectures, and it is not.
>

In fact what it demonstrates is that the results from permute(), like
all the other pgbench random functions, will vary by platform for
sufficiently large size parameters.

> I'd agree with a two phases approach: drop the test in the short term and
> deal with the PRNG later. I'm so unhappy with this 48 bit PRNG that I
> may be motivated enough to attempt to replace it, or at least add a better
> (faster?? larger state?? same/better quality?) alternative.
>

I don't necessarily have a problem with that provided the replacement
is well-chosen and has a proven track record (i.e., let's not invent
our own PRNG).

For now though, I'll go remove the test.

Regards,
Dean




Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-05-19 Thread Bharath Rupireddy
On Wed, May 19, 2021 at 4:10 PM Amit Kapila  wrote:
>
> On Wed, May 19, 2021 at 3:08 PM Bharath Rupireddy
>  wrote:
> >
> > On Wed, May 19, 2021 at 2:33 PM Amul Sul  wrote:
> > >
> > > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > parse_subscription_options function has some similar code when
> > > > throwing errors [with the only difference in the option]. I feel we
> > > > could just use a variable for the option and use it in the error.
>
> I am not sure how much it helps to just refactor this part of the code
> alone unless we need to add/change it more. Having said that, this
> function is being modified by one of the proposed patches for logical
> decoding of 2PC and I noticed that the proposed patch is adding more
> parameters to this function which already takes 14 input parameters,
> so I suggested refactoring it. See comment 11 in email[1]. See, if
> that makes sense to you then we can refactor this function such that
> it can be enhanced easily by future patches.

Thanks Amit for the comments. I agree to move the parse options to a
new structure ParseSubOptions as suggested. Then the function can just
be parse_subscription_options(ParseSubOptions opts); I wonder if we
should also have a structure for parse_publication_options as we might
add new options there in the future?

If okay, I can work on these changes and attach it along with these
error message changes. Thoughts?

> > > > While this has no benefit at all, it saves some LOC and makes the code
> > > > look better with lesser ereport(ERROR statements. PSA patch.
> > > >
> > > > Thoughts?
> > >
> > > I don't have a strong opinion on this, but the patch should add
> > > __translator__ help comment for the error msg.
> >
> > Is the "/*- translator:" help comment something visible to the user or
> > some other tool?
> >
>
> We use similar comments at other places. So, it makes sense to retain
> the comment as it might help translation tools.

I will retail it.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Fabien COELHO


Hello Dean,


Or, (3) remove this test?  I am not quite sure what there is to gain
with this extra test considering all the other tests with permute()
already present in this script.


Yes, I think removing the test is the best option. It was originally
added because there was a separate code path for larger permutation
sizes that needed testing, but that's no longer the case so the test
really isn't adding anything.


Hmmm…

It is the one test which worked in actually detecting an issue, so I would 
not say that it is not adding anything, on the contrary, it did prove its 
value! The permute function is expected to be deterministic on different 
platforms and architectures, and it is not.


I agree that removing the test will hide the issue effectively:-) but ISTM 
more appropriate to solve the underlying issue and keep the test.


I'd agree with a two phases approach: drop the test in the short term and 
deal with the PRNG later. I'm so unhappy with this 48 bit PRNG that I may 
be motivated enough to attempt to replace it, or at least add a better 
(faster?? larger state?? same/better quality?) alternative.


Attached patch disactivates the test with comments to outline that there 
is an issue to fix… so it is *not* removed.


I'm obviously okay with providing an alternate PRNG, let me know if this 
is the prefered option.


--
Fabien.diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2eaf9ab4c2..db65988f76 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -645,13 +645,15 @@ SELECT :v0, :v1, :v2, :v3;
  permute(0, 2, 5435) = 1 and permute(1, 2, 5435) = 0)
 -- 63 bits tests
 \set size debug(:max - 10)
-\set t debug(permute(:size-1, :size, 5432) = 5301702756001087507 and \
- permute(:size-2, :size, 5432) = 8968485976055840695 and \
- permute(:size-3, :size, 5432) = 6708495591295582115 and \
- permute(:size-4, :size, 5432) = 2801794404574855121 and \
- permute(:size-5, :size, 5432) = 1489011409218895840 and \
- permute(:size-6, :size, 5432) = 2267749475878240183 and \
- permute(:size-7, :size, 5432) = 1300324176838786780)
+-- FIXME non deterministic issue on i386 due to floating point different behavior
+\set t debug(true)
+-- \set t debug(permute(:size-1, :size, 5432) = 5301702756001087507 and \
+--  permute(:size-2, :size, 5432) = 8968485976055840695 and \
+--  permute(:size-3, :size, 5432) = 6708495591295582115 and \
+--  permute(:size-4, :size, 5432) = 2801794404574855121 and \
+--  permute(:size-5, :size, 5432) = 1489011409218895840 and \
+--  permute(:size-6, :size, 5432) = 2267749475878240183 and \
+--  permute(:size-7, :size, 5432) = 1300324176838786780)
 }
 	});
 


Re: Inaccurate error message when set fdw batch_size to 0

2021-05-19 Thread Bharath Rupireddy
On Mon, May 17, 2021 at 4:23 PM Amit Kapila  wrote:
>
> On Tue, May 11, 2021 at 11:28 AM Bharath Rupireddy
>  wrote:
> >
> > On Mon, May 10, 2021 at 7:39 PM Tom Lane  wrote:
> > >
> > > Bharath Rupireddy  writes:
> > > > On Mon, May 10, 2021 at 12:00 PM Tom Lane  wrote:
> > > >> Yeah, this error message seems outright buggy.  However, it's a minor
> > > >> matter.  Also, some people think "positive" is the same thing as
> > > >> "non-negative", so maybe we need less ambiguous wording?
> > >
> > > > Since value 0 can't be considered as either a positive or negative
> > > > integer, I think we can do as following(roughly):
> > >
> > > > if (value < 0) "requires a zero or positive integer value"
> > > > if (value <= 0) "requires a positive integer value"
> > >
> > > I was thinking of avoiding the passive voice and writing
> > >
> > > "foo must be greater than zero"
> >
> > +1 for "foo must be greater than zero" if (foo <= 0) kind of errors.
> > But, we also have some values for which zero is accepted, see below
> > error messages. How about the error message "foo must be greater than
> > or equal to zero"?
> >
>
> +1 for your proposed message for the cases where we have a check if
> (foo < 0). Tom, Michael, do you see any problem with the proposed
> message? We would like to make a similar change at another place [1]
> so wanted to be consistent.
>
> [1] - 
> https://www.postgresql.org/message-id/CALj2ACWGB9oHCR5ygkc8u6_QDqecObf9j2MxtOgsjZMMKsLj%3DQ%40mail.gmail.com

Thanks all for your inputs. PSA v2 patch that uses the new convention.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


v2-0001-Disambiguate-error-messages-that-use-non-negative.patch
Description: Binary data


pg_rewind fails if there is a read only file.

2021-05-19 Thread Paul Guo
Several weeks ago I saw this issue in a production environment. The
read only file looks like a credential file. Michael told me that
usually such kinds of files should be better kept in non-pgdata
directories in production environments. Thought further it seems that
pg_rewind should be more user friendly to tolerate such scenarios.

The failure error is "Permission denied" after open(). The reason is
open() fais with the below mode in open_target_file()

  mode = O_WRONLY | O_CREAT | PG_BINARY;
  if (trunc)
  mode |= O_TRUNC;
  dstfd = open(dstpath, mode, pg_file_create_mode);

The fix should be quite simple, if open fails with "Permission denied"
then we try to call chmod to add a S_IWUSR just before open() and call
chmod to reset the flags soon after open(). A stat() call to get
previous st_mode flags is needed.

Any other suggestions or thoughts?

Thanks,

-- 
Paul Guo (Vmware)




Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-05-19 Thread Amit Kapila
On Wed, May 19, 2021 at 3:08 PM Bharath Rupireddy
 wrote:
>
> On Wed, May 19, 2021 at 2:33 PM Amul Sul  wrote:
> >
> > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy
> >  wrote:
> > >
> > > Hi,
> > >
> > > parse_subscription_options function has some similar code when
> > > throwing errors [with the only difference in the option]. I feel we
> > > could just use a variable for the option and use it in the error.

I am not sure how much it helps to just refactor this part of the code
alone unless we need to add/change it more. Having said that, this
function is being modified by one of the proposed patches for logical
decoding of 2PC and I noticed that the proposed patch is adding more
parameters to this function which already takes 14 input parameters,
so I suggested refactoring it. See comment 11 in email[1]. See, if
that makes sense to you then we can refactor this function such that
it can be enhanced easily by future patches.

> > > While this has no benefit at all, it saves some LOC and makes the code
> > > look better with lesser ereport(ERROR statements. PSA patch.
> > >
> > > Thoughts?
> >
> > I don't have a strong opinion on this, but the patch should add
> > __translator__ help comment for the error msg.
>
> Is the "/*- translator:" help comment something visible to the user or
> some other tool?
>

We use similar comments at other places. So, it makes sense to retain
the comment as it might help translation tools.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1Jz64rwLyB6H7Z_SmEDouJ41KN42%3DVkVFp6JTpafJFG8Q%40mail.gmail.com

-- 
With Regards,
Amit Kapila.




RE: Forget close an open relation in ReorderBufferProcessTXN()

2021-05-19 Thread osumi.takami...@fujitsu.com
On Wednesday, May 19, 2021 1:52 PM Amit Kapila  wrote:
> > > I am not sure but I
> > > > think we should prohibit truncate on user_catalog_tables as we
> > > > prohibit truncate on system catalog tables (see below [1]) if we
> > > > want plugin to lock them, otherwise, as you said it might lead to
> deadlock.
> > > > For the matter, I think we should once check all other operations
> > > > where we can take an exclusive lock on [user]_catalog_table, say
> > > > Cluster command, and compare the behavior of same on system
> > > > catalog tables.
> > > >
> > > > [1]
> > > > postgres=# truncate pg_class;
> > > > ERROR:  permission denied: "pg_class" is a system catalog
> > > > postgres=# cluster pg_class;
> > > > ERROR:  there is no previously clustered index for table "pg_class"
> > > >
> > >
> > > Please ignore the cluster command as we need to use 'using index'
> > > with that command to make it successful. I just want to show the
> > > truncate command behavior for which you have asked the question.
> > Thank you so much for clarifying the direction.
> > I agree with the changing the TRUNCATE side.
> > I'll make a patch based on this.
> >
> 
> Isn't it a better idea to start a new thread where you can summarize whatever
> we have discussed here about user_catalog_tables? We might get the opinion
> from others about the behavior change you are proposing.
You are right. So, I've launched it with the patch for this.


Best Regards,
Takamichi Osumi



Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Fabien COELHO




Confirmed, thanks for looking. I can reproduce it on my machine with
-m32. It's somewhat annoying that the buildfarm didn't pick it up
sooner :-(

On Wed, 19 May 2021 at 08:28, Michael Paquier  wrote:


On Wed, May 19, 2021 at 09:06:16AM +0200, Fabien COELHO wrote:

I see two simple approaches:

(1) use another PRNG inside pgbench, eg Knuth's which was used in some
previous submission and is very simple and IMHO better than the rand48
stuff.

(2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits
state.


Or, (3) remove this test?  I am not quite sure what there is to gain
with this extra test considering all the other tests with permute()
already present in this script.


Yes, I think removing the test is the best option. It was originally
added because there was a separate code path for larger permutation
sizes that needed testing, but that's no longer the case so the test
really isn't adding anything.


Hmmm…

It is the one test which worked in actually detecting an issue, so I would 
not say that it is not adding anything, on the contrary, it did prove its 
value! The permute function is expected to be deterministic on different 
platforms and architectures, and it is not.


I agree that removing the test will hide the issue effectively:-) but 
ISTM more appropriate to solve the underlying issue and keep the test.


I'd agree with a two phases approach: drop the test in the short term and 
deal with the PRNG later. I'm so unhappy with this 48 bit PRNG that I 
may be motivated enough to attempt to replace it, or at least add a better 
(faster?? larger state?? same/better quality?) alternative.


--
Fabien.

Deadlock concern caused by TRUNCATE on user_catalog_table in synchronous mode

2021-05-19 Thread osumi.takami...@fujitsu.com
Hi


I've been discussing about user_catalog_table
and the possibility of deadlock during synchronous mode
of logical replication in [1]. I'll launch a new thread
and summarize the contents so that anyone who is
interested in this title can join the discussion.

We don't have any example of user_catalog_tables
in the core code, so any response and idea related to this area is helpful.

Now, we don't disallow output plugin to take a lock
on user_catalog_table. Then, we can consider a deadlock scenario like below.

1. TRUNCATE command is performed on user_catalog_table.
2. TRUNCATE command locks the table and index with ACCESS EXCLUSIVE LOCK.
3. TRUNCATE waits for the subscriber's synchronization
when synchronous_standby_names is set.
4. Here, the walsender hangs, *if* it tries to acquire a lock on the 
user_catalog_table
because the table where it wants to see is locked by the TRUNCATE 
already. 

(Here, we don't talk about LOCK command because
the discussion is in progress in another thread independently - [2])

Another important point here is that we can *not*
know how and when plugin does read only access to user_catalog_table in general,
because it depends on the purpose of the plugin.
Then, I'm thinking that changing a behavior of TRUNCATE side
to error out when TRUNCATE is performed on user_catalog_table
will work to make the concern disappear. Kindly have a look at the attached 
patch.

[1] - 
https://www.postgresql.org/message-id/MEYP282MB166933B1AB02B4FE56E82453B64D9%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM

[2] - 
https://www.postgresql.org/message-id/CALDaNm1UB==gl9poad4etjfcygdjbphwezezocodnbd--kj...@mail.gmail.com


Best Regards,
Takamichi Osumi



disallow_TRUNCATE_on_user_catalog_table_v01.patch
Description: disallow_TRUNCATE_on_user_catalog_table_v01.patch


Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.

2021-05-19 Thread Fujii Masao



On 2021/05/19 16:43, Kyotaro Horiguchi wrote:

+1 for adding some tests for pg_wal_replay_pause() but the test seems
like checking only that pg_get_wal_replay_pause_state() returns the
expected state value.  Don't we need to check that the recovery is
actually paused and that the promotion happens at expected LSN?


Sounds good. Attached is the updated version of the patch.
I added such checks into the test.

BTW, while reading some recovery regression tests, I found that
013_crash_restart.pl has "use Time::HiRes qw(usleep)" but it seems
not necessary. We can safely remove that? Patch attached.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/test/recovery/t/005_replay_delay.pl 
b/src/test/recovery/t/005_replay_delay.pl
index 7f177afaed..496fa40fe1 100644
--- a/src/test/recovery/t/005_replay_delay.pl
+++ b/src/test/recovery/t/005_replay_delay.pl
@@ -1,13 +1,13 @@
 
 # Copyright (c) 2021, PostgreSQL Global Development Group
 
-# Checks for recovery_min_apply_delay
+# Checks for recovery_min_apply_delay and recovery pause
 use strict;
 use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 1;
+use Test::More tests => 3;
 
 # Initialize primary node
 my $node_primary = get_new_node('primary');
@@ -55,3 +55,58 @@ $node_standby->poll_query_until('postgres',
 # the configured apply delay.
 ok(time() - $primary_insert_time >= $delay,
"standby applies WAL only after replication delay");
+
+
+# Check that recovery can be paused or resumed expectedly.
+my $node_standby2 = get_new_node('standby2');
+$node_standby2->init_from_backup($node_primary, $backup_name,
+   has_streaming => 1);
+$node_standby2->start;
+
+# Recovery is not yet paused.
+is($node_standby2->safe_psql('postgres',
+   "SELECT pg_get_wal_replay_pause_state()"),
+   'not paused', 'pg_get_wal_replay_pause_state() reports not paused');
+
+# Request to pause recovery and wait until it's actually paused.
+$node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()");
+$node_primary->safe_psql('postgres',
+   "INSERT INTO tab_int VALUES (generate_series(21,30))");
+$node_standby2->poll_query_until('postgres',
+   "SELECT pg_get_wal_replay_pause_state() = 'paused'")
+   or die "Timed out while waiting for recovery to be paused";
+
+# Even if new WAL records are streamed from the primary,
+# recovery in the paused state doesn't replay them.
+my $receive_lsn = $node_standby2->safe_psql('postgres',
+   "SELECT pg_last_wal_receive_lsn()");
+my $replay_lsn = $node_standby2->safe_psql('postgres',
+   "SELECT pg_last_wal_replay_lsn()");
+$node_primary->safe_psql('postgres',
+   "INSERT INTO tab_int VALUES (generate_series(31,40))");
+$node_standby2->poll_query_until('postgres',
+   "SELECT '$receive_lsn'::pg_lsn < pg_last_wal_receive_lsn()")
+   or die "Timed out while waiting for new WAL to be streamed";
+is($node_standby2->safe_psql('postgres',
+   "SELECT pg_last_wal_replay_lsn()"),
+   qq($replay_lsn), 'no WAL is replayed in the paused state');
+
+# Request to resume recovery and wait until it's actually resumed.
+$node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_resume()");
+$node_standby2->poll_query_until('postgres',
+   "SELECT pg_get_wal_replay_pause_state() = 'not paused' AND 
pg_last_wal_replay_lsn() > '$replay_lsn'::pg_lsn")
+   or die "Timed out while waiting for recovery to be resumed";
+
+# Check that the paused state ends and promotion continues if a promotion
+# is triggered while recovery is paused.
+$node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()");
+$node_primary->safe_psql('postgres',
+   "INSERT INTO tab_int VALUES (generate_series(41,50))");
+$node_standby2->poll_query_until('postgres',
+   "SELECT pg_get_wal_replay_pause_state() = 'paused'")
+  or die "Timed out while waiting for recovery to be paused";
+
+$node_standby2->promote;
+$node_standby2->poll_query_until('postgres',
+   "SELECT NOT pg_is_in_recovery()")
+  or die "Timed out while waiting for promotion to finish";
diff --git a/src/test/recovery/t/013_crash_restart.pl 
b/src/test/recovery/t/013_crash_restart.pl
index 66e43ffbe8..e1c36abe97 100644
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -17,7 +17,6 @@ use PostgresNode;
 use TestLib;
 use Test::More;
 use Config;
-use Time::HiRes qw(usleep);
 
 plan tests => 18;
 


Re: libpq_pipeline in tmp_install

2021-05-19 Thread Peter Eisentraut

On 10.05.21 20:26, Peter Eisentraut wrote:
The test program libpq_pipeline produced by the test suite in 
src/test/modules/libpq_pipeline/ is installed into tmp_install as part 
of make check.  This isn't a real problem by itself, but I think it 
creates a bit of an asymmetric situation that might be worth cleaning up.


Before, the contents of tmp_install exactly matched an actual 
installation.  There were no extra test programs installed.


Also, the test suite code doesn't actually use that installed version, 
so it's not of any use, and it creates confusion about which copy is in 
use.


The reason this is there is that the test suite uses PGXS to build the 
test program, and so things get installed automatically.  I suggest that 
we should either write out the build system by hand to avoid this, or 
maybe extend PGXS to support building programs but not installing them. 


Here is a patch that implements the second solution, which turned out to 
be very easy.
From 0c2f11034d66c072ffd2141bc5724c211ffb6ae8 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 19 May 2021 08:12:15 +0200
Subject: [PATCH] Add NO_INSTALL option to pgxs

Apply in libpq_pipeline test makefile, so that the test file is not
installed into tmp_install.

Discussion: 
https://www.postgresql.org/message-id/flat/cb9d16a6-760f-cd44-28d6-b091d5fb6ca7%40enterprisedb.com
---
 src/makefiles/pgxs.mk| 4 
 src/test/modules/libpq_pipeline/Makefile | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/src/makefiles/pgxs.mk b/src/makefiles/pgxs.mk
index 271e7eaba8..95d0441e9b 100644
--- a/src/makefiles/pgxs.mk
+++ b/src/makefiles/pgxs.mk
@@ -227,6 +227,8 @@ all: all-lib
 endif # MODULE_big
 
 
+ifndef NO_INSTALL
+
 install: all installdirs
 ifneq (,$(EXTENSION))
$(INSTALL_DATA) $(addprefix $(srcdir)/, $(addsuffix .control, 
$(EXTENSION))) '$(DESTDIR)$(datadir)/extension/'
@@ -336,6 +338,8 @@ endif # with_llvm
 uninstall: uninstall-lib
 endif # MODULE_big
 
+endif # NO_INSTALL
+
 
 clean:
 ifdef MODULES
diff --git a/src/test/modules/libpq_pipeline/Makefile 
b/src/test/modules/libpq_pipeline/Makefile
index b798f5fbbc..c9c5ae1beb 100644
--- a/src/test/modules/libpq_pipeline/Makefile
+++ b/src/test/modules/libpq_pipeline/Makefile
@@ -3,6 +3,8 @@
 PROGRAM = libpq_pipeline
 OBJS = libpq_pipeline.o
 
+NO_INSTALL = 1
+
 PG_CPPFLAGS = -I$(libpq_srcdir)
 PG_LIBS_INTERNAL += $(libpq_pgport)
 
-- 
2.31.1



Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-05-19 Thread Bharath Rupireddy
On Wed, May 19, 2021 at 2:33 PM Amul Sul  wrote:
>
> On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy
>  wrote:
> >
> > Hi,
> >
> > parse_subscription_options function has some similar code when
> > throwing errors [with the only difference in the option]. I feel we
> > could just use a variable for the option and use it in the error.
> > While this has no benefit at all, it saves some LOC and makes the code
> > look better with lesser ereport(ERROR statements. PSA patch.
> >
> > Thoughts?
>
> I don't have a strong opinion on this, but the patch should add
> __translator__ help comment for the error msg.

Is the "/*- translator:" help comment something visible to the user or
some other tool? If not, I don't think that's necessary as the meaning
of the error message is evident by looking at the error message
itself. IMO, anyone who looks at that part of the code can understand
it.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Different compression methods for FPI

2021-05-19 Thread Michael Paquier
On Tue, May 18, 2021 at 10:06:18PM -0500, Justin Pryzby wrote:
> On Mon, May 17, 2021 at 04:44:11PM +0900, Michael Paquier wrote:
>> On Sun, Mar 21, 2021 at 02:30:04PM -0500, Justin Pryzby wrote:
>> 
>> For this patch, this is going to require a bit more in terms of library
>> linking as the block decompression is done in xlogreader.c, so that's one
>> thing to worry about.
> 
> I'm not sure what you mean here ?

I was wondering if anything else was needed in terms of linking here.

>> Don't think you need a hook here, but zlib, or any other method which
>> is not supported by the build, had better not be listed instead.  This
>> ensures that the value cannot be assigned if the binaries don't
>> support that.
> 
> I think you're confusing the walmethods struct (which is unconditional) with
> wal_compression_options, which is conditional.

Indeed I was.  For the note, walmethods is not necessary either as a
new structure.  You could just store a list of strings in xlogreader.c
and make a note to keep the entries in a given order.  That's simpler
as well.

>> The patch set is a gathering of various things, and not only things
>> associated to the compression method used for FPIs.
>> What is the point of that in patch 0002?
>>> Subject: [PATCH 03/12] Make sure published XIDs are persistent
>> Patch 0003 looks unrelated to this thread.
> 
> ..for the reason that I gave:
> 
> | And 2ndary patches from another thread to allow passing recovery tests.
> |These two patches are a prerequisite for this patch to progress:
> | * Run 011_crash_recovery.pl with wal_level=minimal
> | * Make sure published XIDs are persistent

I still don't understand why XID consistency has anything to do with
the compression of FPIs.  There is nothing preventing the testing of
compression of FPIs, and plese note this argument:
https://www.postgresql.org/message-id/bef3b1e0-0b31-4f05-8e0a-f681cb918...@yandex-team.ru

For example, I can just revert from my tree 0002 and 0003, and still
perform tests of the various compression methods.  I do agree that we
are going to need to do something about this problem, but let's drop
this stuff from the set of patches of this thread and just discuss
them where they are needed.


>> +   {
>> +   {"wal_compression_method", PGC_SIGHUP, WAL_SETTINGS,
>> +   gettext_noop("Set the method used to compress full page images 
>> in the WAL."),
>> +   NULL
>> +   },
>> +   _compression_method,
>> +   WAL_COMPRESSION_PGLZ, wal_compression_options,
>> +   NULL, NULL, NULL
>> +   },
>> The interface is not quite right to me.  I think that we should just
>> change wal_compression to become an enum, with extra values for pglz
>> and the new method.  "on" would be a synonym for "pglz".
> 
> Andrey gave a reason in March:
> 
> | I hope one day we will compress all WAL, not just FPIs. Advanced archive 
> management tools already do so, why not compress it in walwriter?
> | When this will be implemented, we could have wal_compression = {off, fpi, 
> all}.
>
> [...]

> I don't see why we'd add a guc for configuration compression but not include
> the 30 lines of code needed to support a 3rd method that we already used by 
> the
> core server.

Because that makes things confusing.  We have no idea if we'll ever
reach a point or even if it makes sense to have compression applied to
multiple parts of WAL.  So, for now, let's just use one single GUC and
be done with it.  Your argument is not tied to what's proposed on this
thread either, and could go the other way around.  If we were to
invent more compression concepts in WAL records, we could as well just
go with a new GUC that lists the parts of the WAL where compression
needs to be applied.  I'd say to keep it to a minimum for now, that's
an interface less confusing than what's proposed here.

And you have not replaced BKPIMAGE_IS_COMPRESSED by a PGLZ-equivalent,
so your patch set is eating more bits for BKPIMAGE_* than it needs
to.

By the way, it would be really useful for the user to print in
pg_waldump -b the type of compression used :)

A last point, and I think that this should be part of a study of the
choice to made for an extra compression method: there is no discussion
yet about the level of compression applied, which is something that
can be applied to zstd, lz4 or even zlib.  Perhaps there is an
argument for a different GUC controlling that, so more benchmarking
is a first step needed for this thread to move on.  Benchmarking can
happen with what's currently posted, of course.
--
Michael


signature.asc
Description: PGP signature


Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-05-19 Thread Amul Sul
On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy
 wrote:
>
> Hi,
>
> parse_subscription_options function has some similar code when
> throwing errors [with the only difference in the option]. I feel we
> could just use a variable for the option and use it in the error.
> While this has no benefit at all, it saves some LOC and makes the code
> look better with lesser ereport(ERROR statements. PSA patch.
>
> Thoughts?

I don't have a strong opinion on this, but the patch should add
__translator__ help comment for the error msg.

Regards,
Amul




Refactor "mutually exclusive options" error reporting code in parse_subscription_options

2021-05-19 Thread Bharath Rupireddy
Hi,

parse_subscription_options function has some similar code when
throwing errors [with the only difference in the option]. I feel we
could just use a variable for the option and use it in the error.
While this has no benefit at all, it saves some LOC and makes the code
look better with lesser ereport(ERROR statements. PSA patch.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From 6e4824653d2849631cba3cfe1fe562e1b2823c4a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 1 May 2021 20:26:30 +0530
Subject: [PATCH v1] Refactoring of error code in parse_subscription_options

---
 src/backend/commands/subscriptioncmds.c | 50 -
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 517c8edd3b..c4c9b4bd8c 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -227,25 +227,21 @@ parse_subscription_options(List *options,
 	 */
 	if (connect && !*connect)
 	{
+		char *option = NULL;
+
 		/* Check for incompatible options from the user. */
 		if (enabled && *enabled_given && *enabled)
-			ereport(ERROR,
-	(errcode(ERRCODE_SYNTAX_ERROR),
-			/*- translator: both %s are strings of the form "option = value" */
-	 errmsg("%s and %s are mutually exclusive options",
-			"connect = false", "enabled = true")));
-
-		if (create_slot && create_slot_given && *create_slot)
-			ereport(ERROR,
-	(errcode(ERRCODE_SYNTAX_ERROR),
-	 errmsg("%s and %s are mutually exclusive options",
-			"connect = false", "create_slot = true")));
+			option = "enabled = true";
+		else if (create_slot && create_slot_given && *create_slot)
+			option = "create_slot = true";
+		else if (copy_data && copy_data_given && *copy_data)
+			option = "copy_data = true";
 
-		if (copy_data && copy_data_given && *copy_data)
+		if (option)
 			ereport(ERROR,
 	(errcode(ERRCODE_SYNTAX_ERROR),
 	 errmsg("%s and %s are mutually exclusive options",
-			"connect = false", "copy_data = true")));
+			"connect = false", option)));
 
 		/* Change the defaults of other options. */
 		*enabled = false;
@@ -259,31 +255,31 @@ parse_subscription_options(List *options,
 	 */
 	if (slot_name && *slot_name_given && !*slot_name)
 	{
+		char *option = NULL;
+
 		if (enabled && *enabled_given && *enabled)
-			ereport(ERROR,
-	(errcode(ERRCODE_SYNTAX_ERROR),
-			/*- translator: both %s are strings of the form "option = value" */
-	 errmsg("%s and %s are mutually exclusive options",
-			"slot_name = NONE", "enabled = true")));
+			option = "enabled = true";
+		else if (create_slot && create_slot_given && *create_slot)
+			option = "create_slot = true";
 
-		if (create_slot && create_slot_given && *create_slot)
+		if (option)
 			ereport(ERROR,
 	(errcode(ERRCODE_SYNTAX_ERROR),
 	 errmsg("%s and %s are mutually exclusive options",
-			"slot_name = NONE", "create_slot = true")));
+			"slot_name = NONE", option)));
+
+		option = NULL;
 
 		if (enabled && !*enabled_given && *enabled)
-			ereport(ERROR,
-	(errcode(ERRCODE_SYNTAX_ERROR),
-			/*- translator: both %s are strings of the form "option = value" */
-	 errmsg("subscription with %s must also set %s",
-			"slot_name = NONE", "enabled = false")));
+			option = "enabled = false";
+		else if (create_slot && !create_slot_given && *create_slot)
+			option = "create_slot = false";
 
-		if (create_slot && !create_slot_given && *create_slot)
+		if (option)
 			ereport(ERROR,
 	(errcode(ERRCODE_SYNTAX_ERROR),
 	 errmsg("subscription with %s must also set %s",
-			"slot_name = NONE", "create_slot = false")));
+			"slot_name = NONE", option)));
 	}
 }
 
-- 
2.25.1



Re: Added missing tab completion for alter subscription set option

2021-05-19 Thread Bharath Rupireddy
On Tue, May 18, 2021 at 9:21 PM Alvaro Herrera  wrote:
>
> On 2021-May-14, vignesh C wrote:
>
> > While I was reviewing one of the logical decoding features, I found
> > Streaming and binary options were missing in tab completion for the
> > alter subscription set option, the attached patch has the changes for
> > the same.
> > Thoughts?
>
> I wish we didn't have to keep knowledge in the psql source on which
> option names are to be used for each command.  If we had some function
>  SELECT pg_completion_options('alter subscription set');
> that returned the list of options usable for each command, we wouldn't
> have to ... psql would just retrieve the list of options for the current
> command.
>
> Maintaining such a list does not seem hard -- for example we could just
> have a function alongside parse_subscription_option() that returns the
> names that are recognized by that one.  If we drive the implementation
> of both off a single struct, it would never be outdated.

Yeah, having something similar to table_storage_parameters works better.

While on this, I found that all the options are not listed for CREATE
SUBSCRIPTION command in tab-complete.c, missing ones are binary and
streaming:
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("copy_data", "connect", "create_slot", "enabled",
  "slot_name", "synchronous_commit");

Similarly, CREATE and ALTER PUBLICATION don't have
publish_via_partition_root option:
else if (HeadMatches("CREATE", "PUBLICATION") && TailMatches("WITH", "("))
COMPLETE_WITH("publish");

I think having some structures like below in subscriptioncmds.h,
publicationcmds.h and using them in tab-complete.c would make more
sense.

static const char *const create_subscription_params[] = {
"copy_data",
"create_slot",
"enabled",
"slot_name",
"synchronous_commit",
"binary",
"connect",
"streaming",
NULL
}

static const char *const alter_subscription_set_params[] = {
"binary",
"slot_name",
"streaming",
"synchronous_commit",
NULL
}

static const char *const create_or_alter_publication_params[] = {
"publish",
"publish_via_partition_root"
NULL
}

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Dean Rasheed
On Wed, 19 May 2021 at 00:35, Thomas Munro  wrote:
>
> FWIW this is reproducible on my local Debian/gcc box with -m32,

Confirmed, thanks for looking. I can reproduce it on my machine with
-m32. It's somewhat annoying that the buildfarm didn't pick it up
sooner :-(

On Wed, 19 May 2021 at 08:28, Michael Paquier  wrote:
>
> On Wed, May 19, 2021 at 09:06:16AM +0200, Fabien COELHO wrote:
> > I see two simple approaches:
> >
> > (1) use another PRNG inside pgbench, eg Knuth's which was used in some
> > previous submission and is very simple and IMHO better than the rand48
> > stuff.
> >
> > (2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits
> > state.
>
> Or, (3) remove this test?  I am not quite sure what there is to gain
> with this extra test considering all the other tests with permute()
> already present in this script.

Yes, I think removing the test is the best option. It was originally
added because there was a separate code path for larger permutation
sizes that needed testing, but that's no longer the case so the test
really isn't adding anything.

Regards,
Dean




Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.

2021-05-19 Thread Dilip Kumar
On Wed, May 19, 2021 at 11:55 AM Kyotaro Horiguchi
 wrote:
>
> At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar  wrote 
> in
> > On Wed, May 19, 2021 at 10:16 AM Fujii Masao
> >  wrote:
> > >
> > > On 2021/05/18 15:46, Michael Paquier wrote:
> > > > On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote:
> > > >> Currently a promotion causes all available WAL to be replayed before
> > > >> a standby becomes a primary whether it was in paused state or not.
> > > >> OTOH, something like immediate promotion (i.e., standby becomes
> > > >> a primary without replaying outstanding WAL) might be useful for
> > > >> some cases. I don't object to that.
> > > >
> > > > Sounds like a "promotion immediate" mode.  It does not sound difficult
> > > > nor expensive to add a small test for that in one of the existing
> > > > recovery tests triggerring a promotion.  Could you add one based on
> > > > pg_get_wal_replay_pause_state()?
> > >
> > > You're thinking to add the test like the following?
> > > #1. Pause the recovery
> > > #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused'
> > > #3. Trigger standby promotion
> > > #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused'
> > >
> > > It seems not easy to do the test #4 stably because
> > > pg_get_wal_replay_pause_state() needs to be executed
> > > before the promotion finishes.
> >
> > Even for #2, we can not ensure that whether it will be 'paused' or
> > 'pause requested'.
>
> We often use poll_query_until() to make sure some desired state is
> reached.  And, as Michael suggested, the function
> pg_get_wal_replay_pause_state() still works at the time of
> recovery_end_command.  So a bit more detailed steps are:

Right, if we are polling for the state change in #2 then that makes sense.

> #0. Equip the server with recovery_end_command that waits for some
> trigger then start the server.
> #1. Pause the recovery
> #2. Wait until pg_get_wal_replay_pause_state() returns 'paused'
> #3. Trigger standby promotion
> #4. Wait until pg_get_wal_replay_pause_state() returns 'not paused'
> #5. Trigger recovery_end_command to let promotion proceed.

+1

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Bug fix for tab completion of ALTER TABLE ... VALIDATE CONSTRAINT ...

2021-05-19 Thread Michael Paquier
On Tue, Apr 27, 2021 at 12:58:52PM +0300, Aleksander Alekseev wrote:
> I've noticed there is no tab completion for ALTER TABLE xxx ADD. Here
> is an alternative version of the patch that fixes this as well. Not
> sure if this should be in the same commit though.

-   /* If we have ALTER TABLE  DROP, provide COLUMN or CONSTRAINT */
-   else if (Matches("ALTER", "TABLE", MatchAny, "DROP"))
+   /* If we have ALTER TABLE  ADD|DROP, provide COLUMN or CONSTRAINT */
+   else if (Matches("ALTER", "TABLE", MatchAny, "ADD|DROP"))
Seems to me that the behavior to not complete with COLUMN or
CONSTRAINT for ADD is intentional, as it is possible to specify a
constraint or column name without the object type first.  This
introduces a inconsistent behavior with what we do for columns with
ADD, for one.  So a more consistent approach would be to list columns,
constraints, COLUMN and CONSTRAINT in the list of options available
after ADD.

+   else if (Matches("ALTER", "TABLE", MatchAny, "VALIDATE", "CONSTRAINT"))
+   {
+   completion_info_charp = prev3_wd;
+   COMPLETE_WITH_QUERY(Query_for_nonvalid_constraint_of_table);
+   }
Specifying valid constraints is an authorized grammar, so it does not
seem that bad to keep things as they are, either.  I would leave that
alone.
--
Michael


signature.asc
Description: PGP signature


Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.

2021-05-19 Thread Kyotaro Horiguchi
At Wed, 19 May 2021 16:21:58 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2021/05/19 15:25, Kyotaro Horiguchi wrote:
> > At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar
> >  wrote in
> >> On Wed, May 19, 2021 at 10:16 AM Fujii Masao
> >>  wrote:
> >>>
> >>> On 2021/05/18 15:46, Michael Paquier wrote:
>  On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote:
> > Currently a promotion causes all available WAL to be replayed before
> > a standby becomes a primary whether it was in paused state or not.
> > OTOH, something like immediate promotion (i.e., standby becomes
> > a primary without replaying outstanding WAL) might be useful for
> > some cases. I don't object to that.
> 
>  Sounds like a "promotion immediate" mode.  It does not sound difficult
>  nor expensive to add a small test for that in one of the existing
>  recovery tests triggerring a promotion.  Could you add one based on
>  pg_get_wal_replay_pause_state()?
> >>>
> >>> You're thinking to add the test like the following?
> >>> #1. Pause the recovery
> >>> #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused'
> >>> #3. Trigger standby promotion
> >>> #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused'
> >>>
> >>> It seems not easy to do the test #4 stably because
> >>> pg_get_wal_replay_pause_state() needs to be executed
> >>> before the promotion finishes.
> >>
> >> Even for #2, we can not ensure that whether it will be 'paused' or
> >> 'pause requested'.
> > We often use poll_query_until() to make sure some desired state is
> > reached.
> 
> Yes.
> 
> >  And, as Michael suggested, the function
> > pg_get_wal_replay_pause_state() still works at the time of
> > recovery_end_command.  So a bit more detailed steps are:
> 
> IMO this idea is tricky and fragile, so I'm inclined to avoid that if

Agreed, the recovery_end_command would be something like the following
avoiding dependency on sh. However, I'm not sure it works as well on
Windows..

recovery_end_command='perl -e "while( -f \'$trigfile\') {sleep 0.1;}"'

> possible.
> Attached is the POC patch to add the following tests.
> 
> #1. Check that pg_get_wal_replay_pause_state() reports "not paused" at
> #first.
> #2. Request to pause archive recovery and wait until it's actually
> #paused.
> #3. Request to resume archive recovery and wait until it's actually
> #resumed.
> #4. Request to pause archive recovery and wait until it's actually
> #paused.
>Then, check that the paused state ends and promotion continues
>if a promotion is triggered while recovery is paused.
> 
> In #4, pg_get_wal_replay_pause_state() is not executed while promotion
> is ongoing. #4 checks that pg_is_in_recovery() returns false and
> the promotion finishes expectedly in that case. Isn't this test enough
> for now?

+1 for adding some tests for pg_wal_replay_pause() but the test seems
like checking only that pg_get_wal_replay_pause_state() returns the
expected state value.  Don't we need to check that the recovery is
actually paused and that the promotion happens at expected LSN?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Michael Paquier
On Wed, May 19, 2021 at 09:06:16AM +0200, Fabien COELHO wrote:
> I see two simple approaches:
> 
> (1) use another PRNG inside pgbench, eg Knuth's which was used in some
> previous submission and is very simple and IMHO better than the rand48
> stuff.
> 
> (2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits
> state.

Or, (3) remove this test?  I am not quite sure what there is to gain
with this extra test considering all the other tests with permute()
already present in this script.
--
Michael


signature.asc
Description: PGP signature


Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.

2021-05-19 Thread Fujii Masao



On 2021/05/19 15:25, Kyotaro Horiguchi wrote:

At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar  wrote in

On Wed, May 19, 2021 at 10:16 AM Fujii Masao
 wrote:


On 2021/05/18 15:46, Michael Paquier wrote:

On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote:

Currently a promotion causes all available WAL to be replayed before
a standby becomes a primary whether it was in paused state or not.
OTOH, something like immediate promotion (i.e., standby becomes
a primary without replaying outstanding WAL) might be useful for
some cases. I don't object to that.


Sounds like a "promotion immediate" mode.  It does not sound difficult
nor expensive to add a small test for that in one of the existing
recovery tests triggerring a promotion.  Could you add one based on
pg_get_wal_replay_pause_state()?


You're thinking to add the test like the following?
#1. Pause the recovery
#2. Confirm that pg_get_wal_replay_pause_state() returns 'paused'
#3. Trigger standby promotion
#4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused'

It seems not easy to do the test #4 stably because
pg_get_wal_replay_pause_state() needs to be executed
before the promotion finishes.


Even for #2, we can not ensure that whether it will be 'paused' or
'pause requested'.


We often use poll_query_until() to make sure some desired state is
reached.


Yes.


 And, as Michael suggested, the function
pg_get_wal_replay_pause_state() still works at the time of
recovery_end_command.  So a bit more detailed steps are:


IMO this idea is tricky and fragile, so I'm inclined to avoid that if possible.
Attached is the POC patch to add the following tests.

#1. Check that pg_get_wal_replay_pause_state() reports "not paused" at first.
#2. Request to pause archive recovery and wait until it's actually paused.
#3. Request to resume archive recovery and wait until it's actually resumed.
#4. Request to pause archive recovery and wait until it's actually paused.
   Then, check that the paused state ends and promotion continues
   if a promotion is triggered while recovery is paused.

In #4, pg_get_wal_replay_pause_state() is not executed while promotion
is ongoing. #4 checks that pg_is_in_recovery() returns false and
the promotion finishes expectedly in that case. Isn't this test enough for now?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/test/recovery/t/002_archiving.pl 
b/src/test/recovery/t/002_archiving.pl
index c675c0886c..8db7e47d13 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -6,7 +6,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 3;
+use Test::More tests => 4;
 use File::Copy;
 
 # Initialize primary node, doing archives
@@ -75,3 +75,42 @@ ok( !-f "$node_standby2_data/pg_wal/RECOVERYHISTORY",
"RECOVERYHISTORY removed after promotion");
 ok( !-f "$node_standby2_data/pg_wal/RECOVERYXLOG",
"RECOVERYXLOG removed after promotion");
+
+# Check that archive recovery can be paused or resumed expectedly.
+my $node_standby3 = get_new_node('standby3');
+$node_standby3->init_from_backup($node_primary, $backup_name,
+   has_restoring => 1);
+$node_standby3->start;
+
+# Archive recovery is not yet paused.
+is($node_standby3->safe_psql('postgres',
+   "SELECT pg_get_wal_replay_pause_state()"),
+   'not paused', 'pg_get_wal_replay_pause_state() reports not paused');
+
+# Request to pause archive recovery and wait until it's actually paused.
+$node_standby3->safe_psql('postgres', "SELECT pg_wal_replay_pause()");
+$node_primary->safe_psql('postgres',
+   "INSERT INTO tab_int VALUES (generate_series(2001,2010))");
+$node_standby3->poll_query_until('postgres',
+   "SELECT pg_get_wal_replay_pause_state() = 'paused'")
+   or die "Timed out while waiting for archive recovery to be paused";
+
+# Request to resume archive recovery and wait until it's actually resumed.
+$node_standby3->safe_psql('postgres', "SELECT pg_wal_replay_resume()");
+$node_standby3->poll_query_until('postgres',
+   "SELECT pg_get_wal_replay_pause_state() = 'not paused'")
+   or die "Timed out while waiting for archive recovery to be resumed";
+
+# Check that the paused state ends and promotion continues if a promotion
+# is triggered while recovery is paused.
+$node_standby3->safe_psql('postgres', "SELECT pg_wal_replay_pause()");
+$node_primary->safe_psql('postgres',
+   "INSERT INTO tab_int VALUES (generate_series(2011,2020))");
+$node_standby3->poll_query_until('postgres',
+   "SELECT pg_get_wal_replay_pause_state() = 'paused'")
+  or die "Timed out while waiting for archive recovery to be paused";
+
+$node_standby3->promote;
+$node_standby3->poll_query_until('postgres',
+   "SELECT NOT pg_is_in_recovery()")
+  or die "Timed out while waiting for promotion to finish";


Re: pgbench test failing on 14beta1 on Debian/i386

2021-05-19 Thread Fabien COELHO




Forgot to post the actual values:
  r = 2563421694876090368
  r = 2563421694876090365
Smells a bit like a precision problem in the workings of pg_erand48(),
but as soon as I saw floating point numbers I closed my laptop and ran
for the door.


Yup.  This test has a touching, but entirely unwarranted, faith in
pg_erand48() producing bit-for-bit the same values everywhere.


Indeed.

I argued against involving any floats computation on principle, but Dean 
was confident it could work, and it did simplify the code, so it did not 
look that bad an option.


I see two simple approaches:

(1) use another PRNG inside pgbench, eg Knuth's which was used in some 
previous submission and is very simple and IMHO better than the rand48 
stuff.


(2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits
state.

Any preference?

--
Fabien.




Re: subscriptioncheck failure

2021-05-19 Thread vignesh C
On Wed, May 19, 2021 at 10:28 AM Amit Kapila  wrote:
>
> On Tue, May 18, 2021 at 11:25 AM vignesh C  wrote:
> >
> > Thanks for the comments, the attached patch has the changes for the same.
> >
>
> Thanks, I have pushed your patch after making minor changes in the comments.

Thanks for pushing this patch.

Regards,
Vignesh




Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Amit Langote
On Wed, May 19, 2021 at 2:05 PM Michael Paquier  wrote:
> On Wed, May 19, 2021 at 10:26:28AM +0530, Amit Kapila wrote:
> > How about moving AfterTriggerEndQuery() to apply_handle_*_internal
> > calls? That way, we might not even need to change Push/Pop calls.
>
> Isn't that going to be a problem when a tuple is moved to a new
> partition in the tuple routing?  This does a DELETE followed by an
> INSERT, but the operation is an UPDATE.

That indeed doesn't work.  Once AfterTriggerEndQuery() would get
called for DELETE from apply_handle_delete_internal(), after triggers
of the subsequent INSERT can't be processed, instead causing:

ERROR:  AfterTriggerSaveEvent() called outside of query

IOW, the patch you posted earlier seems like the way to go.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.

2021-05-19 Thread Kyotaro Horiguchi
At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar  wrote 
in 
> On Wed, May 19, 2021 at 10:16 AM Fujii Masao
>  wrote:
> >
> > On 2021/05/18 15:46, Michael Paquier wrote:
> > > On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote:
> > >> Currently a promotion causes all available WAL to be replayed before
> > >> a standby becomes a primary whether it was in paused state or not.
> > >> OTOH, something like immediate promotion (i.e., standby becomes
> > >> a primary without replaying outstanding WAL) might be useful for
> > >> some cases. I don't object to that.
> > >
> > > Sounds like a "promotion immediate" mode.  It does not sound difficult
> > > nor expensive to add a small test for that in one of the existing
> > > recovery tests triggerring a promotion.  Could you add one based on
> > > pg_get_wal_replay_pause_state()?
> >
> > You're thinking to add the test like the following?
> > #1. Pause the recovery
> > #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused'
> > #3. Trigger standby promotion
> > #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused'
> >
> > It seems not easy to do the test #4 stably because
> > pg_get_wal_replay_pause_state() needs to be executed
> > before the promotion finishes.
> 
> Even for #2, we can not ensure that whether it will be 'paused' or
> 'pause requested'.

We often use poll_query_until() to make sure some desired state is
reached.  And, as Michael suggested, the function
pg_get_wal_replay_pause_state() still works at the time of
recovery_end_command.  So a bit more detailed steps are:

#0. Equip the server with recovery_end_command that waits for some
trigger then start the server.
#1. Pause the recovery
#2. Wait until pg_get_wal_replay_pause_state() returns 'paused'
#3. Trigger standby promotion
#4. Wait until pg_get_wal_replay_pause_state() returns 'not paused'
#5. Trigger recovery_end_command to let promotion proceed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center