from:"Sergei Kornilov"

Re:pg_waldump

2023-12-19 Thread Sergei Kornilov

Hello

It is something like

rmgr: Heaplen (rec/tot):143/   143, tx:748, lsn: 
0/01530AF0, prev 0/01530AB8, desc: INSERT off: 17, flags: 0x00, blkref #0: rel 
1664/0/1260 blk 0

just insertion into pg_authid (oid=1260)

regards, Sergei

Re:Making auto_explain more useful / convenient

2023-11-11 Thread Sergei Kornilov

Hello

auto_explain.log_level is available since postgresql 12.

postgres=# load 'auto_explain';
LOAD
postgres=# set auto_explain.log_min_duration to 0;
SET
postgres=# set auto_explain.log_level to 'notice';
SET
postgres=# select 1;
NOTICE:  duration: 0.010 ms  plan:
Query Text: select 1;
Result  (cost=0.00..0.01 rows=1 width=4)
 ?column? 
--
1

regards, Sergei

Doubled test for SET statements in pg_stat_statements tests

2023-11-07 Thread Sergei Kornilov

Hello!

I noticed that the same block

-- SET statements.
-- These use two different strings, still they count as one entry.
SET work_mem = '1MB';
Set work_mem = '1MB';
SET work_mem = '2MB';
RESET work_mem;
SET enable_seqscan = off;
SET enable_seqscan = on;
RESET enable_seqscan;

is checked twice in contrib/pg_stat_statements/sql/utility.sql on lines 278-286 
and 333-341. Is this on any purpose? I think the second set of tests is not 
needed and can be removed, as in the attached patch.

regards, Sergeidiff --git a/contrib/pg_stat_statements/expected/utility.out b/contrib/pg_stat_statements/expected/utility.out
index cc6e898cdf..a913421290 100644
--- a/contrib/pg_stat_statements/expected/utility.out
+++ b/contrib/pg_stat_statements/expected/utility.out
@@ -664,30 +664,3 @@ SELECT pg_stat_statements_reset();
  
 (1 row)
 
--- SET statements.
--- These use two different strings, still they count as one entry.
-SET work_mem = '1MB';
-Set work_mem = '1MB';
-SET work_mem = '2MB';
-RESET work_mem;
-SET enable_seqscan = off;
-SET enable_seqscan = on;
-RESET enable_seqscan;
-SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
- calls | rows |   query   
+--+---
- 1 |0 | RESET enable_seqscan
- 1 |0 | RESET work_mem
- 1 |1 | SELECT pg_stat_statements_reset()
- 1 |0 | SET enable_seqscan = off
- 1 |0 | SET enable_seqscan = on
- 2 |0 | SET work_mem = '1MB'
- 1 |0 | SET work_mem = '2MB'
-(7 rows)
-
-SELECT pg_stat_statements_reset();
- pg_stat_statements_reset 
---
- 
-(1 row)
-
diff --git a/contrib/pg_stat_statements/sql/utility.sql b/contrib/pg_stat_statements/sql/utility.sql
index 04598e5ae4..3fb8dde68e 100644
--- a/contrib/pg_stat_statements/sql/utility.sql
+++ b/contrib/pg_stat_statements/sql/utility.sql
@@ -329,16 +329,3 @@ DROP TABLE pgss_ctas;
 DROP TABLE pgss_select_into;
 
 SELECT pg_stat_statements_reset();
-
--- SET statements.
--- These use two different strings, still they count as one entry.
-SET work_mem = '1MB';
-Set work_mem = '1MB';
-SET work_mem = '2MB';
-RESET work_mem;
-SET enable_seqscan = off;
-SET enable_seqscan = on;
-RESET enable_seqscan;
-
-SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
-SELECT pg_stat_statements_reset();

whether to unlink the existing state.tmp file in SaveSlotToPath

2023-09-05 Thread Sergei Kornilov

Hello
I encountered a very lucky logical decoding error on the publisher:

2023-09-05 09:58:38.955 UTC 28316 melkij@postgres from [local] [vxid:3/0 
txid:0] [START_REPLICATION] LOG:  starting logical decoding for slot "pubsub"
2023-09-05 09:58:38.955 UTC 28316 melkij@postgres from [local] [vxid:3/0 
txid:0] [START_REPLICATION] DETAIL:  Streaming transactions committing after 
0/16AD5F8, reading WAL from 0/16AD5F8.
2023-09-05 09:58:38.955 UTC 28316 melkij@postgres from [local] [vxid:3/0 
txid:0] [START_REPLICATION] STATEMENT:  START_REPLICATION SLOT "pubsub" LOGICAL 
0/16AD5F8 (proto_version '4', origin 'any', publication_names '"testpub"')
2023-09-05 09:58:38.956 UTC 28316 melkij@postgres from [local] [vxid:3/0 
txid:0] [START_REPLICATION] LOG:  logical decoding found consistent point at 
0/16AD5F8
2023-09-05 09:58:38.956 UTC 28316 melkij@postgres from [local] [vxid:3/0 
txid:0] [START_REPLICATION] DETAIL:  There are no running transactions.
2023-09-05 09:58:38.956 UTC 28316 melkij@postgres from [local] [vxid:3/0 
txid:0] [START_REPLICATION] STATEMENT:  START_REPLICATION SLOT "pubsub" LOGICAL 
0/16AD5F8 (proto_version '4', origin 'any', publication_names '"testpub"')
2023-09-05 09:58:39.187 UTC 28316 melkij@postgres from [local] [vxid:3/0 
txid:0] [START_REPLICATION] ERROR:  could not create file 
"pg_replslot/pubsub/state.tmp": File exists

As I found out, the disk with the database ran out of space, but it was so 
lucky that postgresql did not go into crash recovery. Doubly lucky that logical 
walsender was able to create state.tmp, but could not write the contents and 
got "ERROR: could not write to file "pg_replslot/pubsub/state.tmp": No space 
left on device". The empty state.tmp remained on disk. When the problem with 
free disk space was solved, the publication remained inoperative. To fix it, 
one need to restart the database (RestoreSlotFromDisk always deletes state.tmp) 
or delete state.tmp manually.

Maybe in SaveSlotToPath (src/backend/replication/slot.c) it's also worth 
deleting state.tmp if it already exists? All operations are performed under 
LWLock and there should be no parallel access.

PS: I reproduced the error on HEAD by adding pg_usleep to SaveSlotToPath before 
writing to file. At this time, I filled up the virtual disk.

regards, Sergei

Re: doc: clarify the limitation for logical replication when REPILICA IDENTITY is FULL

2023-07-11 Thread Sergei Kornilov

Hello

I think it's appropriate to add on the restrictions page. (But mentioning that 
this restriction is only for subscriber)

If the list were larger, then the restrictions page could be divided into 
publisher and subscriber restrictions. But not for one very specific 
restriction.

regards, Sergei

Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value

2023-07-10 Thread Sergei Kornilov

Hello

My +1 to have such a function in core or in some contrib at least (pg_surgery? 
amcheck?).

In the past, more than once I needed to find a damaged tuple knowing only chunk 
id and toastrelid. This feature would help a lot.

regards, Sergei

Re: doc: clarify the limitation for logical replication when REPILICA IDENTITY is FULL

2023-07-10 Thread Sergei Kornilov

>> Is this restriction only for the subscriber?
>>
>> If we have not changed the replica identity and there is no primary key, 
>> then we forbid update and delete on the publication side (a fairly common 
>> usage error at the beginning of using publications).
>> If we have replica identity FULL (the table has such a column), then on the 
>> subscription side, update and delete will be performed.
> 
> In the above sentence, do you mean the publisher side?

Yep, sorry.

> But we will not be able to apply them on a subscription. Right?
> 
> If your previous sentence talks about the publisher and this sentence
> about the subscriber then what you are saying is correct. You can see
> the example in the email [1].

Thank you

>> This is an important difference for real use, when the subscriber is not 
>> necessarily postgresql - for example, debezium.
> 
> Can you explain the difference and problem you are seeing? As per my
> understanding, this is the behavior from the time logical replication
> has been introduced.

The difference is that if it's a subscriber-only restriction, then it won't 
automatically apply to anyone with a non-postgresql subscriber.
But if suddenly this would be a limitation of the publisher - then it will 
automatically apply to everyone, regardless of which subscriber is used.
(and it's a completely different problem if the restriction affects the 
update/delete themselves, not only their replication. Like as default replica 
identity on table without primary key, not in this case)

So, I suggest to mention subscriber explicitly:

+ class of Btree, then UPDATE and 
DELETE
-  operations cannot be replicated.
+ operations cannot be applied on subscriber.

Another example of difference:
Debezium users sometimes ask to set identity to FULL to get access to old 
values: https://stackoverflow.com/a/59820210/10983392
However, identity FULL is described in the documentation as: 
https://www.postgresql.org/docs/current/logical-replication-publication.html

> If the table does not have any suitable key, then it can be set to replica 
> identity “full”, which means the entire row becomes the key. This, however, 
> is very inefficient and should only be used as a fallback if no other 
> solution is possible.

But not mentioned, this would only be "very inefficient" for the subscriber, or 
would have an huge impact on the publisher too (besides writing more WAL).

regards, Sergei

Re:doc: clarify the limitation for logical replication when REPILICA IDENTITY is FULL

2023-07-10 Thread Sergei Kornilov

Hello

Is this restriction only for the subscriber?

If we have not changed the replica identity and there is no primary key, then 
we forbid update and delete on the publication side (a fairly common usage 
error at the beginning of using publications).
If we have replica identity FULL (the table has such a column), then on the 
subscription side, update and delete will be performed. But we will not be able 
to apply them on a subscription. Right?

This is an important difference for real use, when the subscriber is not 
necessarily postgresql - for example, debezium.

regards, Sergei

Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

2023-03-21 Thread Sergei Kornilov

Hello!

The documentation still describes the function pg_stat_statements_reset like 
this

>   By default, this function can only be executed by superusers.

But unfortunately, this part was lost somewhere.

-- Don't want this to be available to non-superusers.
REVOKE ALL ON FUNCTION pg_stat_statements_reset(Oid, Oid, bigint, boolean) FROM 
PUBLIC;

should be added to the upgrade script

Also, shouldn't we first do:

/* First we have to remove them from the extension */
ALTER EXTENSION pg_stat_statements DROP VIEW ..
ALTER EXTENSION pg_stat_statements DROP FUNCTION ..

like in previous upgrade scripts?

> +   Time at which min/max statistics gathering started for this
> +   statement

I think it would be better to explicitly mention in the documentation all 4 
fields for which minmax_stats_since displays the time.

regards, Sergei

Re:pg_stat_statements and "IN" conditions

2023-02-07 Thread Sergei Kornilov

Hello!

Unfortunately, rebase is needed again due to recent changes in queryjumblefuncs 
( 9ba37b2cb6a174b37fc51d0649ef73e56eae27fc )

It seems a little strange to me that with const_merge_threshold = 1, such a 
test case gives the same result as with const_merge_threshold = 2

select pg_stat_statements_reset();
set const_merge_threshold to 1;
select * from test where i in (1,2,3);
select * from test where i in (1,2);
select * from test where i in (1);
select query, calls from pg_stat_statements order by query;

query| calls 
-+---
 select * from test where i in (...) | 2
 select * from test where i in ($1)  | 1

Probably const_merge_threshold = 1 should produce only "i in (...)"?

const_merge_threshold is "the minimal length of an array" (more or equal) or 
"array .. length is larger than" (not equals)? I think the documentation is 
ambiguous in this regard.

I also noticed a typo in guc_tables.c: "Sets the minimal numer of constants in 
an array" -> number

regards, Sergei

Re:Patch: Global Unique Index

2022-11-18 Thread Sergei Kornilov

Hello
Do we need new syntax actually? I think that a global unique index can be 
created automatically instead of raising an error "unique constraint on 
partitioned table must include all partitioning columns"

regards, Sergei

Re:Ubuntu 16.04: Xenial: Why was it removed from the apt repo?

2022-09-19 Thread Sergei Kornilov

Hi
Ubuntu 16.04 is EOL from April 2021, over a year ago.

https://wiki.postgresql.org/wiki/Apt/FAQ#Where_are_older_versions_of_the_packages.3F

regards, Sergei

Re:pg_stat_statements and "IN" conditions

2022-09-16 Thread Sergei Kornilov

Hello!

Unfortunately the patch needs another rebase due to the recent split of guc.c 
(0a20ff54f5e66158930d5328f89f087d4e9ab400)

I'm reviewing a patch on top of a previous commit and noticed a failed test:

#   Failed test 'no parameters missing from postgresql.conf.sample'
#   at t/003_check_guc.pl line 82.
#  got: '1'
# expected: '0'
# Looks like you failed 1 test of 3.
t/003_check_guc.pl .. 

The new option has not been added to the postgresql.conf.sample

PS: I would also like to have such a feature. It's hard to increase 
pg_stat_statements.max or lose some entries just because some ORM sends 
requests with a different number of parameters.

regards, Sergei

Re:Remove an undefined function CalculateMaxmumSafeLSN

2022-05-21 Thread Sergei Kornilov

Hello

Yes, I already wrote about this artifact. And created CF app entry so it 
wouldn't get lost: https://commitfest.postgresql.org/38/3616/

regards, Sergei

Re:Proposal: array_unique_agg() function

2022-03-01 Thread Sergei Kornilov

Hello

select array_agg(distinct mycolumn) from 

from the very beginning? Even the 7.1 manual describes such a syntax: 
https://www.postgresql.org/docs/7.1/sql-expressions.html

regards, Sergei

definition of CalculateMaxmumSafeLSN

2022-02-28 Thread Sergei Kornilov

Hello
I just spotted in src/include/access/xlog.h:
extern XLogRecPtr CalculateMaxmumSafeLSN(void);

This function doesn't seem to be used anywhere or even defined? "git grep 
CalculateMaxmumSafeLSN" shows only this line.
Was added in c6550776394e25c1620bc8258427c8f1d448080d "Allow users to limit 
storage reserved by replication slots"

regards, Sergei

Re:unexpected plan with id = any('{}') condition

2021-11-11 Thread Sergei Kornilov

Hello

Thank you for the explanation!

> unreadable HTML mess

ouch, sorry. "Nobody uses plain text mail, we dropped this thing in the 
interface” said yandex team. (I know that some members of the Yandex team read 
mailing lists, could you ping your colleagues?)

> but I'm not convinced we want to expend
> extra cycles looking for such cases.

Agree. This is an application bug and needs to be fixed there. There is no 
point in slowing down all queries for the sake of improving an inherently 
erroneous query condition.

regards, Sergei

unexpected plan with id = any('{}') condition

2021-11-11 Thread Sergei Kornilov

Hello I have such case: create table test (id int not null, status text);insert into test select i, 'foo' from generate_series(1,100) i;update test set status = 'bar' where id <= 10;create index test_id on test (id );create index test_status_partial on test (status) where status = 'bar';analyze test ;explain (analyze) select * from test where id = any('{}'); Gives query plan:    QUERY PLAN    -- Index Scan using test_status_partial on test  (cost=0.12..4.14 rows=1 width=8) (actual time=0.024..0.025 rows=0 loops=1)   Filter: (id = ANY ('{}'::integer[]))   Rows Removed by Filter: 10 Planning Time: 0.327 ms Execution Time: 0.048 ms I don't understand why the planner chose such an unrelated partial index. I expected "One-Time Filter: false" here or use of test_id index. I agree, a strange condition, the application should avoid such condition, but why use such an index? Initially was spotted on 13.3 production system (slow query due too much Rows Removed by Filter), then I checked this behaviour on 14.0 and fresh HEAD (db9f287711ac49d9799f93f664d6d101ff8f5891) regards, Sergei

Re: Confused with PostgreSQL on Synology NAS

2021-11-10 Thread Sergei Kornilov

Hello postgresql uses hba_file configuration parameter: https://www.postgresql.org/docs/current/runtime-config-file-locations.htmlSo could be changed in postgresql.conf regards, Sergei

Re: extensible options syntax for replication parser?

2021-09-22 Thread Sergei Kornilov

Hello

Thanks, I missed this thread.

> +CHECKPOINT { 'fast' | 'spread' 
> }

Unpaired  tag in docs.

That was all I noticed in 0001. Still not sure where is the difference between 
"change NOWAIT to WAIT" and "change NOWAIT to something else descriptive". But 
fine, I can live with WAIT. (one note: the exact command is visible to the user 
when log_replication_commands is enabled, not completely hidden)

0002 breaks "create subscription (with create_slot true)" when the publish 
server is an earlier version:

postgres=# create subscription test CONNECTION 'host=127.0.0.1 user=postgres' 
PUBLICATION test with (create_slot = true);
ERROR:  could not create replication slot "test": ERROR:  syntax error

regards, Sergei

Re: refactoring basebackup.c

2021-09-14 Thread Sergei Kornilov

Hello

I found that in 0001 you propose to rename few options. Probably we could 
rename another option for clarify? I think FAST (it's about some bw limits?) 
and WAIT (wait for what? checkpoint?) option names are confusing.
Could we replace FAST with "CHECKPOINT [fast|spread]" and WAIT to 
WAIT_WAL_ARCHIVED? I think such names would be more descriptive.

-   if (PQserverVersion(conn) >= 10)
-   /* pg_recvlogical doesn't use an exported snapshot, so 
suppress */
-   appendPQExpBufferStr(query, " NOEXPORT_SNAPSHOT");
+   /* pg_recvlogical doesn't use an exported snapshot, so suppress 
*/
+   if (use_new_option_syntax)
+   AppendStringCommandOption(query, use_new_option_syntax,
+  
"SNAPSHOT", "nothing");
+   else
+   AppendPlainCommandOption(query, use_new_option_syntax,
+
"NOEXPORT_SNAPSHOT");

In 0002, it looks like condition for 9.x releases was lost?

Also my gcc version 8.3.0 is not happy with 
v5-0007-Support-base-backup-targets.patch and produces:

basebackup.c: In function ‘parse_basebackup_options’:
basebackup.c:970:7: error: ‘target_str’ may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
   errmsg("target '%s' does not accept a target detail",
   ^~

regards, Sergei

Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)

2021-06-03 Thread Sergei Kornilov

Hello

I build gcc version 12.0.0 20210603 (experimental) locally, then tried 
REL_13_STABLE with same CFLAGS as serinus
./configure --prefix=/home/melkij/tmp/pgdev/inst CFLAGS="-O3 -ggdb -g3 -Wall 
-Wextra -Wno-unused-parameter -Wno-sign-compare 
-Wno-missing-field-initializers" --enable-tap-tests --enable-cassert 
--enable-debug
check-world passed.

regards, Sergei

Re: hint Consider using pg_file_read()

2021-03-18 Thread Sergei Kornilov

I noticed that the fix has been committed, thank you!

regards, Sergei

hint Consider using pg_file_read()

2021-03-18 Thread Sergei Kornilov

Hello
In src/backend/utils/adt/genfile.c in pg_read_file we have errhint

> errhint("Consider using %s, which is part of core, instead.",
>"pg_file_read()")

Shouldn't pg_read_file() be written here?

regards, Sergei

Re: Improve handling of parameter differences in physical replication

2021-01-15 Thread Sergei Kornilov

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

Hello
Look good for me. I think the patch is ready for commiter.

The new status of this patch is: Ready for Committer

Lock level of create table partition of

2020-12-17 Thread Sergei Kornilov

Hello

In commit 898e5e3290a72d288923260143930fb32036c00c [1] we lowered the lock 
level on the parent relation. I found in discussion [2]:

> David Rowley recently pointed out that we can modify
> CREATE TABLE ..  PARTITION OF to likewise not obtain AEL anymore.
> Apparently it just requires removal of three lines in MergeAttributes.

But on current HEAD "create table ... partition of" still require 
AccessExclusiveLock on the parent relation. It's neccessary?

regards, Sergei

[1]: 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=898e5e3290a72d288923260143930fb32036c00c
[2]: 
https://www.postgresql.org/message-id/20181025202622.d3x4y4ch7m4pxwnn%40alvherre.pgsql

Re: pg_stat_statements oddity with track = all

2020-12-04 Thread Sergei Kornilov

Hello

Seems we need also change PGSS_FILE_HEADER.

regards, Sergei

Re: pg_stat_statements oddity with track = all

2020-12-03 Thread Sergei Kornilov

Hello

> To get an increase in the number of records that means that the same
> statement
> would appear at top level AND nested level. This seems a corner case with
> very low
> (neglectible) occurence rate.

+1
I think splitting fields into plans_toplevel / plans_nested will be less 
convenient. And more code with higher chance of copypaste errors

regards, Sergei

Re: pg_stat_statements oddity with track = all

2020-12-02 Thread Sergei Kornilov

Hello

> - add a parent_statement_id column that would be NULL for top level queries

Will generate too much entries... Every FK for each different delete/insert, 
for example.
But very useful for databases with a lot of stored procedures to find where 
this query is called. May be new mode track = tree? Use NULL to indicate a 
top-level query (same as with track=tree) and some constant for any nested 
queries when track = all.

Also, currently a top statement will account buffers usage for underlying 
statements?

regards, Sergei

Re: Allow some recovery parameters to be changed with reload

2020-12-01 Thread Sergei Kornilov

Hello

> OK, so I pushed the patch. Thanks!

Thank you!

regards, Sergei

Re: Stronger safeguard for archive recovery not to miss data

2020-11-25 Thread Sergei Kornilov

Hello

> First of all, I confirmed the server started up without this patch.

It is possible only with manually configured hot_standby = off, right?
We have ERROR in hot standby mode just below in CheckRequiredParameterValues.

regards, Sergei

Re: Improve handling of parameter differences in physical replication

2020-11-20 Thread Sergei Kornilov

Hello

> I think I like "unpaused" better here, because "resumed" would seem to
> imply that recovery can actually continue.

Good, I agree.

> One thing that has not been added to my patch is the equivalent of
> 496ee647ecd2917369ffcf1eaa0b2cdca07c8730, which allows promotion while
> recovery is paused. I'm not sure that would be necessary, and it
> doesn't look easy to add either.

Hmm... Good question. How about putting CheckForStandbyTrigger() in a wait 
loop, but reporting FATAL with an appropriate message, such as "promotion is 
not possible because of insufficient parameter settings"?
Also it suits me if we only document that we ignore promote here. I don't think 
this is an important case. And yes, it's not easy to allow promotion, since we 
have already updated control file.

Probably we need pause only after we reached consistency?

2020-11-20 18:10:23.617 MSK 19722 @ from  [vxid: txid:0] [] LOG:  entering 
standby mode
2020-11-20 18:10:23.632 MSK 19722 @ from  [vxid: txid:0] [] WARNING:  hot 
standby is not possible because of insufficient parameter settings
2020-11-20 18:10:23.632 MSK 19722 @ from  [vxid: txid:0] [] DETAIL:  
max_connections = 100 is a lower setting than on the primary server, where its 
value was 150.
2020-11-20 18:10:23.632 MSK 19722 @ from  [vxid: txid:0] [] LOG:  recovery has 
paused
2020-11-20 18:10:23.632 MSK 19722 @ from  [vxid: txid:0] [] DETAIL:  If 
recovery is unpaused, the server will shut down.
2020-11-20 18:10:23.632 MSK 19722 @ from  [vxid: txid:0] [] HINT:  You can then 
restart the server after making the necessary configuration changes.
2020-11-20 18:13:09.767 MSK 19755 melkij@postgres from [local] [vxid: txid:0] 
[] FATAL:  the database system is starting up

regards, Sergei

Re: Improve handling of parameter differences in physical replication

2020-11-19 Thread Sergei Kornilov

Hello

Thank you! I'm on vacation, so I was finally able to review the patch.

Seems WAIT_EVENT_RECOVERY_PAUSE addition was lost during patch simplification.

>   ereport(FATAL,
>   (errmsg("recovery aborted because of 
> insufficient parameter settings"),
>errhint("You can restart the server after 
> making the necessary configuration changes.")));

I think we should repeat here conflicted param_name and minValue. 
pg_wal_replay_resume can be called days after recovery being paused. The 
initial message can be difficult to find.

> errmsg("recovery will be paused")

May be use the same "recovery has paused" as in recoveryPausesHere? It doesn't 
seem to make any difference since we set pause right after that, but there will 
be a little less work translators.

Not sure about "If recovery is unpaused". The word "resumed" seems to have been 
usually used in docs.

regards, Sergei

generated columns in MergeAttributesIntoExisting

2020-11-12 Thread Sergei Kornilov

Hello

I found a simple test case:

CREATE TABLE test(id int NOT NULL, gen int GENERATED ALWAYS AS (id + 1) STORED) 
partition by range (id);
create table test_part_create partition of test for values from ( 0) to (10);
create table test_part_attach (id int NOT NULL, gen int);
alter table test attach partition test_part_attach for values from (10) to (20);

postgres=# \d test_part_create 
Table "public.test_part_create"
 Column |  Type   | Collation | Nullable |   Default   
+-+---+--+-
 id | integer |   | not null | 
 gen| integer |   |  | generated always as (id + 1) stored
Partition of: test FOR VALUES FROM (0) TO (10)

postgres=# \d test_part_attach 
  Table "public.test_part_attach"
 Column |  Type   | Collation | Nullable | Default 
+-+---+--+-
 id | integer |   | not null | 
 gen| integer |   |  | 
Partition of: test FOR VALUES FROM (10) TO (20)

Both partitions are attached, but alter table attach patition did not check nor 
enforce a generated column. Same for table inheritance stuff. While looking at 
MergeAttributesIntoExisting in src/backend/commands/tablecmds.c I did not 
notice any special handling or comments for attgenerated. It's an oversight and 
a bug?

Also,
postgres=# alter table test alter COLUMN gen drop expression ;
ERROR:  column "gen" of relation "test_part_attach" is not a stored generated 
column

Regards, Sergei

Re: Allow some recovery parameters to be changed with reload

2020-11-11 Thread Sergei Kornilov

Hello

> Anyway, for now I think that your first patch would be enough, i.e.,
> just change the context of restore_command to PGC_SIGHUP.

Glad to hear. Attached a rebased version of the original proposal.

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f043433e31..ec74cb43ad 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3542,7 +3542,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows

 

-This parameter can only be set at server start.
+This parameter can only be set in the postgresql.conf
+file or on the server command line.

   
  
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index bb34630e8e..a732279f52 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3699,7 +3699,7 @@ static struct config_string ConfigureNamesString[] =
 	},
 
 	{
-		{"restore_command", PGC_POSTMASTER, WAL_ARCHIVE_RECOVERY,
+		{"restore_command", PGC_SIGHUP, WAL_ARCHIVE_RECOVERY,
 			gettext_noop("Sets the shell command that will be called to retrieve an archived WAL file."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9cb571f7cc..9c9091e601 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -253,7 +253,6 @@
 # placeholders: %p = path of file to restore
 #   %f = file name only
 # e.g. 'cp /mnt/server/archivedir/%f %p'
-# (change requires restart)
 #archive_cleanup_command = ''	# command to execute at every restartpoint
 #recovery_end_command = ''	# command to execute at completion of recovery

Re: Allow some recovery parameters to be changed with reload

2020-11-10 Thread Sergei Kornilov

Hello

> Even if PITR is commanded, crash recovery can run before starting
> archive recovery if the server was not gracefully shut down.

Hmm... Still not sure how it's possible. Both readRecoverySignalFile and 
validateRecoveryParameters are called early in StartupXLOG. If PITR was 
commanded - we follow PITR logic. If requested recovery stop point is before 
consistent recovery point we shutdown the database with another FATAL.
I mean such place:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/transam/xlog.c;h=9d3f1c12fc56f61da4d2b9bf08c54d31b9757ef7;hb=29be9983a64c011eac0b9ee29895cce71e15ea77#l6891
If we start recovery by any reason and a archive recovery was requested - we 
start archive recovery instead of crash recovery.

> I don't know. I just think that it is not proper that "ALTER SYSTEM" +
> config-reload causes server stop.

I got your point. How about pause the recovery process? Like proposed in 
https://commitfest.postgresql.org/30/2489/
For example,
* restore_command become empty on SIGHUP while PITR was requested
* we set recovery to pause
* if user call pg_wal_replay_resume and restore_command is still empty - we 
shutdown the database
* if user fix restore_command - we continue restore.

But it seems complicated if we just don't need special handling here. We still 
require restore_command to be set to start recovery. In case the user later 
wants to set the restore_command to empty - let's assume that's correct (FATAL 
if PITR target is after the end of local pg_wal, promote otherwise).

>>  Why not use local pg_wal? There may be already enough WAL.
>
> Mmm. If the file to read is in pg_wal, restore_command won't be
> executed in the first place?

Startup process will call restore_command in any case regardless of pg_wal 
content. (xlogarchive.c, RestoreArchivedFile)

> * When doing archive recovery, we always prefer an archived log file even
> * if a file of the same name exists in XLOGDIR.  The reason is that the
> * file in XLOGDIR could be an old, un-filled or partly-filled version
> * that was copied and restored as part of backing up $PGDATA.

regards, Sergei

Re: Allow some recovery parameters to be changed with reload

2020-11-10 Thread Sergei Kornilov

Hello

> If someone changes restore_command to '' then reload while crash
> recovery is running, the server stops for no valid reason.

While *crash* recovery is running? It's possible only during Point-in-Time 
Recovery, no?
At the beginning of validateRecoveryParameters we check 
ArchiveRecoveryRequested, which can be set in two cases:

* if recovery.signal found - same check on recovery start. Otherwise it is 
possible to early end recovery because of empty restore_command. So we want to 
protect the user from such misconfiguration? I am fine if we decide that no 
additional handling is needed.
* if standby.signal found - this FATAL is not reachable because 
StandbyModeRequested is also set.

During crash recovery validateRecoveryParameters does nothing.

> If restore_command is set to 'hoge' (literally:p, that is, anything
> unexecutable) and send SIGHUP while archive recovery is running, the
> server stops. I think we need to handle these cases more gracefully,

I think we can not perform such check reliable. As in my example earlier:

> restore_command = '. /etc/wal-g/WALG_AWS_ENV; wal-g wal-fetch "%f" "%p"'

How do we find the commands first? For any shell? And even: we learned that the 
binary is unexecutable. But what to do next?

> If someone changes restore_command by mistake to something executable
> but fails to offer the specfied file even if it exists, the running
> archive recovery finishes then switches timeline unexpectedly.

Or executable file was just removed. Which is clearly a pilot error. Is this 
differs from changing restore_command?

>>  I do not know the history of this fatal ereport. It looks like "must 
>> specify restore_command when standby mode is not enabled" check is only 
>> intended to protect the user from misconfiguration and the rest code will 
>> treat empty restore_command correctly, just like /bin/false. Did not notice 
>> anything around StandbyMode conditions.
>
> If restore_command is not changable after server-start, it would be
> valid for startup to stop for inexecutable content for the variable
> since there's no way to proceed recovery.

Why not use local pg_wal? There may be already enough WAL.

regards, Sergei

Re: Allow some recovery parameters to be changed with reload

2020-11-06 Thread Sergei Kornilov

Hello

> I'm wondering if it's safe to allow restore_command to be emptied during
> archive recovery. Even when it's emptied, archive recovery can proceed
> by reading WAL files from pg_wal directory. This is the same behavior as
> when restore_command is set to, e.g., /bin/false.

I am always confused by this implementation detail. restore_command fails? 
Fine, let's just read file from pg_wal. But this is different topic...

I do not know the history of this fatal ereport. It looks like "must specify 
restore_command when standby mode is not enabled" check is only intended to 
protect the user from misconfiguration and the rest code will treat empty 
restore_command correctly, just like /bin/false. Did not notice anything around 
StandbyMode conditions.

regards, Sergei

Re: Allow some recovery parameters to be changed with reload

2020-11-06 Thread Sergei Kornilov

Hello

> Currently when restore_command is not set, archive recovery fails
> at the beginning. With the patch, how should we treat the case where
> retore_command is reset to empty during archive recovery? We should
> reject that change of restore_command?

Good point. I think we should reject that change. But (AFAIC) I cannot use GUC 
check callback for this purpose, as only the startup process knows 
StandbyModeRequested. I think it would be appropriate to call 
validateRecoveryParameters from StartupRereadConfig. As side effect this add 
warning/hint "specified neither primary_conninfo nor restore_command" in 
standby mode in appropriate configuration state. Not sure about the rest checks 
in validateRecoveryParameters, maybe it's a wrong idea to recheck them here and 
I need to separate these checks into another function.

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f043433e31..ec74cb43ad 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3542,7 +3542,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows

 

-This parameter can only be set at server start.
+This parameter can only be set in the postgresql.conf
+file or on the server command line.

   
  
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a1078a7cfc..148dd34633 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -883,7 +883,6 @@ static MemoryContext walDebugCxt = NULL;
 #endif
 
 static void readRecoverySignalFile(void);
-static void validateRecoveryParameters(void);
 static void exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog);
 static bool recoveryStopsBefore(XLogReaderState *record);
 static bool recoveryStopsAfter(XLogReaderState *record);
@@ -5458,7 +5457,7 @@ readRecoverySignalFile(void)
  errmsg("standby mode is not supported by single-user servers")));
 }
 
-static void
+void
 validateRecoveryParameters(void)
 {
 	if (!ArchiveRecoveryRequested)
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index eab9c8c4ed..bf7dc866db 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -114,6 +114,11 @@ StartupRereadConfig(void)
 
 	if (conninfoChanged || slotnameChanged || tempSlotChanged)
 		StartupRequestWalReceiverRestart();
+
+	/*
+	 * Check the combination of new parameters
+	 */
+	validateRecoveryParameters();
 }
 
 /* Handle various signals that might be sent to the startup process */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index a62d64eaa4..87fd593924 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3699,7 +3699,7 @@ static struct config_string ConfigureNamesString[] =
 	},
 
 	{
-		{"restore_command", PGC_POSTMASTER, WAL_ARCHIVE_RECOVERY,
+		{"restore_command", PGC_SIGHUP, WAL_ARCHIVE_RECOVERY,
 			gettext_noop("Sets the shell command that will be called to retrieve an archived WAL file."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9cb571f7cc..9c9091e601 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -253,7 +253,6 @@
 # placeholders: %p = path of file to restore
 #   %f = file name only
 # e.g. 'cp /mnt/server/archivedir/%f %p'
-# (change requires restart)
 #archive_cleanup_command = ''	# command to execute at every restartpoint
 #recovery_end_command = ''	# command to execute at completion of recovery
 
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 221af87e71..965ed109b3 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -347,6 +347,7 @@ extern void WakeupRecovery(void);
 extern void SetWalWriterSleeping(bool sleeping);
 
 extern void StartupRequestWalReceiverRestart(void);
+extern void validateRecoveryParameters(void);
 extern void XLogRequestWalReceiverReply(void);
 
 extern void assign_max_wal_size(int newval, void *extra);

Re: Allow some recovery parameters to be changed with reload

2020-10-28 Thread Sergei Kornilov

Hello

Sorry for late response.

>>  > ... but what's the corresponding hazard here, exactly? It doesn't seem
>>  > that there's any way in which the decision one process makes affects
>>  > the decision the other process makes. There's still a race condition:
>>  > it's possible for a walsender
>>  Did you mean walreceiver here?
>
> It's logical walsender. restore_command is used within
> logical_read_xlog_page() via XLogReadDetermineTimeline().

Still have no idea what's the corresponding hazard here.

>>  > to use the old restore_command after the
>>  > startup process had already used the new one, or the other way around.
>>  > However, it doesn't seem like that should confuse anything inside the
>>  > server, and therefore I'm not sure we need to code around it.
>>  I came up with following scenario. Let's say we have xlog files 1,2,3
>>  in dir1 and files 4,5 in dir2. If startup process had only handled
>>  files 1 and 2, before we switched restore_command from reading dir1 to
>>  reading dir2, it will fail to find next file. IIUC, it will assume
>>  that recovery is done, start server and walreceiver. The walreceiver
>>  will fail as well. I don't know, how realistic is this case, though.
>
> That operation is somewhat bogus, if the server is not in standby
> mode. In standby mode, startup waits for the next segment safely.

I think it's pilot error. It is already possible to change anything in 
restore_command by wrapping real command into some script:

> restore_command = '/bin/restore_wal.sh "%f" "%p"'

And one can simple replace this file with something else with different logic. 
Or even by using some command with separate own settings. Real world example ( 
https://github.com/wal-g/wal-g ):

> restore_command = '. /etc/wal-g/WALG_AWS_ENV; wal-g wal-fetch "%f" "%p"'

And it is possible to change the real WAL source in ENV script without changing 
the restore_command. We can't track this, so I not see new issues here.

>>  Sergey, could you please attach this thread to the upcoming CF, if
>>  you're going to continue working on it.

Sure, I created one: https://commitfest.postgresql.org/30/2802/

>>  - How will it interact with possible future optimizations of archive
>>  - restore? For example, WAL prefetch [1].

Shouldn't we ask the author of such a patch and not me? In particular, does 
this patch rely on the restore_command not being changed? Probably some form of 
synchronisation would be neccesary in infrastructure for parallel executing 
restore commands. On / off handling of restore_command will most likely be 
required. I did not review this patch.

regards, Sergei

Re: allow online change primary_conninfo

2020-10-16 Thread Sergei Kornilov

Hello

Yep, I think it's useful and I already posted patch in this thread: 
https://www.postgresql.org/message-id/flat/3090621585393698%40myt5-1466095fe4e5.qloud-c.yandex.net#ee6574e93982b5628d140f15cb44
Currently without consensus

regards, Sergei

Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread Sergei Kornilov

Hi

> Sergei, a few questions:
>
> - Just to be clear, your recipe does not require any indexes, right? Because 
> the constraint check table scan is inherently concurrent?

Right. "alter table validate constraint" can not use indexes, but does not 
block concurrent read/write queries. Other queries in this scenario can not use 
indexes too, but should be fast.

> - Was this new behavior mentioned in the release nose?

Yes, small note in documentation and small note in release notes 
https://www.postgresql.org/docs/12/release-12.html this one:

> Allow ALTER TABLE ... SET NOT NULL to avoid unnecessary table scans (Sergei 
> Kornilov)
> This can be optimized when the table's column constraints can be recognized 
> as disallowing nulls.

> - Do you know if there are any blog posts etc. discussing this? (I'm 
> definitely going to write one!!)

I do not know. Personally, I mentioned this feature in only a few 
Russian-speaking communities. And right now I wrote answer in dba.SO.

regards, Sergei

Re: feature idea: use index when checking for NULLs before SET NOT NULL

2020-05-29 Thread Sergei Kornilov

Hello

Correct index lookup is a difficult task. I tried to implement this 
previously...

But the answer in SO is a bit incomplete for recent postgresql releases. 
Seqscan is not the only possible way to set not null in pg12+. My patch was 
commited ( https://commitfest.postgresql.org/22/1389/ ) and now it's possible 
to do this way:

alter table foos 
 add constraint foos_not_null 
 check (bar1 is not null) not valid; -- short-time exclusive lock

alter table foos validate constraint foos_not_null; -- still seqscan entire 
table but without exclusive lock

An then another short lock:
alter table foos alter column bar1 set not null;
alter table foos drop constraint foos_not_null;

regards, Sergei

Re: Strange decreasing value of pg_last_wal_receive_lsn()

2020-05-08 Thread Sergei Kornilov

Hello

Yes, this is expected. Walreceiver always start streaming from beginning of the 
wal segment.
./src/backend/replication/walreceiverfuncs.c in RequestXLogStreaming:

 * We always start at the beginning of the segment. That prevents a 
broken
 * segment (i.e., with no records in the first half of a segment) from
 * being created by XLOG streaming, which might cause trouble later on 
if
 * the segment is e.g archived.

regards, Sergei

Re: Why no "array_sort" function?

2020-05-08 Thread Sergei Kornilov

Hello

> mostly in an aggregation to show
> in a compact way what items were grouped together.

Aggregate functions have syntax for ordering: just "select array_agg(i order by 
i) from "
Described here: 
https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES

regards, Sergei

Re: recovery_target_action=pause with confusing hint

2020-03-31 Thread Sergei Kornilov

    My proposal does not change the behavior after this commit, only 
 changing the lines in the logs.
>>>
>>>  Yes. What's your opinion about the latest patch based on Robert's idea?
>>>  Barring any ojection, I'd like to commit that.
>>
>>  Oh, sorry. Looks good and solves my issue
>
> Thanks for reviewing the patch! Pushed!

Thank you!

regards, Sergei

Re: recovery_target_action=pause with confusing hint

2020-03-31 Thread Sergei Kornilov

Hello

>>  My proposal does not change the behavior after this commit, only changing 
>> the lines in the logs.
>
> Yes. What's your opinion about the latest patch based on Robert's idea?
> Barring any ojection, I'd like to commit that.

Oh, sorry. Looks good and solves my issue

regards, Sergei

Re: recovery_target_action=pause with confusing hint

2020-03-31 Thread Sergei Kornilov

Hello

> When I test the patch, I find an issue: I start a stream with 
> 'promote_trigger_file'
> GUC valid, and exec pg_wal_replay_pause() during recovery and as below it
> shows success to pause at the first time. I think it use a initialize
> 'SharedPromoteIsTriggered' value first time I exec the pg_wal_replay_pause().

hm. Are you sure this is related to this patch? Could you explain the exact 
timing? I mean log_statement = all and relevant logs.
Most likely this is expected change by 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=496ee647ecd2917369ffcf1eaa0b2cdca07c8730

My proposal does not change the behavior after this commit, only changing the 
lines in the logs.

regards, Sergei

Re: Allow some recovery parameters to be changed with reload

2020-03-28 Thread Sergei Kornilov

Hello

I want to return to this discussion, since primary_conninfo is now PGC_SIGHUP 
(and I hope will not be reverted)

> On 2019-02-08 09:19:31 +0900, Michael Paquier wrote:
>>  On Thu, Feb 07, 2019 at 11:06:27PM +0100, Peter Eisentraut wrote:
>>  > Probably right. I figured it would be useful to see what the outcome is
>>  > with primary_conninfo, so they can be treated similarly.
>>
>>  The interactions with waiting for WAL to be available and the WAL
>>  receiver stresses me a bit for restore_command, as you could finish
>>  with the startup process switching to use restore_command with a WAL
>>  receiver still working behind and overwriting partially the recovered
>>  segment, which could lead to corruption. We should be *very* careful
>>  about that.
>
> I'm not clear on the precise mechanics you're imagining here, could you
> expand a bit? We kill the walreceiver when switching from receiver to
> restore command, and wait for it to acknowledge that, no?
> C.F. ShutdownWalRcv() call in the lastSourceFailed branch of
> WaitForWALToBecomeAvailable().

So...
We call restore_command only when walreceiver is stopped.
We use restore_command only in startup process - so we have no race condition 
between processes.
We have some issues here? Or we can just make restore_command reloadable as 
attached?

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2de21903a1..454bf95d9b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3303,7 +3303,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows

 

-This parameter can only be set at server start.
+This parameter can only be set in the postgresql.conf
+file or on the server command line.

   
  
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 79bc7ac8ca..f340369dcf 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3641,7 +3641,7 @@ static struct config_string ConfigureNamesString[] =
 	},
 
 	{
-		{"restore_command", PGC_POSTMASTER, WAL_ARCHIVE_RECOVERY,
+		{"restore_command", PGC_SIGHUP, WAL_ARCHIVE_RECOVERY,
 			gettext_noop("Sets the shell command that will retrieve an archived WAL file."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e9f8ca775d..8078249e1f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -247,7 +247,6 @@
 # placeholders: %p = path of file to restore
 #   %f = file name only
 # e.g. 'cp /mnt/server/archivedir/%f %p'
-# (change requires restart)
 #archive_cleanup_command = ''	# command to execute at every restartpoint
 #recovery_end_command = ''	# command to execute at completion of recovery

Re: allow online change primary_conninfo

2020-03-28 Thread Sergei Kornilov

Hello

Thank you very much!

I attached updated patch: walreceiver will use configured primary_slot_name as 
temporary slot name if wal_receiver_create_temp_slot is enabled.

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2de21903a1..8983cb5f5e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4170,8 +4170,8 @@ ANY num_sync ( ).
+slot on the remote instance. The slot name can be configured
+(using ), otherwise it will be generated.
 The default is off.  This parameter can only be set in the 
 postgresql.conf file or on the server command line.
 If this parameter is changed while the WAL receiver process is running,
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index fd9ac35dac..134600db7d 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -116,13 +116,8 @@ StartupRereadConfig(void)
 
 	conninfoChanged = strcmp(conninfo, PrimaryConnInfo) != 0;
 	slotnameChanged = strcmp(slotname, PrimarySlotName) != 0;
+	tempSlotChanged = (tempSlot != wal_receiver_create_temp_slot);
 
-	/*
-	 * wal_receiver_create_temp_slot is used only when we have no slot
-	 * configured.  We do not need to track this change if it has no effect.
-	 */
-	if (!slotnameChanged && strcmp(PrimarySlotName, "") == 0)
-		tempSlotChanged = tempSlot != wal_receiver_create_temp_slot;
 	pfree(conninfo);
 	pfree(slotname);
 
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 760e3c7ab0..303e739b5b 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -245,12 +245,6 @@ WalReceiverMain(void)
 	startpoint = walrcv->receiveStart;
 	startpointTLI = walrcv->receiveStartTLI;
 
-	/*
-	 * At most one of is_temp_slot and slotname can be set; otherwise,
-	 * RequestXLogStreaming messed up.
-	 */
-	Assert(!is_temp_slot || (slotname[0] == '\0'));
-
 	/* Initialise to a sanish value */
 	walrcv->lastMsgSendTime =
 		walrcv->lastMsgReceiptTime = walrcv->latestWalEndTime = now;
@@ -365,14 +359,14 @@ WalReceiverMain(void)
 
 		/*
 		 * Create temporary replication slot if requested, and update slot
-		 * name in shared memory.  (Note the slot name cannot already be set
-		 * in this case.)
+		 * name in shared memory.
 		 */
 		if (is_temp_slot)
 		{
-			snprintf(slotname, sizeof(slotname),
-	 "pg_walreceiver_%lld",
-	 (long long int) walrcv_get_backend_pid(wrconn));
+			if (slotname[0] == '\0')
+snprintf(slotname, sizeof(slotname),
+		"pg_walreceiver_%lld",
+		(long long int) walrcv_get_backend_pid(wrconn));
 
 			walrcv_create_slot(wrconn, slotname, true, 0, NULL);
 
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 21d1823607..28117883e1 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -218,7 +218,7 @@ ShutdownWalRcv(void)
  * "recptr" indicates the position where streaming should begin.  "conninfo"
  * is a libpq connection string to use.  "slotname" is, optionally, the name
  * of a replication slot to acquire.  "create_temp_slot" indicates to create
- * a temporary slot when no "slotname" is given.
+ * a temporary slot.
  *
  * WAL receivers do not directly load GUC parameters used for the connection
  * to the primary, and rely on the values passed down by the caller of this
@@ -254,23 +254,17 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
 	else
 		walrcv->conninfo[0] = '\0';
 
-	/*
-	 * Use configured replication slot if present, and ignore the value
-	 * of create_temp_slot as the slot name should be persistent.  Otherwise,
-	 * use create_temp_slot to determine whether this WAL receiver should
-	 * create a temporary slot by itself and use it, or not.
-	 */
 	if (slotname != NULL && slotname[0] != '\0')
 	{
 		strlcpy((char *) walrcv->slotname, slotname, NAMEDATALEN);
-		walrcv->is_temp_slot = false;
 	}
 	else
 	{
 		walrcv->slotname[0] = '\0';
-		walrcv->is_temp_slot = create_temp_slot;
 	}
 
+	walrcv->is_temp_slot = create_temp_slot;
+
 	if (walrcv->walRcvState == WALRCV_STOPPED)
 	{
 		launch = true;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 79bc7ac8ca..e702e3eddc 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2051,7 +2051,7 @@ static struct config_bool ConfigureNamesBool[] =
 
 	{
 		{"wal_receiver_create_temp_slot", PGC_SIGHUP, REPLICATION_STANDBY,
-			gettext_noop("Sets whether a WAL receiver should create a temporary replication slot if no permanent slot is configured."),
+			gettext_noop("Sets whether a WAL receiver should create a temporary replication slot."),
 		},
 		_receiver_create_temp_slot,
 		false,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e9f8ca775d..e09be86940 100644
---

Re: allow online change primary_conninfo

2020-03-27 Thread Sergei Kornilov

Hello

Thank you!

> I think I should set aside your new draft for now

I agree, this patch definitely needs a bit more time to review. (currently it 
applies on top of v13 patch cleanly)

> but I think we should still get it in pg13 to avoid having the change the 
> semantics of the
> walreceiver temp slot in the next release.

Good point.

regards, Sergei

Re: Improve handling of parameter differences in physical replication

2020-03-27 Thread Sergei Kornilov

Hello

> I think we can set wait event WAIT_EVENT_RECOVERY_PAUSE here.

+1, since we added this in recoveryPausesHere.

PS: do we need to add a prototype for the RecoveryRequiredIntParameter function 
in top of xlog.c?

regards, Sergei

Re: allow online change primary_conninfo

2020-03-27 Thread Sergei Kornilov

Hello

In other words, patches in reverse order:
0001 will change primary_conninfo and primary_slot_name to reloadable
0002 will move wal_receiver_create_temp_slot logic to startup process (without 
changing to PGC_POSTMASTER)
0003 is new draft patch: wal_receiver_create_temp_slot will use the given name 
or generate new one.

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c5160fe907..d18921725c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4171,9 +4171,9 @@ ANY num_sync ( ).
-The default is off.  This parameter can only be set in the 
+slot on the remote instance. The slot will be created with the name
+configured by  setting, otherwise
+with the generated name. The default is off.  This parameter can only be set in the 
 postgresql.conf file or on the server command line.


diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 0aa0c52c49..af9b9d586c 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -362,9 +362,10 @@ WalReceiverMain(void)
 		 */
 		if (is_temp_slot)
 		{
-			snprintf(slotname, sizeof(slotname),
-	 "pg_walreceiver_%lld",
-	 (long long int) walrcv_get_backend_pid(wrconn));
+			if (slotname[0] == '\0')
+snprintf(slotname, sizeof(slotname),
+		"pg_walreceiver_%lld",
+		(long long int) walrcv_get_backend_pid(wrconn));
 
 			walrcv_create_slot(wrconn, slotname, true, 0, NULL);
 
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 21d1823607..28117883e1 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -218,7 +218,7 @@ ShutdownWalRcv(void)
  * "recptr" indicates the position where streaming should begin.  "conninfo"
  * is a libpq connection string to use.  "slotname" is, optionally, the name
  * of a replication slot to acquire.  "create_temp_slot" indicates to create
- * a temporary slot when no "slotname" is given.
+ * a temporary slot.
  *
  * WAL receivers do not directly load GUC parameters used for the connection
  * to the primary, and rely on the values passed down by the caller of this
@@ -254,23 +254,17 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
 	else
 		walrcv->conninfo[0] = '\0';
 
-	/*
-	 * Use configured replication slot if present, and ignore the value
-	 * of create_temp_slot as the slot name should be persistent.  Otherwise,
-	 * use create_temp_slot to determine whether this WAL receiver should
-	 * create a temporary slot by itself and use it, or not.
-	 */
 	if (slotname != NULL && slotname[0] != '\0')
 	{
 		strlcpy((char *) walrcv->slotname, slotname, NAMEDATALEN);
-		walrcv->is_temp_slot = false;
 	}
 	else
 	{
 		walrcv->slotname[0] = '\0';
-		walrcv->is_temp_slot = create_temp_slot;
 	}
 
+	walrcv->is_temp_slot = create_temp_slot;
+
 	if (walrcv->walRcvState == WALRCV_STOPPED)
 	{
 		launch = true;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 53665971f5..51ef6987b5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2051,7 +2051,7 @@ static struct config_bool ConfigureNamesBool[] =
 
 	{
 		{"wal_receiver_create_temp_slot", PGC_SIGHUP, REPLICATION_STANDBY,
-			gettext_noop("Sets whether a WAL receiver should create a temporary replication slot if no permanent slot is configured."),
+			gettext_noop("Sets whether a WAL receiver should create a temporary replication slot."),
 		},
 		_receiver_create_temp_slot,
 		false,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f01e43b818..5308d602f4 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -319,8 +319,7 @@
 #max_standby_streaming_delay = 30s	# max delay before canceling queries
 	# when reading streaming WAL;
 	# -1 allows indefinite delay
-#wal_receiver_create_temp_slot = off	# Create temp slot if primary_slot_name
-	# is not set.
+#wal_receiver_create_temp_slot = off	# Create temp replication slot
 #wal_receiver_status_interval = 10s	# send replies at least this often
 	# 0 disables
 #hot_standby_feedback = off		# send info from standby to prevent
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 1cf88e953d..c5160fe907 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4173,11 +4173,11 @@ ANY num_sync ( ).
-The default is on.  The only reason to turn this off would be if the
-remote instance is currently out of available replication slots.  This
-parameter can only be set in the postgresql.conf
-file or on the server command line.  Changes only take effect when the
-WAL receiver process starts a new connection.
+The default is off.  This parameter

Re: allow online change primary_conninfo

2020-03-27 Thread Sergei Kornilov

Hello

> I realized that the reason the tests broke after Sergei's patch is that
> recovery/t/001_stream_rep.pl's get_slot_xmins() is broken for temp
> walreceiver slots, since it's using the non-temp name it tries to give
> to the slot, rather than the temp name under which it is actually
> created. The workaround proposed by 0002 is to edit standby_1's config
> to set walreceiver's slot to be non-temp.

This is bug in behavior, not in tests.
We need walrcv->is_temp_slot = false; somewhere in RequestXLogStreaming to 
works correctly.

HEAD is not affected since primary_slot_name cannot be changed online.

> (The thing is: if I specify primary_slot_name in the config, why is the
> temp walreceiver slot code not obeying that name? I think walreceiver
> should create a temp slot, sure, but using the given name rather than
> coming up with a random name.)

Hm, interesting idea.

regards, Sergei

Re: replay pause vs. standby promotion

2020-03-26 Thread Sergei Kornilov

Hello

> If we don't ignore walreceiver and does try to connect to the master,
> a promotion and recovery cannot end forever since new WAL data can
> be streamed. You think this behavior is more consistent?

We have no simple point to stop replay.
Well, except for "immediately" - just one easy stop. But I agree that this is 
not the best option. Simple and clear, but not best one for data when we want 
to replay as much as possible from archive.

> IMO it's valid to replay all the WAL data available to avoid data loss
> before a promotion completes.

But in case of still working primary (with archive_command) we choose quite 
random time to promote. A random time when the primary did not save the new wal 
segment.
or even when a temporary error of restore_command occurs? We mention just cp 
command in docs. I know users uses cp (e.g. from NFS) without further error 
handling.

regards, Sergei

Re: allow online change primary_conninfo

2020-03-26 Thread Sergei Kornilov

Hello

Thank you! You were ahead of me. I wanted to double-check my changes this 
morning before posting.

> Sergei included LOG messages to indicate which setting has been changed.
> I put these in "#if 0" for now, but I'm thinking to remove these
> altogether;

No objections.

> Not sure if we can or should adjust the FATAL line; probably not worth the 
> trouble.

I think it's ok. walreceiver is terminated exactly due to administrator command.

regards, Sergei

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-25 Thread Sergei Kornilov

Hello

> WAL usage patch [1] increments this version to 1_4 instead of 1_8.
> I *guess* that's because previously this version was maintained
> independently from pg_stat_statements' version. For example,
> pg_stat_statements 1.4 seems to have used PGSS_V1_3.

As far as I remember, this was my proposed change in review a year ago.
I think that having a clear analogy between the extension version and the 
function name would be more clear than sequential numbering of PGSS_V with 
different extension versions.
For pgss 1.4 it was fine to use PGSS_V1_3, because there were no changes in 
pg_stat_statements_internal.
pg_stat_statements 1.3 will call pg_stat_statements_1_3
pg_stat_statements 1.4 - 1.7 will still call pg_stat_statements_1_3. In my 
opinion, this is the correct naming, since we did not need a new function.
but pg_stat_statements 1.8 will call pg_stat_statements_1_4. It's not confusing?

Well, no strong opinion.

regards, Sergei

Re: replay pause vs. standby promotion

2020-03-25 Thread Sergei Kornilov

Hi

>>  Could we add a few words in func.sgml to clarify the behavior? Especially 
>> for users from my example above. Something like:
>>
>>>  If a promotion is triggered while recovery is paused, the paused state 
>>> ends, replay of any WAL immediately available in the archive or in pg_wal 
>>> will be continued and then a promotion will be completed.
>
> This description is true if pause is requested by pg_wal_replay_pause(),
> but not if recovery target is reached and pause is requested by
> recovery_target_action=pause. In the latter case, even if there are WAL data
> avaiable in pg_wal or archive, they are not replayed, i.e., the promotion
> completes immediately. Probably we should document those two cases
> explicitly to avoid the confusion about a promotion and recovery pause?

This is description for pg_wal_replay_pause, but actually we suggest to call 
pg_wal_replay_resume in recovery_target_action=pause... So, I agree, we need to 
document both cases.

PS: I think we have inconsistent behavior here... Read wal during promotion 
from local pg_wal AND call restore_command, but ignore walreceiver also seems 
strange for my DBA hat...

regards, Sergei

Re: replay pause vs. standby promotion

2020-03-24 Thread Sergei Kornilov

Hello

> I pushed the latest version of the patch. If you have further opinion
> about immediate promotion, let's keep discussing that!

Thank you!

Honestly, I forgot that the promotion is documented in high-availability.sgml 
as:

> Before failover, any WAL immediately available in the archive or in pg_wal 
> will be
> restored, but no attempt is made to connect to the master.

I mistakenly thought that promote should be "immediately"...

> If a promotion is triggered while recovery is paused, the paused state ends 
> and a promotion continues.

Could we add a few words in func.sgml to clarify the behavior? Especially for 
users from my example above. Something like:

> If a promotion is triggered while recovery is paused, the paused state ends, 
> replay of any WAL immediately available in the archive or in pg_wal will be 
> continued and then a promotion will be completed.

regards, Sergei

Re: replay pause vs. standby promotion

2020-03-23 Thread Sergei Kornilov

Hello

> You meant that the promotion request should cause the recovery
> to finish immediately even if there are still outstanding WAL records,
> and cause the standby to become the master?

Oh, I get your point. But yes, I expect that in case of promotion request 
during a pause, the user (me too) will want to have exactly the current state, 
not latest available in WALs.

Real usercase from my experience:
The user wants to update a third-party application. In case of problems, he 
wants to return to the old version of the application and the unchanged 
replica. Thus, it sets a pause on standby and performs an update. If all is ok 
- he will resume replay. In case of some problems he plans to promote standby.
But oops, standby will ignore promote signals during pause and we need get 
currect LSN from standby and restart it with recovery_target_lsn = ? and 
recovery_target_action = promote to achieve this state.

regards, Sergei

Re: replay pause vs. standby promotion

2020-03-23 Thread Sergei Kornilov

Hello

(I am trying to find an opportunity to review this patch...)

Consider test case with streaming replication:

on primary: create table foo (i int);
on standby:

postgres=# select pg_wal_replay_pause();
 pg_wal_replay_pause 
-
 
(1 row)

postgres=# select pg_is_wal_replay_paused();
 pg_is_wal_replay_paused 
-
 t
(1 row)

postgres=# table foo;
 i 
---
(0 rows)

Execute "insert into foo values (1);" on primary

postgres=# select pg_promote ();
 pg_promote 

 t
(1 row)

postgres=# table foo;
 i 
---
 1

And we did replay one additional change during promote. I think this is wrong 
behavior. Possible can be fixed by

+if (PromoteIsTriggered()) break;
/* Setup error traceback support for ereport() */
errcallback.callback = rm_redo_error_callback;

regards, Sergei

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-20 Thread Sergei Kornilov

Hello

I was inactive for a while... Sorry.

>>  BTW, I recheck the patchset.
>>  I think codes are ready for committer but should we modify the 
>> documentation?
>>  {min,max,mean,stddev}_time is now obsoleted so it is better to modify it to
>>  {min,max,mean,stddev}_exec_time and add {min,max,mean,stddev}_plan_time.
>
> Oh indeed, I totally forgot about this. I'm attaching v8 with updated
> documentation that should match what was implemented since some versions.

Yet another is missed in docs: total_time

I specifically verified that the new loaded library works with the old version 
of the extension in the database. I have not noticed issues here.

> 2.2% is a bit large but I think it is still acceptable because people using 
> this feature
> might take account that some overhead will happen for additional calling of a
> gettime function.

I will be happy even with 10% overhead due to enabled track_planning... (but in 
this case disabled by default) log_min_duration_statement = 0 with log parsing 
is much more expensive.
I think 1-2% is acceptable and we can set track_planning = on by default as 
patch does.

> * Rows for executor time are changed from {total/min/max/mean/stddev}_time to 
> {total/min/max/mean/stddev}_exec_time.

Maybe release it as 2.0 version instead of minor update 1.18?

regards, Sergei

Re: pgsql: walreceiver uses a temporary replication slot by default

2020-03-17 Thread Sergei Kornilov

Hello

> I have reworked that part, adding more comments about the use of GUC
> parameters when establishing the connection to the primary for a WAL
> receiver. And also I have added an extra comment to walreceiver.c
> about the use of GUcs in general, to avoid this stuff again in the
> future. There were some extra nits with the format of
> postgresql.conf.sample.

Thank you! I just noticed that you removed my proposed change to this condition 
in RequestXLogStreaming

-   if (slotname != NULL)
+   if (slotname != NULL && slotname[0] != '\0')

We need this change to set is_temp_slot properly. PrimarySlotName GUC can 
usually be an empty string, so just "slotname != NULL" is not enough.

I attached patch with this change.

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 672bf6f1ee..8b742c83c5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4160,11 +4160,7 @@ ANY num_sync ( ).
-The default is on.  The only reason to turn this off would be if the
-remote instance is currently out of available replication slots.  This
-parameter can only be set in the postgresql.conf
-file or on the server command line.  Changes only take effect when the
-WAL receiver process starts a new connection.
+The default is off.  This parameter can only be set at server start.

   
  
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index de2d4ee582..483fedb218 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -284,6 +284,7 @@ bool		StandbyModeRequested = false;
 char	   *PrimaryConnInfo = NULL;
 char	   *PrimarySlotName = NULL;
 char	   *PromoteTriggerFile = NULL;
+bool		wal_receiver_create_temp_slot = false;
 
 /* are we currently in standby mode? */
 bool		StandbyMode = false;
@@ -11951,7 +11952,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 		}
 		curFileTLI = tli;
 		RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
-			 PrimarySlotName);
+			 PrimarySlotName, wal_receiver_create_temp_slot);
 		receivedUpto = 0;
 	}
 
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 25e0333c9e..779d19f1c1 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -15,6 +15,12 @@
  * WalRcv->receivedUpto variable in shared memory, to inform the startup
  * process of how far it can proceed with XLOG replay.
  *
+ * WAL receivers cannot load directly GUC parameters used when establishing
+ * their connection to the primary, and rely on parameter values passed down
+ * by the startup process when WAL streaming is requested.  This applies
+ * to for example the replication slot creation and the connection string to
+ * use for the connection with the primary.
+ *
  * If the primary server ends streaming, but doesn't disconnect, walreceiver
  * goes into "waiting" mode, and waits for the startup process to give new
  * instructions. The startup process will treat that the same as
@@ -74,7 +80,6 @@
 
 
 /* GUC variables */
-bool		wal_receiver_create_temp_slot;
 int			wal_receiver_status_interval;
 int			wal_receiver_timeout;
 bool		hot_standby_feedback;
@@ -349,42 +354,23 @@ WalReceiverMain(void)
 		WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
 
 		/*
-		 * Create temporary replication slot if no slot name is configured or
-		 * the slot from the previous run was temporary, unless
-		 * wal_receiver_create_temp_slot is disabled.  We also need to handle
-		 * the case where the previous run used a temporary slot but
-		 * wal_receiver_create_temp_slot was changed in the meantime.  In that
-		 * case, we delete the old slot name in shared memory.  (This would
+		 * Create temporary replication slot if requested.  In that
+		 * case, we update slot name in shared memory.  (This would
 		 * all be a bit easier if we just didn't copy the slot name into
 		 * shared memory, since we won't need it again later, but then we
 		 * can't see the slot name in the stats views.)
 		 */
-		if (slotname[0] == '\0' || is_temp_slot)
+		if (is_temp_slot)
 		{
-			bool		changed = false;
+			snprintf(slotname, sizeof(slotname),
+	 "pg_walreceiver_%lld",
+	 (long long int) walrcv_get_backend_pid(wrconn));
 
-			if (wal_receiver_create_temp_slot)
-			{
-snprintf(slotname, sizeof(slotname),
-		 "pg_walreceiver_%lld",
-		 (long long int) walrcv_get_backend_pid(wrconn));
+			walrcv_create_slot(wrconn, slotname, true, 0, NULL);
 
-walrcv_create_slot(wrconn, slotname, true, 0, NULL);
-changed = true;
-			}
-			else if (slotname[0] != '\0')
-			{
-slotname[0] = '\0';
-changed = true;
-			}
-
-			if (changed)
-			{
-SpinLockAcquire(>mutex);
-strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
-walrcv->is_temp_slot = wal_receiver_create_temp_slot;
-

Re: allow online change primary_conninfo

2020-03-17 Thread Sergei Kornilov

Hello

>>  Well, it seems better to move this patch to next commitfest?
>
> What? You want to make wal_receiver_create_temp_slot unchangeable and
> default to off in pg13, and delay the patch that fixes those things to
> pg14? That makes no sense to me.

I want to handle similar things in a similar way.
wal_receiver_create_temp_slot has good design? I will change my patch in same 
way in this case. But something like that was strongly rejected a year ago.

> Please keep them both here so that we can get things to a usable state.

Yes, of course.

Here I attached 3 patches:
0001 is copy from https://commitfest.postgresql.org/27/2456/ It changes 
wal_receiver_create_temp_slot to PGC_POSTMASTER, changes the default value to 
off, and moves the logic to the startup process.
0002 changes primary_conninfo and primary_slot_name to be PGC_SIGHUP
0003 changes wal_receiver_create_temp_slot back to be PGC_SIGHUP. Michael 
Paquier asks to remove this from 0002, you ask to leave it in this thread. So, 
I made separate patch on top of 0002.

Thank you

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f1d2a77043..eb6b3fb876 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4170,7 +4170,11 @@ ANY num_sync ( ).
-The default is off.  This parameter can only be set at server start.
+The default is off.  This parameter can only be set in the 
+postgresql.conf file or on the server command line.
+   
+   
+The WAL receiver is restarted after an update of wal_receiver_create_temp_slot.

   
  
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e1f3e905bb..47d69d92d2 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -12278,8 +12278,10 @@ ProcessStartupSigHup(void)
 {
 	char	   *conninfo = pstrdup(PrimaryConnInfo);
 	char	   *slotname = pstrdup(PrimarySlotName);
+	bool		tempSlot = wal_receiver_create_temp_slot;
 	bool		conninfoChanged;
 	bool		slotnameChanged;
+	bool		tempSlotChanged = false;
 
 	ProcessConfigFile(PGC_SIGHUP);
 
@@ -12289,10 +12291,17 @@ ProcessStartupSigHup(void)
 	conninfoChanged = (strcmp(conninfo, PrimaryConnInfo) != 0);
 	slotnameChanged = (strcmp(slotname, PrimarySlotName) != 0);
 
+	/*
+	 * wal_receiver_create_temp_slot is used only when we have no slot
+	 * configured. We do not need to track this change if it has no effect.
+	 */
+	if (!slotnameChanged && strcmp(PrimarySlotName, "") == 0)
+		tempSlotChanged = (tempSlot != wal_receiver_create_temp_slot);
+
 	pfree(conninfo);
 	pfree(slotname);
 
-	if ((conninfoChanged || slotnameChanged) &&
+	if ((conninfoChanged || slotnameChanged || tempSlotChanged) &&
 		currentSource == XLOG_FROM_STREAM
 		&& WalRcvRunning())
 	{
@@ -12300,14 +12309,16 @@ ProcessStartupSigHup(void)
 			ereport(LOG,
 	(errmsg("The WAL receiver is going to be shut down due to change of %s",
 			"primary_conninfo")));
-		else if (conninfoChanged && slotnameChanged)
+		else if (conninfoChanged && (slotnameChanged || tempSlotChanged))
 			ereport(LOG,
 	(errmsg("The WAL receiver is going to be restarted due to change of %s and %s",
-			"primary_conninfo", "primary_slot_name")));
+			"primary_conninfo",
+			slotnameChanged ? "primary_slot_name" : "wal_receiver_create_temp_slot")));
 		else
 			ereport(LOG,
 	(errmsg("The WAL receiver is going to be restarted due to change of %s",
-			conninfoChanged ? "primary_conninfo" : "primary_slot_name")));
+			conninfoChanged ? "primary_conninfo"
+			: (slotnameChanged ? "primary_slot_name" : "wal_receiver_create_temp_slot";
 
 		pendingWalRcvRestart = true;
 	}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 8d9bdc0eaf..a3650518c4 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2030,7 +2030,7 @@ static struct config_bool ConfigureNamesBool[] =
 	},
 
 	{
-		{"wal_receiver_create_temp_slot", PGC_POSTMASTER, REPLICATION_STANDBY,
+		{"wal_receiver_create_temp_slot", PGC_SIGHUP, REPLICATION_STANDBY,
 			gettext_noop("Sets whether a WAL receiver should create a temporary replication slot if no permanent slot is configured."),
 		},
 		_receiver_create_temp_slot,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 31096725fd..f01e43b818 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -321,7 +321,6 @@
 	# -1 allows indefinite delay
 #wal_receiver_create_temp_slot = off	# Create temp slot if primary_slot_name
 	# is not set.
-	# (change requires restart)
 #wal_receiver_status_interval = 10s	# send replies at least this often
 	# 0 disables
 #hot_standby_feedback = off		# send info from standby to prevent
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8b742c83c5..f1d2a77043 100644
---

Re: allow online change primary_conninfo

2020-03-17 Thread Sergei Kornilov

Hello

Sorry for late replies.

> Yes. In my opinion, patch 0002 should not change the GUC mode of
> wal_receiver_create_temp_slot as the discussion here is about
> primary_conninfo, even if both may share some logic regarding WAL
> receiver shutdown and its restart triggered by the startup process.

Ok, I removed related changes from main patch. Along with minor merge conflict.

> Patch 0001 has actually been presented on this thread first:
> https://www.postgresql.org/message-id/753391579708...@iva3-77ae5995f07f.qloud-c.yandex.net
> And there is an independent patch registered in this CF:
> https://commitfest.postgresql.org/27/2456/

Yep, 0001 is separate patch. I will post copy of this patch here to make cfbot 
works.  Main patch 0002 requires resetting of is_temp_slot in 
RequestXLogStreaming to works properly.

> Should we add patch 0001 as an open item for v13 as there is a risk of
> forgetting this issue?

I think yes.

Well, it seems better to move this patch to next commitfest?

regards, Sergeidiff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 98b033fc20..12362421d7 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -129,6 +129,7 @@ extern int	recoveryTargetAction;
 extern int	recovery_min_apply_delay;
 extern char *PrimaryConnInfo;
 extern char *PrimarySlotName;
+extern bool wal_receiver_create_temp_slot;
 
 /* indirectly set via GUC system */
 extern TransactionId recoveryTargetXid;
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index e08afc6548..cf3e43128c 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -23,7 +23,6 @@
 #include "utils/tuplestore.h"
 
 /* user-settable parameters */
-extern bool wal_receiver_create_temp_slot;
 extern int	wal_receiver_status_interval;
 extern int	wal_receiver_timeout;
 extern bool hot_standby_feedback;
@@ -321,7 +320,8 @@ extern void ShutdownWalRcv(void);
 extern bool WalRcvStreaming(void);
 extern bool WalRcvRunning(void);
 extern void RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr,
- const char *conninfo, const char *slotname);
+ const char *conninfo, const char *slotname,
+ bool create_temp_slot);
 extern XLogRecPtr GetWalRcvWriteRecPtr(XLogRecPtr *latestChunkStart, TimeLineID *receiveTLI);
 extern int	GetReplicationApplyDelay(void);
 extern int	GetReplicationTransferLatency(void);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3813eadfb4..e0f3ed5c2a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -283,6 +283,7 @@ bool		StandbyModeRequested = false;
 char	   *PrimaryConnInfo = NULL;
 char	   *PrimarySlotName = NULL;
 char	   *PromoteTriggerFile = NULL;
+bool		wal_receiver_create_temp_slot = false;
 
 /* are we currently in standby mode? */
 bool		StandbyMode = false;
@@ -11901,7 +11902,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 		}
 		curFileTLI = tli;
 		RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
-			 PrimarySlotName);
+			 PrimarySlotName, wal_receiver_create_temp_slot);
 		receivedUpto = 0;
 	}
 
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 2ab15c3cbb..ff45482faa 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -15,6 +15,12 @@
  * WalRcv->receivedUpto variable in shared memory, to inform the startup
  * process of how far it can proceed with XLOG replay.
  *
+ * WAL receivers cannot load directly GUC parameters used when establishing
+ * their connection to the primary, and rely on parameter values passed down
+ * by the startup process when WAL streaming is requested.  This applies
+ * to for example the replication slot creation and the connection string to
+ * use for the connection with the primary.
+ *
  * If the primary server ends streaming, but doesn't disconnect, walreceiver
  * goes into "waiting" mode, and waits for the startup process to give new
  * instructions. The startup process will treat that the same as
@@ -73,7 +79,6 @@
 
 
 /* GUC variables */
-bool		wal_receiver_create_temp_slot;
 int			wal_receiver_status_interval;
 int			wal_receiver_timeout;
 bool		hot_standby_feedback;
@@ -348,42 +353,23 @@ WalReceiverMain(void)
 		WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
 
 		/*
-		 * Create temporary replication slot if no slot name is configured or
-		 * the slot from the previous run was temporary, unless
-		 * wal_receiver_create_temp_slot is disabled.  We also need to handle
-		 * the case where the previous run used a temporary slot but
-		 * wal_receiver_create_temp_slot was changed in the meantime.  In that
-		 * case, we delete the old slot name in shared memory.  (This would
+		 * Create temporary replication slot if requested.  In that
+		 * case, we update slot name in shared memory.  (This would

Re: replay pause vs. standby promotion

2020-03-04 Thread Sergei Kornilov

Hello

> I want to start this discussion because this is related to the patch
> (propoesd at the thread [1]) that I'm reviewing. It does that partially,
> i.e., prefers the promotion only when the pause is requested by
> recovery_target_action=pause. But I think that it's reasonable and
> more consistent to do that whether whichever the pause is requested
> by pg_wal_replay_pause() or recovery_target_action.

+1.
I'm just not sure if this is safe for replay logic, so I did not touch this 
behavior in my proposal. (hmm, I wanted to mention this, but apparently forgot)

regards, Sergei

Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

2020-03-02 Thread Sergei Kornilov

Hello

I reviewed a recently published patch. Looks good for me.
One small note: the values for the new definitions in progress.h seems not to 
be aligned vertically. However, pgindent doesn't objects.

regards, Sergei

Re: Improve handling of parameter differences in physical replication

2020-02-27 Thread Sergei Kornilov

Hello

Thank you for working on this!

> Where this becomes a serious problem is if you have many standbys and you do 
> a failover.

+1
Several times my team would like to pause recovery instead of panic after 
change settings on primary. (same thing for create_tablespace_directories 
replay errors too...)

We documented somewhere (excluding code) shutting down the standby immediately 
upon receipt of the parameter change? doc/src/sgml/high-availability.sgml says 
only about "refuse to start".

regards, Sergei

Re: pgsql: walreceiver uses a temporary replication slot by default

2020-02-17 Thread Sergei Kornilov

Hello

> Thanks for posting this patch, Sergei. Here is a review to make
> things move on.

Thank you, here is updated patch

> The set of comments you are removing from walreceiver.c to decide if a
> temporary slot needs to be created or not should be moved to
> walreceiverfuncs.c as you move the logic from the WAL receiver startup
> phase to the moment the WAL receiver spawn is requested.

I changed this comments because they describes behavior during change value of 
wal_receiver_create_temp_slot.
But yes, I need to add some comments to RequestXLogStreaming.

> It would be more consistent with primary_conn_info and
> primary_slot_name if wal_receiver_create_temp_slot is passed down as
> an argument of RequestXLogStreaming().

Yep, I thought about that. Changed.

> As per the discussion done on this thread, let's also switch the
> parameter default to be disabled.

Done (my vote is also for disabling this option by default).

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c1128f89ec..683d87c491 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4140,11 +4140,7 @@ ANY num_sync ( ).
-The default is on.  The only reason to turn this off would be if the
-remote instance is currently out of available replication slots.  This
-parameter can only be set in the postgresql.conf
-file or on the server command line.  Changes only take effect when the
-WAL receiver process starts a new connection.
+The default is off.  This parameter can only be set at server start.

   
  
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3813eadfb4..e0f3ed5c2a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -283,6 +283,7 @@ bool		StandbyModeRequested = false;
 char	   *PrimaryConnInfo = NULL;
 char	   *PrimarySlotName = NULL;
 char	   *PromoteTriggerFile = NULL;
+bool		wal_receiver_create_temp_slot = false;
 
 /* are we currently in standby mode? */
 bool		StandbyMode = false;
@@ -11901,7 +11902,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 		}
 		curFileTLI = tli;
 		RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
-			 PrimarySlotName);
+			 PrimarySlotName, wal_receiver_create_temp_slot);
 		receivedUpto = 0;
 	}
 
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 2ab15c3cbb..01452a175f 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -73,7 +73,6 @@
 
 
 /* GUC variables */
-bool		wal_receiver_create_temp_slot;
 int			wal_receiver_status_interval;
 int			wal_receiver_timeout;
 bool		hot_standby_feedback;
@@ -348,42 +347,23 @@ WalReceiverMain(void)
 		WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
 
 		/*
-		 * Create temporary replication slot if no slot name is configured or
-		 * the slot from the previous run was temporary, unless
-		 * wal_receiver_create_temp_slot is disabled.  We also need to handle
-		 * the case where the previous run used a temporary slot but
-		 * wal_receiver_create_temp_slot was changed in the meantime.  In that
-		 * case, we delete the old slot name in shared memory.  (This would
+		 * Create temporary replication slot if requested.  In that
+		 * case, we update slot name in shared memory.  (This would
 		 * all be a bit easier if we just didn't copy the slot name into
 		 * shared memory, since we won't need it again later, but then we
 		 * can't see the slot name in the stats views.)
 		 */
-		if (slotname[0] == '\0' || is_temp_slot)
+		if (is_temp_slot)
 		{
-			bool		changed = false;
+			snprintf(slotname, sizeof(slotname),
+	 "pg_walreceiver_%lld",
+	 (long long int) walrcv_get_backend_pid(wrconn));
 
-			if (wal_receiver_create_temp_slot)
-			{
-snprintf(slotname, sizeof(slotname),
-		 "pg_walreceiver_%lld",
-		 (long long int) walrcv_get_backend_pid(wrconn));
-
-walrcv_create_slot(wrconn, slotname, true, 0, NULL);
-changed = true;
-			}
-			else if (slotname[0] != '\0')
-			{
-slotname[0] = '\0';
-changed = true;
-			}
+			walrcv_create_slot(wrconn, slotname, true, 0, NULL);
 
-			if (changed)
-			{
-SpinLockAcquire(>mutex);
-strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
-walrcv->is_temp_slot = wal_receiver_create_temp_slot;
-SpinLockRelease(>mutex);
-			}
+			SpinLockAcquire(>mutex);
+			strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
+			SpinLockRelease(>mutex);
 		}
 
 		/*
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 89c903e45a..ee0dd693b2 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -216,12 +216,13 @@ ShutdownWalRcv(void)
  * Request postmaster to start walreceiver.
  *
  * recptr indicates the position where streaming should begin,

Re: pg_basebackup -F plain -R overwrites postgresql.auto.conf

2020-02-10 Thread Sergei Kornilov

Hello

Seems bug was introduced in caba97a9d9f4d4fa2531985fd12d3cd823da06f3 - in HEAD 
only

In REL_12_STABLE we have:

boolis_recovery_guc_supported = true;

if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_RECOVERY_GUC)
is_recovery_guc_supported = false;

snprintf(filename, MAXPGPATH, "%s/%s", basedir,
 is_recovery_guc_supported ? "postgresql.auto.conf" : 
"recovery.conf");

cf = fopen(filename, is_recovery_guc_supported ? "a" : "w");

It looks correct: append mode for postgresql.auto.conf

In HEAD version is_recovery_guc_supported variable was replaced to inversed 
use_recovery_conf without change fopen mode.

regards, Sergei

Re: allow online change primary_conninfo

2020-01-31 Thread Sergei Kornilov

Hello
Small rebase due to merge conflict of the tests. No functional changes since v7.

PS: also it is end of current CF, I will mark patch entry as moved to the next 
CF.diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2e2af9e96e..7887391bbb 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4005,9 +4005,15 @@ ANY num_sync ( 
@@ -4022,9 +4028,13 @@ ANY num_sync ( ).
-  This parameter can only be set at server start.
+  This parameter can only be set in the postgresql.conf
+  file or on the server command line.
   This setting has no effect if primary_conninfo is not
-  set.
+  set or the server is not in standby mode.
+ 
+ 
+  The WAL receiver is restarted after an update of primary_slot_name.
  
 

@@ -4142,7 +4152,11 @@ ANY num_sync ( ).
 The default is on.  The only reason to turn this off would be if the
 remote instance is currently out of available replication slots.  This
-parameter can only be set at server start.
+parameter can only be set in the postgresql.conf
+file or on the server command line.
+   
+   
+The WAL receiver is restarted after an update of wal_receiver_create_temp_slot.

   
  
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 408b9b489a..c24b93332e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -803,6 +803,12 @@ static XLogSource readSource = 0;	/* XLOG_FROM_* code */
 static XLogSource currentSource = 0;	/* XLOG_FROM_* code */
 static bool lastSourceFailed = false;
 
+/*
+ * Need for restart running WalReceiver due the configuration change.
+ * Suitable only for XLOG_FROM_STREAM source
+ */
+static bool pendingWalRcvRestart = false;
+
 typedef struct XLogPageReadPrivate
 {
 	int			emode;
@@ -11831,6 +11837,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	for (;;)
 	{
 		int			oldSource = currentSource;
+		bool		startWalReceiver = false;
 
 		/*
 		 * First check if we failed to read from the current source, and
@@ -11864,54 +11871,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	if (!StandbyMode)
 		return false;
 
-	/*
-	 * If primary_conninfo is set, launch walreceiver to try
-	 * to stream the missing WAL.
-	 *
-	 * If fetching_ckpt is true, RecPtr points to the initial
-	 * checkpoint location. In that case, we use RedoStartLSN
-	 * as the streaming start position instead of RecPtr, so
-	 * that when we later jump backwards to start redo at
-	 * RedoStartLSN, we will have the logs streamed already.
-	 */
-	if (PrimaryConnInfo && strcmp(PrimaryConnInfo, "") != 0)
-	{
-		XLogRecPtr	ptr;
-		TimeLineID	tli;
-
-		if (fetching_ckpt)
-		{
-			ptr = RedoStartLSN;
-			tli = ControlFile->checkPointCopy.ThisTimeLineID;
-		}
-		else
-		{
-			ptr = RecPtr;
-
-			/*
-			 * Use the record begin position to determine the
-			 * TLI, rather than the position we're reading.
-			 */
-			tli = tliOfPointInHistory(tliRecPtr, expectedTLEs);
-
-			if (curFileTLI > 0 && tli < curFileTLI)
-elog(ERROR, "according to history file, WAL location %X/%X belongs to timeline %u, but previous recovered WAL file came from timeline %u",
-	 (uint32) (tliRecPtr >> 32),
-	 (uint32) tliRecPtr,
-	 tli, curFileTLI);
-		}
-		curFileTLI = tli;
-		RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
-			 PrimarySlotName);
-		receivedUpto = 0;
-	}
-
 	/*
 	 * Move to XLOG_FROM_STREAM state in either case. We'll
 	 * get immediate failure if we didn't launch walreceiver,
 	 * and move on to the next state.
 	 */
 	currentSource = XLOG_FROM_STREAM;
+	startWalReceiver = true;
 	break;
 
 case XLOG_FROM_STREAM:
@@ -12057,7 +12023,69 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	Assert(StandbyMode);
 
 	/*
-	 * Check if WAL receiver is still active.
+	 * shutdown WAL receiver if restart is requested.
+	 */
+	if (!startWalReceiver && pendingWalRcvRestart)
+	{
+		if (WalRcvRunning())
+			ShutdownWalRcv();
+
+		/*
+		 * Re-scan for possible new timelines if we were
+		 * requested to recover to the latest timeline.
+		 */
+		if (recoveryTargetTimeLineGoal ==
+			RECOVERY_TARGET_TIMELINE_LATEST)
+			rescanLatestTimeLine();
+
+		startWalReceiver = true;
+	}
+	pendingWalRcvRestart = false;
+
+	/*
+	 * Launch walreceiver if needed.
+	 *
+	 * If fetching_ckpt is true, RecPtr points to the initial
+	 * checkpoint location. In that case, we use RedoStartLSN
+	 * as the streaming start position instead of RecPtr, so
+	 * that when we later jump backwards to start redo

recovery_target_action=pause with confusing hint

2020-01-30 Thread Sergei Kornilov

Hello

Currently during point-in-time recovery with recovery_target_action = 'pause' 
we print log lines:

> LOG: recovery has paused
> HINT: Execute pg_wal_replay_resume() to continue.

My colleague told me that this is a terrible moment: to continue what exactly? 
It sounds like "to continue replay", similar to normal 
pg_wal_replay_pause/pg_wal_replay_resume behavior. We have just small note in 
documentation:

> The paused state can be resumed by using pg_wal_replay_resume() (see Table 
> 9.81), which then causes recovery to end.

But I think this is important place and can be improved.

Also the database does not respond to the promote signals at this stage. 
Attached patch 0001 with the test will fail.

0002 patch contains my proposed ideas:
- introduce separate message for pause due pg_wal_replay_pause call and for 
recovery_target_action.
- check for standby triggers only for recovery_target_action - I am not sure 
this would be safe for pg_wal_replay_pause() call case

Maybe more verbose hint would be appropriate:

> Execute pg_promote() to end recovery or shut down the server, change the 
> recovery target settings to a later target and restart to continue recovery

Thoughts?

regards, Sergeidiff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3813eadfb4..5ab09917c7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -860,7 +860,7 @@ static void validateRecoveryParameters(void);
 static void exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog);
 static bool recoveryStopsBefore(XLogReaderState *record);
 static bool recoveryStopsAfter(XLogReaderState *record);
-static void recoveryPausesHere(void);
+static void recoveryPausesHere(bool isRecoveryTargetAction);
 static bool recoveryApplyDelay(XLogReaderState *record);
 static void SetLatestXTime(TimestampTz xtime);
 static void SetCurrentChunkStartTime(TimestampTz xtime);
@@ -5912,20 +5912,28 @@ recoveryStopsAfter(XLogReaderState *record)
  * anyone cares about server power consumption in.
  */
 static void
-recoveryPausesHere(void)
+recoveryPausesHere(bool isRecoveryTargetAction)
 {
 	/* Don't pause unless users can connect! */
 	if (!LocalHotStandbyActive)
 		return;
 
-	ereport(LOG,
-			(errmsg("recovery has paused"),
-			 errhint("Execute pg_wal_replay_resume() to continue.")));
+	if (isRecoveryTargetAction)
+		ereport(LOG,
+(errmsg("recovery has paused"),
+errhint("Execute pg_wal_replay_resume() to promote.")));
+	else
+		ereport(LOG,
+(errmsg("recovery has paused"),
+errhint("Execute pg_wal_replay_resume() to continue.")));
 
 	while (RecoveryIsPaused())
 	{
 		pg_usleep(100L);	/* 1000 ms */
 		HandleStartupProcInterrupts();
+		/* handle promote requests */
+		if (isRecoveryTargetAction && CheckForStandbyTrigger())
+			SetRecoveryPause(false);
 	}
 }
 
@@ -7096,7 +7104,7 @@ StartupXLOG(void)
  * adding another spinlock cycle to prevent that.
  */
 if (((volatile XLogCtlData *) XLogCtl)->recoveryPause)
-	recoveryPausesHere();
+	recoveryPausesHere(false);
 
 /*
  * Have we reached our recovery target?
@@ -7121,7 +7129,7 @@ StartupXLOG(void)
 	 * work.
 	 */
 	if (((volatile XLogCtlData *) XLogCtl)->recoveryPause)
-		recoveryPausesHere();
+		recoveryPausesHere(false);
 }
 
 /* Setup error traceback support for ereport() */
@@ -7295,7 +7303,7 @@ StartupXLOG(void)
 
 	case RECOVERY_TARGET_ACTION_PAUSE:
 		SetRecoveryPause(true);
-		recoveryPausesHere();
+		recoveryPausesHere(true);
 
 		/* drop into promote */
 
diff --git a/src/test/recovery/t/003_recovery_targets.pl b/src/test/recovery/t/003_recovery_targets.pl
index fd14bab208..85afc71c66 100644
--- a/src/test/recovery/t/003_recovery_targets.pl
+++ b/src/test/recovery/t/003_recovery_targets.pl
@@ -167,3 +167,22 @@ foreach my $i (0..100)
 $logfile = slurp_file($node_standby->logfile());
 ok($logfile =~ qr/FATAL:  recovery ended before configured recovery target was reached/,
 	'recovery end before target reached is a fatal error');
+
+# react to promote on recovery_target_action = pause
+
+$node_standby = get_new_node('standby_9');
+$node_standby->init_from_backup($node_master, 'my_backup',
+has_restoring => 1, standby => 1);
+$node_standby->append_conf('postgresql.conf',
+		   "recovery_target_name = '$recovery_name'");
+$node_standby->append_conf('postgresql.conf',
+		   "recovery_target_action = 'pause'");
+$node_standby->start;
+
+# Wait until standby has replayed enough data
+my $caughtup_query =
+  "SELECT '$lsn4'::pg_lsn <= pg_last_wal_replay_lsn()";
+$node_standby->poll_query_until('postgres', $caughtup_query)
+  or die "Timed out while waiting for standby to catch up";
+
+$node_standby->promote;

Re: pgsql: walreceiver uses a temporary replication slot by default

2020-01-22 Thread Sergei Kornilov

Hello

> In short, the following things:
> - wal_receiver_create_temp_slot should be made PGC_POSTMASTER,
> similarly to primary_slot_name and primary_conninfo.
> - WalReceiverMain() should not load the parameter from the GUC context
> by itself.
> - RequestXLogStreaming(), called by the startup process, should be in
> charge of defining if a temp slot should be used or not.

I would like to cross-post here a patch with such changes that I posted in 
"allow online change primary_conninfo" thread.
This thread is more appropriate for discussion about 
wal_receiver_create_temp_slot.

PS: I posted this patch in both threads mostly to make cfbot happy.

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e07dc01e80..14992a08d7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4137,9 +4137,7 @@ ANY num_sync ( ).
 The default is on.  The only reason to turn this off would be if the
 remote instance is currently out of available replication slots.  This
-parameter can only be set in the postgresql.conf
-file or on the server command line.  Changes only take effect when the
-WAL receiver process starts a new connection.
+parameter can only be set at server start.

   
  
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7f4f784c0e..55e9294ae3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -283,6 +283,7 @@ bool		StandbyModeRequested = false;
 char	   *PrimaryConnInfo = NULL;
 char	   *PrimarySlotName = NULL;
 char	   *PromoteTriggerFile = NULL;
+bool		wal_receiver_create_temp_slot = true;
 
 /* are we currently in standby mode? */
 bool		StandbyMode = false;
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index a5e85d32f3..264b544194 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -73,7 +73,6 @@
 
 
 /* GUC variables */
-bool		wal_receiver_create_temp_slot;
 int			wal_receiver_status_interval;
 int			wal_receiver_timeout;
 bool		hot_standby_feedback;
@@ -349,42 +348,23 @@ WalReceiverMain(void)
 		WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
 
 		/*
-		 * Create temporary replication slot if no slot name is configured or
-		 * the slot from the previous run was temporary, unless
-		 * wal_receiver_create_temp_slot is disabled.  We also need to handle
-		 * the case where the previous run used a temporary slot but
-		 * wal_receiver_create_temp_slot was changed in the meantime.  In that
-		 * case, we delete the old slot name in shared memory.  (This would
+		 * Create temporary replication slot if requested.  In that
+		 * case, we update slot name in shared memory.  (This would
 		 * all be a bit easier if we just didn't copy the slot name into
 		 * shared memory, since we won't need it again later, but then we
 		 * can't see the slot name in the stats views.)
 		 */
-		if (slotname[0] == '\0' || is_temp_slot)
+		if (is_temp_slot)
 		{
-			bool		changed = false;
+			snprintf(slotname, sizeof(slotname),
+	 "pg_walreceiver_%lld",
+	 (long long int) walrcv_get_backend_pid(wrconn));
 
-			if (wal_receiver_create_temp_slot)
-			{
-snprintf(slotname, sizeof(slotname),
-		 "pg_walreceiver_%lld",
-		 (long long int) walrcv_get_backend_pid(wrconn));
-
-walrcv_create_slot(wrconn, slotname, true, 0, NULL);
-changed = true;
-			}
-			else if (slotname[0] != '\0')
-			{
-slotname[0] = '\0';
-changed = true;
-			}
+			walrcv_create_slot(wrconn, slotname, true, 0, NULL);
 
-			if (changed)
-			{
-SpinLockAcquire(>mutex);
-strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
-walrcv->is_temp_slot = wal_receiver_create_temp_slot;
-SpinLockRelease(>mutex);
-			}
+			SpinLockAcquire(>mutex);
+			strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
+			SpinLockRelease(>mutex);
 		}
 
 		/*
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 89c903e45a..a36bc83d2c 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -248,10 +248,16 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
 	else
 		walrcv->conninfo[0] = '\0';
 
-	if (slotname != NULL)
+	if (slotname != NULL && slotname[0] != '\0')
+	{
 		strlcpy((char *) walrcv->slotname, slotname, NAMEDATALEN);
+		walrcv->is_temp_slot = false;
+	}
 	else
+	{
 		walrcv->slotname[0] = '\0';
+		walrcv->is_temp_slot = wal_receiver_create_temp_slot;
+	}
 
 	if (walrcv->walRcvState == WALRCV_STOPPED)
 	{
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9f179a9129..c4b36eb2ad 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1994,7 +1994,7 @@ static struct config_bool ConfigureNamesBool[] =
 	},
 
 	{
-		{"wal_receiver_create_temp_slot", PGC_SIGHUP,

Re: allow online change primary_conninfo

2020-01-22 Thread Sergei Kornilov

Hello

> Yeah, you are right. I was not paying much attention but something
> does not stick here. My understanding is that we should have the WAL
> receiver receive the value it needs to use from the startup process
> (aka via RequestXLogStreaming from xlog.c), and that we ought to make
> this new parameter PGC_POSTMASTER instead of PGC_SIGHUP. HEAD is
> inconsistent here.

Thank you!

I attached two patches:
- first changes wal_receiver_create_temp_slot to PGC_POSTMASTER and moved the 
logic to RequestXLogStreaming
- second is based on last published v6 version of main patch. It changes 
wal_receiver_create_temp_slot back to PGC_SIGHUP along with primary_conninfo 
and primary_slot_name and will restart walreceiver if need.

Regarding the main patch: we have several possibilities for moving 
RequestXLogStreaming call. We need to choose one.
And, of course, changes in the text...

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 14992a08d7..19f36a3318 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4000,9 +4000,15 @@ ANY num_sync ( 
@@ -4017,9 +4023,13 @@ ANY num_sync ( ).
-  This parameter can only be set at server start.
+  This parameter can only be set in the postgresql.conf
+  file or on the server command line.
   This setting has no effect if primary_conninfo is not
-  set.
+  set or the server is not in standby mode.
+ 
+ 
+  The WAL receiver is restarted after an update of primary_slot_name.
  
 

@@ -4137,7 +4147,11 @@ ANY num_sync ( ).
 The default is on.  The only reason to turn this off would be if the
 remote instance is currently out of available replication slots.  This
-parameter can only be set at server start.
+parameter can only be set in the postgresql.conf
+file or on the server command line.
+   
+   
+The WAL receiver is restarted after an update of wal_receiver_create_temp_slot.

   
  
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 55e9294ae3..2861b9f22a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -803,6 +803,12 @@ static XLogSource readSource = 0;	/* XLOG_FROM_* code */
 static XLogSource currentSource = 0;	/* XLOG_FROM_* code */
 static bool lastSourceFailed = false;
 
+/*
+ * Need for restart running WalReceiver due the configuration change.
+ * Suitable only for XLOG_FROM_STREAM source
+ */
+static bool pendingWalRcvRestart = false;
+
 typedef struct XLogPageReadPrivate
 {
 	int			emode;
@@ -11818,6 +11824,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	for (;;)
 	{
 		int			oldSource = currentSource;
+		bool		startWalReceiver = false;
 
 		/*
 		 * First check if we failed to read from the current source, and
@@ -11851,54 +11858,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	if (!StandbyMode)
 		return false;
 
-	/*
-	 * If primary_conninfo is set, launch walreceiver to try
-	 * to stream the missing WAL.
-	 *
-	 * If fetching_ckpt is true, RecPtr points to the initial
-	 * checkpoint location. In that case, we use RedoStartLSN
-	 * as the streaming start position instead of RecPtr, so
-	 * that when we later jump backwards to start redo at
-	 * RedoStartLSN, we will have the logs streamed already.
-	 */
-	if (PrimaryConnInfo && strcmp(PrimaryConnInfo, "") != 0)
-	{
-		XLogRecPtr	ptr;
-		TimeLineID	tli;
-
-		if (fetching_ckpt)
-		{
-			ptr = RedoStartLSN;
-			tli = ControlFile->checkPointCopy.ThisTimeLineID;
-		}
-		else
-		{
-			ptr = RecPtr;
-
-			/*
-			 * Use the record begin position to determine the
-			 * TLI, rather than the position we're reading.
-			 */
-			tli = tliOfPointInHistory(tliRecPtr, expectedTLEs);
-
-			if (curFileTLI > 0 && tli < curFileTLI)
-elog(ERROR, "according to history file, WAL location %X/%X belongs to timeline %u, but previous recovered WAL file came from timeline %u",
-	 (uint32) (tliRecPtr >> 32),
-	 (uint32) tliRecPtr,
-	 tli, curFileTLI);
-		}
-		curFileTLI = tli;
-		RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
-			 PrimarySlotName);
-		receivedUpto = 0;
-	}
-
 	/*
 	 * Move to XLOG_FROM_STREAM state in either case. We'll
 	 * get immediate failure if we didn't launch walreceiver,
 	 * and move on to the next state.
 	 */
 	currentSource = XLOG_FROM_STREAM;
+	startWalReceiver = true;
 	break;
 
 case XLOG_FROM_STREAM:
@@ -12044,7 +12010,69 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	Assert(StandbyMode);
 
 	/*
-	 * Check if WAL receiver is still active.
+	 * shutdown WAL receiver if restart is requested.
+	 */
+	if

Re: allow online change primary_conninfo

2020-01-21 Thread Sergei Kornilov

Hello

"Waiting on Author" is the right status. I found logical conflict with recently 
added wal_receiver_create_temp_slot GUC and my tests are currently broken. Will 
fix it and post new version.

PS: also, I surprised why it's ok for wal_receiver_create_temp_slot to be 
PGC_SIGHUP and ignore change of this setting until walreceiver will reconnect 
by unrelated reason. I means walreceiver does nothing special on SIGHUP. In 
common case change of wal_receiver_create_temp_slot setting will have effect 
only during restart of walreceiver process. And therefore we will switch to 
archive recovery. But such design was strongly rejected for my patch year ago.

regards, Sergei

Re: Rearranging ALTER TABLE to avoid multi-operations bugs

2020-01-15 Thread Sergei Kornilov

Hello

Thank you!

I am clearly not a good reviewer for such changes... But for a note: I read the 
v4 patch and have no useful comments. Good new tests, reasonable code changes 
to fix multiple bug reports.

The patch is proposed only for the master branch, right?

regards, Sergei

Re: ALTER TABLE support for dropping generation expression

2020-01-13 Thread Sergei Kornilov

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

Thank you!
Looks good to me. I have no further comments. I'll mark as ready for committer.

The new status of this patch is: Ready for Committer

Re: [HACKERS] Block level parallel vacuum

2020-01-13 Thread Sergei Kornilov

Hello

> I just thought they were concerned
> that the variable name skip_index might be confusing because we skip
> if skip_index is NOT true.

Right.

>>  > - bool skip_index = (get_indstats(lps->lvshared, i) == NULL ||
>>  > - skip_parallel_vacuum_index(Irel[i], lps->lvshared));
>>  > + bool can_parallel = (get_indstats(lps->lvshared, i) == NULL ||
>>  > + skip_parallel_vacuum_index(Irel[i],
>>  > + lps->lvshared));
>>  >
>>  > The above condition is true when the index can *not* do parallel index 
>> vacuum.

Ouch, right. I was wrong. (or the variable name and the comment really confused 
me)

> Okay, would it better if we get rid of this variable and have code like below?
>
> /* Skip the indexes that can be processed by parallel workers */
> if ( !(get_indstats(lps->lvshared, i) == NULL ||
> skip_parallel_vacuum_index(Irel[i], lps->lvshared)))
> continue;

Complex condition... Not sure.

> How about changing it to skipped_index and change the comment to something 
> like “We are interested in only index skipped parallel vacuum”?

I prefer this idea.

> Today, again thinking about it, it seems
> the idea Mahendra is suggesting that is giving an error if the
> parallel degree is not specified seems reasonable to me.

+1

regards, Sergei

Re: [HACKERS] Block level parallel vacuum

2020-01-10 Thread Sergei Kornilov

Hello

> Yes, we should improve this. I tried to fix this. Attaching a delta
> patch that is fixing both the comments.

Thank you, I have no objections.

I think that status of CF entry is outdated and the most appropriate status for 
this patch is "Ready to Commiter". Changed. I also added an annotation with a 
link to recently summarized results.

regards, Sergei

Re: ALTER TABLE support for dropping generation expression

2020-01-10 Thread Sergei Kornilov

Hello

Thank you, but I am late: patch has another merge conflict.

Conflict seems trivial and patch looks fine for me.

regards, Sergei

Re: [HACKERS] Block level parallel vacuum

2020-01-10 Thread Sergei Kornilov

Hi
Thank you for update! I looked again

(vacuum_indexes_leader)
+   /* Skip the indexes that can be processed by parallel workers */
+   if (!skip_index)
+   continue;

Does the variable name skip_index not confuse here? Maybe rename to something 
like can_parallel?

Another question about behavior on temporary tables. Use case: the user 
commands just "vacuum;" to vacuum entire database (and has enough maintenance 
workers). Vacuum starts fine in parallel, but on first temporary table we hit:

+   if (RelationUsesLocalBuffers(onerel) && params->nworkers >= 0)
+   {
+   ereport(WARNING,
+   (errmsg("disabling parallel option of vacuum on 
\"%s\" --- cannot vacuum temporary tables in parallel",
+   
RelationGetRelationName(onerel;
+   params->nworkers = -1;
+   }

And therefore we turn off the parallel vacuum for the remaining tables... Can 
we improve this case?

regards, Sergei

Re: [HACKERS] Block level parallel vacuum

2020-01-09 Thread Sergei Kornilov

Hello

I noticed that parallel vacuum uses min_parallel_index_scan_size GUC to skip 
small indexes but this is not mentioned in documentation for both vacuum 
command and GUC itself.

+   /* Determine the number of parallel workers to launch */
+   if (lps->lvshared->for_cleanup)
+   {
+   if (lps->lvshared->first_time)
+   nworkers = lps->nindexes_parallel_cleanup +
+   lps->nindexes_parallel_condcleanup - 1;
+   else
+   nworkers = lps->nindexes_parallel_cleanup - 1;
+
+   }
+   else
+   nworkers = lps->nindexes_parallel_bulkdel - 1;

(lazy_parallel_vacuum_indexes)
Perhaps we need to add a comment for future readers, why we reduce the number 
of workers by 1. Maybe this would be cleaner?

+   /* Determine the number of parallel workers to launch */
+   if (lps->lvshared->for_cleanup)
+   {
+   if (lps->lvshared->first_time)
+   nworkers = lps->nindexes_parallel_cleanup +
+   lps->nindexes_parallel_condcleanup;
+   else
+   nworkers = lps->nindexes_parallel_cleanup;
+
+   }
+   else
+   nworkers = lps->nindexes_parallel_bulkdel;
+
+   /* The leader process will participate */
+   nworkers--;

I have no more comments after reading the patches.

regards, Sergei

Re: Expose lock group leader pid in pg_stat_activity

2019-12-27 Thread Sergei Kornilov

Hello

> As I understand it, lock group is some infrastructure that is used by
> parallel queries, but could be used for something else too. So if
> more documentation is needed, we should say something like "For now,
> only parallel queries can have a lock group" or something like that.

If lockGroupLeader will be used in some way for non-parallel query, then the 
name leader_pid could be confusing. No?
I treat pg_stat_activity as view for user. We have document somewhere what is 
"lock group leader" (excepts README in source tree)? I meant user going to read 
documentation, "ok, this field is process ID of the lock group leader, but what 
is it?". Expose a leader pid for parallel worker will be clear improvement for 
user. And seems lockGroupLeader->pid is exactly this stuff. Therefore, I would 
like to see such description and meaning of the field.

> The fact that leader_pid == pid for the leader and different for the
> other member should be obvious, I'm not sure that it's worth
> documenting that.

It may be not obvious that leader_pid is not null in this case. But ok, no 
objections.

regards, Sergei

Re: Expose lock group leader pid in pg_stat_activity

2019-12-26 Thread Sergei Kornilov

Hello

I doubt that "Process ID of the lock group leader" is enough for user 
documentation. I think we need note:
- this field is related to parallel query execution
- leader_pid = pid if process is parallel leader
- leader_pid would point to pid of the leader if process is parallel worker
- leader_pid will be NULL for non-parallel queries or idle sessions

Also patch has no tests. Possible this is normal, not sure how to write a 
reliable test for this feature.
Patch applies, compiles, pass tests

regards, Sergei

Re: Online checksums patch - once again

2019-12-25 Thread Sergei Kornilov

Hello

> Attached is a v15 of the online checksums patchset (minus 0005), rebased on 
> top
> of your v3 ProcSignalBarrier patch rather than Andres' PoC GlobalBarrier 
> patch.
> It does take the, perhaps, controversial approach of replacing the SAMPLE
> barrier with the CHECKSUM barrier. The cfbot will be angry since this email
> doesn't contain the procsignalbarrier patch, but it sounded like that would go
> in shortly so opted for that.

ProcSignalBarrier was committed, so online checksums patchset has no other 
pending dependencies and should be applied cleanly on master. Right? The 
patchset needs another rebase in this case, does not apply...

regards, Sergei

Re: ALTER TABLE support for dropping generation expression

2019-12-25 Thread Sergei Kornilov

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, failed
Spec compliant:   not tested
Documentation:tested, passed

Hello

Patch does not apply to master. Could you rebase?

Code looks good and very similar to "ALTER TABLE ... ALTER COLUMN ... DROP 
IDENTITY"

I noticed one bug:

create table testdrop (i int, b int, m int GENERATED ALWAYS AS ( i*2) stored);
insert into testdrop(i,b) values (3,4);
alter table testdrop alter COLUMN m drop expression ;
alter table testdrop drop column i;

Here is no "m" column anymore. Possible due some forgotten dependency?

regards, Sergei

The new status of this patch is: Waiting on Author

Re: global / super barriers (for checksums)

2019-12-17 Thread Sergei Kornilov

Hi

> Andrew Gierth complained about this too over on -committers, and I saw
> his message first and pushed a fix. It includes the first and third
> hunks from your proposed patch, but not the second one.

Yep, I received his email just after sending mine. Thanks, my build is clean 
now.

regards, Sergei

Re: global / super barriers (for checksums)

2019-12-17 Thread Sergei Kornilov

Hello

> Stellar. If nobody objects in the meantime, I plan to commit 0001-0003
> next week.

My compiler (gcc 8.3.0) is not happy with recent 
5910d6c7e311f0b14e3d3cb9ce3597c01d3a3cde commit:

autovacuum.c:831:1: error: ‘AutoVacLauncherShutdown’ was used with no prototype 
before its definition [-Werror=missing-prototypes]
checkpointer.c:524:1: error: ‘HandleCheckpointerInterrupts’ was used with no 
prototype before its definition [-Werror=missing-prototypes]

I think definition should looks as in attached patch. With this change build is 
clean

regards, Sergeidiff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1792008ebe..47b4bacadb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -311,7 +311,7 @@ NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_no
 
 static Oid	do_start_worker(void);
 static void HandleAutoVacLauncherInterrupts(void);
-static void AutoVacLauncherShutdown() pg_attribute_noreturn();
+static void AutoVacLauncherShutdown(void) pg_attribute_noreturn();
 static void launcher_determine_sleep(bool canlaunch, bool recursing,
 	 struct timeval *nap);
 static void launch_worker(TimestampTz now);
@@ -828,7 +828,7 @@ HandleAutoVacLauncherInterrupts(void)
  * Perform a normal exit from the autovac launcher.
  */
 static void
-AutoVacLauncherShutdown()
+AutoVacLauncherShutdown(void)
 {
 	ereport(DEBUG1,
 			(errmsg("autovacuum launcher shutting down")));
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 3f35b324c3..df527ac021 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -161,7 +161,7 @@ static pg_time_t last_xlog_switch_time;
 
 /* Prototypes for private functions */
 
-static void HandleCheckpointerInterrupts();
+static void HandleCheckpointerInterrupts(void);
 static void CheckArchiveTimeout(void);
 static bool IsCheckpointOnSchedule(double progress);
 static bool ImmediateCheckpointRequested(void);

Re: Sketch of a fix for that truncation data corruption issue

2019-12-05 Thread Sergei Kornilov

Hello

>>  > Also, I'm not entirely sure whether there's anything in our various
>>  > replication logic that's dependent on vacuum truncation taking AEL.
>>  > Offhand I'd expect the reduced use of AEL to be a plus, but maybe
>>  > I'm missing something.
>>
>>  It'd be a *MAJOR* plus. One of the biggest operational headaches for
>>  using a HS node for querying is that there'll often be conflicts due to
>>  vacuum truncating relations (which logs an AEL), even if
>>  hot_standby_feedback is used. There's been multiple proposals to
>>  allow disabling truncations just because of that.
>
> Huge +1 from me here, we've seen this too. Getting rid of the conflict
> when using a HS node for querying would be fantastic.

One small ping... This topic has been inactive for a long time. But that would 
be a great improvement for any future release. I observe such problems from 
time to time... (so far, we have at least a workaround with the vacuum_trunk 
option)

regards, Sergei

Re: [HACKERS] Block level parallel vacuum

2019-12-01 Thread Sergei Kornilov

Hi

> I think I got your point. Your proposal is that it's more efficient if
> we make the leader process vacuum the index that can be processed only
> the leader process (i.e. indexes not supporting parallel index vacuum)
> while workers are processing indexes supporting parallel index vacuum,
> right? That way, we can process indexes in parallel as much as
> possible.

Right

> So maybe we can call vacuum_or_cleanup_skipped_indexes first
> and then call vacuum_or_cleanup_indexes_worker. But I'm not sure that
> there are parallel-safe remaining indexes after the leader finished
> vacuum_or_cleanup_indexes_worker, as described on your proposal.

I meant that after processing missing indexes (not supporting parallel index 
vacuum), the leader can start processing indexes that support the parallel 
index vacuum, along with parallel workers.
Exactly call vacuum_or_cleanup_skipped_indexes after start parallel workers but 
before vacuum_or_cleanup_indexes_worker or something with similar effect.
If we have 0 missed indexes - parallel vacuum will run as in current 
implementation, with leader participation.

Sorry for my unclear english...

regards, Sergei

Re: [HACKERS] Block level parallel vacuum

2019-12-01 Thread Sergei Kornilov

Hi

> I think the advantage of the current approach is that once the parallel 
> workers are launched, the leader can process indexes that don't support 
> parallelism.  So, both type of indexes can be processed at the same time.

In lazy_parallel_vacuum_or_cleanup_indexes I see:

/*
 * Join as a parallel worker. The leader process alone does that in
 * case where no workers launched.
 */
if (lps->leaderparticipates || lps->pcxt->nworkers_launched == 0)
vacuum_or_cleanup_indexes_worker(Irel, nindexes, stats, 
lps->lvshared,

 vacrelstats->dead_tuples);

/*
 * Here, the indexes that had been skipped during parallel index 
vacuuming
 * are remaining. If there are such indexes the leader process does 
vacuum
 * or cleanup them one by one.
 */
vacuum_or_cleanup_skipped_indexes(vacrelstats, Irel, nindexes, stats,
  lps);

So parallel leader will process parallel indexes first along with parallel 
workers and skip non-parallel ones. Only after end of the index list parallel 
leader will process non-parallel indexes one by one. In case of equal index 
processing time parallel leader will process (count of parallel 
indexes)/(nworkers+1) + all non-parallel, while parallel workers will process 
(count of parallel indexes)/(nworkers+1).  I am wrong here?

regards, Sergei

Re: [HACKERS] Block level parallel vacuum

2019-11-30 Thread Sergei Kornilov

Hello

Its possible to change order of index processing by parallel leader? In v35 
patchset I see following order:
- start parallel processes
- leader and parallel workers processed index lixt and possible skip some 
entries
- after that parallel leader recheck index list and process the skipped indexes
- WaitForParallelWorkersToFinish

I think it would be better to:
- start parallel processes
- parallel leader goes through index list and process only indexes which are 
skip_parallel_index_vacuum = true
- parallel workers processes indexes with skip_parallel_index_vacuum = false
- parallel leader start participate with remainings parallel-safe index 
processing
- WaitForParallelWorkersToFinish

This would be less running time and better load balance across leader and 
workers in case of few non-parallel and few parallel indexes.
(if this is expected and required by some reason, we need a comment in code)

Also few notes to vacuumdb:
Seems we need version check at least in vacuum_one_database and 
prepare_vacuum_command. Similar to SKIP_LOCKED or DISABLE_PAGE_SKIPPING 
features.
discussion question: difference between --parallel and --jobs parameters will 
be confusing? We need more description for this options?

regards, Sergei

Re: could not stat promote trigger file leads to shutdown

2019-11-15 Thread Sergei Kornilov

Hello

> Maybe we need a new elevel category for that.
> SYSTEM_WARNING or LOG_WARNING, perhaps?

I think a separate levels for user warnings and system warnings (and errors) 
would be great for log analytics. Error due to user typo in query is not the 
same as cache lookup error (for example).

regards, Sergei

Re: base backup client as auxiliary backend process

2019-11-15 Thread Sergei Kornilov

Hello

Could you rebase patch please? I have errors during patch apply. CFbot checks 
latest demonstration patch.

> I looked into this. It seems trivial to make walsender create and use a
> temporary replication slot by default if no permanent replication slot
> is specified. This is basically the logic that pg_basebackup has but
> done server-side. See attached patch for a demonstration. Any reason
> not to do that?

Seems this would break pg_basebackup --no-slot option?

> +  Do not copy configuration files, that is, files that end in
> +  .conf.

possible we need ignore *.signal files too?

> +/*
> + * XXX copied from pg_basebackup.c
> + */
> +
> +unsigned long long totaldone;
> +unsigned long long totalsize_kb;
> +int tablespacenum;
> +int tablespacecount;

Variable declaration in the middle of file is correct for coding style? Not a 
problem for me, I just want to clarify.
Should not be declared "static"?
Also how about tablespacedone instead of tablespacenum?

> The updated has support for tablespaces without mapping.  I'm thinking 
> about putting the mapping specification into a GUC list somehow. 
> Shouldn't be too hard.

I think we can leave tablespace mapping for pg_basebackup only. More powerful 
tool for less common scenarios. Or for another future patch.

regards, Sergei

Re: allow online change primary_conninfo

2019-10-31 Thread Sergei Kornilov

Hello

> So, I'd like to propose to move the stuff to the second switch().
> (See the attached incomplete patch.) This is rather similar to
> Sergei's previous proposal, but the structure of the state
> machine is kept.

Very similar to my v4 proposal (also move RequestXLogStreaming call), but 
closer to currentSource reading. No objections from me, attached patch is 
changed this way.
I renamed start_wal_receiver to startWalReceiver - this style looks more 
consistent to near code.

> + /*
> +  * Else, check if WAL receiver is still 
> active.
> +  */
> + else if (!WalRcvStreaming())

I think we still need wait WalRcvStreaming after RequestXLogStreaming call. So 
I remove else branch and leave separate condition.

> In ProcessStartupSigHup, conninfo and slotname don't need to be
> retained until the end of the function.

Agreed, I move pfree

> The log message in the function seems to be too detailed. On the
> other hand, if we changed primary_conninfo to '' (stop) or vise
> versa (start), the message (restart) looks strange.

I have no strong opinion here. These messages was changed many times during 
this thread lifetime, can be changed anytime. I think this is not issue since 
we have no consensus about overall design.
Write detailed messages was proposed here: 
https://www.postgresql.org/message-id/20190216151025.GJ2240%40paquier.xyz

> or vise versa (start)

I explicitly check currentSource and WalRcvRunning, so we have no such messages 
if user had no walreceiver before.

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0191ec84b1..587031dea8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3983,9 +3983,15 @@ ANY num_sync ( 
@@ -4000,9 +4006,13 @@ ANY num_sync ( ).
-  This parameter can only be set at server start.
+  This parameter can only be set in the postgresql.conf
+  file or on the server command line.
   This setting has no effect if primary_conninfo is not
-  set.
+  set or the server is not in standby mode.
+ 
+ 
+  The WAL receiver is restarted after an update of primary_slot_name.
  
 

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2e3cc51006..8eee079eb2 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -803,6 +803,12 @@ static XLogSource readSource = 0;	/* XLOG_FROM_* code */
 static XLogSource currentSource = 0;	/* XLOG_FROM_* code */
 static bool lastSourceFailed = false;
 
+/*
+ * Need for restart running WalReceiver due the configuration change.
+ * Suitable only for XLOG_FROM_STREAM source
+ */
+static bool pendingWalRcvRestart = false;
+
 typedef struct XLogPageReadPrivate
 {
 	int			emode;
@@ -11773,6 +11779,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	for (;;)
 	{
 		int			oldSource = currentSource;
+		bool		startWalReceiver = false;
 
 		/*
 		 * First check if we failed to read from the current source, and
@@ -11806,54 +11813,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	if (!StandbyMode)
 		return false;
 
-	/*
-	 * If primary_conninfo is set, launch walreceiver to try
-	 * to stream the missing WAL.
-	 *
-	 * If fetching_ckpt is true, RecPtr points to the initial
-	 * checkpoint location. In that case, we use RedoStartLSN
-	 * as the streaming start position instead of RecPtr, so
-	 * that when we later jump backwards to start redo at
-	 * RedoStartLSN, we will have the logs streamed already.
-	 */
-	if (PrimaryConnInfo && strcmp(PrimaryConnInfo, "") != 0)
-	{
-		XLogRecPtr	ptr;
-		TimeLineID	tli;
-
-		if (fetching_ckpt)
-		{
-			ptr = RedoStartLSN;
-			tli = ControlFile->checkPointCopy.ThisTimeLineID;
-		}
-		else
-		{
-			ptr = RecPtr;
-
-			/*
-			 * Use the record begin position to determine the
-			 * TLI, rather than the position we're reading.
-			 */
-			tli = tliOfPointInHistory(tliRecPtr, expectedTLEs);
-
-			if (curFileTLI > 0 && tli < curFileTLI)
-elog(ERROR, "according to history file, WAL location %X/%X belongs to timeline %u, but previous recovered WAL file came from timeline %u",
-	 (uint32) (tliRecPtr >> 32),
-	 (uint32) tliRecPtr,
-	 tli, curFileTLI);
-		}
-		curFileTLI = tli;
-		RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
-			 PrimarySlotName);
-		receivedUpto = 0;
-	}
-
 	/*
 	 * Move to XLOG_FROM_STREAM state in either case. We'll
 	 * get immediate failure if we didn't launch walreceiver,
 	 * and move on to the next state.
 	 */
 	currentSource = XLOG_FROM_STREAM;
+	startWalReceiver = true;
 	break;
 
 case

Re: allow online change primary_conninfo

2019-09-19 Thread Sergei Kornilov

Hello

Thank you for review!

> - This parameter can only be set at server start.
> + This parameter can only be set in the postgresql.conf
> + file or on the server command line.
>
> I'm not sure it's good to change the context of this
> description. This was mentioning that changing of this parameter
> requires server (re)start. So if we want to be on the same
> context after rewriting, it would be like "This parameter can be
> set any time and causes WAL receiver restart with the new setting
> if the server is in standby mode."

Was written such way after this review: 
https://www.postgresql.org/message-id/20181125214313.lydvmrraqjfrb3s2%40alap3.anarazel.de

> And If I'm not missing something, I don't find an (explict)
> paramter of postmaster for setting primary_conninfo.

Well, we have common -c option: -c name=value

> Couldn't we do the same thing by just skipping the wait and
> setting lastSourceFailed to true in the case of intentional
> walreceiver restart?

Yes, it's possible. Let's see... Done in attached variant.
We need check pendingWalRcvRestart before rescanLatestTimeLine lines.

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6612f95f9f..1fa48058e8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3929,9 +3929,15 @@ ANY num_sync ( 
@@ -3946,9 +3952,13 @@ ANY num_sync ( ).
-  This parameter can only be set at server start.
+  This parameter can only be set in the postgresql.conf
+  file or on the server command line.
   This setting has no effect if primary_conninfo is not
-  set.
+  set or the server is not in standby mode.
+ 
+ 
+  The WAL receiver is restarted after an update of primary_slot_name.
  
 

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b7ff004234..f0e47cc663 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -803,6 +803,12 @@ static XLogSource readSource = 0;	/* XLOG_FROM_* code */
 static XLogSource currentSource = 0;	/* XLOG_FROM_* code */
 static bool lastSourceFailed = false;
 
+/*
+ * Need for restart running WalReceiver due the configuration change.
+ * Suitable only for XLOG_FROM_STREAM source
+ */
+static bool pendingWalRcvRestart = false;
+
 typedef struct XLogPageReadPrivate
 {
 	int			emode;
@@ -11756,12 +11762,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 		int			oldSource = currentSource;
 
 		/*
-		 * First check if we failed to read from the current source, and
+		 * First check if we failed to read from the current source or if
+		 * we want to restart wal receiver, and
 		 * advance the state machine if so. The failure to read might've
 		 * happened outside this function, e.g when a CRC check fails on a
 		 * record, or within this loop.
 		 */
-		if (lastSourceFailed)
+		if (lastSourceFailed || pendingWalRcvRestart)
 		{
 			switch (currentSource)
 			{
@@ -11862,6 +11869,17 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	if (WalRcvStreaming())
 		ShutdownWalRcv();
 
+	/*
+	 * If wal receiver is requested to restart, we skip the
+	 * next XLOG_FROM_ARCHIVE to immediately starting it.
+	 */
+	if (pendingWalRcvRestart)
+	{
+		lastSourceFailed = true;
+		currentSource = XLOG_FROM_ARCHIVE;
+		continue;
+	}
+
 	/*
 	 * Before we sleep, re-scan for possible new timelines if
 	 * we were requested to recover to the latest timeline.
@@ -11881,7 +11899,8 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	 * obtaining the requested WAL. We're going to loop back
 	 * and retry from the archive, but if it hasn't been long
 	 * since last attempt, sleep wal_retrieve_retry_interval
-	 * milliseconds to avoid busy-waiting.
+	 * milliseconds to avoid busy-waiting. We don't wait if
+	 * explicitly requested to restart.
 	 */
 	now = GetCurrentTimestamp();
 	if (!TimestampDifferenceExceeds(last_fail_time, now,
@@ -11922,16 +11941,17 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 currentSource = XLOG_FROM_ARCHIVE;
 		}
 
-		if (currentSource != oldSource)
+		if (currentSource != oldSource && !pendingWalRcvRestart)
 			elog(DEBUG2, "switched WAL source from %s to %s after %s",
  xlogSourceNames[oldSource], xlogSourceNames[currentSource],
  lastSourceFailed ? "failure" : "success");
 
 		/*
-		 * We've now handled possible failure. Try to read from the chosen
-		 * source.
+		 * We've now handled possible failure and configuration change. Try to
+		 * read from the chosen source.
 		 */
 		lastSourceFailed = false;
+		pendingWalRcvRestart = false;
 
 		switch (currentSource)
 		{
@@ -12096,6 +12116,45 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	return false;/* not reached */
 }
 
+/*
+ * Re-read config file and plan to

Re: allow online change primary_conninfo

2019-09-19 Thread Sergei Kornilov

Hello

Thank you for review!

> ISTM that you need to update the above parts in postgresql.conf.sample.

Good catch, I forgot about conf sample.

> ISTM that you need to update the above comment in walreceiver.c.

Changed

> If primary_conninfo is set to an empty string, walreceiver just shuts down,
> not restarts. The above description in the doc is not always true.
> So I'm thinking something like the following description is better.
> Thought?
>
> If primary_conninfo is changed while WAL receiver is running,
> the WAL receiver shuts down and then restarts with new setting,
> except when primary_conninfo is an empty string.

Ok, changed.
I leave primary_slot_name description as before.

> There is the case where walreceiver doesn't shut down immediately
> after the change of primary_conninfo. If the change happens while
> the startup process in paused state (e.g., by pg_wal_replay_pause(),
> recovery_min_apply_delay, recovery conflict, etc), the startup
> process tries to terminate walreceiver after it gets out of such state.
> Shouldn't this fact be documented as a note?

Hmm. Is somewhere documented that walreceiver will receive WAL while the 
startup process in paused state?
(didn't add such note in current version)

regards, Sergeidiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6612f95f9f..1fa48058e8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3929,9 +3929,15 @@ ANY num_sync ( 
@@ -3946,9 +3952,13 @@ ANY num_sync ( ).
-  This parameter can only be set at server start.
+  This parameter can only be set in the postgresql.conf
+  file or on the server command line.
   This setting has no effect if primary_conninfo is not
-  set.
+  set or the server is not in standby mode.
+ 
+ 
+  The WAL receiver is restarted after an update of primary_slot_name.
  
 

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b7ff004234..743cae079e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -803,6 +803,12 @@ static XLogSource readSource = 0;	/* XLOG_FROM_* code */
 static XLogSource currentSource = 0;	/* XLOG_FROM_* code */
 static bool lastSourceFailed = false;
 
+/*
+ * Need for restart running WalReceiver due the configuration change.
+ * Suitable only for XLOG_FROM_STREAM source
+ */
+static bool pendingWalRcvRestart = false;
+
 typedef struct XLogPageReadPrivate
 {
 	int			emode;
@@ -11787,48 +11793,6 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	if (!StandbyMode)
 		return false;
 
-	/*
-	 * If primary_conninfo is set, launch walreceiver to try
-	 * to stream the missing WAL.
-	 *
-	 * If fetching_ckpt is true, RecPtr points to the initial
-	 * checkpoint location. In that case, we use RedoStartLSN
-	 * as the streaming start position instead of RecPtr, so
-	 * that when we later jump backwards to start redo at
-	 * RedoStartLSN, we will have the logs streamed already.
-	 */
-	if (PrimaryConnInfo && strcmp(PrimaryConnInfo, "") != 0)
-	{
-		XLogRecPtr	ptr;
-		TimeLineID	tli;
-
-		if (fetching_ckpt)
-		{
-			ptr = RedoStartLSN;
-			tli = ControlFile->checkPointCopy.ThisTimeLineID;
-		}
-		else
-		{
-			ptr = RecPtr;
-
-			/*
-			 * Use the record begin position to determine the
-			 * TLI, rather than the position we're reading.
-			 */
-			tli = tliOfPointInHistory(tliRecPtr, expectedTLEs);
-
-			if (curFileTLI > 0 && tli < curFileTLI)
-elog(ERROR, "according to history file, WAL location %X/%X belongs to timeline %u, but previous recovered WAL file came from timeline %u",
-	 (uint32) (tliRecPtr >> 32),
-	 (uint32) tliRecPtr,
-	 tli, curFileTLI);
-		}
-		curFileTLI = tli;
-		RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
-			 PrimarySlotName);
-		receivedUpto = 0;
-	}
-
 	/*
 	 * Move to XLOG_FROM_STREAM state in either case. We'll
 	 * get immediate failure if we didn't launch walreceiver,
@@ -11928,10 +11892,66 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
  lastSourceFailed ? "failure" : "success");
 
 		/*
-		 * We've now handled possible failure. Try to read from the chosen
-		 * source.
+		 * Request walreceiver to start if we switch from another source or if
+		 * we need to change walreceiver connection configuration.
+		 */
+		if (currentSource == XLOG_FROM_STREAM && (lastSourceFailed || pendingWalRcvRestart))
+		{
+			/*
+			 * Ensure walreceiver is not running
+			 */
+			if (WalRcvRunning())
+ShutdownWalRcv();
+
+			/*
+			 * If primary_conninfo is set, launch walreceiver to try to stream
+			 * the missing WAL.
+			 *
+			 * If fetching_ckpt is true, RecPtr points to the initial
+			 * checkpoint location. In that case, we

Re: Change ereport level for QueuePartitionConstraintValidation

2019-09-08 Thread Sergei Kornilov

Hello

> Hearing no comments, I've pushed that patch, and marked the v12
> open item closed.

Thank you!

regards, Sergei

Re: Planning counters in pg_stat_statements (using pgss_store)

2019-09-04 Thread Sergei Kornilov

Hello

I think the most important question for this topic is performance penalty.
It was a long story, first test on my desktop was too volatile. I setup 
separate PC with DB only and test few cases.

PC spec: 2-core Intel Core 2 Duo E6550, 4GB ram, mechanical HDD
All tests on top 7dedfd22b79822b7f4210e6255b672ea82db6678 commit, build via 
./configure  --prefix=/home/melkij/tmp/ --enable-tap-tests
DB settings:
  listen_addresses = '*'
  log_line_prefix = '%m %p %u@%d from %h [vxid:%v txid:%x] [%i] '
  lc_messages = 'C'
  shared_buffers = 512MB

pgbench runned from different host, in same L2 network.
Database was generated by: pgbench -s 10 -i -h hostname postgres
After database start I run:
  create extension if not exists pg_prewarm;
  select count(*), sum(pg_prewarm) from pg_tables join 
pg_prewarm(tablename::regclass) on true where schemaname= 'public';
  select count(*), sum(pg_prewarm) from pg_indexes join 
pg_prewarm(indexname::regclass) on true where schemaname= 'public';
So all data was in buffers.

Load generated by command: pgbench --builtin=select-only --time=300 -n -c 10 -h 
hostname postgres -M (vary)

Tests are:
head_no_pgss - unpatched version, empty shared_preload_libraries
head_track_none - unpatched version with:
  shared_preload_libraries = 'pg_stat_statements'
  pg_stat_statements.max = 5000
  pg_stat_statements.track = none
  pg_stat_statements.save = off
  pg_stat_statements.track_utility = off
head_track_top - the same but with pg_stat_statements.track=top
5-times runned in every mode -M: simple, extended, prepared

patch_not_loaded - build with latest published patches, empty 
shared_preload_libraries
patch_track_none - patched build with
  shared_preload_libraries = 'pg_stat_statements'
  pg_stat_statements.max = 5000
  pg_stat_statements.track = none
  pg_stat_statements.save = off
  pg_stat_statements.track_utility = off
  pg_stat_statements.track_planning = off
patch_track_top - the same but with pg_stat_statements.track=top
patch_track_planning - with:
  shared_preload_libraries = 'pg_stat_statements'
  pg_stat_statements.max = 5000
  pg_stat_statements.track = top
  pg_stat_statements.save = off
  pg_stat_statements.track_utility = off
  pg_stat_statements.track_planning = on

10-times runned in every mode -M: simple, extended, prepared

Results:

 test |   mode   | average_tps | degradation_perc 
--+--+-+--
 head_no_pgss | extended |   13816 |1.000
 patch_not_loaded | extended |   13755 |0.996
 head_track_none  | extended |   13607 |0.985
 patch_track_none | extended |   13560 |0.981
 head_track_top   | extended |   13277 |0.961
 patch_track_top  | extended |   13189 |0.955
 patch_track_planning | extended |   12983 |0.940
 head_no_pgss | prepared |   29101 |1.000
 head_track_none  | prepared |   28510 |0.980
 patch_track_none | prepared |   28481 |0.979
 patch_not_loaded | prepared |   28382 |0.975
 patch_track_planning | prepared |   28046 |0.964
 head_track_top   | prepared |   28035 |0.963
 patch_track_top  | prepared |   27973 |0.961
 head_no_pgss | simple   |   16733 |1.000
 patch_not_loaded | simple   |   16552 |0.989
 head_track_none  | simple   |   16452 |0.983
 patch_track_none | simple   |   16365 |0.978
 head_track_top   | simple   |   15867 |0.948
 patch_track_top  | simple   |   15820 |0.945
 patch_track_planning | simple   |   15739 |0.941

So I found slight slowdown with track_planning = off compared to HEAD. Possibly 
just at the level of measurement error. I think this is ok.
track_planning = on also has no dramatic impact. In my opinion proposed design 
with pgss_store call is acceptable.

regards, Sergei

Re: pg_get_databasebyid(oid)

2019-09-04 Thread Sergei Kornilov

Hello

Thank you for attention! I marked CF entry as returned with feedback.

regards, Sergei

1 2 3 >

1 - 100 of 298 matches

Mail list logo