Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-11-03 Thread Vladimir Borodin

> 3 нояб. 2015 г., в 11:38, Andres Freund  написал(а):
> 
> On 2015-11-02 15:37:57 -0500, Robert Haas wrote:
>> On Fri, Oct 30, 2015 at 9:49 AM, Vladimir Borodin  wrote:
>>> I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT;
>>> ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there
>>> is copy-paste from psql there, but during conversation initial description
>>> was lost.
>>> 
>>> [0]
>>> http://www.postgresql.org/message-id/7f74c5ea-6741-44fc-b6c6-e96f18d76...@simply.name
>> 
>> Hmm.  That behavior seems unexpected to me, but I might be missing something.
> 
> The conflict is because of a relation lock, not because of
> visibility. Hot-Standby feedback changes nothing about that.
> 
> I presume all the other conflicts are all because of relation level
> locks? Check pg_stat_database_conflicts and the server logs to verify.

Oh, good point, thank you, it gives the answer. Actually I’ve already done a 
switchover in this cluster, so pg_stat_database_conflicts started from scratch 
:( But the logs haven’t been rotated yet:

root@rpopdb01e ~ # fgrep -e 562f9ef0.23df,6 -e 562fa107.451a -e 562fa1d9.5146 
-e 562f9ef0.23df,10 -e 562fa259.56d1 
/var/lib/pgsql/9.4/data/pg_log/postgresql-2015-10-27_185736.csv
2015-10-27 19:06:28.656 MSK,,,9183,,562f9ef0.23df,6,,2015-10-27 18:57:36 
MSK,,0,LOG,0,"parameter ""hot_standby_feedback"" changed to 
""off""",""
2015-10-27 19:10:05.039 
MSK,"postgres","rpopdb",17690,"[local]",562fa107.451a,1,"",2015-10-27 19:06:31 
MSK,12/54563,0,ERROR,40001,"canceling statement due to conflict with 
recovery","User query might have needed to see row versions that must be 
removed.","select count(*) from rpop.rpop_imap_uidls;",,,"psql"
2015-10-27 19:10:05.995 
MSK,"monitor","rpopdb",20806,"localhost:51794",562fa1d9.5146,1,"",2015-10-27 
19:10:01 MSK,15/24192,0,ERROR,40001,"canceling statement due to conflict with 
recovery","User was holding shared buffer pin for too long.""SQL function 
""to_timestamp"" statement 1","select cnt from monitor.bad_rpop_total",,,""
2015-10-27 19:12:06.878 MSK,,,9183,,562f9ef0.23df,10,,2015-10-27 18:57:36 
MSK,,0,LOG,0,"parameter ""hot_standby_feedback"" changed to 
""on""",""
2015-10-27 19:17:57.056 
MSK,"postgres","rpopdb",5,"[local]",562fa259.56d1,1,"",2015-10-27 19:12:09 
MSK,3/35442,0,FATAL,40001,"terminating connection due to conflict with 
recovery","User was holding a relation lock for too long.","select count(*) 
from rpop.rpop_imap_uidls;",,,"psql"
root@rpopdb01e ~ #

So FATAL is due to relation lock and one ERROR is due to pinned buffers (this 
is actually from another user) but there is also one ERROR due to old snapshots 
(first line). But I actually turned off hs_feedback before first ERROR and 
turned it on after it. So it seems to work expectedly.

Does it actually mean that I could get such conflicts (due to relation locks, 
for example) even in repeatable read or serializable? I mean, is there any 
dependency between transaction isolation level on standby and conflicts with 
recovery?

And am I right that the only way not to have confl_lock is to increase 
max_standby_streaming_delay?


> 
> Andres
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


--
May the force be with you…
https://simply.name



Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-11-03 Thread Andres Freund
On 2015-11-02 15:37:57 -0500, Robert Haas wrote:
> On Fri, Oct 30, 2015 at 9:49 AM, Vladimir Borodin  wrote:
> > I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT;
> > ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there
> > is copy-paste from psql there, but during conversation initial description
> > was lost.
> >
> > [0]
> > http://www.postgresql.org/message-id/7f74c5ea-6741-44fc-b6c6-e96f18d76...@simply.name
> 
> Hmm.  That behavior seems unexpected to me, but I might be missing something.

The conflict is because of a relation lock, not because of
visibility. Hot-Standby feedback changes nothing about that.

I presume all the other conflicts are all because of relation level
locks? Check pg_stat_database_conflicts and the server logs to verify.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-11-02 Thread Vladimir Borodin

> 2 нояб. 2015 г., в 23:37, Robert Haas  написал(а):
> 
> On Fri, Oct 30, 2015 at 9:49 AM, Vladimir Borodin  wrote:
>> I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT;
>> ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there
>> is copy-paste from psql there, but during conversation initial description
>> was lost.
>> 
>> [0]
>> http://www.postgresql.org/message-id/7f74c5ea-6741-44fc-b6c6-e96f18d76...@simply.name
> 
> Hmm.  That behavior seems unexpected to me, but I might be missing something.

Me too. That’s why I started the thread. One small detail that might have a 
value is that the big table being queried is partitioned into 64 inhereted 
tables. Now I’m trying to write a simple script to reproduce the problem, but 
that is not so easy because AFAIK VACUUM on master should happen while single 
query on standby is running and it should vacuum those rows that have not been 
accessed by the query on standby yet.

> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


--
May the force be with you…
https://simply.name



Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-11-02 Thread Robert Haas
On Fri, Oct 30, 2015 at 9:49 AM, Vladimir Borodin  wrote:
> I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT;
> ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there
> is copy-paste from psql there, but during conversation initial description
> was lost.
>
> [0]
> http://www.postgresql.org/message-id/7f74c5ea-6741-44fc-b6c6-e96f18d76...@simply.name

Hmm.  That behavior seems unexpected to me, but I might be missing something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-30 Thread Vladimir Borodin

> 30 окт. 2015 г., в 16:04, Robert Haas  написал(а):
> 
> On Fri, Oct 30, 2015 at 12:40 PM, Vladimir Borodin  wrote:
>> I still don’t fully understand why is it so (the problem occurs while
>> running only one SELECT-statement in READ COMMITED so only one snapshot is
>> taken), but if is expected behavior shouldn’t the documentation mention that
>> using READ COMMITED (which is the default) you may still get conflicts with
>> recovery while using replication slots?
> 
> Are you doing BEGIN / one or more SELECT statements / END?
> 
> Or just a bare SELECT with no explicit transaction control?

I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT; 
ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there is 
copy-paste from psql there, but during conversation initial description was 
lost.

[0] 
http://www.postgresql.org/message-id/7f74c5ea-6741-44fc-b6c6-e96f18d76...@simply.name

> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> 
> -- 
> Sent via pgsql-admin mailing list (pgsql-ad...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin


--
Да пребудет с вами сила…
https://simply.name/ru



Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-30 Thread Robert Haas
On Fri, Oct 30, 2015 at 12:40 PM, Vladimir Borodin  wrote:
> I still don’t fully understand why is it so (the problem occurs while
> running only one SELECT-statement in READ COMMITED so only one snapshot is
> taken), but if is expected behavior shouldn’t the documentation mention that
> using READ COMMITED (which is the default) you may still get conflicts with
> recovery while using replication slots?

Are you doing BEGIN / one or more SELECT statements / END?

Or just a bare SELECT with no explicit transaction control?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-30 Thread Andres Freund
On 2015-10-30 13:42:19 +0100, Michael Paquier wrote:
> On Fri, Oct 30, 2015 at 12:40 PM, Vladimir Borodin wrote:
> > On Thu, Oct 29, 2015 at 3:29 PM, Oleksii Kliukin wrote:
> >> Could it be a consequence of how REPEATABLE READ transactions handle
> >> snapshots? With REPEATABLE READ the snapshot is acquired only once at the
> >> beginning of a transaction; a READ COMMITTED transaction re-evaluates its
> >> snapshot with each new command.
> >
> > I still don’t fully understand why is it so (the problem occurs while
> > running only one SELECT-statement in READ COMMITED so only one snapshot is
> > taken), but if is expected behavior shouldn’t the documentation mention that
> > using READ COMMITTED (which is the default) you may still get conflicts with
> > recovery while using replication slots?
> 
> Replication slots and hot_standby_feedback are two different unrelated
> concepts, slots being aimed at retaining WAL.

Uh. Slots also retain the xmin horizon if hot_standby_feedback is
enabled on the standby?

> I guess that's the origin of your confusion:
> http://www.postgresql.org/message-id/20150616192141.gd2...@alap3.anarazel.de

That just says what I said above, I don't see how this makes replication
slots and hs feedback unrelated?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-30 Thread Michael Paquier
On Fri, Oct 30, 2015 at 12:40 PM, Vladimir Borodin wrote:
> On Thu, Oct 29, 2015 at 3:29 PM, Oleksii Kliukin wrote:
>> Could it be a consequence of how REPEATABLE READ transactions handle
>> snapshots? With REPEATABLE READ the snapshot is acquired only once at the
>> beginning of a transaction; a READ COMMITTED transaction re-evaluates its
>> snapshot with each new command.
>
> I still don’t fully understand why is it so (the problem occurs while
> running only one SELECT-statement in READ COMMITED so only one snapshot is
> taken), but if is expected behavior shouldn’t the documentation mention that
> using READ COMMITTED (which is the default) you may still get conflicts with
> recovery while using replication slots?

Replication slots and hot_standby_feedback are two different unrelated
concepts, slots being aimed at retaining WAL. I guess that's the
origin of your confusion:
http://www.postgresql.org/message-id/20150616192141.gd2...@alap3.anarazel.de
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-30 Thread Vladimir Borodin

> 30 окт. 2015 г., в 14:30, Robert Haas  написал(а):
> 
> On Thu, Oct 29, 2015 at 3:29 PM, Oleksii Kliukin  wrote:
>> Could it be a consequence of how REPEATABLE READ transactions handle
>> snapshots? With REPEATABLE READ the snapshot is acquired only once at the
>> beginning of a transaction; a READ COMMITTED transaction re-evaluates its
>> snapshot with each new command.
> 
> I bet that's exactly it.

I still don’t fully understand why is it so (the problem occurs while running 
only one SELECT-statement in READ COMMITED so only one snapshot is taken), but 
if is expected behavior shouldn’t the documentation mention that using READ 
COMMITED (which is the default) you may still get conflicts with recovery while 
using replication slots?

> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


--
May the force be with you…
https://simply.name



Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-30 Thread Robert Haas
On Thu, Oct 29, 2015 at 3:29 PM, Oleksii Kliukin  wrote:
> Could it be a consequence of how REPEATABLE READ transactions handle
> snapshots? With REPEATABLE READ the snapshot is acquired only once at the
> beginning of a transaction; a READ COMMITTED transaction re-evaluates its
> snapshot with each new command.

I bet that's exactly it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-29 Thread Oleksii Kliukin

> On 29 Oct 2015, at 14:39, Vladimir Borodin  wrote:
> 
> f I understand right, with hot_standby_feedback = on standby tells the master 
> xmin of the earliest transaction on standby. And autovacuum worker on master 
> takes it into account when doing vacuum cleanup (because it can see it from 
> pg_replications_slots view), AFAIK with GetOldestXmin function. But I don’t 
> understand why with READ COMMITED transactions xmin in pg_replication_slots 
> view on master continues to increase while with REPEATABLE READ xmin freezes 
> until this transaction finishes.

Could it be a consequence of how REPEATABLE READ transactions handle snapshots? 
With REPEATABLE READ the snapshot is acquired only once at the beginning of a 
transaction; a READ COMMITTED transaction re-evaluates its snapshot with each 
new command.

http://www.postgresql.org/docs/current/static/transaction-iso.html

Kind regards,
--
Oleksii



Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-29 Thread Vladimir Borodin

> 29 окт. 2015 г., в 15:29, Michael Paquier  
> написал(а):
> 
> On Thu, Oct 29, 2015 at 12:35 PM, Vladimir Borodin wrote:
>> 29 окт. 2015 г., в 14:03, Michael Paquier написал(а):
>>> Standby will receive the record but not replay it until the
>>> transaction doing REPEATABLE READ transactions that needs those rows
>>> commits on the standby. The WAL flush position on the standby
>>> continues to move on.
>> 
>> By replication lag on standby I mean exactly replay_location, not
>> flush_location.
>> Well, the initial problem is that in read commited mode heavy
>> SELECT-statement hits max_standby_streaming_delay but in repeatable read
>> mode doesn’t. My question is if it is expected behavior? If yes, why is it
>> so?
> 
> Er, well. If I enforce on master the deletion then VACUUM-cleanup of a
> page with a REPEATABLE READ transaction on standby still expecting to
> have this page items visible until its commit the startup process puts
> itself in waiting state when trying to replay the cleanup record, and
> the replay_location does not move on, still the wal receiver gets WAL
> in parallel, so it continues to flush things and flush_position
> progresses. With a READ COMMITTED transaction running on the standby,
> this transaction considers as visible stuff that has been committed,
> so WAL replay can move on, and indeed there is a risk to face a
> recovery conflict. So this behavior as-is is correct, based on how
> isolation levels should behave when a node performs recovery.

Everything you describe is exactly true for setups without replication slots. 
And the ability to run heavy SELECT statements on hot standby without 
replication lag and recovery conflicts was the reason why I tried to use them. 
And the documentation [0] directly says that «Replication slots provide an 
automated way to ensure ... that the master does not remove rows which could 
cause a recovery conflict even when the standby is disconnected». My question 
is why is it true for REPEATABLE READ transactions but it doesn’t work for READ 
COMMITED queries? Seems, that «even when the standby is disconnected» is much 
stronger limitation and READ COMMITED should work fine, but it doesn’t.

If I understand right, with hot_standby_feedback = on standby tells the master 
xmin of the earliest transaction on standby. And autovacuum worker on master 
takes it into account when doing vacuum cleanup (because it can see it from 
pg_replications_slots view), AFAIK with GetOldestXmin function. But I don’t 
understand why with READ COMMITED transactions xmin in pg_replication_slots 
view on master continues to increase while with REPEATABLE READ xmin freezes 
until this transaction finishes.

[0] 
http://www.postgresql.org/docs/9.4/static/warm-standby.html#STREAMING-REPLICATION-SLOTS
 

> -- 
> Michael
> 
> 
> -- 
> Sent via pgsql-admin mailing list (pgsql-ad...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin


--
Да пребудет с вами сила…
https://simply.name/ru



Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-29 Thread Michael Paquier
On Thu, Oct 29, 2015 at 12:35 PM, Vladimir Borodin wrote:
> 29 окт. 2015 г., в 14:03, Michael Paquier написал(а):
>> Standby will receive the record but not replay it until the
>> transaction doing REPEATABLE READ transactions that needs those rows
>> commits on the standby. The WAL flush position on the standby
>> continues to move on.
>
> By replication lag on standby I mean exactly replay_location, not
> flush_location.
> Well, the initial problem is that in read commited mode heavy
> SELECT-statement hits max_standby_streaming_delay but in repeatable read
> mode doesn’t. My question is if it is expected behavior? If yes, why is it
> so?

Er, well. If I enforce on master the deletion then VACUUM-cleanup of a
page with a REPEATABLE READ transaction on standby still expecting to
have this page items visible until its commit the startup process puts
itself in waiting state when trying to replay the cleanup record, and
the replay_location does not move on, still the wal receiver gets WAL
in parallel, so it continues to flush things and flush_position
progresses. With a READ COMMITTED transaction running on the standby,
this transaction considers as visible stuff that has been committed,
so WAL replay can move on, and indeed there is a risk to face a
recovery conflict. So this behavior as-is is correct, based on how
isolation levels should behave when a node performs recovery.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-29 Thread Vladimir Borodin

> 29 окт. 2015 г., в 14:03, Michael Paquier  
> написал(а):
> 
> On Thu, Oct 29, 2015 at 11:33 AM, Vladimir Borodin wrote:
>> 29 окт. 2015 г., в 13:12, Michael Paquier написал(а):
>>> In the case of repeatable read the standby will wait before applying
>>> the VACUUM WAL record cleaning up a relation page. Hence you won't get
>>> conflicts in this case.
>> 
>> Standby will receive but will not apply? Or master will not vacuum needed by
>> standby pages? It seems that the second one is happening because replication
>> lag on standby does not increase while issuing such repeatable read
>> transaction.
> 
> Standby will receive the record but not replay it until the
> transaction doing REPEATABLE READ transactions that needs those rows
> commits on the standby. The WAL flush position on the standby
> continues to move on.

By replication lag on standby I mean exactly replay_location, not 
flush_location.

> This depends of course on
> max_standby_streaming_delay which may decide or not to force the
> transaction to cancel if it takes too long. Someone feel free to
> correct me if I am missing something here.

Well, the initial problem is that in read commited mode heavy SELECT-statement 
hits max_standby_streaming_delay but in repeatable read mode doesn’t. My 
question is if it is expected behavior? If yes, why is it so?

Thanks for your response!

> -- 
> Michael


--
Да пребудет с вами сила…
https://simply.name/ru



Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-29 Thread Michael Paquier
On Thu, Oct 29, 2015 at 11:33 AM, Vladimir Borodin wrote:
> 29 окт. 2015 г., в 13:12, Michael Paquier написал(а):
>> In the case of repeatable read the standby will wait before applying
>> the VACUUM WAL record cleaning up a relation page. Hence you won't get
>> conflicts in this case.
>
> Standby will receive but will not apply? Or master will not vacuum needed by
> standby pages? It seems that the second one is happening because replication
> lag on standby does not increase while issuing such repeatable read
> transaction.

Standby will receive the record but not replay it until the
transaction doing REPEATABLE READ transactions that needs those rows
commits on the standby. The WAL flush position on the standby
continues to move on. This depends of course on
max_standby_streaming_delay which may decide or not to force the
transaction to cancel if it takes too long. Someone feel free to
correct me if I am missing something here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-29 Thread Vladimir Borodin

> 29 окт. 2015 г., в 13:12, Michael Paquier  
> написал(а):
> 
> On Thu, Oct 29, 2015 at 9:42 AM, Vladimir Borodin wrote:
>> I’m wondering why do I get conflicts with recovery on hot standby using
>> replication slots and read commited isolation level? And if I start
>> repeatable read transaction I don’t get any errors. Below is some
>> diagnostics.
> 
> In the case of repeatable read the standby will wait before applying
> the VACUUM WAL record cleaning up a relation page. Hence you won't get
> conflicts in this case.

Standby will receive but will not apply? Or master will not vacuum needed by 
standby pages? It seems that the second one is happening because replication 
lag on standby does not increase while issuing such repeatable read transaction.

> -- 
> Michael
> 
> 
> -- 
> Sent via pgsql-admin mailing list (pgsql-ad...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin


--
Да пребудет с вами сила…
https://simply.name/ru



Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-29 Thread Michael Paquier
On Thu, Oct 29, 2015 at 9:42 AM, Vladimir Borodin wrote:
> I’m wondering why do I get conflicts with recovery on hot standby using
> replication slots and read commited isolation level? And if I start
> repeatable read transaction I don’t get any errors. Below is some
> diagnostics.

In the case of repeatable read the standby will wait before applying
the VACUUM WAL record cleaning up a relation page. Hence you won't get
conflicts in this case.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Replication slots and isolation levels

2015-10-29 Thread Vladimir Borodin

> 27 окт. 2015 г., в 19:45, Vladimir Borodin  написал(а):
> 
> Hi all.
> 
> I’m wondering why do I get conflicts with recovery on hot standby using 
> replication slots and read commited isolation level? And if I start 
> repeatable read transaction I don’t get any errors. Below is some diagnostics.

+hackers@

Could anybody explain, why this is happening?

> 
> I’m using 9.4.4 (upgrade is planned) from yum.postgresql.org 
>  packages on both master and standby. Configs are 
> the same on both master and standby:
> 
> rpopdb01d/postgres M # SELECT name, setting FROM pg_settings
> WHERE category LIKE 'Replication%' or category LIKE 'Write-Ahead Log';
>  name | setting
> --+-
>  hot_standby  | on
>  hot_standby_feedback | on
>  max_replication_slots| 1
>  max_standby_archive_delay| 3
>  max_standby_streaming_delay  | 3
>  max_wal_senders  | 10
>  synchronous_standby_names|
>  vacuum_defer_cleanup_age | 20
>  wal_keep_segments| 64
>  wal_receiver_status_interval | 1
>  wal_receiver_timeout | 6
>  wal_sender_timeout   | 3000
> (12 rows)
> 
> Time: 1.583 ms
> rpopdb01d/postgres M #
> 
> On the master I’ve created a physical replication slot and attached standby 
> to it, I do see changing xmin and restart_lsn fields in pg_replication_slots 
> view.
> 
> rpopdb01d/postgres M # select * from pg_replication_slots ;
>  slot_name| plugin | slot_type | datoid | database | active |
> xmin| catalog_xmin |  restart_lsn
> --++---++--+++--+---
>  rpopdb01e_domain_com | [null] | physical  | [null] | [null]   | t  | 
> 2127399287 |   [null] | 960B/415C79C8
> (1 row)
> 
> Time: 0.463 ms
> rpopdb01d/postgres M #
> 
> When I start a read commited transaction on standby (or use autocommit mode, 
> doesn’t matter) I still see that xmin in pg_replication_slots view on master 
> increases. If I do run a heavy SELECT statement, at some point of time 
> (presumably after vacuum_defer_cleanup_age expires) standby starts to lag 
> replication apply and when it hits max_standby_streaming_delay I get 40001 
> sql code, either ERROR or FATAL:
> 
> rpopdb01e/rpopdb R # SHOW transaction_isolation ;
>  transaction_isolation
> ---
>  read committed
> (1 row)
> 
> Time: 0.324 ms
> rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
> ERROR:  40001: canceling statement due to conflict with recovery
> DETAIL:  User query might have needed to see row versions that must be 
> removed.
> LOCATION:  ProcessInterrupts, postgres.c:2990
> Time: 199791.339 ms
> rpopdb01e/rpopdb R #
> 
> 
> 
> rpopdb01e/rpopdb R # SHOW transaction_isolation ;
>  transaction_isolation
> ---
>  read committed
> (1 row)
> 
> Time: 0.258 ms
> rpopdb01e/rpopdb R # BEGIN;
> BEGIN
> Time: 0.067 ms
> rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
> FATAL:  40001: terminating connection due to conflict with recovery
> DETAIL:  User was holding a relation lock for too long.
> LOCATION:  ProcessInterrupts, postgres.c:2857
> server closed the connection unexpectedly
>   This probably means the server terminated abnormally
>   before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
> Time: 307864.830 ms
> rpopdb01e/rpopdb R #
> 
> The behavior is the same as expected to be without using replication slots.
> 
> But when I start repeatable read transaction xmin field in 
> pg_replication_slots view on master freezes (while restart_lsn is still 
> increasing) and I don’t get any replication lag and conflicts with recovery. 
> When I end this transaction, xmin starts increasing again.
> 
> rpopdb01e/rpopdb R # begin transaction isolation level repeatable read;
> BEGIN
> Time: 0.118 ms
> rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
>count
> 
>  3106222429
> (1 row)
> 
> Time: 411944.889 ms
> rpopdb01e/rpopdb R # ROLLBACK;
> ROLLBACK
> Time: 0.269 ms
> rpopdb01e/rpopdb R #
> 
>  And that is what I expect. Am I missing something or is it expected behavior 
> in read commited mode?
> 
> Thanks in advance.
> 
> --
> May the force be with you…
> https://simply.name 


--
May the force be with you…
https://simply.name