Improve the performance to create END_OF_RECOVERY checkpoint

2020-12-21 Thread Thunder
Dear All In startup process we only launch bgwriter when ArchiveRecoveryRequested is true, which means we will not lauch bgwriter in master node. The bgwriters can write the dirty buffers to disk which helps startup process to do less IO when we complete xlog replay and request to do END_OF_REC

Re:Re: Re:Standby got fatal after the crash recovery

2020-03-16 Thread Thunder
Sorry. We are using pg11, and cloned from tag REL_11_BETA2. At 2020-03-17 11:43:51, "Michael Paquier" wrote: >On Tue, Mar 17, 2020 at 11:02:24AM +0800, Thunder wrote: >> The slru error detail is the following. >> DETAIL: Could not read from fil

Re:Re:Standby got fatal after the crash recovery

2020-03-16 Thread Thunder
The slru error detail is the following. DETAIL: Could not read from file "/data/pg_xact/003C" at offset 221184: Success. I think read /data/pg_xact/003C from offset 221184 and the return value was 0. At 2020-03-17 10:36:03, "Thunder" wrote: Appreciate any

Re:Standby got fatal after the crash recovery

2020-03-16 Thread Thunder
Appreciate any suggestion for this issue? Is there something i misunderstand? At 2020-03-17 00:33:36, "Thunder" wrote: Hello hackers: Our standby node got fatal after the crash recovery. The fatal error was caused in slru module, i changed log level from ERROR to

Standby got fatal after the crash recovery

2020-03-16 Thread Thunder
Hello hackers: Our standby node got fatal after the crash recovery. The fatal error was caused in slru module, i changed log level from ERROR to PANIC and got the following stack. (gdb) bt #0 0x7f0cc47a1277 in raise () from /lib64/libc.so.6 #1 0x7f0cc47a2968 in abort () from /lib64

Re:Re: Optimize crash recovery

2020-03-13 Thread Thunder
init page if page lsn is larger than the lsn of xlog record? At 2020-03-13 23:41:03, "Alvaro Herrera" wrote: >On 2020-Mar-13, Thunder wrote: > >> Hello hackers: >> >> >> During crash recovery, we compare most of the lsn of xlog record with

Optimize crash recovery

2020-03-13 Thread Thunder
Hello hackers: During crash recovery, we compare most of the lsn of xlog record with page lsn to determine if the record has already been replayed. The exceptions are full-page and init-page xlog records. It's restored if the xlog record includes a full-page image of the page. And it initializes

Re:Re:Re: [BUG] standby node can not provide service even it replays all log files

2019-10-28 Thread Thunder
to review the attached patch? Thanks. At 2019-10-22 20:42:21, "Thunder" wrote: Update the patch. 1. The STANDBY_SNAPSHOT_PENDING state is set when we replay the first XLOG_RUNNING_XACTS and the sub transaction ids are overflow. 2. When we log XLOG_RUNNING_XACTS in master no

Re:Re: [BUG] standby node can not provide service even it replays all log files

2019-10-24 Thread Thunder
Thanks for replay.I feel confused about snapshot. At 2019-10-23 11:51:19, "Kyotaro Horiguchi" wrote: >Hello. > >At Tue, 22 Oct 2019 20:42:21 +0800 (CST), Thunder wrote in >> Update the patch. >> >> 1. The STANDBY_SNAPSHOT_PENDING state is set when we re

Re:Re: [BUG] standby node can not provide service even it replays all log files

2019-10-22 Thread Thunder
min to be procArray->oldest_running_xid? Appreciate any suggestion to this issue. At 2019-10-22 01:27:58, "Robert Haas" wrote: >On Mon, Oct 21, 2019 at 4:13 AM Thunder wrote: >> Can we fix this issue like the following patch? >> >> $git diff src/backend/access

Re:[BUG] standby node can not provide service even it replays all log files

2019-10-21 Thread Thunder
OT_READY || standbyState == STANDBY_SNAPSHOT_PENDING) && !LocalHotStandbyActive && reachedConsistency && IsUnderPostmaster) At 2019-10-21 15:40:24, "Thunder" wrote: Hi hackers, I found this issue when restart standby node a

[BUG] standby node can not provide service even it replays all log files

2019-10-21 Thread Thunder
Hi hackers, I found this issue when restart standby node and then try to connect it. It return "psql: FATAL: the database system is starting up". The steps to reproduce this issue. 1. Create a session to run uncommit_trans.sql 2. Create the other session to do checkpoint 3. Restart standby no

Re:PATCH: standby crashed when replay block which truncated in standby but failed to truncate in master node

2019-09-23 Thread Thunder
Is this an issue? Can we fix like this? Thanks! At 2019-09-22 00:38:03, "Thunder" wrote: The step to reproduce this issue. 1. Create a table create table gist_point_tbl(id int4, p point); create index gist_pointidx on gist_point_tbl using gist(p); 2. Insert data i

PATCH: standby crashed when replay block which truncated in standby but failed to truncate in master node

2019-09-21 Thread Thunder
The step to reproduce this issue. 1. Create a table create table gist_point_tbl(id int4, p point); create index gist_pointidx on gist_point_tbl using gist(p); 2. Insert data insert into gist_point_tbl (id, p) select g,point(g*10, g*10) from generate_series(1, 100) g; 3. Del

Got "FATAL: could not access status of transaction" in PG 11.2

2019-09-03 Thread Thunder
Shutdown postgres server immediately with pg_ctl stop -m immediate command. When I restart the server got following fatal message: --- 126690 2019-09-03 14:06:52 UTC XX000 FATAL: could not a

Re:Re: PANIC :Call AbortTransaction when transaction id is no normal

2019-05-13 Thread Thunder
On our server when process crash and core dump file generated we will receive complaining phone call. That's why i try to fix it. At 2019-05-14 07:53:36, "Michael Paquier" wrote: >On Mon, May 13, 2019 at 09:37:32AM -0400, Tom Lane wrote: >> But ... that code's been like that for decades a

PANIC :Call AbortTransaction when transaction id is no normal

2019-05-12 Thread Thunder
Hello, The process crashed when running in bootstrap mode and received signal to shutdown. From the call stack we can see that the transaction id is 1, which is BootstrapTransactionId. During TransactionLogFetch function, which fetch commit status of specified transaction id, it will return