Thanks for your advice!  This help me a lot.

> On Sep 17, 2020, at 9:18 PM, Jakub Wartak <jakub.war...@tomtom.com> wrote:
> 
> Li Japin wrote:
> 
>> If we can improve the efficiency of replay, then we can shorten the database 
>> recovery time (streaming replication or database crash recovery). 
> (..)
>> For streaming replication, we may need to improve the transmission of WAL 
>> logs to improve the entire recovery process.
>> I’m not sure if this is correct.
> 
> Hi, 
> 
> If you are interested in increased efficiency of WAL replay internals/startup 
> performance then you might be interested in following threads:
> 
> Cache relation sizes in recovery - 
> https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BNPZeEdLXAcNr%2Bw0YOZVb0Un0_MwTBpgmmVDh7No2jbg%40mail.gmail.com#feace7ccbb8e3df8b086d0a2217df91f
> Faster compactify_tuples() - 
> https://www.postgresql.org/message-id/flat/ca+hukgkmqfvpjr106grhwk6r-nxv0qoctrezuqzxgphesal...@mail.gmail.com
> Handing off SLRU fsyncs to the checkpointer - 
> https://www.postgresql.org/message-id/flat/CA%2BhUKGLJ%3D84YT%2BNvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ%40mail.gmail.com
> Optimizing compactify_tuples() - 
> https://www.postgresql.org/message-id/flat/CA%2BhUKGKMQFVpjr106gRhwk6R-nXv0qOcTreZuQzxgpHESAL6dw%40mail.gmail.com
> Background bgwriter during crash recovery - 
> https://www.postgresql.org/message-id/flat/ca+hukgj8nrsqgkzensnrc2mfrobv-jcnacbyvtpptk2a9yy...@mail.gmail.com
> WIP: WAL prefetch (another approach) - 
> https://www.postgresql.org/message-id/flat/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
> Division in dynahash.c due to HASH_FFACTOR - 
> https://www.postgresql.org/message-id/flat/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
> [PATCH] guc-ify the formerly hard-coded MAX_SEND_SIZE to max_wal_send - 
> https://www.postgresql.org/message-id/flat/CACJqAM2uAUnEAy0j2RRJOSM1UHPdGxCr%3DU-HbqEf0aAcdhUoEQ%40mail.gmail.com
> Unnecessary delay in streaming replication due to replay lag - 
> https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com
> WAL prefetching in future combined with AIO (IO_URING) - longer term future, 
> https://anarazel.de/talks/2020-05-28-pgcon-aio/2020-05-28-pgcon-aio.pdf
> 
> Good way to start is to profile the system what is taking time during Your 
> failover situation OR Your normal hot-standby behavior 
> and then proceed to identifying and characterizing the main bottleneck - 
> there can be many depending on the situation (inefficient single processes 
> PostgreSQL code, 
> CPU-bound startup/recovering, IOPS/VFS/ syscall/s / API limitations, single 
> TCP stream limitations  single TCP stream latency impact in WAN, contention 
> on locks in hot-standby case...) .
> 
> Some of the above are already commited in for 14/master, some are not and 
> require further discussions and testing. 
> Without real identification of the bottleneck and WAL stream statistics you 
> are facing , it's hard to say how would parallel WAL recovery improve the 
> situation.
> 
> -J.

Reply via email to