Currently, the Startup process is responsible for running restore_command. So when the Startup process is busy or waiting, then no new WAL files arrive.
That has these effects * Recovery must wait while the Startup process requests next WAL file. This reduces performance of archive recovery. * If replication is file-based then no new files can be downloaded while we are waiting. If the Startup process waits, it then is much slower to catch up than it could be if it had already downloaded the files from the archive. * We cannot run an archive_cleanup_command, so the archive keep growing. * Cascading from a standby that uses file based replication is not easily possible My solution is to create a new process called the DeArchiver. This will run restore_command in a tight loop until the number of files would exceed wal_keep_files, then sleep. Each time the DeArchiver executes restore_command it will set the return code and if rc=0 the new XLogRecPtr reached. If standby_mode = on it will continue to retry indefinitely. The Startup process will just read files from pg_xlog rather than from the archive, just as it does for streaming, so this will remove the special case code in xlog.c. (WALReciver and this process will still need to coordinate so they are not both simultaneously active at any point, as now). This proposal gives a performance gain because the DeArchiver can be restoring the next file while the Startup process is processing the current file, so they work together using pipeline parallelism. The DeArchiver would start when we are not in crash recovery and exit at the end of recovery. This would then allow restore_command to be set via reload rather than restart. Previously, we have given greater weight to files from the archive to files already in pg_xlog. To ensure that behaviour continues, if restore_command is set at the Startup process will read the files in the pg_xlog directory and remember which ones were there at startup. That way it will be able to tell the difference between files newly downloaded and those already in the directory. If a file is absent from the archive we will use the file from pg_xlog. This makes file-based and stream-based replication work in a similar way, which is neater, and it also means all required files are available in case of a crash, which means we can more easily get rid of shutdown checkpoints in case of failoiver (discussed on separate thread). Since more files are available, it allows cascading replication to have a sender which receives WAL data in files. Which do we prefer "DeArchiver", "Restore process", or "WALFileReceiver". Thoughts? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers