Hello. At Sat, 29 Jun 2019 22:05:22 +0200, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote in <61b8d18d-c922-ac99-b990-a31ba63cd...@2ndquadrant.com> > Setting up a standby instance is still quite complicated. You need to > run pg_basebackup with all the right options. You need to make sure > pg_basebackup has the right permissions for the target directories. The > created instance has to be integrated into the operating system's start > scripts. There is this slightly awkward business of the --recovery-conf > option and how it interacts with other features. And you should > probably run pg_basebackup under screen. And then how do you get > notified when it's done. And when it's done you have to log back in and > finish up. Too many steps. > > My idea is that the postmaster can launch a base backup worker, wait > till it's done, then proceed with the rest of the startup. initdb gets > a special option to create a "minimal" data directory with only a few > files, directories, and the usual configuration files. Then you create > a $PGDATA/basebackup.signal, start the postmaster as normal. It sees > the signal file, launches an auxiliary process that runs the base > backup, then proceeds with normal startup in standby mode. > > This makes a whole bunch of things much nicer: The connection > information for where to get the base backup from comes from > postgresql.conf, so you only need to specify it in one place. > pg_basebackup is completely out of the picture; no need to deal with > command-line options, --recovery-conf, screen, monitoring for > completion, etc. If something fails, the base backup process can > automatically be restarted (maybe). Operating system integration is > much easier: You only call initdb and then pg_ctl or postgres, as you > are already doing. Automated deployment systems don't need to wait for > pg_basebackup to finish: You only call initdb, then start the server, > and then you're done -- waiting for the base backup to finish can be > done by the regular monitoring system. > > Attached is a very hackish patch to implement this. It works like this: > > # (assuming you have a primary already running somewhere) > initdb -D data2 --minimal > $EDITOR data2/postgresql.conf # set primary_conninfo > pg_ctl -D data2 start
Nice idea! > (Curious side note: If you don’t set primary_conninfo in these steps, > then libpq defaults apply, so the default behavior might end up being > that a given instance attempts to replicate from itself.) We may be able to have different setting for primary and replica for other settings if we could have sections in the configuration file, defining, say, [replica] section gives us more frexibility. Though it is a bit far from the topic, dedicate command-line configuration editor that can find and replace specified parameter would elimite the sublte editing step. It is annoying that finding specific separator in conf file then trim then add new contnet. > It works for basic cases. It's missing tablespace support, proper > fsyncing, progress reporting, probably more. Those would be pretty While catching up master, connections to replica are once accepted then result in FATAL error. I now and then receive inquiries for that. With the new feature, we get FATAL also while basebackup phase. That can let users fear more frequently. > straightforward I think. The interesting bit is the delicate ordering > of the postmaster startup: Normally, the pg_control file is read quite > early, but if starting from a minimal data directory, we need to wait > until the base backup is done. There is also the question what you do > if the base backup fails halfway through. Currently you probably need > to delete the whole data directory and start again with initdb. Better > might be a way to start again and overwrite any existing files, but that > can clearly also be dangerous. All this needs some careful analysis, > but I think it's doable. > > Any thoughts? Just overwriting won't work since files removed just before retrying are left alon in replica. I think it should work similarly to initdb, that is, removing all then retrying. It's easy if we don't consider reducing startup time. Just do initdb then start exising postmaster internally. But melding them together makes room for reducing the startup time. We even could redirect read-only queries to master while setting up the server. regards. -- Kyotaro Horiguchi NTT Open Source Software Center