We ran into a similar issue at work: when the cluster starts it spends some time doing bookkeeping like replaying the logs if the server crashed before. In our case we were starting up completely empty new PostgreSQL containers and they were still taking 15+ seconds to start on occasion because the cloud hosted disks were so slow. Our solution was a bash script that just called psql in a loop until it succeeded. You could shoehorn that into a new systemd unit and have the prbot depend on it.
On Wed, Apr 29, 2020 at 4:56 AM Rainer Müller <[email protected]> wrote: > > On 29/04/2020 04.07, Chris Jones wrote: > > Looks like the macportsbot has stopped running again , as The latest MRs > > are no longer being labelled. Could someone kick it back into life ;) > > Thank you for the notification! I started it now. prbot had failed to start > after a reboot of the host system: > > prbot-current[1453]: 2020/04/28 09:02:54 pq: the database system is starting > up > > This happened before, but I don't know how we can make this more robust. We > already use After=postgresql.service in the corresponding systemd unit, but > apparently that is not enough. Either there is some other indicator when > postgres has finished its startup, or prbot should retry if it encounters this > error message, or as a last resort we have to add some arbitrary delay. > > Rainer -- David Gilman :DG<
