On 14/10/2020 21:58, Tom Lane wrote:
I noticed that chipmunk failed [1] with a rather interesting log:2020-10-14 08:57:01.661 EEST [27048:6] pg_regress/prepared_xacts LOG: statement: UPDATE pxtest1 SET foobar = 'bbb' WHERE foobar = 'aaa'; 2020-10-14 08:57:01.721 EEST [27048:7] pg_regress/prepared_xacts LOG: statement: SELECT * FROM pxtest1; 2020-10-14 08:57:01.823 EEST [27048:8] pg_regress/prepared_xacts FATAL: postmaster exited during a parallel transaction TRAP: FailedAssertion("entry->trans == NULL", File: "pgstat.c", Line: 909, PID: 27048) 2020-10-14 08:57:01.861 EEST [27051:1] ERROR: could not attach to dynamic shared area 2020-10-14 08:57:01.861 EEST [27051:2] STATEMENT: SELECT * FROM pxtest1; I do not know what happened to the postmaster, but seeing that chipmunk is a very small machine running a pretty old Linux kernel, it's plausible to guess that the OOM killer decided to pick on the postmaster. (I wonder whether Heikki has taken any steps to prevent that on that machine.)
For the record, it was not the OOM killer. It was the buildfarm cron job that did it:
Oct 14 08:57:01 raspberrypi /USR/SBIN/CRON[27050]: (pgbfarm) CMD (killall -q -9 postgres; cd /home/pgbfarm/build-farm-client/ && ./run_branches.pl --run-all)
Apparently building and testing all the branches is now taking slightly more than 24 h on that system, so the next day's cron job kills the previous tests. I'm going to change the cron schedule so that it runs only every other day.
- Heikki
