On Sun, Jul 21, 2024 at 2:36 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > > Andrew Dunstan <and...@dunslane.net> writes: > > On 2024-07-21 Su 1:34 PM, Tom Lane wrote: > >> Locally, I've not managed to reproduce the failure yet; so perhaps > >> there is some platform dependency. What are you testing on? > > > Linux ub22arm 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:08:40 UTC > > 2024 aarch64 aarch64 aarch64 GNU/Linux > > It's a VM running on UTM/Apple Silicon > > Hmm, doesn't sound like that ought to be slow. > > I did manage to reproduce dodo's failures by running xid_wraparound > manually on mamba's very slow host: > > $ time make -s installcheck PROVE_FLAGS=--timer > # +++ tap install-check in src/test/modules/xid_wraparound +++ > [13:37:49] t/001_emergency_vacuum.pl .. 1/? # poll_query_until timed out > executing this query: > # > # SELECT NOT EXISTS ( > # SELECT * > # FROM pg_database > # WHERE age(datfrozenxid) > > current_setting('autovacuum_freeze_max_age')::int) > # > # expecting this output: > # t > # last actual query output: > # f > # with stderr: > # Tests were run but no plan was declared and done_testing() was not seen. > # Looks like your test exited with 4 just after 1. > [13:37:49] t/001_emergency_vacuum.pl .. Dubious, test returned 4 (wstat 1024, > 0x400) > All 1 subtests passed > [14:06:51] t/002_limits.pl ............ 2/? # Tests were run but no plan was > declared and done_testing() was not seen. > # Looks like your test exited with 29 just after 2. > [14:06:51] t/002_limits.pl ............ Dubious, test returned 29 (wstat > 7424, 0x1d00) > All 2 subtests passed > [14:31:16] t/003_wraparounds.pl ....... ok 7564763 ms ( 0.00 usr 0.01 sys + > 13.82 cusr 9.26 csys = 23.09 CPU) > [16:37:21] > > Test Summary Report > ------------------- > t/001_emergency_vacuum.pl (Wstat: 1024 (exited 4) Tests: 1 Failed: 0) > Non-zero exit status: 4 > Parse errors: No plan found in TAP output > t/002_limits.pl (Wstat: 7424 (exited 29) Tests: 2 Failed: 0) > Non-zero exit status: 29 > Parse errors: No plan found in TAP output > Files=3, Tests=4, 10772 wallclock secs ( 0.15 usr 0.06 sys + 58.50 cusr > 59.88 csys = 118.59 CPU) > Result: FAIL > make: *** [../../../../src/makefiles/pgxs.mk:442: installcheck] Error 1 > 10772.99 real 59.34 user 60.14 sys > > Each of those two failures looks just like something that dodo has > shown at one time or another. So it's at least plausible that > "slow machine" is the whole explanation. I'm still wondering > though if there's some effect that causes the test's runtime to > be unstable in itself, sometimes leading to timeouts. >
Since the server writes a lot of logs during the xid_wraparound test, "slow disk" could also be a reason. Looking at dodo's failures, it seems that while it passes module-xid_wraparound-check, all failures happened only during testmodules-install-check-C. Can we check the server logs written during xid_wraparound test in testmodules-install-check-C? I thought the following link is the server logs but since it seems there were no autovacuum logs I suspected there is another log file: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=dodo&dt=2024-07-20%2020%3A35%3A39&stg=testmodules-install-check-C&raw=1 Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com