25.07.2024 15:00, Alexander Lakhin wrote:
The other question is: why is 005_opclass_damage taking so much time there? ... $ make -s check -C src/bin/pg_amcheck/ PROVE_TESTS="t/005*" PROVE_FLAGS="--timer" [11:11:53] t/005_opclass_damage.pl .. ok 1370 ms ( 0.00 usr 0.00 sys + 0.10 cusr 0.07 csys = 0.17 CPU) $ echo "debug_parallel_query = regress" >/tmp/extra.config $ TEMP_CONFIG=/tmp/extra.config make -s check -C src/bin/pg_amcheck/ PROVE_TESTS="t/005*" PROVE_FLAGS="--timer" [11:12:46] t/005_opclass_damage.pl .. ok 40854 ms ( 0.00 usr 0.00 sys + 0.10 cusr 0.10 csys = 0.20 CPU) ... So maybe at least this test should be improved for testing with debug_parallel_query enabled, if such active use of parallel workers by pg_amcheck can't be an issue to end users?
When running this test with "log_min_messages = DEBUG2" in my extra.config, I see thousands of the following messages in the test log: 2024-07-26 09:32:54.544 UTC [2572189:46] DEBUG: postmaster received pmsignal signal 2024-07-26 09:32:54.544 UTC [2572189:47] DEBUG: registering background worker "parallel worker for PID 2572197" 2024-07-26 09:32:54.544 UTC [2572189:48] DEBUG: starting background worker process "parallel worker for PID 2572197" 2024-07-26 09:32:54.547 UTC [2572189:49] DEBUG: unregistering background worker "parallel worker for PID 2572197" 2024-07-26 09:32:54.547 UTC [2572189:50] DEBUG: background worker "parallel worker" (PID 2572205) exited with exit code 0 2024-07-26 09:32:54.547 UTC [2572189:51] DEBUG: postmaster received pmsignal signal 2024-07-26 09:32:54.547 UTC [2572189:52] DEBUG: registering background worker "parallel worker for PID 2572197" 2024-07-26 09:32:54.547 UTC [2572189:53] DEBUG: starting background worker process "parallel worker for PID 2572197" 2024-07-26 09:32:54.549 UTC [2572189:54] DEBUG: unregistering background worker "parallel worker for PID 2572197" 2024-07-26 09:32:54.549 UTC [2572189:55] DEBUG: background worker "parallel worker" (PID 2572206) exited with exit code 0 ... grep ' registering background worker' \ src/bin/pg_amcheck/tmp_check/log/005_opclass_damage_test.log | wc -l 15669 So this test launches more than 15000 processes when debug_parallel_query is enabled. As far as I can see, this is happening because of the "PARALLEL UNSAFE" marking is ignored when the function is called by CREATE INDEX/amcheck. Namely, with a function defined as: CREATE FUNCTION int4_asc_cmp (a int4, b int4) RETURNS int LANGUAGE sql AS $$ SELECT CASE WHEN $1 = $2 THEN 0 WHEN $1 > $2 THEN 1 ELSE -1 END; $$; SELECT int4_asc_cmp(1, 2); executed without parallel workers. Whilst when it's used by an index: CREATE OPERATOR CLASS int4_fickle_ops FOR TYPE int4 USING btree AS ... OPERATOR 5 > (int4, int4), FUNCTION 1 int4_asc_cmp(int4, int4); INSERT INTO int4tbl (SELECT * FROM generate_series(1,1000) gs); CREATE INDEX fickleidx ON int4tbl USING btree (i int4_fickle_ops); launches 1000 parallel workers. (This is reminiscent of bug #18314.) One way to workaround this is to disable debug_parallel_query in the test and another I find possible is to set max_parallel_workers = 0. Best regards, Alexander