This is an automated email from the ASF dual-hosted git repository. yjhjstz pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/cloudberry.git
commit d8f2bcb603201e663fd3571eb9ea1a278dea4467 Author: Huansong Fu <[email protected]> AuthorDate: Wed Sep 20 12:12:10 2023 -0700 Fix a flakiness with test gp_check_files This should have be done with #16428, but we need to disable autovacuum when running the gp_check_files regress test. Otherwise we might see errors like: ``` @@ -53,12 +53,8 @@ -- check orphaned files, note that this forces a checkpoint internally. set client_min_messages = ERROR; select gp_segment_id, filename from run_orphaned_files_view(); - gp_segment_id | filename ----------------+---------- - 1 | 987654 - 1 | 987654.3 -(2 rows) - +ERROR: failed to retrieve orphaned files after 10 minutes of retries. +CONTEXT: PL/pgSQL function run_orphaned_files_view() line 19 at RAISE reset client_min_messages; ``` In the log we have: ``` 2023-09-20 15:33:00.766420 UTC,"gpadmin","regression",p148081,th-589358976,"[local]",,2023-09-20 15:31:39 UTC,0,con19,cmd65,seg-1,,dx38585,,sx1,"LOG","00000","attempt failed 17 with error: There is a client session running on one or more segment. Aborting...",,,,,"PL/pgSQL function run_orphaned_files_view() line 11 at RAISE","select gp_segment_id, filename from run_orphaned_files_view();",0,,"pl_exec.c",3857, ``` It is possible that some background jobs have created some backends that we think we should avoid when taking the gp_check_orphaned_files view. As we have decided to make the view conservative (disallowing any backends that could cause false positive of the view results), fixing the test is what we need. In the test we have a safeguard which is to loop 10 minutes and take the view repeatedly (function run_orphaned_files_view()). But it didn't solve the issue because it saw only one snapshot of pg_stat_activity in the entire execution of the function. Now explicitly call pg_stat_clear_snapshot() to solve that issue. Co-authored-by: Ashwin Agrawal [email protected] --- src/test/regress/input/gp_check_files.source | 2 ++ src/test/regress/output/gp_check_files.source | 2 ++ 2 files changed, 4 insertions(+) diff --git a/src/test/regress/input/gp_check_files.source b/src/test/regress/input/gp_check_files.source index 9ee8509a0a..5e1490d953 100644 --- a/src/test/regress/input/gp_check_files.source +++ b/src/test/regress/input/gp_check_files.source @@ -31,6 +31,8 @@ BEGIN RAISE LOG 'attempt failed % with error: %', retry_counter + 1, SQLERRM; -- When an exception occurs, wait for 5 seconds and then retry PERFORM pg_sleep(5); + -- Refresh to get the latest pg_stat_activity + PERFORM pg_stat_clear_snapshot(); retry_counter := retry_counter + 1; END; END LOOP; diff --git a/src/test/regress/output/gp_check_files.source b/src/test/regress/output/gp_check_files.source index 2d6a733db1..70bf5cf9ae 100644 --- a/src/test/regress/output/gp_check_files.source +++ b/src/test/regress/output/gp_check_files.source @@ -29,6 +29,8 @@ BEGIN RAISE LOG 'attempt failed % with error: %', retry_counter + 1, SQLERRM; -- When an exception occurs, wait for 5 seconds and then retry PERFORM pg_sleep(5); + -- Refresh to get the latest pg_stat_activity + PERFORM pg_stat_clear_snapshot(); retry_counter := retry_counter + 1; END; END LOOP; --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
