Hi, On 2023-08-07 19:15:41 -0700, Andres Freund wrote: > As some of you might have seen when running CI, cirrus-ci is restricting how > much CI cycles everyone can use for free (announcement at [1]). This takes > effect September 1st. > > This obviously has consequences both for individual users of CI as well as > cfbot. > > [...]
> Potential paths forward for individual CI: > > - migrate wholesale to another CI provider > > - split CI tasks across different CI providers, rely on github et al > displaying the CI status for different platforms > > - give up > > > Potential paths forward for cfbot, in addition to the above: > > - Pay for compute / ask the various cloud providers to grant us compute > credits. At least some of the cloud providers can be used via cirrus-ci. > > - Host (some) CI runners ourselves. Particularly with macos and windows, that > could provide significant savings. To make that possible, we need to make the compute resources for CI configurable on a per-repository basis. After experimenting with a bunch of ways to do that, I got stuck on that for a while. But since today we have sufficient macos runners for cfbot available, so... I think the approach I finally settled on is decent, although not great. It's described in the "main" commit message: ci: Prepare to make compute resources for CI configurable cirrus-ci will soon restrict the amount of free resources every user gets (as have many other CI providers). For most users of CI that should not be an issue. But e.g. for cfbot it will be an issue. To allow configuring different resources on a per-repository basis, introduce infrastructure for overriding the task execution environment. Unfortunately this is not entirely trivial, as yaml anchors have to be defined before their use, and cirrus-ci only allows injecting additional contents at the end of .cirrus.yml. To deal with that, move the definition of the CI tasks to .cirrus.tasks.yml. The main .cirrus.yml is loaded first, then, if defined, the file referenced by the REPO_CI_CONFIG_GIT_URL variable, will be added, followed by the contents of .cirrus.tasks.yml. That allows REPO_CI_CONFIG_GIT_URL to override the yaml anchors defined in .cirrus.yml. Unfortunately git's default merge / rebase strategy does not handle copied files, just renamed ones. To avoid painful rebasing over this change, this commit just renames .cirrus.yml to .cirrus.tasks.yml, without adding a new .cirrus.yml. That's done in the followup commit, which moves the relevant portion of .cirrus.tasks.yml to .cirrus.yml. Until that is done, REPO_CI_CONFIG_GIT_URL does not fully work. The subsequent commit adds documentation for how to configure custom compute resources to src/tools/ci/README Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qm...@awork3.anarazel.de Backpatch: 15-, where CI support was added I don't love moving most of the contents of .cirrus.yml into a new file, but I don't see another way. I did implement it without that as well (see [1]), but that ends up considerably harder to understand, and hardcodes what cfbot needs. Splitting the commit, as explained above, at least makes git rebase fairly painless. FWIW, I did merge the changes into 15, with only reasonable conflicts (due to new tasks, autoconf->meson). A prerequisite commit converts "SanityCheck" and "CompilerWarnings" to use a full VM instead of a container - that way providing custom compute resources doesn't have to deal with containers in addition to VMs. It also looks like the increased startup overhead is outweighed by the reduction in runtime overhead. I'm hoping to push this fairly soon, as I'll be on vacation the last week of August. I'll be online intermittently though, if there are issues, I can react (very limited connectivity for middday Aug 29th - midday Aug 31th though). I'd appreciate a quick review or two. Greetings, Andres Freund [1] https://github.com/anarazel/postgres/commit/b95fd302161b951f1dc14d586162ed3d85564bfc
>From d1fa394bf9318f08c1c529160c3e6cedc5bb5289 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Mon, 7 Aug 2023 17:31:15 -0700 Subject: [PATCH v3 01/10] ci: Don't specify amount of memory The number of CPUs is the cost-determining factor. Most instance types that run tests have more memory/core than what we specified, there's no real benefit in wasting that. Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qm...@awork3.anarazel.de Backpatch: 15-, where CI support was added --- .cirrus.yml | 4 ---- 1 file changed, 4 deletions(-) diff --git a/.cirrus.yml b/.cirrus.yml index 727c434de40..9e84eb95be5 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -149,7 +149,6 @@ task: image: family/pg-ci-freebsd-13 platform: freebsd cpu: $CPUS - memory: 4G disk: 50 sysinfo_script: | @@ -291,7 +290,6 @@ task: image: family/pg-ci-bullseye platform: linux cpu: $CPUS - memory: 4G ccache_cache: folder: ${CCACHE_DIR} @@ -558,7 +556,6 @@ task: image: family/pg-ci-windows-ci-vs-2019 platform: windows cpu: $CPUS - memory: 4G setup_additional_packages_script: | REM choco install -y --no-progress ... @@ -606,7 +603,6 @@ task: image: family/pg-ci-windows-ci-mingw64 platform: windows cpu: $CPUS - memory: 4G env: TEST_JOBS: 4 # higher concurrency causes occasional failures -- 2.38.0
>From dee50464cd75c1eadc46e7cc673d23d6bf01e6b6 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 22 Aug 2023 21:25:28 -0700 Subject: [PATCH v3 02/10] ci: Move execution method of tasks into yaml templates This is done in preparation for making the compute resources for CI configurable. It also looks cleaner. Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qm...@awork3.anarazel.de Backpatch: 15-, where CI support was added --- .cirrus.yml | 85 +++++++++++++++++++++++++++++++++++------------------ 1 file changed, 57 insertions(+), 28 deletions(-) diff --git a/.cirrus.yml b/.cirrus.yml index 9e84eb95be5..75747b9b651 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -9,6 +9,7 @@ env: GCP_PROJECT: pg-ci-images IMAGE_PROJECT: $GCP_PROJECT CONTAINER_REPO: us-docker.pkg.dev/${GCP_PROJECT}/ci + DISK_SIZE: 25 # The lower depth accelerates git clone. Use a bit of depth so that # concurrent tasks and retrying older jobs have a chance of working. @@ -28,6 +29,45 @@ env: PG_TEST_EXTRA: kerberos ldap ssl load_balance +# Define how to run various types of tasks. + +# VMs provided by cirrus-ci. Each user has a limited number of "free" credits +# for testing. +cirrus_community_vm_template: &cirrus_community_vm_template + compute_engine_instance: + image_project: $IMAGE_PROJECT + image: family/$IMAGE_FAMILY + platform: $PLATFORM + cpu: $CPUS + disk: $DISK_SIZE + + +default_linux_task_template: &linux_task_template + env: + PLATFORM: linux + <<: *cirrus_community_vm_template + + +default_freebsd_task_template: &freebsd_task_template + env: + PLATFORM: freebsd + <<: *cirrus_community_vm_template + + +default_windows_task_template: &windows_task_template + env: + PLATFORM: windows + <<: *cirrus_community_vm_template + + +# macos workers provided by cirrus-ci +default_macos_task_template: &macos_task_template + env: + PLATFORM: macos + macos_instance: + image: $IMAGE + + # What files to preserve in case tests fail on_failure_ac: &on_failure_ac log_artifacts: @@ -136,21 +176,18 @@ task: CPUS: 2 BUILD_JOBS: 3 TEST_JOBS: 3 + IMAGE_FAMILY: pg-ci-freebsd-13 + DISK_SIZE: 50 CCACHE_DIR: /tmp/ccache_dir CPPFLAGS: -DRELCACHE_FORCE_RELEASE -DCOPY_PARSE_PLAN_TREES -DWRITE_READ_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS CFLAGS: -Og -ggdb + <<: *freebsd_task_template + depends_on: SanityCheck only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*freebsd.*' - compute_engine_instance: - image_project: $IMAGE_PROJECT - image: family/pg-ci-freebsd-13 - platform: freebsd - cpu: $CPUS - disk: 50 - sysinfo_script: | id uname -a @@ -250,6 +287,7 @@ task: CPUS: 4 BUILD_JOBS: 4 TEST_JOBS: 8 # experimentally derived to be a decent choice + IMAGE_FAMILY: pg-ci-bullseye CCACHE_DIR: /tmp/ccache_dir DEBUGINFOD_URLS: "https://debuginfod.debian.net" @@ -282,15 +320,11 @@ task: LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES + <<: *linux_task_template + depends_on: SanityCheck only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*linux.*' - compute_engine_instance: - image_project: $IMAGE_PROJECT - image: family/pg-ci-bullseye - platform: linux - cpu: $CPUS - ccache_cache: folder: ${CCACHE_DIR} @@ -430,6 +464,7 @@ task: # work OK. See # https://postgr.es/m/20220927040208.l3shfcidovpzqxfh%40awork3.anarazel.de TEST_JOBS: 8 + IMAGE: ghcr.io/cirruslabs/macos-ventura-base:latest CIRRUS_WORKING_DIR: ${HOME}/pgsql/ CCACHE_DIR: ${HOME}/ccache @@ -440,12 +475,11 @@ task: CFLAGS: -Og -ggdb CXXFLAGS: -Og -ggdb + <<: *macos_task_template + depends_on: SanityCheck only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*(macos|darwin|osx).*' - macos_instance: - image: ghcr.io/cirruslabs/macos-ventura-base:latest - sysinfo_script: | id uname -a @@ -524,6 +558,7 @@ WINDOWS_ENVIRONMENT_BASE: &WINDOWS_ENVIRONMENT_BASE # Avoids port conflicts between concurrent tap test runs PG_TEST_USE_UNIX_SOCKETS: 1 PG_REGRESS_SOCK_DIR: "c:/cirrus/" + DISK_SIZE: 50 sysinfo_script: | chcp @@ -547,16 +582,13 @@ task: # given that it explicitly prevents crash dumps from working... # 0x8001 is SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX CIRRUS_WINDOWS_ERROR_MODE: 0x8001 + IMAGE_FAMILY: pg-ci-windows-ci-vs-2019 + + <<: *windows_task_template depends_on: SanityCheck only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*' - compute_engine_instance: - image_project: $IMAGE_PROJECT - image: family/pg-ci-windows-ci-vs-2019 - platform: windows - cpu: $CPUS - setup_additional_packages_script: | REM choco install -y --no-progress ... @@ -598,12 +630,6 @@ task: # otherwise it'll be sorted before other tasks depends_on: SanityCheck - compute_engine_instance: - image_project: $IMAGE_PROJECT - image: family/pg-ci-windows-ci-mingw64 - platform: windows - cpu: $CPUS - env: TEST_JOBS: 4 # higher concurrency causes occasional failures CCACHE_DIR: C:/msys64/ccache @@ -617,6 +643,9 @@ task: # Start bash in current working directory CHERE_INVOKING: 1 BASH: C:\msys64\usr\bin\bash.exe -l + IMAGE_FAMILY: pg-ci-windows-ci-mingw64 + + <<: *windows_task_template ccache_cache: folder: ${CCACHE_DIR} -- 2.38.0
>From a60733b7bb86e93a3d23e45b98ce179d0f4e26fd Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 22 Aug 2023 21:28:37 -0700 Subject: [PATCH v3 03/10] ci: Use VMs for SanityCheck and CompilerWarnings The main reason for this change is to reduce different ways of executing tasks, making it easier to use custom compute resources for cfbot. A secondary benefit is that the tasks seem slightly faster this way, apparently the increased startup overhead is outweighed by reduced runtime overhead. Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qm...@awork3.anarazel.de Backpatch: 15-, where CI support was added --- .cirrus.yml | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/.cirrus.yml b/.cirrus.yml index 75747b9b651..f4276ad8692 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -110,15 +110,14 @@ task: CPUS: 4 BUILD_JOBS: 8 TEST_JOBS: 8 + IMAGE_FAMILY: pg-ci-bullseye CCACHE_DIR: ${CIRRUS_WORKING_DIR}/ccache_dir # no options enabled, should be small CCACHE_MAXSIZE: "150M" - # Container starts up quickly, but is slower at runtime, particularly for - # tests. Good for the briefly running sanity check. - container: - image: $CONTAINER_REPO/linux_debian_bullseye_ci:latest - cpu: $CPUS + # While containers would start up a bit quicker, building is a bit + # slower. This way we don't have to maintain a container image. + <<: *linux_task_template ccache_cache: folder: $CCACHE_DIR @@ -691,6 +690,7 @@ task: env: CPUS: 4 BUILD_JOBS: 4 + IMAGE_FAMILY: pg-ci-bullseye # Use larger ccache cache, as this task compiles with multiple compilers / # flag combinations @@ -700,9 +700,7 @@ task: LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES - container: - image: $CONTAINER_REPO/linux_debian_bullseye_ci:latest - cpu: $CPUS + <<: *linux_task_template sysinfo_script: | id -- 2.38.0
>From a4e238c4c4426f371535fb88c4046fb9e127c923 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 22 Aug 2023 23:48:32 -0700 Subject: [PATCH v3 04/10] ci: Prepare to make compute resources for CI configurable cirrus-ci will soon restrict the amount of free resources every user gets (as have many other CI providers). For most users of CI that should not be an issue. But e.g. for cfbot it will be an issue. To allow configuring different resources on a per-repository basis, introduce infrastructure for overriding the task execution environment. Unfortunately this is not entirely trivial, as yaml anchors have to be defined before their use, and cirrus-ci only allows injecting additional contents at the end of .cirrus.yml. To deal with that, move the definition of the CI tasks to .cirrus.tasks.yml. The main .cirrus.yml is loaded first, then, if defined, the file referenced by the REPO_CI_CONFIG_GIT_URL variable, will be added, followed by the contents of .cirrus.tasks.yml. That allows REPO_CI_CONFIG_GIT_URL to override the yaml anchors defined in .cirrus.yml. Unfortunately git's default merge / rebase strategy does not handle copied files, just renamed ones. To avoid painful rebasing over this change, this commit just renames .cirrus.yml to .cirrus.tasks.yml, without adding a new .cirrus.yml. That's done in the followup commit, which moves the relevant portion of .cirrus.tasks.yml to .cirrus.yml. Until that is done, REPO_CI_CONFIG_GIT_URL does not fully work. The subsequent commit adds documentation for how to configure custom compute resources to src/tools/ci/README Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qm...@awork3.anarazel.de Backpatch: 15-, where CI support was added --- .cirrus.star | 63 ++++++++++++++++++++++++++++++++ .cirrus.yml => .cirrus.tasks.yml | 0 2 files changed, 63 insertions(+) create mode 100644 .cirrus.star rename .cirrus.yml => .cirrus.tasks.yml (100%) diff --git a/.cirrus.star b/.cirrus.star new file mode 100644 index 00000000000..f3ea9b943e4 --- /dev/null +++ b/.cirrus.star @@ -0,0 +1,63 @@ +"""Additional CI configuration, using the starlark language. See +https://cirrus-ci.org/guide/programming-tasks/#introduction-into-starlark + +See also the starlark specification at +https://github.com/bazelbuild/starlark/blob/master/spec.md + +See also .cirrus.yml and src/tools/ci/README +""" + +load("cirrus", "env", "fs") + + +def main(): + """The main function is executed by cirrus-ci after loading .cirrus.yml and can + extend the CI definition further. + + As documented in .cirrus.yml, the final CI configuration is composed of + + 1) the contents of this file + + 2) if defined, the contents of the file referenced by the, repository + level, REPO_CI_CONFIG_GIT_URL variable (see + https://cirrus-ci.org/guide/programming-tasks/#fs for the accepted + format) + + 3) .cirrus.tasks.yml + """ + + output = "" + + # 1) is included implicitly + + # Add 2) + repo_config_url = env.get("REPO_CI_CONFIG_GIT_URL") + if repo_config_url != None: + print("loading additional configuration from \"{}\"".format(repo_config_url)) + output += config_from(repo_config_url) + else: + output += "n# REPO_CI_CONFIG_URL was not set\n" + + # Add 3) + output += config_from(".cirrus.tasks.yml") + + return output + + +def config_from(config_src): + """return contents of config file `config_src`, surrounded by markers + indicating start / end of the the included file + """ + + config_contents = fs.read(config_src) + config_fmt = """ + +### +# contents of config file `{0}` start here +### +{1} +### +# contents of config file `{0}` end here +### +""" + return config_fmt.format(config_src, config_contents) diff --git a/.cirrus.yml b/.cirrus.tasks.yml similarity index 100% rename from .cirrus.yml rename to .cirrus.tasks.yml -- 2.38.0
>From 53a4c9642dda8ffea3c7b73e97f4b23ea0a4f2a3 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 22 Aug 2023 22:26:54 -0700 Subject: [PATCH v3 05/10] ci: Make compute resources for CI configurable See prior commit for an explanation for the goal of the change and why it had to be split into two commits. Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qm...@awork3.anarazel.de Backpatch: 15-, where CI support was added --- .cirrus.yml | 73 +++++++++++++++++++++++++++++++++++++++++++++ .cirrus.tasks.yml | 45 ---------------------------- src/tools/ci/README | 17 +++++++++++ 3 files changed, 90 insertions(+), 45 deletions(-) create mode 100644 .cirrus.yml diff --git a/.cirrus.yml b/.cirrus.yml new file mode 100644 index 00000000000..a83129ae46d --- /dev/null +++ b/.cirrus.yml @@ -0,0 +1,73 @@ +# CI configuration file for CI utilizing cirrus-ci.org +# +# For instructions on how to enable the CI integration in a repository and +# further details, see src/tools/ci/README +# +# +# The actual CI tasks are defined in .cirrus.tasks.yml. To make the compute +# resources for CI configurable on a repository level, the "final" CI +# configuration is the combination of: +# +# 1) the contents of this file +# +# 2) if defined, the contents of the file referenced by the, repository +# level, REPO_CI_CONFIG_GIT_URL variable (see +# https://cirrus-ci.org/guide/programming-tasks/#fs for the accepted +# format) +# +# 3) .cirrus.tasks.yml +# +# This composition is done by .cirrus.star + + +env: + # Source of images / containers + GCP_PROJECT: pg-ci-images + IMAGE_PROJECT: $GCP_PROJECT + CONTAINER_REPO: us-docker.pkg.dev/${GCP_PROJECT}/ci + DISK_SIZE: 25 + + +# Define how to run various types of tasks. + +# VMs provided by cirrus-ci. Each user has a limited number of "free" credits +# for testing. +cirrus_community_vm_template: &cirrus_community_vm_template + compute_engine_instance: + image_project: $IMAGE_PROJECT + image: family/$IMAGE_FAMILY + platform: $PLATFORM + cpu: $CPUS + disk: $DISK_SIZE + + +default_linux_task_template: &linux_task_template + env: + PLATFORM: linux + <<: *cirrus_community_vm_template + + +default_freebsd_task_template: &freebsd_task_template + env: + PLATFORM: freebsd + <<: *cirrus_community_vm_template + + +default_windows_task_template: &windows_task_template + env: + PLATFORM: windows + <<: *cirrus_community_vm_template + + +# macos workers provided by cirrus-ci +default_macos_task_template: &macos_task_template + env: + PLATFORM: macos + macos_instance: + image: $IMAGE + + +# Contents of REPO_CI_CONFIG_GIT_URL, if defined, will be inserted here, +# followed by the contents .cirrus.tasks.yml. This allows +# REPO_CI_CONFIG_GIT_URL to override how the task types above will be +# executed, e.g. using a custom compute account or permanent workers. diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml index f4276ad8692..0cf7ba77996 100644 --- a/.cirrus.tasks.yml +++ b/.cirrus.tasks.yml @@ -5,12 +5,6 @@ env: - # Source of images / containers - GCP_PROJECT: pg-ci-images - IMAGE_PROJECT: $GCP_PROJECT - CONTAINER_REPO: us-docker.pkg.dev/${GCP_PROJECT}/ci - DISK_SIZE: 25 - # The lower depth accelerates git clone. Use a bit of depth so that # concurrent tasks and retrying older jobs have a chance of working. CIRRUS_CLONE_DEPTH: 500 @@ -29,45 +23,6 @@ env: PG_TEST_EXTRA: kerberos ldap ssl load_balance -# Define how to run various types of tasks. - -# VMs provided by cirrus-ci. Each user has a limited number of "free" credits -# for testing. -cirrus_community_vm_template: &cirrus_community_vm_template - compute_engine_instance: - image_project: $IMAGE_PROJECT - image: family/$IMAGE_FAMILY - platform: $PLATFORM - cpu: $CPUS - disk: $DISK_SIZE - - -default_linux_task_template: &linux_task_template - env: - PLATFORM: linux - <<: *cirrus_community_vm_template - - -default_freebsd_task_template: &freebsd_task_template - env: - PLATFORM: freebsd - <<: *cirrus_community_vm_template - - -default_windows_task_template: &windows_task_template - env: - PLATFORM: windows - <<: *cirrus_community_vm_template - - -# macos workers provided by cirrus-ci -default_macos_task_template: &macos_task_template - env: - PLATFORM: macos - macos_instance: - image: $IMAGE - - # What files to preserve in case tests fail on_failure_ac: &on_failure_ac log_artifacts: diff --git a/src/tools/ci/README b/src/tools/ci/README index 80d01939e84..30ddd200c96 100644 --- a/src/tools/ci/README +++ b/src/tools/ci/README @@ -65,3 +65,20 @@ messages. Currently the following controls are available: Only runs CI on operating systems specified. This can be useful when addressing portability issues affecting only a subset of platforms. + + +Using custom compute resources for CI +===================================== + +When running a lot of tests in a repository, cirrus-ci's free credits do not +suffice. In those cases a repository can be configured to use other +infrastructure for running tests. To do so, the REPO_CI_CONFIG_GIT_URL +variable can be configured for the repository in the cirrus-ci web interface, +at https://cirrus-ci.com/github/<user or organization>. The file referenced +(see https://cirrus-ci.org/guide/programming-tasks/#fs) by the variable can +overwrite the default execution method for different operating systems, +defined in .cirrus.yml, by redefining the relevant yaml anchors. + +Custom compute resources can be provided using +- https://cirrus-ci.org/guide/supported-computing-services/ +- https://cirrus-ci.org/guide/persistent-workers/ -- 2.38.0
>From cae9e8fd2c257eaad0270f79b595eb70a743ab13 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 22 Aug 2023 21:45:31 -0700 Subject: [PATCH v3 06/10] ci: dontmerge: Example custom CI configuration Uses a custom google compute account (freebsd, linux and windows) and persistent workers (macos). This only works in repositories that are configured for both. In addition, cirrus-ci needs to be configured to set the REPO_CI_CONFIG_GIT_URL pointing to the file added here. --- .cirrus.custom.yml | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 .cirrus.custom.yml diff --git a/.cirrus.custom.yml b/.cirrus.custom.yml new file mode 100644 index 00000000000..42cbeecc715 --- /dev/null +++ b/.cirrus.custom.yml @@ -0,0 +1,49 @@ +gcp_credentials: + workload_identity_provider: projects/1072892761768/locations/global/workloadIdentityPools/cirrus-ci-pool/providers/cirrus-oidc + service_account: cirrus...@pg-ci-runs.iam.gserviceaccount.com + + +# Defaults +gce_instance: + type: t2d-standard-4 + spot: true + zone: us-west1-a + use_ssd: true + + +gce_instance_template: &gce_instance_template + gce_instance: + image_project: $IMAGE_PROJECT + image_family: $IMAGE_FAMILY + platform: $PLATFORM + disk: $DISK_SIZE + platform: $PLATFORM + + +gce_linux_task_template: &linux_task_template + env: + PLATFORM: linux + <<: *gce_instance_template + + +gce_freebsd_task_template: &freebsd_task_template + env: + PLATFORM: freebsd + <<: *gce_instance_template + + +gce_windows_task_template: &windows_task_template + env: + PLATFORM: windows + <<: *gce_instance_template + + +persistent_worker_macos_task_template: &macos_task_template + env: + PLATFORM: macos + persistent_worker: + isolation: + tart: + image: $IMAGE + user: admin + password: admin -- 2.38.0
>From 12a0287a38ed77529be2cf8a8a37d772459fa090 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Thu, 2 Feb 2023 21:51:53 -0800 Subject: [PATCH v3 07/10] Use "template" initdb in tests Discussion: https://postgr.es/m/20220120021859.3zpsfqn4z7ob7...@alap3.anarazel.de --- meson.build | 30 ++++++++++ src/test/perl/PostgreSQL/Test/Cluster.pm | 46 ++++++++++++++- src/test/regress/pg_regress.c | 74 ++++++++++++++++++------ .cirrus.tasks.yml | 3 +- src/Makefile.global.in | 52 +++++++++-------- 5 files changed, 161 insertions(+), 44 deletions(-) diff --git a/meson.build b/meson.build index f5ec442f9a9..8b2b521a013 100644 --- a/meson.build +++ b/meson.build @@ -3070,8 +3070,10 @@ testport = 40000 test_env = environment() temp_install_bindir = test_install_location / get_option('bindir') +test_initdb_template = meson.build_root() / 'tmp_install' / 'initdb-template' test_env.set('PG_REGRESS', pg_regress.full_path()) test_env.set('REGRESS_SHLIB', regress_module.full_path()) +test_env.set('INITDB_TEMPLATE', test_initdb_template) # Test suites that are not safe by default but can be run if selected # by the user via the whitespace-separated list in variable PG_TEST_EXTRA. @@ -3086,6 +3088,34 @@ if library_path_var != '' endif +# Create (and remove old) initdb template directory. Tests use that, where +# possible, to make it cheaper to run tests. +# +# Use python to remove the old cached initdb, as we cannot rely on a working +# 'rm' binary on windows. +test('initdb_cache', + python, + args: [ + '-c', ''' +import shutil +import sys +import subprocess + +shutil.rmtree(sys.argv[1], ignore_errors=True) +sp = subprocess.run(sys.argv[2:] + [sys.argv[1]]) +sys.exit(sp.returncode) +''', + test_initdb_template, + temp_install_bindir / 'initdb', + '-A', 'trust', '-N', '--no-instructions' + ], + priority: setup_tests_priority - 1, + timeout: 300, + is_parallel: false, + env: test_env, + suite: ['setup']) + + ############################################################### # Test Generation diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm index 5e161dbee60..4d449c35de9 100644 --- a/src/test/perl/PostgreSQL/Test/Cluster.pm +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm @@ -522,8 +522,50 @@ sub init mkdir $self->backup_dir; mkdir $self->archive_dir; - PostgreSQL::Test::Utils::system_or_bail('initdb', '-D', $pgdata, '-A', - 'trust', '-N', @{ $params{extra} }); + # If available and if there aren't any parameters, use a previously + # initdb'd cluster as a template by copying it. For a lot of tests, that's + # substantially cheaper. Do so only if there aren't parameters, it doesn't + # seem worth figuring out whether they affect compatibility. + # + # There's very similar code in pg_regress.c, but we can't easily + # deduplicate it until we require perl at build time. + if (defined $params{extra} or !defined $ENV{INITDB_TEMPLATE}) + { + note("initializing database system by running initdb"); + PostgreSQL::Test::Utils::system_or_bail('initdb', '-D', $pgdata, '-A', + 'trust', '-N', @{ $params{extra} }); + } + else + { + my @copycmd; + my $expected_exitcode; + + note("initializing database system by copying initdb template"); + + if ($PostgreSQL::Test::Utils::windows_os) + { + @copycmd = qw(robocopy /E /NJS /NJH /NFL /NDL /NP); + $expected_exitcode = 1; # 1 denotes files were copied + } + else + { + @copycmd = qw(cp -a); + $expected_exitcode = 0; + } + + @copycmd = (@copycmd, $ENV{INITDB_TEMPLATE}, $pgdata); + + my $ret = PostgreSQL::Test::Utils::system_log(@copycmd); + + # See http://perldoc.perl.org/perlvar.html#%24CHILD_ERROR + if ($ret & 127 or $ret >> 8 != $expected_exitcode) + { + BAIL_OUT( + sprintf("failed to execute command \"%s\": $ret", + join(" ", @copycmd))); + } + } + PostgreSQL::Test::Utils::system_or_bail($ENV{PG_REGRESS}, '--config-auth', $pgdata, @{ $params{auth_extra} }); diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index b68632320a7..407e3915cec 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -2295,6 +2295,7 @@ regression_main(int argc, char *argv[], FILE *pg_conf; const char *env_wait; int wait_seconds; + const char *initdb_template_dir; /* * Prepare the temp instance @@ -2316,25 +2317,64 @@ regression_main(int argc, char *argv[], if (!directory_exists(buf)) make_directory(buf); - /* initdb */ initStringInfo(&cmd); - appendStringInfo(&cmd, - "\"%s%sinitdb\" -D \"%s/data\" --no-clean --no-sync", - bindir ? bindir : "", - bindir ? "/" : "", - temp_instance); - if (debug) - appendStringInfo(&cmd, " --debug"); - if (nolocale) - appendStringInfo(&cmd, " --no-locale"); - appendStringInfo(&cmd, " > \"%s/log/initdb.log\" 2>&1", outputdir); - fflush(NULL); - if (system(cmd.data)) + + /* + * Create data directory. + * + * If available, use a previously initdb'd cluster as a template by + * copying it. For a lot of tests, that's substantially cheaper. + * + * There's very similar code in Cluster.pm, but we can't easily de + * duplicate it until we require perl at build time. + */ + initdb_template_dir = getenv("INITDB_TEMPLATE"); + if (initdb_template_dir == NULL || nolocale || debug) { - bail("initdb failed\n" - "# Examine \"%s/log/initdb.log\" for the reason.\n" - "# Command was: %s", - outputdir, cmd.data); + note("initializing database system by running initdb"); + + appendStringInfo(&cmd, + "\"%s%sinitdb\" -D \"%s/data\" --no-clean --no-sync", + bindir ? bindir : "", + bindir ? "/" : "", + temp_instance); + if (debug) + appendStringInfo(&cmd, " --debug"); + if (nolocale) + appendStringInfo(&cmd, " --no-locale"); + appendStringInfo(&cmd, " > \"%s/log/initdb.log\" 2>&1", outputdir); + fflush(NULL); + if (system(cmd.data)) + { + bail("initdb failed\n" + "# Examine \"%s/log/initdb.log\" for the reason.\n" + "# Command was: %s", + outputdir, cmd.data); + } + } + else + { +#ifndef WIN32 + const char *copycmd = "cp -a \"%s\" \"%s/data\""; + int expected_exitcode = 0; +#else + const char *copycmd = "robocopy /E /NJS /NJH /NFL /NDL /NP \"%s\" \"%s/data\""; + int expected_exitcode = 1; /* 1 denotes files were copied */ +#endif + + note("initializing database system by copying initdb template"); + + appendStringInfo(&cmd, + copycmd, + initdb_template_dir, + temp_instance); + if (system(cmd.data) != expected_exitcode) + { + bail("copying of initdb template failed\n" + "# Examine \"%s/log/initdb.log\" for the reason.\n" + "# Command was: %s", + outputdir, cmd.data); + } } pfree(cmd.data); diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml index 0cf7ba77996..e137769850d 100644 --- a/.cirrus.tasks.yml +++ b/.cirrus.tasks.yml @@ -109,8 +109,9 @@ task: test_minimal_script: | su postgres <<-EOF ulimit -c unlimited + meson test $MTEST_ARGS --suite setup meson test $MTEST_ARGS --num-processes ${TEST_JOBS} \ - tmp_install cube/regress pg_ctl/001_start_stop + cube/regress pg_ctl/001_start_stop EOF on_failure: diff --git a/src/Makefile.global.in b/src/Makefile.global.in index df9f721a41a..0b4ca0eb6ae 100644 --- a/src/Makefile.global.in +++ b/src/Makefile.global.in @@ -397,30 +397,6 @@ check: temp-install .PHONY: temp-install -temp-install: | submake-generated-headers -ifndef NO_TEMP_INSTALL -ifneq ($(abs_top_builddir),) -ifeq ($(MAKELEVEL),0) - rm -rf '$(abs_top_builddir)'/tmp_install - $(MKDIR_P) '$(abs_top_builddir)'/tmp_install/log - $(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install >'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1 - $(MAKE) -j1 $(if $(CHECKPREP_TOP),-C $(CHECKPREP_TOP),) checkprep >>'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1 -endif -endif -endif - -# Tasks to run serially at the end of temp-install. Some EXTRA_INSTALL -# entries appear more than once in the tree, and parallel installs of the same -# file can fail with EEXIST. -checkprep: - $(if $(EXTRA_INSTALL),for extra in $(EXTRA_INSTALL); do $(MAKE) -C '$(top_builddir)'/$$extra DESTDIR='$(abs_top_builddir)'/tmp_install install || exit; done) - -PROVE = @PROVE@ -# There are common routines in src/test/perl, and some test suites have -# extra perl modules in their own directory. -PG_PROVE_FLAGS = -I $(top_srcdir)/src/test/perl/ -I $(srcdir) -# User-supplied prove flags such as --verbose can be provided in PROVE_FLAGS. -PROVE_FLAGS = # prepend to path if already set, else just set it define add_to_path @@ -437,8 +413,36 @@ ld_library_path_var = LD_LIBRARY_PATH with_temp_install = \ PATH="$(abs_top_builddir)/tmp_install$(bindir):$(CURDIR):$$PATH" \ $(call add_to_path,$(strip $(ld_library_path_var)),$(abs_top_builddir)/tmp_install$(libdir)) \ + INITDB_TEMPLATE='$(abs_top_builddir)'/tmp_install/initdb-template \ $(with_temp_install_extra) +temp-install: | submake-generated-headers +ifndef NO_TEMP_INSTALL +ifneq ($(abs_top_builddir),) +ifeq ($(MAKELEVEL),0) + rm -rf '$(abs_top_builddir)'/tmp_install + $(MKDIR_P) '$(abs_top_builddir)'/tmp_install/log + $(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install >'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1 + $(MAKE) -j1 $(if $(CHECKPREP_TOP),-C $(CHECKPREP_TOP),) checkprep >>'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1 + + $(with_temp_install) initdb -A trust -N --no-instructions '$(abs_top_builddir)'/tmp_install/initdb-template >>'$(abs_top_builddir)'/tmp_install/log/initdb-template.log 2>&1 +endif +endif +endif + +# Tasks to run serially at the end of temp-install. Some EXTRA_INSTALL +# entries appear more than once in the tree, and parallel installs of the same +# file can fail with EEXIST. +checkprep: + $(if $(EXTRA_INSTALL),for extra in $(EXTRA_INSTALL); do $(MAKE) -C '$(top_builddir)'/$$extra DESTDIR='$(abs_top_builddir)'/tmp_install install || exit; done) + +PROVE = @PROVE@ +# There are common routines in src/test/perl, and some test suites have +# extra perl modules in their own directory. +PG_PROVE_FLAGS = -I $(top_srcdir)/src/test/perl/ -I $(srcdir) +# User-supplied prove flags such as --verbose can be provided in PROVE_FLAGS. +PROVE_FLAGS = + ifeq ($(enable_tap_tests),yes) ifndef PGXS -- 2.38.0
>From a30d92a68b4e7a925cc9fccb3c83674f2c6407af Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Mon, 7 Aug 2023 17:27:51 -0700 Subject: [PATCH v3 08/10] ci: switch tasks to debugoptimized build In aggregate the CI tasks burn a lot of cpu hours. Compared to that easy to read backtraces aren't as important. Still use -ggdb where appropriate, as that does make backtraces more reliable, particularly in the face of optimization. Author: Reviewed-by: Discussion: https://postgr.es/m/ Backpatch: --- .cirrus.tasks.yml | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml index e137769850d..d1730ce08a8 100644 --- a/.cirrus.tasks.yml +++ b/.cirrus.tasks.yml @@ -136,7 +136,7 @@ task: CCACHE_DIR: /tmp/ccache_dir CPPFLAGS: -DRELCACHE_FORCE_RELEASE -DCOPY_PARSE_PLAN_TREES -DWRITE_READ_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS - CFLAGS: -Og -ggdb + CFLAGS: -ggdb <<: *freebsd_task_template @@ -171,7 +171,7 @@ task: configure_script: | su postgres <<-EOF meson setup \ - --buildtype=debug \ + --buildtype=debugoptimized \ -Dcassert=true -Duuid=bsd -Dtcl_version=tcl86 -Ddtrace=auto \ -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \ -Dextra_lib_dirs=/usr/local/lib -Dextra_include_dirs=/usr/local/include/ \ @@ -266,7 +266,7 @@ task: ASAN_OPTIONS: print_stacktrace=1:disable_coredump=0:abort_on_error=1:detect_leaks=0 # SANITIZER_FLAGS is set in the tasks below - CFLAGS: -Og -ggdb -fno-sanitize-recover=all $SANITIZER_FLAGS + CFLAGS: -ggdb -fno-sanitize-recover=all $SANITIZER_FLAGS CXXFLAGS: $CFLAGS LDFLAGS: $SANITIZER_FLAGS CC: ccache gcc @@ -356,7 +356,7 @@ task: configure_script: | su postgres <<-EOF meson setup \ - --buildtype=debug \ + --buildtype=debugoptimized \ -Dcassert=true \ ${LINUX_MESON_FEATURES} \ -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \ @@ -369,7 +369,7 @@ task: su postgres <<-EOF export CC='ccache gcc -m32' meson setup \ - --buildtype=debug \ + --buildtype=debugoptimized \ -Dcassert=true \ ${LINUX_MESON_FEATURES} \ -Dllvm=disabled \ @@ -427,8 +427,8 @@ task: CC: ccache cc CXX: ccache c++ - CFLAGS: -Og -ggdb - CXXFLAGS: -Og -ggdb + CFLAGS: -ggdb + CXXFLAGS: -ggdb <<: *macos_task_template @@ -479,7 +479,7 @@ task: configure_script: | export PKG_CONFIG_PATH="/opt/local/lib/pkgconfig/" meson setup \ - --buildtype=debug \ + --buildtype=debugoptimized \ -Dextra_include_dirs=/opt/local/include \ -Dextra_lib_dirs=/opt/local/lib \ -Dcassert=true \ -- 2.38.0
>From df832576932cd3f8dc8e966670eb171353242680 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Mon, 7 Aug 2023 16:56:29 -0700 Subject: [PATCH v3 09/10] ci: windows: Disabling write cache flushing during test This has been measured to reduce windows test times by about 30s. --- .cirrus.tasks.yml | 6 ++++++ src/tools/ci/windows_write_cache.ps1 | 20 ++++++++++++++++++++ 2 files changed, 26 insertions(+) create mode 100644 src/tools/ci/windows_write_cache.ps1 diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml index d1730ce08a8..360b1c775fd 100644 --- a/.cirrus.tasks.yml +++ b/.cirrus.tasks.yml @@ -547,6 +547,12 @@ task: setup_additional_packages_script: | REM choco install -y --no-progress ... + # Define the write cache to be power protected. This reduces the rate of + # cache flushes, which seems to help metadata heavy workloads on NTFS. We're + # just testing here anyway, so ... + change_write_caching_script: + - powershell src/tools/ci/windows_write_cache.ps1 2>&1 + setup_hosts_file_script: | echo 127.0.0.1 pg-loadbalancetest >> c:\Windows\System32\Drivers\etc\hosts echo 127.0.0.2 pg-loadbalancetest >> c:\Windows\System32\Drivers\etc\hosts diff --git a/src/tools/ci/windows_write_cache.ps1 b/src/tools/ci/windows_write_cache.ps1 new file mode 100644 index 00000000000..5c67b3ce54b --- /dev/null +++ b/src/tools/ci/windows_write_cache.ps1 @@ -0,0 +1,20 @@ +# Define the write cache to be power protected. This reduces the rate of cache +# flushes, which seems to help metadata heavy workloads on NTFS. We're just +# testing here anyway, so ... +# +# Let's do so for all disks, this could be useful beyond cirrus-ci. + +Set-Location "HKLM:/SYSTEM/CurrentControlSet/Enum/SCSI"; + +Get-ChildItem -Path "*/*" | foreach-object { + Push-Location; + cd /$_; + pwd; + cd 'Device Parameters'; + if (!(Test-Path -Path "Disk")) { + New-Item -Path "Disk"; + } + + Set-ItemProperty -Path Disk -Type DWord -name CacheIsPowerProtected -Value 1; + Pop-Location; +} -- 2.38.0
>From c7b2c9d9664cc03fc997230c6162c6920181ca85 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Mon, 7 Aug 2023 16:51:16 -0700 Subject: [PATCH v3 10/10] regress: Check for postgres startup completion more often Previously pg_regress.c only checked whether the server started up once a second - in most cases startup is much faster though. Use the same interval as pg_ctl does. --- src/test/regress/pg_regress.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 407e3915cec..46af1fddbdb 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -75,6 +75,9 @@ const char *pretty_diff_opts = "-w -U3"; */ #define TESTNAME_WIDTH 36 +/* how often to recheck if postgres startup completed */ +#define WAITS_PER_SEC 10 + typedef enum TAPtype { DIAG = 0, @@ -2499,7 +2502,7 @@ regression_main(int argc, char *argv[], else wait_seconds = 60; - for (i = 0; i < wait_seconds; i++) + for (i = 0; i < wait_seconds * WAITS_PER_SEC; i++) { /* Done if psql succeeds */ fflush(NULL); @@ -2519,7 +2522,7 @@ regression_main(int argc, char *argv[], outputdir); } - pg_usleep(1000000L); + pg_usleep(1000000L / WAITS_PER_SEC); } if (i >= wait_seconds) { -- 2.38.0