Re: Use windows VMs instead of windows containers on the CI
On Fri, Feb 3, 2023 at 3:27 PM Thomas Munro wrote: > > On Fri, Feb 3, 2023 at 6:57 PM Andres Freund wrote: > > And pushed! I think an improvement in CI times of this degree is pretty > > awesome. > > +1 > > A lot of CI compute time is saved. The Cirrus account[1] was > previously hitting the 4 job limit all day long, and now it's often > running 1 or 2 jobs when I look, and it has space capacity to start a > new job immediately if someone posts a new patch. I'll monitor it > over the next few days but it looks great. > > [1] https://cirrus-ci.com/github/postgresql-cfbot/postgresql Oh, wow! This commit drastically improved testing time on Windows. It was Windows tests that were always behind in my github repo's CI, now I can see it got much faster. Thanks for working on this. Windows tests were taking around 27min (https://cirrus-ci.com/build/6060448332644352) before the patch, but it came down to 13min (https://cirrus-ci.com/build/6528980879147008) after the patch - Yay! 2X improvement :). -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Use windows VMs instead of windows containers on the CI
On Fri, Feb 3, 2023 at 6:57 PM Andres Freund wrote: > And pushed! I think an improvement in CI times of this degree is pretty > awesome. +1 A lot of CI compute time is saved. The Cirrus account[1] was previously hitting the 4 job limit all day long, and now it's often running 1 or 2 jobs when I look, and it has space capacity to start a new job immediately if someone posts a new patch. I'll monitor it over the next few days but it looks great. [1] https://cirrus-ci.com/github/postgresql-cfbot/postgresql
Re: Use windows VMs instead of windows containers on the CI
Hi, On 2023-02-02 17:47:37 +0300, Nazir Bilal Yavuz wrote: > On 1/12/2023 3:30 AM, Andres Freund wrote: > > Yea, there's a problem where packer on windows doesn't seem to abort after a > > powershell script error out. The reason isn't yet quiete clear. I think > > Bilal > > is working on a workaround. > > > That should be fixed now. Also, adding a patch for PG15. There were > conflicts while applying current patch to the REL_15_STABLE branch. And pushed! I think an improvement in CI times of this degree is pretty awesome. Unfortunately I also noticed that the tap tests on mingw don't run anymore, due to IPC::Run not being available. But it's independent of this change. I don't know when that broke. Could you check it out? Greetings, Andres Freund
Re: Use windows VMs instead of windows containers on the CI
Hi, On 1/12/2023 3:30 AM, Andres Freund wrote: Yea, there's a problem where packer on windows doesn't seem to abort after a powershell script error out. The reason isn't yet quiete clear. I think Bilal is working on a workaround. That should be fixed now. Also, adding a patch for PG15. There were conflicts while applying current patch to the REL_15_STABLE branch. Regards, Nazir Bilal Yavuz Microsoft From ba88fe8effc1400f08bfa4fcdc44862faf2977d4 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Tue, 10 Jan 2023 13:58:39 +0300 Subject: [PATCH v2] [master] Use windows VMs instead of windows containers --- .cirrus.yml | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/.cirrus.yml b/.cirrus.yml index 69837bcd5a..5700b8cd66 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -549,8 +549,10 @@ task: depends_on: SanityCheck only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*' - windows_container: -image: $CONTAINER_REPO/windows_ci_vs_2019:latest + compute_engine_instance: +image_project: $IMAGE_PROJECT +image: family/pg-ci-windows-ci-vs-2019 +platform: windows cpu: $CPUS memory: 4G @@ -589,8 +591,10 @@ task: # otherwise it'll be sorted before other tasks depends_on: SanityCheck - windows_container: -image: $CONTAINER_REPO/windows_ci_mingw64:latest + compute_engine_instance: +image_project: $IMAGE_PROJECT +image: family/pg-ci-windows-ci-mingw64 +platform: windows cpu: $CPUS memory: 4G -- 2.25.1 From bcde952b53c1e14a7eddf39542368c3dd7a97c13 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Thu, 2 Feb 2023 16:37:17 +0300 Subject: [PATCH v2] [REL_15] Use windows VMs instead of windows containers --- .cirrus.yml | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/.cirrus.yml b/.cirrus.yml index ab7043c54e..cf0fe29a14 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -400,8 +400,10 @@ task: only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*' - windows_container: -image: $CONTAINER_REPO/windows_ci_vs_2019:latest + compute_engine_instance: +image_project: $IMAGE_PROJECT +image: family/pg-ci-windows-ci-vs-2019 +platform: windows cpu: $CPUS memory: 4G -- 2.25.1
Re: Use windows VMs instead of windows containers on the CI
Hi, On 2023-01-11 17:21:21 -0600, Justin Pryzby wrote: > On Tue, Jan 10, 2023 at 03:20:18PM +0300, Nazir Bilal Yavuz wrote: > > Hi, > > > > I propose using windows VMs instead of containers, the patch is attached. > > Currently, windows containers are used on the CI, but these container images > > are needs to get pulled on every CI run, also they are slow to run. > > > @@ -589,8 +591,10 @@ task: > ># otherwise it'll be sorted before other tasks > >depends_on: SanityCheck > > > > - windows_container: > > -image: $CONTAINER_REPO/windows_ci_mingw64:latest > > + compute_engine_instance: > > +image_project: $IMAGE_PROJECT > > +image: family/pg-ci-windows-ci-mingw64 > > +platform: windows > > cpu: $CPUS > > memory: 4G > > It looks like MinGW currently doesn't have the necessary perl modules: > > [19:58:46.356] Message: Can't locate IPC/Run.pm in @INC (you may need to > install the IPC::Run module) (@INC contains: > C:/msys64/ucrt64/lib/perl5/site_perl/5.32.1 > C:/msys64/ucrt64/lib/perl5/site_perl/5.32.1 > C:/msys64/ucrt64/lib/perl5/site_perl C:/msys64/ucrt64/lib/perl5/vendor_perl > C:/msys64/ucrt64/lib/perl5/core_perl) at config/check_modules.pl line 11. > [19:58:46.356] BEGIN failed--compilation aborted at config/check_modules.pl > line 11. > [19:58:46.356] meson.build:1337: WARNING: Additional Perl modules are > required to run TAP tests. > > That could be caused by a transient failure combined with bad error > handling - if there's an error while building the image, it shouldn't be > uploaded. Yea, there's a problem where packer on windows doesn't seem to abort after a powershell script error out. The reason isn't yet quiete clear. I think Bilal is working on a workaround. Greetings, Andres Freund
Re: Use windows VMs instead of windows containers on the CI
On Tue, Jan 10, 2023 at 03:20:18PM +0300, Nazir Bilal Yavuz wrote: > Hi, > > I propose using windows VMs instead of containers, the patch is attached. > Currently, windows containers are used on the CI, but these container images > are needs to get pulled on every CI run, also they are slow to run. > @@ -589,8 +591,10 @@ task: ># otherwise it'll be sorted before other tasks >depends_on: SanityCheck > > - windows_container: > -image: $CONTAINER_REPO/windows_ci_mingw64:latest > + compute_engine_instance: > +image_project: $IMAGE_PROJECT > +image: family/pg-ci-windows-ci-mingw64 > +platform: windows > cpu: $CPUS > memory: 4G It looks like MinGW currently doesn't have the necessary perl modules: [19:58:46.356] Message: Can't locate IPC/Run.pm in @INC (you may need to install the IPC::Run module) (@INC contains: C:/msys64/ucrt64/lib/perl5/site_perl/5.32.1 C:/msys64/ucrt64/lib/perl5/site_perl/5.32.1 C:/msys64/ucrt64/lib/perl5/site_perl C:/msys64/ucrt64/lib/perl5/vendor_perl C:/msys64/ucrt64/lib/perl5/core_perl) at config/check_modules.pl line 11. [19:58:46.356] BEGIN failed--compilation aborted at config/check_modules.pl line 11. [19:58:46.356] meson.build:1337: WARNING: Additional Perl modules are required to run TAP tests. That could be caused by a transient failure combined with bad error handling - if there's an error while building the image, it shouldn't be uploaded. -- Justin
Re: Use windows VMs instead of windows containers on the CI
On Wed, Jan 11, 2023 at 8:20 AM Andres Freund wrote: > On 2023-01-10 09:22:12 -0600, Justin Pryzby wrote: > > > There is more than 2x speed gain when VMs are used. > > > > One consideration is that if windows runs twice as fast, we'll suddenly > > start using twice as many resources at cirrus/google/amazon - the > > windows task has been throttling everything else. Not sure if we should > > to do anything beyond the limits that cfbot already uses. > > I'm not sure we would. cfbot has a time based limit for how often it tries to > rebuild entries, and I think we were just about keeping up with that. In which > case we shouldn't, on average, schedule more jobs than we currently > do. Although peak "job throughput" would be higher. > > Thomas? It currently tries to re-test each patch every 24 hours, but doesn't achieve that. It looks like it's currently re-testing every ~30 hours. Justin's right, we'll consume more non-Windows resources if Windows speeds up, but not 2x, more like 1.25x when cfbot's own throttling kicks in. Or I could change the cycle target to 36 or 48 hours, to spread the work out more. Back-of-a-napkin maths: * there are currently 240 entries in a testable status * it takes ~0.5 hours to test (because that's the slow Windows time) * therefore it takes ~120 hours to test them all * but we can do 4 at a time, so that's ~30 hours to get through them all and start again * that matches what we see: cfbot=> select created - lag(created) over (order by created) from branch where submission_id = 4068; ?column? --- 1 day 06:30:00.265047 1 day 05:43:59.978949 1 day 04:13:59.754048 1 day 05:28:59.811916 1 day 07:00:00.651655 (6 rows) If, with this change, we can test in only ~0.25 hours, then we'll only need 60 hours of Cirrus time to test them all. With a target of re-testing every 24 hours, it should now only have to run ~2.5 jobs at all times. Having free slots would be kind to Cirrus, and also lower the latency when a new patch is posted (which currently has to wait for a free slot before it can begin). Great news.
Re: Use windows VMs instead of windows containers on the CI
Hi, On 2023-01-10 09:22:12 -0600, Justin Pryzby wrote: > > There is more than 2x speed gain when VMs are used. > > One consideration is that if windows runs twice as fast, we'll suddenly > start using twice as many resources at cirrus/google/amazon - the > windows task has been throttling everything else. Not sure if we should > to do anything beyond the limits that cfbot already uses. I'm not sure we would. cfbot has a time based limit for how often it tries to rebuild entries, and I think we were just about keeping up with that. In which case we shouldn't, on average, schedule more jobs than we currently do. Although peak "job throughput" would be higher. Thomas? Greetings, Andres Freund
Re: Use windows VMs instead of windows containers on the CI
On Tue, Jan 10, 2023 at 03:20:18PM +0300, Nazir Bilal Yavuz wrote: > Hi, > > I propose using windows VMs instead of containers, the patch is attached. > Currently, windows containers are used on the CI, but these container images > are needs to get pulled on every CI run, also they are slow to run. +1 > There is more than 2x speed gain when VMs are used. One consideration is that if windows runs twice as fast, we'll suddenly start using twice as many resources at cirrus/google/amazon - the windows task has been throttling everything else. Not sure if we should to do anything beyond the limits that cfbot already uses. -- Justin
Re: Use windows VMs instead of windows containers on the CI
Hi, It didn't work again. Sending numbers until I figure out how to solve this. Scheduling Step: VM + VS 2019: 00.17m Container + VS 2019: 03.51m VM + MinGW64: 00.16m Container + MinGW64: 04.28m Execution step: VM + VS 2019: 12.16m Container + VS 2019: 26.02m VM + MinGW64: 07.55m Container + MinGW64: 16.34m Sorry for the multiple mails. Regards, Nazir Bilal Yavuz Microsoft
Re: Use windows VMs instead of windows containers on the CI
Hi, Tables didn't seem nice on web interface. Re-sending with correct formatting. Scheduling step: | VS 2019 | MinGW64 -- VM | 00:17m | 00:16m -- Container | 03:51m | 04:28m Execution step: | VS 2019 | MinGW64 -- VM | 12:16m| 07:55m -- Container | 26:02m | 16:34m Regards, Nazir Bilal Yavuz Microsoft
Use windows VMs instead of windows containers on the CI
Hi, I propose using windows VMs instead of containers, the patch is attached. Currently, windows containers are used on the CI, but these container images are needs to get pulled on every CI run, also they are slow to run. These VM images are created in the same way how container images are created [1]. The comparison between VMs and containers are (based on d952373a98 and with same numbers of CPU and memory): Scheduling step: VS 2019 MinGW64 VM [2] 00:17m 00:16m Container [3] 03:51m 04:28m Execution step: VS 2019 MinGW64 VM [2] 12:16m 07.55m Container [3] 26:02m 16:34m There is more than 2x speed gain when VMs are used. [1] https://github.com/anarazel/pg-vm-images/blob/main/packer/windows.pkr.hcl [2] https://cirrus-ci.com/build/4720774045499392 [3] https://cirrus-ci.com/build/5468256027279360 Regards, Nazir Bilal Yavuz Microsoft From 6981319d054f0736e474bc315ab094094d159979 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Tue, 10 Jan 2023 13:58:39 +0300 Subject: [PATCH] Use windows VMs instead of windows containers --- .cirrus.yml | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/.cirrus.yml b/.cirrus.yml index 69837bcd5a..5700b8cd66 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -549,8 +549,10 @@ task: depends_on: SanityCheck only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*' - windows_container: -image: $CONTAINER_REPO/windows_ci_vs_2019:latest + compute_engine_instance: +image_project: $IMAGE_PROJECT +image: family/pg-ci-windows-ci-vs-2019 +platform: windows cpu: $CPUS memory: 4G @@ -589,8 +591,10 @@ task: # otherwise it'll be sorted before other tasks depends_on: SanityCheck - windows_container: -image: $CONTAINER_REPO/windows_ci_mingw64:latest + compute_engine_instance: +image_project: $IMAGE_PROJECT +image: family/pg-ci-windows-ci-mingw64 +platform: windows cpu: $CPUS memory: 4G -- 2.25.1