Re: Use windows VMs instead of windows containers on the CI

2023-02-04 Thread Bharath Rupireddy
On Fri, Feb 3, 2023 at 3:27 PM Thomas Munro  wrote:
>
> On Fri, Feb 3, 2023 at 6:57 PM Andres Freund  wrote:
> > And pushed!  I think an improvement in CI times of this degree is pretty
> > awesome.
>
> +1
>
> A lot of CI compute time is saved.  The Cirrus account[1] was
> previously hitting the 4 job limit all day long, and now it's often
> running 1 or 2 jobs when I look, and it has space capacity to start a
> new job immediately if someone posts a new patch.  I'll monitor it
> over the next few days but it looks great.
>
> [1] https://cirrus-ci.com/github/postgresql-cfbot/postgresql

Oh, wow! This commit drastically improved testing time on Windows. It
was Windows tests that were always behind in my github repo's CI, now
I can see it got much faster. Thanks for working on this.

Windows tests were taking around 27min
(https://cirrus-ci.com/build/6060448332644352) before the patch, but
it came down to 13min (https://cirrus-ci.com/build/6528980879147008)
after the patch - Yay! 2X improvement :).

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Use windows VMs instead of windows containers on the CI

2023-02-03 Thread Thomas Munro
On Fri, Feb 3, 2023 at 6:57 PM Andres Freund  wrote:
> And pushed!  I think an improvement in CI times of this degree is pretty
> awesome.

+1

A lot of CI compute time is saved.  The Cirrus account[1] was
previously hitting the 4 job limit all day long, and now it's often
running 1 or 2 jobs when I look, and it has space capacity to start a
new job immediately if someone posts a new patch.  I'll monitor it
over the next few days but it looks great.

[1] https://cirrus-ci.com/github/postgresql-cfbot/postgresql




Re: Use windows VMs instead of windows containers on the CI

2023-02-02 Thread Andres Freund
Hi,

On 2023-02-02 17:47:37 +0300, Nazir Bilal Yavuz wrote:
> On 1/12/2023 3:30 AM, Andres Freund wrote:
> > Yea, there's a problem where packer on windows doesn't seem to abort after a
> > powershell script error out. The reason isn't yet quiete clear. I think 
> > Bilal
> > is working on a workaround.
> 
> 
> That should be fixed now. Also, adding a patch for PG15. There were
> conflicts while applying current patch to the REL_15_STABLE branch.

And pushed!  I think an improvement in CI times of this degree is pretty
awesome.


Unfortunately I also noticed that the tap tests on mingw don't run
anymore, due to IPC::Run not being available. But it's independent of
this change. I don't know when that broke. Could you check it out?

Greetings,

Andres Freund




Re: Use windows VMs instead of windows containers on the CI

2023-02-02 Thread Nazir Bilal Yavuz

Hi,


On 1/12/2023 3:30 AM, Andres Freund wrote:

Yea, there's a problem where packer on windows doesn't seem to abort after a
powershell script error out. The reason isn't yet quiete clear. I think Bilal
is working on a workaround.



That should be fixed now. Also, adding a patch for PG15. There were 
conflicts while applying current patch to the REL_15_STABLE branch.



Regards,
Nazir Bilal Yavuz
Microsoft
From ba88fe8effc1400f08bfa4fcdc44862faf2977d4 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Tue, 10 Jan 2023 13:58:39 +0300
Subject: [PATCH v2] [master] Use windows VMs instead of windows containers

---
 .cirrus.yml | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 69837bcd5a..5700b8cd66 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -549,8 +549,10 @@ task:
   depends_on: SanityCheck
   only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || 
$CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*'
 
-  windows_container:
-image: $CONTAINER_REPO/windows_ci_vs_2019:latest
+  compute_engine_instance:
+image_project: $IMAGE_PROJECT
+image: family/pg-ci-windows-ci-vs-2019
+platform: windows
 cpu: $CPUS
 memory: 4G
 
@@ -589,8 +591,10 @@ task:
   # otherwise it'll be sorted before other tasks
   depends_on: SanityCheck
 
-  windows_container:
-image: $CONTAINER_REPO/windows_ci_mingw64:latest
+  compute_engine_instance:
+image_project: $IMAGE_PROJECT
+image: family/pg-ci-windows-ci-mingw64
+platform: windows
 cpu: $CPUS
 memory: 4G
 
-- 
2.25.1

From bcde952b53c1e14a7eddf39542368c3dd7a97c13 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Thu, 2 Feb 2023 16:37:17 +0300
Subject: [PATCH v2] [REL_15] Use windows VMs instead of windows containers

---
 .cirrus.yml | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index ab7043c54e..cf0fe29a14 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -400,8 +400,10 @@ task:
 
   only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || 
$CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*'
 
-  windows_container:
-image: $CONTAINER_REPO/windows_ci_vs_2019:latest
+  compute_engine_instance:
+image_project: $IMAGE_PROJECT
+image: family/pg-ci-windows-ci-vs-2019
+platform: windows
 cpu: $CPUS
 memory: 4G
 
-- 
2.25.1



Re: Use windows VMs instead of windows containers on the CI

2023-01-11 Thread Andres Freund
Hi,

On 2023-01-11 17:21:21 -0600, Justin Pryzby wrote:
> On Tue, Jan 10, 2023 at 03:20:18PM +0300, Nazir Bilal Yavuz wrote:
> > Hi,
> > 
> > I propose using windows VMs instead of containers, the patch is attached.
> > Currently, windows containers are used on the CI, but these container images
> > are needs to get pulled on every CI run, also they are slow to run.
> 
> > @@ -589,8 +591,10 @@ task:
> ># otherwise it'll be sorted before other tasks
> >depends_on: SanityCheck
> >  
> > -  windows_container:
> > -image: $CONTAINER_REPO/windows_ci_mingw64:latest
> > +  compute_engine_instance:
> > +image_project: $IMAGE_PROJECT
> > +image: family/pg-ci-windows-ci-mingw64
> > +platform: windows
> >  cpu: $CPUS
> >  memory: 4G
> 
> It looks like MinGW currently doesn't have the necessary perl modules:
> 
> [19:58:46.356] Message: Can't locate IPC/Run.pm in @INC (you may need to 
> install the IPC::Run module) (@INC contains: 
> C:/msys64/ucrt64/lib/perl5/site_perl/5.32.1 
> C:/msys64/ucrt64/lib/perl5/site_perl/5.32.1 
> C:/msys64/ucrt64/lib/perl5/site_perl C:/msys64/ucrt64/lib/perl5/vendor_perl 
> C:/msys64/ucrt64/lib/perl5/core_perl) at config/check_modules.pl line 11.
> [19:58:46.356] BEGIN failed--compilation aborted at config/check_modules.pl 
> line 11.
> [19:58:46.356] meson.build:1337: WARNING: Additional Perl modules are 
> required to run TAP tests.
> 
> That could be caused by a transient failure combined with bad error
> handling - if there's an error while building the image, it shouldn't be
> uploaded.

Yea, there's a problem where packer on windows doesn't seem to abort after a
powershell script error out. The reason isn't yet quiete clear. I think Bilal
is working on a workaround.

Greetings,

Andres Freund




Re: Use windows VMs instead of windows containers on the CI

2023-01-11 Thread Justin Pryzby
On Tue, Jan 10, 2023 at 03:20:18PM +0300, Nazir Bilal Yavuz wrote:
> Hi,
> 
> I propose using windows VMs instead of containers, the patch is attached.
> Currently, windows containers are used on the CI, but these container images
> are needs to get pulled on every CI run, also they are slow to run.

> @@ -589,8 +591,10 @@ task:
># otherwise it'll be sorted before other tasks
>depends_on: SanityCheck
>  
> -  windows_container:
> -image: $CONTAINER_REPO/windows_ci_mingw64:latest
> +  compute_engine_instance:
> +image_project: $IMAGE_PROJECT
> +image: family/pg-ci-windows-ci-mingw64
> +platform: windows
>  cpu: $CPUS
>  memory: 4G

It looks like MinGW currently doesn't have the necessary perl modules:

[19:58:46.356] Message: Can't locate IPC/Run.pm in @INC (you may need to 
install the IPC::Run module) (@INC contains: 
C:/msys64/ucrt64/lib/perl5/site_perl/5.32.1 
C:/msys64/ucrt64/lib/perl5/site_perl/5.32.1 
C:/msys64/ucrt64/lib/perl5/site_perl C:/msys64/ucrt64/lib/perl5/vendor_perl 
C:/msys64/ucrt64/lib/perl5/core_perl) at config/check_modules.pl line 11.
[19:58:46.356] BEGIN failed--compilation aborted at config/check_modules.pl 
line 11.
[19:58:46.356] meson.build:1337: WARNING: Additional Perl modules are required 
to run TAP tests.

That could be caused by a transient failure combined with bad error
handling - if there's an error while building the image, it shouldn't be
uploaded.

-- 
Justin




Re: Use windows VMs instead of windows containers on the CI

2023-01-10 Thread Thomas Munro
On Wed, Jan 11, 2023 at 8:20 AM Andres Freund  wrote:
> On 2023-01-10 09:22:12 -0600, Justin Pryzby wrote:
> > > There is more than 2x speed gain when VMs are used.
> >
> > One consideration is that if windows runs twice as fast, we'll suddenly
> > start using twice as many resources at cirrus/google/amazon - the
> > windows task has been throttling everything else.  Not sure if we should
> > to do anything beyond the limits that cfbot already uses.
>
> I'm not sure we would. cfbot has a time based limit for how often it tries to
> rebuild entries, and I think we were just about keeping up with that. In which
> case we shouldn't, on average, schedule more jobs than we currently
> do. Although peak "job throughput" would be higher.
>
> Thomas?

It currently tries to re-test each patch every 24 hours, but doesn't
achieve that.  It looks like it's currently re-testing every ~30
hours.  Justin's right, we'll consume more non-Windows resources if
Windows speeds up, but not 2x, more like 1.25x when cfbot's own
throttling kicks in.  Or I could change the cycle target to 36 or 48
hours, to spread the work out more.

Back-of-a-napkin maths:

 * there are currently 240 entries in a testable status
 * it takes ~0.5 hours to test (because that's the slow Windows time)
 * therefore it takes ~120 hours to test them all
 * but we can do 4 at a time, so that's ~30 hours to get through them
all and start again
 * that matches what we see:

cfbot=> select created - lag(created) over (order by created) from
branch where submission_id = 4068;
   ?column?
---

 1 day 06:30:00.265047
 1 day 05:43:59.978949
 1 day 04:13:59.754048
 1 day 05:28:59.811916
 1 day 07:00:00.651655
(6 rows)

If, with this change, we can test in only ~0.25 hours, then we'll only
need 60 hours of Cirrus time to test them all.  With a target of
re-testing every 24 hours, it should now only have to run ~2.5 jobs at
all times.  Having free slots would be kind to Cirrus, and also lower
the latency when a new patch is posted (which currently has to wait
for a free slot before it can begin).  Great news.




Re: Use windows VMs instead of windows containers on the CI

2023-01-10 Thread Andres Freund
Hi,

On 2023-01-10 09:22:12 -0600, Justin Pryzby wrote:
> > There is more than 2x speed gain when VMs are used.
> 
> One consideration is that if windows runs twice as fast, we'll suddenly
> start using twice as many resources at cirrus/google/amazon - the
> windows task has been throttling everything else.  Not sure if we should
> to do anything beyond the limits that cfbot already uses.

I'm not sure we would. cfbot has a time based limit for how often it tries to
rebuild entries, and I think we were just about keeping up with that. In which
case we shouldn't, on average, schedule more jobs than we currently
do. Although peak "job throughput" would be higher.

Thomas?

Greetings,

Andres Freund




Re: Use windows VMs instead of windows containers on the CI

2023-01-10 Thread Justin Pryzby
On Tue, Jan 10, 2023 at 03:20:18PM +0300, Nazir Bilal Yavuz wrote:
> Hi,
> 
> I propose using windows VMs instead of containers, the patch is attached.
> Currently, windows containers are used on the CI, but these container images
> are needs to get pulled on every CI run, also they are slow to run.

+1

> There is more than 2x speed gain when VMs are used.

One consideration is that if windows runs twice as fast, we'll suddenly
start using twice as many resources at cirrus/google/amazon - the
windows task has been throttling everything else.  Not sure if we should
to do anything beyond the limits that cfbot already uses.

-- 
Justin




Re: Use windows VMs instead of windows containers on the CI

2023-01-10 Thread Nazir Bilal Yavuz

Hi,

It didn't work again. Sending numbers until I figure out how to solve this.

Scheduling Step:

VM + VS 2019: 00.17m
Container + VS 2019: 03.51m

VM + MinGW64: 00.16m
Container + MinGW64: 04.28m


Execution step:

VM + VS 2019: 12.16m
Container + VS 2019: 26.02m

VM + MinGW64: 07.55m
Container + MinGW64: 16.34m


Sorry for the multiple mails.


Regards,
Nazir Bilal Yavuz
Microsoft


Re: Use windows VMs instead of windows containers on the CI

2023-01-10 Thread Nazir Bilal Yavuz

Hi,

Tables didn't seem nice on web interface. Re-sending with correct 
formatting.


Scheduling step:

| VS 2019 | MinGW64
--
VM | 00:17m | 00:16m
--
Container | 03:51m | 04:28m
Execution step:

| VS 2019 | MinGW64
--
VM | 12:16m| 07:55m
--
Container | 26:02m | 16:34m


Regards,
Nazir Bilal Yavuz
Microsoft


Use windows VMs instead of windows containers on the CI

2023-01-10 Thread Nazir Bilal Yavuz

Hi,

I propose using windows VMs instead of containers, the patch is 
attached. Currently, windows containers are used on the CI, but these 
container images are needs to get pulled on every CI run, also they are 
slow to run.


These VM images are created in the same way how container images are 
created [1].


The comparison between VMs and containers are (based on d952373a98 and 
with same numbers of CPU and memory):


Scheduling step:


VS 2019
MinGW64
VM [2]
00:17m
00:16m
Container [3]
03:51m  04:28m

Execution step:


VS 2019
MinGW64
VM [2]
12:16m
07.55m
Container [3]
26:02m  16:34m

There is more than 2x speed gain when VMs are used.

[1] 
https://github.com/anarazel/pg-vm-images/blob/main/packer/windows.pkr.hcl

[2] https://cirrus-ci.com/build/4720774045499392
[3] https://cirrus-ci.com/build/5468256027279360

Regards,
Nazir Bilal Yavuz
Microsoft
From 6981319d054f0736e474bc315ab094094d159979 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Tue, 10 Jan 2023 13:58:39 +0300
Subject: [PATCH] Use windows VMs instead of windows containers

---
 .cirrus.yml | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 69837bcd5a..5700b8cd66 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -549,8 +549,10 @@ task:
   depends_on: SanityCheck
   only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || 
$CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*'
 
-  windows_container:
-image: $CONTAINER_REPO/windows_ci_vs_2019:latest
+  compute_engine_instance:
+image_project: $IMAGE_PROJECT
+image: family/pg-ci-windows-ci-vs-2019
+platform: windows
 cpu: $CPUS
 memory: 4G
 
@@ -589,8 +591,10 @@ task:
   # otherwise it'll be sorted before other tasks
   depends_on: SanityCheck
 
-  windows_container:
-image: $CONTAINER_REPO/windows_ci_mingw64:latest
+  compute_engine_instance:
+image_project: $IMAGE_PROJECT
+image: family/pg-ci-windows-ci-mingw64
+platform: windows
 cpu: $CPUS
 memory: 4G
 
-- 
2.25.1