Re: Updated CI using GitHub actions

2020-11-11 Thread Tim Düsterhus
Ilya,

Am 11.11.20 um 20:52 schrieb Илья Шипицин:
> as for running some jobs on schedule, does current "generated" config
> support schedule in some way ?
> or do you mean classic yml definition
> 

Yes, it does. GitHub exposes the workflow trigger as
'github.event_name'. We can pass this as a parameter to the Python
script and then use a simple if condition. That's the magic of a
programming language.

I've made an example:

https://github.com/TimWolla/haproxy/commit/770ffd31f3c99b13bc9e36b82dc9723c34afdb4b

The workflow that ran on push:

https://github.com/TimWolla/haproxy/runs/1387351894?check_suite_focus=true#step:3:4

I would have expected the scheduled job to run on 21:25 UTC, but
apparently it did not start. Maybe they are delayed a bit depending on
load. Maybe it pops up here in a minutes. It should include the
dedicated compression jobs then:
https://github.com/TimWolla/haproxy/actions?query=workflow%3AVTest

Best regards
Tim Düsterhus



Re: Updated CI using GitHub actions

2020-11-11 Thread Tim Düsterhus
Ilya,

Am 11.11.20 um 20:48 schrieb Илья Шипицин:
>>> (few things like 51 degree, prometheus, PCRE2 to be discussed later)
>>>
>>
>> - Put 51d, Prometheus into the "all features" tests, that's what they
>> are for.
>>
> 
> 51d has 2 different implementations: "pattern" and "trie".
> 

I guess for 51d it is sufficient to test one of both. I expect any
necessary changes to affect both of them. We can't really test 51d
anyway, because it's just a compile test. Using one of them is "good
enough".

>> - PCRE 2 should get a dedicated pair of tests testing only regex similar
>> to slz + zlib that only test compression.
>>
> 
> that's questionable.
> I think we might encounter more issues if we run slz or pcre2 together with
> other features.

I disagree here. The differences between slz / gzip and pcre / pcre2 are
pretty localized. They are not magically going to break SSL.

Thus I prefer to optimize the builds for easier understanding. If you
see that slz fails and gzip does not then it's immediately obvious. Any
more features enabled make it less obvious.

> but I do not think it is good to waste someone electricity
> 

That's why I suggested to run the less important builds on a schedule.
Even then the GitHub action VMs are much quicker. The Ubuntu, clang,
gz=slz=1 build finishes in 1:58 minutes total with 42 seconds spent
installing build dependencies. With the amount of pushes HAProxy does
daily a $5/month VPS would probably be sufficient to run all the builds
we do and with GitHub Actions we can use our own VPS if necessary.

Best regards
Tim Düsterhus



Re: Updated CI using GitHub actions

2020-11-11 Thread Илья Шипицин
as for running some jobs on schedule, does current "generated" config
support schedule in some way ?
or do you mean classic yml definition

ср, 11 нояб. 2020 г. в 23:59, Tim Düsterhus :

> Ilya,
>
> Am 11.11.20 um 19:38 schrieb Илья Шипицин:
> > let us discuss next steps :)
> >
> > Ubuntu, gcc, all features
> > Ubuntu, gcc, ssl=stock
> > both of jobs use stock ssl. do we really need second one (ssl enabled, no
> > other features enabled) ? I would second one (same for clang)
>
> Yes, we need both. I'd like to have both to be able to directly compare
> any differences between the various SSL libraries. Compare the following
> example:
>
> - Ubuntu, gcc, all features: Fails.
> - Ubuntu, gcc, ssl=stock: Works.
> - Ubuntu, gcc, ssl=libressl=3.0.2: Fails.
>
> -> There are probably two separate issues. One with LibreSSL and one
> that is not related to SSL. This is more useful than:
>
> - Ubuntu, gcc, all features: Fails.
> - Ubuntu, gcc, ssl=libressl=3.0.2: Fails.
>
> -> This might be an issue with SSL in general or two separate issues. We
> don't know because we have nothing to compare.
>
> - Ubuntu, gcc, all features: Works.
> - Ubuntu, gcc, ssl=stock: Fails.
>
> -> There is probably some issue with SSL combined with something else.
>
> - Ubuntu, gcc, all features: Fails.
> - Ubuntu, gcc, ssl=stock: Works.
>
> -> There is probably some issue that is not related to SSL.
>
> - Ubuntu, gcc, all features: Fails.
> - Ubuntu, gcc, ssl=stock: Works.
> - Ubuntu, gcc, ssl=libressl=3.0.2: Fails.
>
> -> There is probably some issue in LibreSSL only.
>
> > Ubuntu, gcc, gz=slz=1
> > this one is "slz only, no other features". should we run slz + all
> features
> > enabled instead ? (same for clang)
>
> No. As with SSL this is intentional, testing only the specific
> difference of using slz instead of zlib in a "controlled" environment.
> There is no need to run, e.g. all the SSL tests with slz, because they
> don't enable compression anyway.
>
> >
> > Ubuntu, gcc, gz=zlib=1
> > should we drop this one in favour of "gcc all features enabled" ?
>
> No. Same reason. Specific controlled environment as a direct comparison
> to easily detect differences that are caused by switching out the
> compression library. If zlib fails and slz does not we can't easily see
> if it is caused by a bug in zlib integration or by some issue in the
> combination of the various features.
>
> ---
>
> I would be fine with running e.g. the SLZ test / LibreSSL tests only in
> cron to keep resource usage low. But we must keep them separate to keep
> things simple and easy to understand instead of implicitly testing stuff
> by including them somewhere else. That was one of my biggest concerns
> with the current Travis situation.
>
> >
> > Ubuntu, gcc, ssl=libressl=3.0.2
> > I think we can drop this one as well
>
> Okay.
>
> >
> > (few things like 51 degree, prometheus, PCRE2 to be discussed later)
> >
>
> - Put 51d, Prometheus into the "all features" tests, that's what they
> are for.
> - PCRE 2 should get a dedicated pair of tests testing only regex similar
> to slz + zlib that only test compression.
>
> Best regards
> Tim Düsterhus
>


Re: Updated CI using GitHub actions

2020-11-11 Thread Илья Шипицин
ср, 11 нояб. 2020 г. в 23:59, Tim Düsterhus :

> Ilya,
>
> Am 11.11.20 um 19:38 schrieb Илья Шипицин:
> > let us discuss next steps :)
> >
> > Ubuntu, gcc, all features
> > Ubuntu, gcc, ssl=stock
> > both of jobs use stock ssl. do we really need second one (ssl enabled, no
> > other features enabled) ? I would second one (same for clang)
>
> Yes, we need both. I'd like to have both to be able to directly compare
> any differences between the various SSL libraries. Compare the following
> example:
>
> - Ubuntu, gcc, all features: Fails.
> - Ubuntu, gcc, ssl=stock: Works.
> - Ubuntu, gcc, ssl=libressl=3.0.2: Fails.
>
> -> There are probably two separate issues. One with LibreSSL and one
> that is not related to SSL. This is more useful than:
>
> - Ubuntu, gcc, all features: Fails.
> - Ubuntu, gcc, ssl=libressl=3.0.2: Fails.
>
> -> This might be an issue with SSL in general or two separate issues. We
> don't know because we have nothing to compare.
>
> - Ubuntu, gcc, all features: Works.
> - Ubuntu, gcc, ssl=stock: Fails.
>
> -> There is probably some issue with SSL combined with something else.
>
> - Ubuntu, gcc, all features: Fails.
> - Ubuntu, gcc, ssl=stock: Works.
>
> -> There is probably some issue that is not related to SSL.
>
> - Ubuntu, gcc, all features: Fails.
> - Ubuntu, gcc, ssl=stock: Works.
> - Ubuntu, gcc, ssl=libressl=3.0.2: Fails.
>
> -> There is probably some issue in LibreSSL only.
>
> > Ubuntu, gcc, gz=slz=1
> > this one is "slz only, no other features". should we run slz + all
> features
> > enabled instead ? (same for clang)
>
> No. As with SSL this is intentional, testing only the specific
> difference of using slz instead of zlib in a "controlled" environment.
> There is no need to run, e.g. all the SSL tests with slz, because they
> don't enable compression anyway.
>
> >
> > Ubuntu, gcc, gz=zlib=1
> > should we drop this one in favour of "gcc all features enabled" ?
>
> No. Same reason. Specific controlled environment as a direct comparison
> to easily detect differences that are caused by switching out the
> compression library. If zlib fails and slz does not we can't easily see
> if it is caused by a bug in zlib integration or by some issue in the
> combination of the various features.
>
> ---
>
> I would be fine with running e.g. the SLZ test / LibreSSL tests only in
> cron to keep resource usage low. But we must keep them separate to keep
> things simple and easy to understand instead of implicitly testing stuff
> by including them somewhere else. That was one of my biggest concerns
> with the current Travis situation.
>
> >
> > Ubuntu, gcc, ssl=libressl=3.0.2
> > I think we can drop this one as well
>
> Okay.
>
>
> > (few things like 51 degree, prometheus, PCRE2 to be discussed later)
> >
>
> - Put 51d, Prometheus into the "all features" tests, that's what they
> are for.
>

51d has 2 different implementations: "pattern" and "trie".


> - PCRE 2 should get a dedicated pair of tests testing only regex similar
> to slz + zlib that only test compression.
>

that's questionable.
I think we might encounter more issues if we run slz or pcre2 together with
other features.
but I do not think it is good to waste someone electricity


>
> Best regards
> Tim Düsterhus
>


Re: Updated CI using GitHub actions

2020-11-11 Thread Tim Düsterhus
Ilya,

Am 11.11.20 um 19:38 schrieb Илья Шипицин:
> let us discuss next steps :)
> 
> Ubuntu, gcc, all features
> Ubuntu, gcc, ssl=stock
> both of jobs use stock ssl. do we really need second one (ssl enabled, no
> other features enabled) ? I would second one (same for clang)

Yes, we need both. I'd like to have both to be able to directly compare
any differences between the various SSL libraries. Compare the following
example:

- Ubuntu, gcc, all features: Fails.
- Ubuntu, gcc, ssl=stock: Works.
- Ubuntu, gcc, ssl=libressl=3.0.2: Fails.

-> There are probably two separate issues. One with LibreSSL and one
that is not related to SSL. This is more useful than:

- Ubuntu, gcc, all features: Fails.
- Ubuntu, gcc, ssl=libressl=3.0.2: Fails.

-> This might be an issue with SSL in general or two separate issues. We
don't know because we have nothing to compare.

- Ubuntu, gcc, all features: Works.
- Ubuntu, gcc, ssl=stock: Fails.

-> There is probably some issue with SSL combined with something else.

- Ubuntu, gcc, all features: Fails.
- Ubuntu, gcc, ssl=stock: Works.

-> There is probably some issue that is not related to SSL.

- Ubuntu, gcc, all features: Fails.
- Ubuntu, gcc, ssl=stock: Works.
- Ubuntu, gcc, ssl=libressl=3.0.2: Fails.

-> There is probably some issue in LibreSSL only.

> Ubuntu, gcc, gz=slz=1
> this one is "slz only, no other features". should we run slz + all features
> enabled instead ? (same for clang)

No. As with SSL this is intentional, testing only the specific
difference of using slz instead of zlib in a "controlled" environment.
There is no need to run, e.g. all the SSL tests with slz, because they
don't enable compression anyway.

> 
> Ubuntu, gcc, gz=zlib=1
> should we drop this one in favour of "gcc all features enabled" ?

No. Same reason. Specific controlled environment as a direct comparison
to easily detect differences that are caused by switching out the
compression library. If zlib fails and slz does not we can't easily see
if it is caused by a bug in zlib integration or by some issue in the
combination of the various features.

---

I would be fine with running e.g. the SLZ test / LibreSSL tests only in
cron to keep resource usage low. But we must keep them separate to keep
things simple and easy to understand instead of implicitly testing stuff
by including them somewhere else. That was one of my biggest concerns
with the current Travis situation.

> 
> Ubuntu, gcc, ssl=libressl=3.0.2
> I think we can drop this one as well

Okay.

> 
> (few things like 51 degree, prometheus, PCRE2 to be discussed later)
> 

- Put 51d, Prometheus into the "all features" tests, that's what they
are for.
- PCRE 2 should get a dedicated pair of tests testing only regex similar
to slz + zlib that only test compression.

Best regards
Tim Düsterhus



Re: Updated CI using GitHub actions

2020-11-11 Thread Илья Шипицин
let us discuss next steps :)

Ubuntu, gcc, all features
Ubuntu, gcc, ssl=stock
both of jobs use stock ssl. do we really need second one (ssl enabled, no
other features enabled) ? I would second one (same for clang)


Ubuntu, gcc, gz=slz=1
this one is "slz only, no other features". should we run slz + all features
enabled instead ? (same for clang)


Ubuntu, gcc, gz=zlib=1
should we drop this one in favour of "gcc all features enabled" ?


Ubuntu, gcc, ssl=libressl=3.0.2
I think we can drop this one as well


(few things like 51 degree, prometheus, PCRE2 to be discussed later)



ср, 11 нояб. 2020 г. в 09:34, Willy Tarreau :

> On Tue, Nov 10, 2020 at 10:30:52PM +0100, Tim Düsterhus wrote:
> (...)
> > Let me (or Ilya) know if you have any questions or if you notice any
> > issues with it. Personally I'm super happy with how it turned out :-)
>
> Many thanks to you and Ilya for handling this. I know for having followed
> your exchanges that it was not trivial, and I really appreciate that you
> entirely offloaded me from this painful task. We'll see how it goes over
> the long term, especially if many projects move from Travis to Github
> Actions and start to slow things down as we've observed on Travis over
> time (let's hope not!).
>
> I guess we'll soon remove the redundant builds from Travis and only keep
> the unusual ones (ppc64, s390x, arm64).
>
> Cheers,
> Willy
>


Re: Updated CI using GitHub actions

2020-11-10 Thread Willy Tarreau
On Tue, Nov 10, 2020 at 10:30:52PM +0100, Tim Düsterhus wrote:
(...)
> Let me (or Ilya) know if you have any questions or if you notice any
> issues with it. Personally I'm super happy with how it turned out :-)

Many thanks to you and Ilya for handling this. I know for having followed
your exchanges that it was not trivial, and I really appreciate that you
entirely offloaded me from this painful task. We'll see how it goes over
the long term, especially if many projects move from Travis to Github
Actions and start to slow things down as we've observed on Travis over
time (let's hope not!).

I guess we'll soon remove the redundant builds from Travis and only keep
the unusual ones (ppc64, s390x, arm64).

Cheers,
Willy



Re: Updated CI using GitHub actions

2020-11-10 Thread William Lallemand
On Tue, Nov 10, 2020 at 10:30:52PM +0100, Tim Düsterhus wrote:
> Let me (or Ilya) know if you have any questions or if you notice any
> issues with it. Personally I'm super happy with how it turned out :-)
> 

Thanks to both of you, the whole thing is cleaner and quicker from my
point of view :-)

-- 
William Lallemand



Updated CI using GitHub actions

2020-11-10 Thread Tim Düsterhus
Hi List,

this is a kind-of follow-up for my previous email from July:
https://www.mail-archive.com/haproxy@formilux.org/msg38032.html.

You might or might not have noticed that Travis became a bit slow in the
last months and a few days ago they announced a new pricing model,
limiting minutes even for Open Source projects:
https://blog.travis-ci.com/2020-11-02-travis-ci-new-billing

Travis was great to get started with CI for HAProxy and definitely
helped to catch quite a few bugs, but its future appears to be a bit
unclear by now.

Due to this I worked with Ilya in the last days to clean up my patch and
fix some of the limitations. Willy pushed it this morning after Ilya was
happy with it. I learned quite a bit about the details of ASAN and SSL,
but Ilya is the expert there :-)

Besides better performance GitHub Actions improves the user experience
with regard to maintenance and integration within GitHub.

I'd like to give a short run down of the changes to get you up to speed
with the new workflow.

1. Test results are directly visible on GitHub.

Most notably each entry within the GitHub Actions build matrix creates
an entry within the status dropdown at the commit. There is no need to
leave GitHub to see which build configuration exactly failed (e.g. only
a specific SSL library). With Travis there was a binary Yes / No for all
Travis tests.

(see build_status.png)

2. Automated extraction of information from the build log.

GitHub Actions support so called "problem matchers". They are usually
used to annotate specific source code lines within a pull request. I
(ab)used them to parse out the specific reg-tests that failed. The
definition of the parser can be found here:
https://github.com/haproxy/haproxy/blob/master/.github/vtest.json

You can find the failing tests in the overview page for a specific
action run in the "Annotations" action:
https://github.com/haproxy/haproxy/actions/runs/31522

(see annotations.png)

3. Output grouping in the build log.

GitHub Actions allows to easily group the output of the whole test and
build run, while Travis only had a long log file.

The output is now grouped by the step within the process (e.g. compile
VTest or compile HAProxy) and also by a logical subgroup (e.g. a
specific reg-test).

As an example you can easily view the output of a single reg-test
without needing to search the start and end of the relevant output.

To do so you go into the logs for a specific job, click "Show test
results" and then click the the Test case you are interested in:

https://github.com/haproxy/haproxy/runs/1378860942?check_suite_focus=true#step:14:125

(see output_group.png)

4. Cleaner build configuration.

I took care to make the GitHub Actions configuration as maintainable as
possible, by splitting everything into logical steps and generating the
build matrix automatically instead of hardcoding all combinations. I've
put comments for non-obvious stuff so that you don't have to search the
GitHub documentation to understand it.

Find all the details in the commit:
https://github.com/haproxy/haproxy/commit/288c0772e6565cf42cb07b354b9c3fd7edaf5459

-

There might be a few more adjustments in the next days to fix any
upcoming issues (it's going to be interesting once the actual failures
are fixed and everything is green). After this email you should be able
to find your way around the new stuff, though.

Travis is probably slowly going away for the important tests. It will
still how the non-amd64 architectures for the foreseeable future,
because GitHub Actions does not support those to the best of my
knowledge. Travis also drives the Coverity scans, but those should be
possible to migrate easyish.

Let me (or Ilya) know if you have any questions or if you notice any
issues with it. Personally I'm super happy with how it turned out :-)

Best regards
Tim Düsterhus