Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-21 Thread Alex Bennée


Wainer dos Santos Moschetta  writes:

> Hi all,

>> Conclusion
>> ==
>>
>> I think generally the state of QEMU's CI has improved over the last few
>> years but we still have a number of challenges caused by its distributed
>> nature and test stability. We are still re-active to failures rather
>> than having something fast and reliable enough to gate changes going
>> into the code base. This results in fairly long periods when one
>> or more parts of the testing mosaic are stuck on red waiting for fixes
>> to finally get merged back into master.
>>
>> So what do people think? Have I missed anything out? What else can we do
>> to improve the situation?
>>
>> Let the discussion begin ;-)
>
> I want to help on improve QEMU CI, and in fact I can commit some time
> to do so. But since I'm new to the community and made just a few
> contributions, I'm in the position of only try to understand what we
> have in place now.
>
> So allow me to put this in a different perspective. I took some notes
> in terms of CI workflows we have. It goes below along with some
> comments and questions:
>
> 
> Besides being distributed across CI providers, there are different CI
> workflows being executed on each stages of the development process.
>
> - Developer tests before send the patch to the mailing-list
>   Each developer has its own recipe.
>   Can be as simple as `make check[-TEST-SUITE]` locally. Or
> Docker-based `make docker-*` tests.

The make docker-* tests mostly cover building on other distros where
there might be subtle differences. The tests themselves are the same
make check-FOO as before.

>   It seems not widely used but some may also use push to GitHub/GibLab
> + triggers to the cloud provider.
>
>   What kind of improvements we can make here?
>   Perhaps (somehow) automate the github/githab + triggers to cloud
> provider workflow?

We have a mechanism that can already do that with patchew. But I'm not
sure how much automation can be done for developers given they need to
have accounts on the relevant services. Once that is done however it
really is just a few git pushes.

>   Allow to reproduce a failure that happens on cloud provider locally,
> when it comes to failures that occurred on next stages of development
> (see below) seems highly appreciated.

In theory yes, in practice it seems our CI providers are quite good at
producing failures under load. I've run tests that fail on travis 10's
thousands of times locally without incident. The reproductions I've done
recently have all been on VMs where I've constrained memory and vCPUs
and then very heavily loaded them. It seems like most developers are
blessed with beefy boxes that rarely show up these problems.

What would be more useful is being able to debug the failure that
occurred on the CI system. Either by:

  a) having some sort of access to the failed system

  The original Travis setup didn't really support that but I think there
  may be options now. I haven't really looked into the other CI setups
  yet. They may be better off. Certainly if we can augment CI with our
  own runners they are easier to give developers access to.

  b) upload the failure artefacts *somewhere*

  Quite a lot of these failures should be dumping core. Maybe if we can
  upload the core, associated binary, config.log and commit id to
  something we can then do a bit more post-mortem on what went wrong.

  c) dump more information in the CI logs

  An alternative to uploading would be some sort of clean-up script
  which could at least dump backtraces of cores in the logs.

>
> - Developer sends a patch to the mailing-list
>   Patchew pushes the patch to GitHub, run tests (checkpatch, asan,
> docker-clang@ubuntu, docker-mingw@fedora)
>   Reports to ML on failure. Shouldn't send an email on success as well
> so that it creates awareness about CI?

Patchew has been a little inconsistent of late with it's notifications.
Maybe a simple email with a "Just so you know patchew has run all it's
tests on this and it's fine" wouldn't be considered too noisy?

> - Maintainer tests its branch before the pull-request
>   Alike developers, it seems each one sits on its own recipe that may
> (or may not) trigger on an CI provider.

Usually the same set of normal checks plus any particular hand-crafted
tests that might be appropriate for the patches included. For example
for all of Emilio's scaling patches I ran lot of stress tests by hand.
They are only semi-automated because it's not something I'd do for most
branches.

> - Maintainer sends a pull-request to the mailing-list
>   Again patchew gets in. It seems it runs the same tests. Am I right?
>   Also send the email to mailing-list only on failure.

Yes - although generally a PR is collection of patches so it's
technically a new tree state to test.

> - Peter runs tests for each PR
>   IIUC not integrated to any CI provider yet.
>   Likely here we have the most complete scenario in terms of coverage
> (several hosts, 

Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-21 Thread Alex Bennée


Cleber Rosa  writes:

> On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
>>
>> Hi,
>>
>> As we approach stabilisation for 4.0 I thought it would be worth doing a
>> review of the current state of CI and stimulate some discussion of where
>> it is working for us and what could be improved.
>>
>> Testing in Build System
>> ===
>>
>> Things seem to be progressing well in this respect. More and more tests
>> have been added into the main source tree and they are only a make
>> invocation away. These include:
>>
>>   check  (includes unit, qapi-schema, qtest and decodetree)
>>   check-tcg  (now with system mode tests!)
>>   check-softfloat
>>   check-block
>>   check-acceptance
>>
>> Personally check-acceptance is the area I've looked at the least but
>> this seems to be the best place for "full life cycle" tests like booting
>> kernels and running stress and performance tests. I'm still a little
>> unsure how we deal with prebuilt kernels and images here though. Are
>> they basically provided by 3rd parties from their websites? Do we mirror
>> any of the artefacts we use for these tests?
>
> While it's possible to add any sort of files alongside the tests, and
> "get it"[1] from the test[2], this is certainly not desirable for
> kernels and other similarly large blobs.  The current approach is to
> use well known URLs[3] and download[4][5] those at test run time.
>
> Those are cached locally, automatically on the first run and reused on
> subsequent executions.  The caching is helpful for development
> environments, but is usually irrelevant to CI environments, where
> you'd most often than not get a new machine (or a clean environment).
>
> For now I would, also for the sake of simplicity, keep relying on 3rd
> party websites until they prove to be unreliable.  This adds
> trasnparency and reproducibility well beyond can be achieved if we
> attempt to mirror them to a QEMU sponsored/official location IMO.

I think this is fine for "well-known" artefacts. Any distro kernel is
reproducible if you go through the appropriate steps. But we don't want
to repeat the mistakes of:

  https://wiki.qemu.org/Testing/System_Images

which is a fairly random collection of stuff. At least the Advent
Calendar images have a bit more documentation with them.

>
>>
>> One area of concern is how well this all sits with KVM (and other HW
>> accelerators) and how that gets tested. With my ARM hat on I don't
>> really see any integration between testing kernel and QEMU changes
>> together to catch any problems as the core OS support for KVM gets
>> updated.
>>
>
> In short, I don't think there should be at the QEMU CI be any
> integration testing that changes both KVM and QEMU at once.
>
> But, that's me assuming that the vast majority of changes in QEMU and
> KVM can be developed, an tested separately of each other.  That's in
> sharp contrast with the the days in which KVM Autotest would build
> both the kernel and userspace as part of all test jobs, because of
> very frequent dependencies among them.
>
> I'd love to get feedback on this from KVM (and other HW accelerator)
> folks.
>
>> Another area I would like to improve is how we expand testing with
>> existing test suites. I'm thinking things like LTP and kvm-unit-tests
>> which can exercise a bunch of QEMU code but are maybe a bit to big to be
>> included in the source tree. Although given we included TestFloat (via a
>> git submodule) maybe we shouldn't dismiss that approach? Or is this
>> something that could be done via Avocado?
>>
>
> Well, there's this:
>
>   https://github.com/avocado-framework-tests/avocado-misc-tests
>
> Which contains close to 300 tests, most of them wrappers for other
> test suites, including LTP:
>
>   
> https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/generic/ltp.py
>
> I'm claiming it's the perfect fit for your idea, but sounds like a
> good starting point.

Cool - I shall have a look at that the other side of Connect. I'd like
to make running LTP easier for non-core linux-user developers.

>
>> Generally though I think we are doing pretty well at increasing our test
>> coverage while making the tests more directly available to developers
>> without having to rely on someones personal collection of random
>> binaries.
>>
>
> +1.
>
>> I wanted to know if we should encode this somewhere in our developer
>> documentation:
>>
>>   There is a strong preference for new QEMU tests to be integrated with
>>   the build system. Developers should be able to (build and) run the new
>>   tests locally directly from make.
>>
>> ?
>>
>
> There should definitely be, if reasonable, a similar experience for
> running different types of tests.  Right now, the build system (make
> targets) is clearly the common place, so +1.
>
> - Cleber.
>
> [1] - 
> https://avocado-framework.readthedocs.io/en/69.0/api/core/avocado.core.html#avocado.core.test.TestData.get_data
> [2] - 
> 

Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-18 Thread Cleber Rosa
On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
> 
> Hi,
> 
> As we approach stabilisation for 4.0 I thought it would be worth doing a
> review of the current state of CI and stimulate some discussion of where
> it is working for us and what could be improved.
> 
> Testing in Build System
> ===
> 
> Things seem to be progressing well in this respect. More and more tests
> have been added into the main source tree and they are only a make
> invocation away. These include:
> 
>   check  (includes unit, qapi-schema, qtest and decodetree)
>   check-tcg  (now with system mode tests!)
>   check-softfloat
>   check-block
>   check-acceptance
> 
> Personally check-acceptance is the area I've looked at the least but
> this seems to be the best place for "full life cycle" tests like booting
> kernels and running stress and performance tests. I'm still a little
> unsure how we deal with prebuilt kernels and images here though. Are
> they basically provided by 3rd parties from their websites? Do we mirror
> any of the artefacts we use for these tests?

While it's possible to add any sort of files alongside the tests, and
"get it"[1] from the test[2], this is certainly not desirable for
kernels and other similarly large blobs.  The current approach is to
use well known URLs[3] and download[4][5] those at test run time.

Those are cached locally, automatically on the first run and reused on
subsequent executions.  The caching is helpful for development
environments, but is usually irrelevant to CI environments, where
you'd most often than not get a new machine (or a clean environment).

For now I would, also for the sake of simplicity, keep relying on 3rd
party websites until they prove to be unreliable.  This adds
trasnparency and reproducibility well beyond can be achieved if we
attempt to mirror them to a QEMU sponsored/official location IMO.

> 
> One area of concern is how well this all sits with KVM (and other HW
> accelerators) and how that gets tested. With my ARM hat on I don't
> really see any integration between testing kernel and QEMU changes
> together to catch any problems as the core OS support for KVM gets
> updated.
>

In short, I don't think there should be at the QEMU CI be any
integration testing that changes both KVM and QEMU at once.

But, that's me assuming that the vast majority of changes in QEMU and
KVM can be developed, an tested separately of each other.  That's in
sharp contrast with the the days in which KVM Autotest would build
both the kernel and userspace as part of all test jobs, because of
very frequent dependencies among them.

I'd love to get feedback on this from KVM (and other HW accelerator)
folks.

> Another area I would like to improve is how we expand testing with
> existing test suites. I'm thinking things like LTP and kvm-unit-tests
> which can exercise a bunch of QEMU code but are maybe a bit to big to be
> included in the source tree. Although given we included TestFloat (via a
> git submodule) maybe we shouldn't dismiss that approach? Or is this
> something that could be done via Avocado?
>

Well, there's this:

  https://github.com/avocado-framework-tests/avocado-misc-tests

Which contains close to 300 tests, most of them wrappers for other
test suites, including LTP:

  
https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/generic/ltp.py

I'm claiming it's the perfect fit for your idea, but sounds like a
good starting point.

> Generally though I think we are doing pretty well at increasing our test
> coverage while making the tests more directly available to developers
> without having to rely on someones personal collection of random
> binaries.
>

+1.

> I wanted to know if we should encode this somewhere in our developer
> documentation:
> 
>   There is a strong preference for new QEMU tests to be integrated with
>   the build system. Developers should be able to (build and) run the new
>   tests locally directly from make.
> 
> ?
>

There should definitely be, if reasonable, a similar experience for
running different types of tests.  Right now, the build system (make
targets) is clearly the common place, so +1.

- Cleber.

[1] - 
https://avocado-framework.readthedocs.io/en/69.0/api/core/avocado.core.html#avocado.core.test.TestData.get_data
[2] - 
https://avocado-framework.readthedocs.io/en/69.0/WritingTests.html#accessing-test-data-files
[3] - 
https://github.com/clebergnu/qemu/blob/sent/target_arch_v5/tests/acceptance/boot_linux_console.py#L68
[4] - 
https://github.com/clebergnu/qemu/blob/sent/target_arch_v5/tests/acceptance/boot_linux_console.py#L92
[5] - 
https://avocado-framework.readthedocs.io/en/69.0/WritingTests.html#fetching-asset-files



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-18 Thread Wainer dos Santos Moschetta

Hi all,


On 03/14/2019 12:57 PM, Alex Bennée wrote:

Hi,

As we approach stabilisation for 4.0 I thought it would be worth doing a
review of the current state of CI and stimulate some discussion of where
it is working for us and what could be improved.

Testing in Build System
===

Things seem to be progressing well in this respect. More and more tests
have been added into the main source tree and they are only a make
invocation away. These include:

   check  (includes unit, qapi-schema, qtest and decodetree)
   check-tcg  (now with system mode tests!)
   check-softfloat
   check-block
   check-acceptance

Personally check-acceptance is the area I've looked at the least but
this seems to be the best place for "full life cycle" tests like booting
kernels and running stress and performance tests. I'm still a little
unsure how we deal with prebuilt kernels and images here though. Are
they basically provided by 3rd parties from their websites? Do we mirror
any of the artefacts we use for these tests?

One area of concern is how well this all sits with KVM (and other HW
accelerators) and how that gets tested. With my ARM hat on I don't
really see any integration between testing kernel and QEMU changes
together to catch any problems as the core OS support for KVM gets
updated.

Another area I would like to improve is how we expand testing with
existing test suites. I'm thinking things like LTP and kvm-unit-tests
which can exercise a bunch of QEMU code but are maybe a bit to big to be
included in the source tree. Although given we included TestFloat (via a
git submodule) maybe we shouldn't dismiss that approach? Or is this
something that could be done via Avocado?

Generally though I think we are doing pretty well at increasing our test
coverage while making the tests more directly available to developers
without having to rely on someones personal collection of random
binaries.

I wanted to know if we should encode this somewhere in our developer
documentation:

   There is a strong preference for new QEMU tests to be integrated with
   the build system. Developers should be able to (build and) run the new
   tests locally directly from make.

?

Testing in the Cloud


After BuildBot went out-of-service we have been relying heavily on Travis
as our primary CI platform. This has been creaking somewhat under the
strain and while we have a large test matrix its coverage is fairly
Ubuntu/x86 centric. However in recent months we've expanded and we now
have:

   - Shippable, cross compilers - catches a lot of 32/64 bit isms
   - Cirrus, FreeBSD and MacOS builds
   - GitLab, Alternative x86/Debian - iotests

Currently they don't a whole lot to the diversity of our testing
although Shippable is pretty quick and does catch cross-compile missteps
quite nicely. I think there is a good argument for removing some of the
testing from Travis and trying to get its long run time down to
something a bit more useful and balancing that with more tests in on the
other services.

I'm currently looking at how easy it is to expand the build farm with
GitLab. It holds a promise of making it easy to add external runners to
the build farm with a fairly simple installation of a runner client on
the machine. We'd just need to beg and borrow for more non-x86 machines.

Cloud CI Feedback
--

Currently only Travis reports the current status to our IRC channel but
with the stability of the testing being up and down it sometimes feels
like unnecessary noise. I've put together a wiki template that allows
tracking of the current CI status using badges:

   https://wiki.qemu.org/Template:CIStatus

Which includes patchew and Coverity status. This can be included on any
personal pages you wish but it is also fairly prominent on the testing
page:

   https://wiki.qemu.org/Testing

For those wishing to have a central point of reference for other
branches there is:

   https://wiki.qemu.org/Template:CustomCIStatus

which is parameterised so you can enable for different branches, for
example my fpu/next branch is linked from:

   https://wiki.qemu.org/Features/Softfloat

Of course for these to be useful people need to a) look at them and b)
be confident enough that non-green is worth looking at.

I'm wary of adding a bunch more notifications onto the IRC channel. What
I really would like is a cibot which could sit on our channel and
aggregate the status from the various services and also be queried for
the state of various branches. However this would be a chunk of non-QEMU
related work to get up and running.

Test Stability
--

We have a slowly growing number of tests who seem to fail on a fairly
regular basis in our CI tests. Sometimes it has been possible to
replicate the failure but often it seems to be a feature of running in
the CI system that is hard to replicate on developer machines. I've had
some success in replicating some and getting the appropriate developers

Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-17 Thread Fam Zheng



> On Mar 15, 2019, at 23:12, Stefan Hajnoczi  wrote:
> 
> On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
>> As we approach stabilisation for 4.0 I thought it would be worth doing a
>> review of the current state of CI and stimulate some discussion of where
>> it is working for us and what could be improved.
> 
> Thanks for this summary and for all the work that is being put into CI.
> 
> How should all sub-maintainers be checking their pull requests?
> 
> We should have information and a strict policy on minimum testing of
> pull requests.  Right now I imagine it varies a lot between
> sub-maintainers.
> 
> For my block pull requests I run qemu-iotests locally and also push to
> GitHub to trigger Travis CI.

Well, long story short, by pushing to gitlab.

If the patchew importer is changed to push to a gitlab repo that is watched by 
the same set of gitlab runners (this is a supported setup by gitlab CI), all 
posted patches can be tested the same way.

It’s a natural next step after we figure out how to automate things just for 
Peter's manual pre-merge testing, as long as the machine resources allow 
testing more subjects.

Tesing private branches are much more costly depending on test set size and 
developer numbers. Maybe it’ll be have to limited to maintainer branches first.

Fam

> 
> Stefan





Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Ed Vielmetti  writes:

> There are a couple of options hosted at Packet - Shippable, Codefresh, and
> Drone. I perhaps know more about Drone than the others. Each of them have a
> supported/sponsored version which can be used to produce arm64 binaries
> natively.
>
> I'll admit to dropping into this conversation in mid-stream though - what
> is the overall goal of this effort? Knowing that it might be easier to
> suggest a specific path.

Apologies I did drop you into the CC half-way. The thread can be viewed
in our archives:

  https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04909.html

but essentially as we finish another dev cycle we are reviewing our
current CI setup and looking to see how we can add additional
architectures to our current setup which is currently very x86 focused.

Individual developers have access to range of machines and the gitlab
runner approach looks quite promising. However it's proving to be harder
to setup in practice!

>
> On Fri, Mar 15, 2019 at 1:54 PM Alex Bennée  wrote:
>
>>
>> Ed Vielmetti  writes:
>>
>> > We have been trying to merge the Gitlab runner patches for arm64
>> > for over a year now; see
>> >
>> > https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/725
>>
>> Yes I found that one. I'm trying to work out exactly how there build
>> system works. It seems to build all architectures on the same host using
>> QEMU to do so. I suspect this has never actually been run on a non-x86
>> host so I'm seeing if there is anything I can fix.
>>
>> I've already hit a bug with Debian's QEMU packaging which assumes that
>> an AArch64 box always supports AArch32 which isn't true on the TX
>> machines:
>>
>>   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=924667
>>
>> > I have not yet sorted out who at Gitlab has the ability to get
>> > this change implemented - their management structure is
>> > not something that I have sorted out yet, and I can't tell whether
>> > this lack of forward progress is something best to tackle by
>> > technical merit or by appealing to management.
>>
>> What about Shippable? I saw the press release you guys did but it is not
>> entirely clear if I need a paid licensed Bring You Own Node or if is there
>> a
>> free option for FLOSS projects?
>>
>> >
>> > On Fri, Mar 15, 2019 at 6:24 AM Fam Zheng  wrote:
>> >
>> >>
>> >>
>> >> > On Mar 15, 2019, at 17:58, Alex Bennée 
>> wrote:
>> >> >
>> >> >
>> >> > Fam Zheng  writes:
>> >> >
>> >> >>> On Mar 15, 2019, at 16:57, Alex Bennée 
>> wrote:
>> >> >>>
>> >> >>> I had installed the gitlab-runner from the Debian repo but it was
>> out
>> >> >>> of date and didn't seem to work correctly.
>> >> >>
>> >> >> If there can be a sidecar x86 box next to the test bot, it can be the
>> >> >> controller node which runs gitlab-runner, the test script (in
>> >> >> .gitlab-ci.yml) can then sshs into the actual env to run test
>> >> >> commands.
>> >> >
>> >> > Sure although that just adds complexity compared to spinning up a box
>> in
>> >> > the cloud ;-)
>> >>
>> >> In the middle is one controller node and a number of hetergeneous boxes
>> it
>> >> knows how to control with ssh.
>> >>
>> >> (BTW patchew tester only relies on vanilla python3 to work, though
>> clearly
>> >> it suffers from insufficient manpower assumed the SLA we'll need on the
>> >> merge test. It’s unfortunate that gitlab-runner is a binary.)
>> >>
>> >> Fam
>> >>
>>
>>
>> --
>> Alex Bennée
>>


--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Ed Vielmetti
There are a couple of options hosted at Packet - Shippable, Codefresh, and
Drone. I perhaps know more about Drone than the others. Each of them have a
supported/sponsored version which can be used to produce arm64 binaries
natively.

I'll admit to dropping into this conversation in mid-stream though - what
is the overall goal of this effort? Knowing that it might be easier to
suggest a specific path.

On Fri, Mar 15, 2019 at 1:54 PM Alex Bennée  wrote:

>
> Ed Vielmetti  writes:
>
> > We have been trying to merge the Gitlab runner patches for arm64
> > for over a year now; see
> >
> > https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/725
>
> Yes I found that one. I'm trying to work out exactly how there build
> system works. It seems to build all architectures on the same host using
> QEMU to do so. I suspect this has never actually been run on a non-x86
> host so I'm seeing if there is anything I can fix.
>
> I've already hit a bug with Debian's QEMU packaging which assumes that
> an AArch64 box always supports AArch32 which isn't true on the TX
> machines:
>
>   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=924667
>
> > I have not yet sorted out who at Gitlab has the ability to get
> > this change implemented - their management structure is
> > not something that I have sorted out yet, and I can't tell whether
> > this lack of forward progress is something best to tackle by
> > technical merit or by appealing to management.
>
> What about Shippable? I saw the press release you guys did but it is not
> entirely clear if I need a paid licensed Bring You Own Node or if is there
> a
> free option for FLOSS projects?
>
> >
> > On Fri, Mar 15, 2019 at 6:24 AM Fam Zheng  wrote:
> >
> >>
> >>
> >> > On Mar 15, 2019, at 17:58, Alex Bennée 
> wrote:
> >> >
> >> >
> >> > Fam Zheng  writes:
> >> >
> >> >>> On Mar 15, 2019, at 16:57, Alex Bennée 
> wrote:
> >> >>>
> >> >>> I had installed the gitlab-runner from the Debian repo but it was
> out
> >> >>> of date and didn't seem to work correctly.
> >> >>
> >> >> If there can be a sidecar x86 box next to the test bot, it can be the
> >> >> controller node which runs gitlab-runner, the test script (in
> >> >> .gitlab-ci.yml) can then sshs into the actual env to run test
> >> >> commands.
> >> >
> >> > Sure although that just adds complexity compared to spinning up a box
> in
> >> > the cloud ;-)
> >>
> >> In the middle is one controller node and a number of hetergeneous boxes
> it
> >> knows how to control with ssh.
> >>
> >> (BTW patchew tester only relies on vanilla python3 to work, though
> clearly
> >> it suffers from insufficient manpower assumed the SLA we'll need on the
> >> merge test. It’s unfortunate that gitlab-runner is a binary.)
> >>
> >> Fam
> >>
>
>
> --
> Alex Bennée
>


Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Ed Vielmetti  writes:

> We have been trying to merge the Gitlab runner patches for arm64
> for over a year now; see
>
> https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/725

Yes I found that one. I'm trying to work out exactly how there build
system works. It seems to build all architectures on the same host using
QEMU to do so. I suspect this has never actually been run on a non-x86
host so I'm seeing if there is anything I can fix.

I've already hit a bug with Debian's QEMU packaging which assumes that
an AArch64 box always supports AArch32 which isn't true on the TX
machines:

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=924667

> I have not yet sorted out who at Gitlab has the ability to get
> this change implemented - their management structure is
> not something that I have sorted out yet, and I can't tell whether
> this lack of forward progress is something best to tackle by
> technical merit or by appealing to management.

What about Shippable? I saw the press release you guys did but it is not
entirely clear if I need a paid licensed Bring You Own Node or if is there a
free option for FLOSS projects?

>
> On Fri, Mar 15, 2019 at 6:24 AM Fam Zheng  wrote:
>
>>
>>
>> > On Mar 15, 2019, at 17:58, Alex Bennée  wrote:
>> >
>> >
>> > Fam Zheng  writes:
>> >
>> >>> On Mar 15, 2019, at 16:57, Alex Bennée  wrote:
>> >>>
>> >>> I had installed the gitlab-runner from the Debian repo but it was out
>> >>> of date and didn't seem to work correctly.
>> >>
>> >> If there can be a sidecar x86 box next to the test bot, it can be the
>> >> controller node which runs gitlab-runner, the test script (in
>> >> .gitlab-ci.yml) can then sshs into the actual env to run test
>> >> commands.
>> >
>> > Sure although that just adds complexity compared to spinning up a box in
>> > the cloud ;-)
>>
>> In the middle is one controller node and a number of hetergeneous boxes it
>> knows how to control with ssh.
>>
>> (BTW patchew tester only relies on vanilla python3 to work, though clearly
>> it suffers from insufficient manpower assumed the SLA we'll need on the
>> merge test. It’s unfortunate that gitlab-runner is a binary.)
>>
>> Fam
>>


--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Paolo Bonzini
On 15/03/19 17:28, Peter Maydell wrote:
> On Fri, 15 Mar 2019 at 15:12, Stefan Hajnoczi  wrote:
>> How should all sub-maintainers be checking their pull requests?
>>
>> We should have information and a strict policy on minimum testing of
>> pull requests.  Right now I imagine it varies a lot between
>> sub-maintainers.
>>
>> For my block pull requests I run qemu-iotests locally and also push to
>> GitHub to trigger Travis CI.
> 
> For my arm pullreqs I do light smoke testing on x86-64 typically
> (and let the tests on merge catch any portability issues), unless
> I'm particularly suspicious that something should be tested
> more widely :-)

For my pull requests I do "make docker-test-{full,mingw,clang}@fedora
docker-test-full@{centos7,ubuntu} vm-build-freebsd".  I also have
recently bought a Macincloud account that I use occasionally for build
testing.

Paolo



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Peter Maydell
On Fri, 15 Mar 2019 at 15:12, Stefan Hajnoczi  wrote:
> How should all sub-maintainers be checking their pull requests?
>
> We should have information and a strict policy on minimum testing of
> pull requests.  Right now I imagine it varies a lot between
> sub-maintainers.
>
> For my block pull requests I run qemu-iotests locally and also push to
> GitHub to trigger Travis CI.

For my arm pullreqs I do light smoke testing on x86-64 typically
(and let the tests on merge catch any portability issues), unless
I'm particularly suspicious that something should be tested
more widely :-)

thanks
-- PMM



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Stefan Hajnoczi  writes:

> On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
>> As we approach stabilisation for 4.0 I thought it would be worth doing a
>> review of the current state of CI and stimulate some discussion of where
>> it is working for us and what could be improved.
>
> Thanks for this summary and for all the work that is being put into CI.
>
> How should all sub-maintainers be checking their pull requests?
>
> We should have information and a strict policy on minimum testing of
> pull requests.  Right now I imagine it varies a lot between
> sub-maintainers.

I'll try and fill out the various Testing/CI/ subpages with details but
in short:

 Travis: already documented in Testing/CI/Travis
 Shippable/Cirrus: sign-up with your github id, enable repos you want
 testing
 Gitlab: sign-up (you can use github as a SSO)

>
> For my block pull requests I run qemu-iotests locally and also push to
> GitHub to trigger Travis CI.

I'm currently pushing to both github and gitlab and then checking I get
green across all of the services (assuming no current breakage).

>
> Stefan


--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Ed Vielmetti
We have been trying to merge the Gitlab runner patches for arm64
for over a year now; see

https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/725

I have not yet sorted out who at Gitlab has the ability to get
this change implemented - their management structure is
not something that I have sorted out yet, and I can't tell whether
this lack of forward progress is something best to tackle by
technical merit or by appealing to management.

On Fri, Mar 15, 2019 at 6:24 AM Fam Zheng  wrote:

>
>
> > On Mar 15, 2019, at 17:58, Alex Bennée  wrote:
> >
> >
> > Fam Zheng  writes:
> >
> >>> On Mar 15, 2019, at 16:57, Alex Bennée  wrote:
> >>>
> >>> I had installed the gitlab-runner from the Debian repo but it was out
> >>> of date and didn't seem to work correctly.
> >>
> >> If there can be a sidecar x86 box next to the test bot, it can be the
> >> controller node which runs gitlab-runner, the test script (in
> >> .gitlab-ci.yml) can then sshs into the actual env to run test
> >> commands.
> >
> > Sure although that just adds complexity compared to spinning up a box in
> > the cloud ;-)
>
> In the middle is one controller node and a number of hetergeneous boxes it
> knows how to control with ssh.
>
> (BTW patchew tester only relies on vanilla python3 to work, though clearly
> it suffers from insufficient manpower assumed the SLA we'll need on the
> merge test. It’s unfortunate that gitlab-runner is a binary.)
>
> Fam
>


Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Stefan Hajnoczi
On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
> As we approach stabilisation for 4.0 I thought it would be worth doing a
> review of the current state of CI and stimulate some discussion of where
> it is working for us and what could be improved.

Thanks for this summary and for all the work that is being put into CI.

How should all sub-maintainers be checking their pull requests?

We should have information and a strict policy on minimum testing of
pull requests.  Right now I imagine it varies a lot between
sub-maintainers.

For my block pull requests I run qemu-iotests locally and also push to
GitHub to trigger Travis CI.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Paolo Bonzini
On 15/03/19 03:53, Fam Zheng wrote:
>> [+] I currently test:
>> - windows crossbuilds
>> - S390, AArch32, AArch64, PPC64 Linux
>>   (SPARC currently disabled because of the migration-test flakiness)
>> - OSX
>> - FreeBSD, OpenBSD, NetBSD via the tests/vm setup
>> - various x86-64 configs: from-clean; debug; clang; TCI; no-tcg;
>>   linux-static (including 'make check-tcg’)
> I think the gitlab CI architecture is quite capable of doing what you
> want here. Some efforts will be needed to set up the gitlab-runners in
> each of above environments and I expect tweakings will be needed to get
> the automation smooth, but it is fairly straigtforward and managable:
> 
> https://docs.gitlab.com/runner/

Once we have hosts we can also do the same thing with Patchew testers by
the way---in fact, Patchew and gitlab-runner have very similar setups
where the runner polls the server for things to patch.

Paolo



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Fam Zheng



> On Mar 15, 2019, at 17:58, Alex Bennée  wrote:
> 
> 
> Fam Zheng  writes:
> 
>>> On Mar 15, 2019, at 16:57, Alex Bennée  wrote:
>>> 
>>> I had installed the gitlab-runner from the Debian repo but it was out
>>> of date and didn't seem to work correctly.
>> 
>> If there can be a sidecar x86 box next to the test bot, it can be the
>> controller node which runs gitlab-runner, the test script (in
>> .gitlab-ci.yml) can then sshs into the actual env to run test
>> commands.
> 
> Sure although that just adds complexity compared to spinning up a box in
> the cloud ;-)

In the middle is one controller node and a number of hetergeneous boxes it 
knows how to control with ssh.

(BTW patchew tester only relies on vanilla python3 to work, though clearly it 
suffers from insufficient manpower assumed the SLA we'll need on the merge 
test. It’s unfortunate that gitlab-runner is a binary.)

Fam



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Peter Maydell  writes:

> On Fri, 15 Mar 2019 at 09:05, Alex Bennée  wrote:
>>
>>
>> Peter Maydell  writes:
>> > [+] I currently test:
>> >  - windows crossbuilds
>>
>> We did have this with shippable but had to disable it when the upstream
>> repo went down. We could re-enable if we can rebuild it and cache our
>> docker images with Daniel's work.
>>
>> >  - S390, AArch32, AArch64, PPC64 Linux
>> >(SPARC currently disabled because of the migration-test flakiness)
>>
>> We would need to get machines from somewhere. Setting up a headless
>> SynQuacer should be easy enough and we have qemu-test which is a
>> ThunderX beast. I guess the IBM guys would have to chime in if they
>> could find PPC/s390 boxen because I'm guessing spamming the GCC build
>> farm with our test runners would be a little unfair.
>
> For S390x we have a just-for-QEMU machine already, courtesy of IBM.
> We're already doing builds on the GCC build farm machines, so
> as long as we don't increase the number of things we're building
> that way (ie we don't allow them to be used by random other
> submaintainers doing test runs) I don't think it should increase
> the load on them. We should definitely check with the cfarm admins
> on how allowable buildbot-equivalents are, though. And as you say
> with our Linaro hats on we can provide the Arm hosts, so it's just
> PPC and SPARC. I should also mention MIPS here which is not in
> my set of host builds because I've never found a MIPS box fast
> enough.
>
>> >  - FreeBSD, OpenBSD, NetBSD via the tests/vm setup
>>
>> We build on FreeBSD on Cirrus - but any x86 box can run the test/vm
>> setup assuming your just kicking it off with a make vm-test type thing?
>
> Yep.
>
>> >  - various x86-64 configs: from-clean; debug; clang; TCI; no-tcg;
>> >linux-static (including 'make check-tcg')
>>
>> This is already covered in our rather large Travis matrix. The trick
>> will be making it all fast enough.
>
> Yes; Travis build time is at least twice the elapsed-time we are
> looking for here.
>
> The other nice part about my current setup is that if something
> fails on a random odd host architecture I can just ssh in and
> run the test by hand to debug it. I'm guessing that any of this
> sort of CI setup is going to prohibit that.

Not necessarily. The build runner is just a daemon on the build machine
so there is nothing that stops is ssh'ing into our own infrastructure
and re-running the test from the command line. Of course in the ideal
case test failures occurring in CI should be able to be replicated in a
normal source checkout.

--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Peter Maydell
On Fri, 15 Mar 2019 at 09:05, Alex Bennée  wrote:
>
>
> Peter Maydell  writes:
> > [+] I currently test:
> >  - windows crossbuilds
>
> We did have this with shippable but had to disable it when the upstream
> repo went down. We could re-enable if we can rebuild it and cache our
> docker images with Daniel's work.
>
> >  - S390, AArch32, AArch64, PPC64 Linux
> >(SPARC currently disabled because of the migration-test flakiness)
>
> We would need to get machines from somewhere. Setting up a headless
> SynQuacer should be easy enough and we have qemu-test which is a
> ThunderX beast. I guess the IBM guys would have to chime in if they
> could find PPC/s390 boxen because I'm guessing spamming the GCC build
> farm with our test runners would be a little unfair.

For S390x we have a just-for-QEMU machine already, courtesy of IBM.
We're already doing builds on the GCC build farm machines, so
as long as we don't increase the number of things we're building
that way (ie we don't allow them to be used by random other
submaintainers doing test runs) I don't think it should increase
the load on them. We should definitely check with the cfarm admins
on how allowable buildbot-equivalents are, though. And as you say
with our Linaro hats on we can provide the Arm hosts, so it's just
PPC and SPARC. I should also mention MIPS here which is not in
my set of host builds because I've never found a MIPS box fast
enough.

> >  - FreeBSD, OpenBSD, NetBSD via the tests/vm setup
>
> We build on FreeBSD on Cirrus - but any x86 box can run the test/vm
> setup assuming your just kicking it off with a make vm-test type thing?

Yep.

> >  - various x86-64 configs: from-clean; debug; clang; TCI; no-tcg;
> >linux-static (including 'make check-tcg')
>
> This is already covered in our rather large Travis matrix. The trick
> will be making it all fast enough.

Yes; Travis build time is at least twice the elapsed-time we are
looking for here.

The other nice part about my current setup is that if something
fails on a random odd host architecture I can just ssh in and
run the test by hand to debug it. I'm guessing that any of this
sort of CI setup is going to prohibit that.

thanks
-- PMM



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Daniel P. Berrangé  writes:

> On Fri, Mar 15, 2019 at 09:34:27AM +, Alex Bennée wrote:
>>
>> Daniel P. Berrangé  writes:
>>
>> > On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
>> >> Docker Images
>> >> =
>> >>
>> >> The addition of docker has unlocked the ability to build a lot more
>> >> tests as well as compile testing on a much wider range of distros. I
>> >> think there are two outstanding areas that need improvement
>> >>
>> >> Daniel has been looking at building and hosting the images somewhere.
>> >> This would be useful as it would stop us slamming the distros
>> >> repositories constantly rebuilding the same images and also help reduce
>> >> the time to test.
>> >
>> > My intent was/still is to make use of quay.io for hosting prebuilt
>> > images.
>> >
>> > As well as avoiding repeated builds for developers it means that
>> > developers can be gauranteed to actually be testing with the same
>> > content that the automated CI did. Currently everyone using the
>> > docker images potentially has slightly different environment as
>> > it depends on what packages were in the repos when they built
>> > the image locally. This is very bad for reproducability.
>> >
>> > Libvirt uses quay.io for hosting images already and I've been
>> > looking at creating a script to automate usage of it via their
>> > REST API. Once done the same script should be usable by QEMU
>> > too.
>> >
>> > The idea would be that we still have docker files in the
>> > tests/docker/dockerfiles directory, but they would only be used
>> > for an automated job which triggers builds on quay.io, or for the
>> > few people who need to make changes to the dockerfiles. The current
>> > make rules used by developers / CI systems for executing test builds
>> > would be changed to simply pull the pre-built image off quay.io
>> > instead of running a docker build again.
>>
>> Could we just have a script that pulls the quay.io image and tags it as
>> the appropriate target and then we could do:
>>
>>   make docker-image-debian-arm64-cross [REBUILD=1]
>>
>> which would normally pull the quay.io image but if REBUILD=1 would force
>> a local rebuild?
>
> Perhaps, I hadn't really got as far as thinking about the make
> integration side.
>
>>
>> >> The other area that needs some work is better supporting non-x86 hosts.
>> >> While Docker's multi-arch story is much better (docker run debian:stable
>> >> will DTRT on any main architecture) we get stumped by things like
>> >> Debian's uneven support of cross compilers. For 4.1 I'd like to
>> >> reorganise the dockerfiles subdirectory into multiarch and arch specific
>> >> directories so we approach this is a less ad-hoc way. It would also be
>> >> nice to have the ability to gracefully fallback to linux-user powered
>> >> images where the host architecture doesn't have what we need.
>>
>> I suspect we'd never store linux-user powered images on quay.io as there
>> are niggly differences depending on the users binfmt_misc setup.
>
> Wouldn't we just upload the cross-build images and rely on the user's
> host to have registered binfmts needed to execute foreign binaries
> in the container ?

That's true - a linux-user powered image is really just a native arch
image with the addition of qemu. We already have the tooling to upload a
local QEMU into the image as-per the current binfmt_misc settings
(docker.py update).

--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Daniel P . Berrangé
On Fri, Mar 15, 2019 at 09:34:27AM +, Alex Bennée wrote:
> 
> Daniel P. Berrangé  writes:
> 
> > On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
> >> Docker Images
> >> =
> >>
> >> The addition of docker has unlocked the ability to build a lot more
> >> tests as well as compile testing on a much wider range of distros. I
> >> think there are two outstanding areas that need improvement
> >>
> >> Daniel has been looking at building and hosting the images somewhere.
> >> This would be useful as it would stop us slamming the distros
> >> repositories constantly rebuilding the same images and also help reduce
> >> the time to test.
> >
> > My intent was/still is to make use of quay.io for hosting prebuilt
> > images.
> >
> > As well as avoiding repeated builds for developers it means that
> > developers can be gauranteed to actually be testing with the same
> > content that the automated CI did. Currently everyone using the
> > docker images potentially has slightly different environment as
> > it depends on what packages were in the repos when they built
> > the image locally. This is very bad for reproducability.
> >
> > Libvirt uses quay.io for hosting images already and I've been
> > looking at creating a script to automate usage of it via their
> > REST API. Once done the same script should be usable by QEMU
> > too.
> >
> > The idea would be that we still have docker files in the
> > tests/docker/dockerfiles directory, but they would only be used
> > for an automated job which triggers builds on quay.io, or for the
> > few people who need to make changes to the dockerfiles. The current
> > make rules used by developers / CI systems for executing test builds
> > would be changed to simply pull the pre-built image off quay.io
> > instead of running a docker build again.
> 
> Could we just have a script that pulls the quay.io image and tags it as
> the appropriate target and then we could do:
> 
>   make docker-image-debian-arm64-cross [REBUILD=1]
> 
> which would normally pull the quay.io image but if REBUILD=1 would force
> a local rebuild?

Perhaps, I hadn't really got as far as thinking about the make
integration side.

> 
> >> The other area that needs some work is better supporting non-x86 hosts.
> >> While Docker's multi-arch story is much better (docker run debian:stable
> >> will DTRT on any main architecture) we get stumped by things like
> >> Debian's uneven support of cross compilers. For 4.1 I'd like to
> >> reorganise the dockerfiles subdirectory into multiarch and arch specific
> >> directories so we approach this is a less ad-hoc way. It would also be
> >> nice to have the ability to gracefully fallback to linux-user powered
> >> images where the host architecture doesn't have what we need.
> 
> I suspect we'd never store linux-user powered images on quay.io as there
> are niggly differences depending on the users binfmt_misc setup.

Wouldn't we just upload the cross-build images and rely on the user's
host to have registered binfmts needed to execute foreign binaries
in the container ?

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Fam Zheng  writes:

>> On Mar 15, 2019, at 16:57, Alex Bennée  wrote:
>>
>> I had installed the gitlab-runner from the Debian repo but it was out
>> of date and didn't seem to work correctly.
>
> If there can be a sidecar x86 box next to the test bot, it can be the
> controller node which runs gitlab-runner, the test script (in
> .gitlab-ci.yml) can then sshs into the actual env to run test
> commands.

Sure although that just adds complexity compared to spinning up a box in
the cloud ;-)

--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Daniel P . Berrangé
On Thu, Mar 14, 2019 at 06:22:44PM +, Peter Maydell wrote:
> On Thu, 14 Mar 2019 at 15:57, Alex Bennée  wrote:
> > Testing in the Cloud
> > 
> >
> > After BuildBot went out-of-service we have been relying heavily on Travis
> > as our primary CI platform. This has been creaking somewhat under the
> > strain and while we have a large test matrix its coverage is fairly
> > Ubuntu/x86 centric. However in recent months we've expanded and we now
> > have:
> >
> >   - Shippable, cross compilers - catches a lot of 32/64 bit isms
> >   - Cirrus, FreeBSD and MacOS builds
> >   - GitLab, Alternative x86/Debian - iotests
> 
> Are any of these capable of replacing my ad-hoc collection
> of build test systems for testing merges ? I would quite like
> to be able to do that, because it would make it easier for
> other people to take over the process of handling pull requests
> when I'm away.
> 
> I think the main requirements for that would be:
>  * covers full range of hosts[*]
>  * can be asked to do a test build of a merge before
>I push it to master
>  * reliably completes all builds within say 90 minutes
>of being asked to start

Out of all of the systems above, I would personally have a strong preference
for GitLab simply because it is actually based on open source software.
Putting a critical part of QEMU's workflow on to a closed source service
whose infrastructure or terms of service could change at any time, feels
risky to me. With GitLab in the worst case we can spin up an instance on
our own hardware to run our test stack, especially if we are already
providing our own test runners.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Fam Zheng



> On Mar 15, 2019, at 16:57, Alex Bennée  wrote:
> 
> I had installed the gitlab-runner from the Debian repo but it was out
> of date and didn't seem to work correctly.

If there can be a sidecar x86 box next to the test bot, it can be the 
controller node which runs gitlab-runner, the test script (in .gitlab-ci.yml) 
can then sshs into the actual env to run test commands.

Fam



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Daniel P. Berrangé  writes:

> On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
>> Docker Images
>> =
>>
>> The addition of docker has unlocked the ability to build a lot more
>> tests as well as compile testing on a much wider range of distros. I
>> think there are two outstanding areas that need improvement
>>
>> Daniel has been looking at building and hosting the images somewhere.
>> This would be useful as it would stop us slamming the distros
>> repositories constantly rebuilding the same images and also help reduce
>> the time to test.
>
> My intent was/still is to make use of quay.io for hosting prebuilt
> images.
>
> As well as avoiding repeated builds for developers it means that
> developers can be gauranteed to actually be testing with the same
> content that the automated CI did. Currently everyone using the
> docker images potentially has slightly different environment as
> it depends on what packages were in the repos when they built
> the image locally. This is very bad for reproducability.
>
> Libvirt uses quay.io for hosting images already and I've been
> looking at creating a script to automate usage of it via their
> REST API. Once done the same script should be usable by QEMU
> too.
>
> The idea would be that we still have docker files in the
> tests/docker/dockerfiles directory, but they would only be used
> for an automated job which triggers builds on quay.io, or for the
> few people who need to make changes to the dockerfiles. The current
> make rules used by developers / CI systems for executing test builds
> would be changed to simply pull the pre-built image off quay.io
> instead of running a docker build again.

Could we just have a script that pulls the quay.io image and tags it as
the appropriate target and then we could do:

  make docker-image-debian-arm64-cross [REBUILD=1]

which would normally pull the quay.io image but if REBUILD=1 would force
a local rebuild?

>> The other area that needs some work is better supporting non-x86 hosts.
>> While Docker's multi-arch story is much better (docker run debian:stable
>> will DTRT on any main architecture) we get stumped by things like
>> Debian's uneven support of cross compilers. For 4.1 I'd like to
>> reorganise the dockerfiles subdirectory into multiarch and arch specific
>> directories so we approach this is a less ad-hoc way. It would also be
>> nice to have the ability to gracefully fallback to linux-user powered
>> images where the host architecture doesn't have what we need.

I suspect we'd never store linux-user powered images on quay.io as there
are niggly differences depending on the users binfmt_misc setup.

--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Peter Maydell  writes:

> On Thu, 14 Mar 2019 at 15:57, Alex Bennée  wrote:
>> Testing in the Cloud
>> 
>>
>> After BuildBot went out-of-service we have been relying heavily on Travis
>> as our primary CI platform. This has been creaking somewhat under the
>> strain and while we have a large test matrix its coverage is fairly
>> Ubuntu/x86 centric. However in recent months we've expanded and we now
>> have:
>>
>>   - Shippable, cross compilers - catches a lot of 32/64 bit isms
>>   - Cirrus, FreeBSD and MacOS builds
>>   - GitLab, Alternative x86/Debian - iotests
>
> Are any of these capable of replacing my ad-hoc collection
> of build test systems for testing merges ? I would quite like
> to be able to do that, because it would make it easier for
> other people to take over the process of handling pull requests
> when I'm away.
>
> I think the main requirements for that would be:
>  * covers full range of hosts[*]
>  * can be asked to do a test build of a merge before
>I push it to master
>  * reliably completes all builds within say 90 minutes
>of being asked to start
>
> [+] I currently test:
>  - windows crossbuilds

We did have this with shippable but had to disable it when the upstream
repo went down. We could re-enable if we can rebuild it and cache our
docker images with Daniel's work.

>  - S390, AArch32, AArch64, PPC64 Linux
>(SPARC currently disabled because of the migration-test flakiness)

We would need to get machines from somewhere. Setting up a headless
SynQuacer should be easy enough and we have qemu-test which is a
ThunderX beast. I guess the IBM guys would have to chime in if they
could find PPC/s390 boxen because I'm guessing spamming the GCC build
farm with our test runners would be a little unfair.

>  - OSX

Currently run on Travis and recently Cirrus

>  - FreeBSD, OpenBSD, NetBSD via the tests/vm setup

We build on FreeBSD on Cirrus - but any x86 box can run the test/vm
setup assuming your just kicking it off with a make vm-test type thing?

>  - various x86-64 configs: from-clean; debug; clang; TCI; no-tcg;
>linux-static (including 'make check-tcg')

This is already covered in our rather large Travis matrix. The trick
will be making it all fast enough.

--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-15 Thread Alex Bennée


Fam Zheng  writes:

>> On Mar 15, 2019, at 02:22, Peter Maydell  wrote:
>>
>> On Thu, 14 Mar 2019 at 15:57, Alex Bennée  wrote:
>>> Testing in the Cloud
>>> 
>>>
>>> After BuildBot went out-of-service we have been relying heavily on Travis
>>> as our primary CI platform. This has been creaking somewhat under the
>>> strain and while we have a large test matrix its coverage is fairly
>>> Ubuntu/x86 centric. However in recent months we've expanded and we now
>>> have:
>>>
>>>  - Shippable, cross compilers - catches a lot of 32/64 bit isms
>>>  - Cirrus, FreeBSD and MacOS builds
>>>  - GitLab, Alternative x86/Debian - iotests
>>
>> Are any of these capable of replacing my ad-hoc collection
>> of build test systems for testing merges ? I would quite like
>> to be able to do that, because it would make it easier for
>> other people to take over the process of handling pull requests
>> when I'm away.
>>
>> I think the main requirements for that would be:
>> * covers full range of hosts[*]
>> * can be asked to do a test build of a merge before
>>   I push it to master
>> * reliably completes all builds within say 90 minutes
>>   of being asked to start
>>
>> [+] I currently test:
>> - windows crossbuilds
>> - S390, AArch32, AArch64, PPC64 Linux
>>   (SPARC currently disabled because of the migration-test flakiness)
>> - OSX
>> - FreeBSD, OpenBSD, NetBSD via the tests/vm setup
>> - various x86-64 configs: from-clean; debug; clang; TCI; no-tcg;
>>   linux-static (including 'make check-tcg’)
>
> I think the gitlab CI architecture is quite capable of doing what you
> want here. Some efforts will be needed to set up the gitlab-runners in
> each of above environments and I expect tweakings will be needed to
> get the automation smooth, but it is fairly straigtforward and
> managable:
>
> https://docs.gitlab.com/runner/

If only it was that simple. It seems they don't current have arm64
packaged, only armhf:

  https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/725

I had installed the gitlab-runner from the Debian repo but it was out
of date and didn't seem to work correctly.

There build system seems... interesting.. in that it requires qemu-arm
to build on any host. I think it is because they are bundling all the
various architecture runners in one package which makes it a bit hard to
follow. Also it's in Go and has *all* the go dependencies which seems to
be a bit of a horror show in the distro packaging department.

However conceptually if we can get over that hurdle it does look as
though it could be quite promising. It's just getting to that point
might require a diversion to get gitlab's multiarch support up to speed.

By the way GitLab offer their additional tiers for free for FLOSS
projects:

  
https://about.gitlab.com/2018/06/05/gitlab-ultimate-and-gold-free-for-education-and-open-source/

so perhaps I should make the application for qemu-project. The main
benefit would be upping the total number of CI minutes we get on the
shared runners.

Shippable also offer a BYON (Bring Your Own Node) solution which we
should be able to plug one of the Packet.net arm64 servers into but it
seems to requite a node license to use so I haven't been able to play
with it yet.

(CCed: Ed from packet.net who runs the Works on ARM program)

--
Alex Bennée



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-14 Thread Fam Zheng



> On Mar 15, 2019, at 02:22, Peter Maydell  wrote:
> 
> On Thu, 14 Mar 2019 at 15:57, Alex Bennée  wrote:
>> Testing in the Cloud
>> 
>> 
>> After BuildBot went out-of-service we have been relying heavily on Travis
>> as our primary CI platform. This has been creaking somewhat under the
>> strain and while we have a large test matrix its coverage is fairly
>> Ubuntu/x86 centric. However in recent months we've expanded and we now
>> have:
>> 
>>  - Shippable, cross compilers - catches a lot of 32/64 bit isms
>>  - Cirrus, FreeBSD and MacOS builds
>>  - GitLab, Alternative x86/Debian - iotests
> 
> Are any of these capable of replacing my ad-hoc collection
> of build test systems for testing merges ? I would quite like
> to be able to do that, because it would make it easier for
> other people to take over the process of handling pull requests
> when I'm away.
> 
> I think the main requirements for that would be:
> * covers full range of hosts[*]
> * can be asked to do a test build of a merge before
>   I push it to master
> * reliably completes all builds within say 90 minutes
>   of being asked to start
> 
> [+] I currently test:
> - windows crossbuilds
> - S390, AArch32, AArch64, PPC64 Linux
>   (SPARC currently disabled because of the migration-test flakiness)
> - OSX
> - FreeBSD, OpenBSD, NetBSD via the tests/vm setup
> - various x86-64 configs: from-clean; debug; clang; TCI; no-tcg;
>   linux-static (including 'make check-tcg’)

I think the gitlab CI architecture is quite capable of doing what you want 
here. Some efforts will be needed to set up the gitlab-runners in each of above 
environments and I expect tweakings will be needed to get the automation 
smooth, but it is fairly straigtforward and managable:

https://docs.gitlab.com/runner/

Fam

> 
> thanks
> -- PMM
> 





Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-14 Thread Peter Maydell
On Thu, 14 Mar 2019 at 15:57, Alex Bennée  wrote:
> Testing in the Cloud
> 
>
> After BuildBot went out-of-service we have been relying heavily on Travis
> as our primary CI platform. This has been creaking somewhat under the
> strain and while we have a large test matrix its coverage is fairly
> Ubuntu/x86 centric. However in recent months we've expanded and we now
> have:
>
>   - Shippable, cross compilers - catches a lot of 32/64 bit isms
>   - Cirrus, FreeBSD and MacOS builds
>   - GitLab, Alternative x86/Debian - iotests

Are any of these capable of replacing my ad-hoc collection
of build test systems for testing merges ? I would quite like
to be able to do that, because it would make it easier for
other people to take over the process of handling pull requests
when I'm away.

I think the main requirements for that would be:
 * covers full range of hosts[*]
 * can be asked to do a test build of a merge before
   I push it to master
 * reliably completes all builds within say 90 minutes
   of being asked to start

[+] I currently test:
 - windows crossbuilds
 - S390, AArch32, AArch64, PPC64 Linux
   (SPARC currently disabled because of the migration-test flakiness)
 - OSX
 - FreeBSD, OpenBSD, NetBSD via the tests/vm setup
 - various x86-64 configs: from-clean; debug; clang; TCI; no-tcg;
   linux-static (including 'make check-tcg')

thanks
-- PMM



Re: [Qemu-devel] State of QEMU CI as we enter 4.0

2019-03-14 Thread Daniel P . Berrangé
On Thu, Mar 14, 2019 at 03:57:06PM +, Alex Bennée wrote:
> Docker Images
> =
> 
> The addition of docker has unlocked the ability to build a lot more
> tests as well as compile testing on a much wider range of distros. I
> think there are two outstanding areas that need improvement
> 
> Daniel has been looking at building and hosting the images somewhere.
> This would be useful as it would stop us slamming the distros
> repositories constantly rebuilding the same images and also help reduce
> the time to test.

My intent was/still is to make use of quay.io for hosting prebuilt
images.

As well as avoiding repeated builds for developers it means that
developers can be gauranteed to actually be testing with the same
content that the automated CI did. Currently everyone using the
docker images potentially has slightly different environment as
it depends on what packages were in the repos when they built
the image locally. This is very bad for reproducability.

Libvirt uses quay.io for hosting images already and I've been
looking at creating a script to automate usage of it via their
REST API. Once done the same script should be usable by QEMU
too.

The idea would be that we still have docker files in the
tests/docker/dockerfiles directory, but they would only be used
for an automated job which triggers builds on quay.io, or for the
few people who need to make changes to the dockerfiles. The current
make rules used by developers / CI systems for executing test builds
would be changed to simply pull the pre-built image off quay.io
instead of running a docker build again.

> The other area that needs some work is better supporting non-x86 hosts.
> While Docker's multi-arch story is much better (docker run debian:stable
> will DTRT on any main architecture) we get stumped by things like
> Debian's uneven support of cross compilers. For 4.1 I'd like to
> reorganise the dockerfiles subdirectory into multiarch and arch specific
> directories so we approach this is a less ad-hoc way. It would also be
> nice to have the ability to gracefully fallback to linux-user powered
> images where the host architecture doesn't have what we need.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|