On Fri, Sep 04, 2020 at 09:18:16AM +0100, Daniel P. Berrangé wrote: > On Thu, Sep 03, 2020 at 08:11:39PM -0400, Cleber Rosa wrote: > > On Thu, Jul 09, 2020 at 11:30:29AM +0100, Daniel P. Berrangé wrote: > > > On Wed, Jul 08, 2020 at 10:46:57PM -0400, Cleber Rosa wrote: > > > > This is a mapping of Peter's "remake-merge-builds" and > > > > "pull-buildtest" scripts, gone through some updates, adding some build > > > > option and removing others. > > > > > > > > The jobs currently cover the machines that the QEMU project owns, and > > > > that > > > > are setup and ready to run jobs: > > > > > > > > - Ubuntu 18.04 on S390x > > > > - Ubuntu 20.04 on aarch64 > > > > > > > > During the development of this set of jobs, the GitLab CI was tested > > > > with many other architectures, including ppc64, s390x and aarch64, > > > > along with the other OSs (not included here): > > > > > > > > - Fedora 30 > > > > - FreeBSD 12.1 > > > > > > > > More information can be found in the documentation itself. > > > > > > > > Signed-off-by: Cleber Rosa <cr...@redhat.com> > > > > --- > > > > .gitlab-ci.d/gating.yml | 146 +++++++++++++++++ > > > > > > AFAIK, the jobs in this file just augment what is already defined > > > in the main .gitlab-ci.yml. Also since we're providing setup info > > > for other people to configure custom runners, these jobs are usable > > > for non-gating CI scenarios too. > > > > > > > If you mean that they introduced new jobs, you're right. > > > > > IOW, the jobs in this file happen to be usable for gating, but they > > > are not the only gating jobs, and can be used for non-gating reasons. > > > > > > > Right, I do not doubt these jobs may be useful to other people and on > > scenarios other than "before merging a patch series". > > > > > This is a complicated way of saying that gating.yml is not a desirable > > > filename, so I'd suggest splitting it in two and having these files > > > named based on what their contents is, rather than their use case: > > > > > > .gitlab-ci.d/runners-s390x.yml > > > .gitlab-ci.d/runners-aarch64.yml > > > > > > The existing jobs in .gitlab-ci.yml could possibly be moved into > > > a .gitlab-ci.d/runners-shared.yml file for consistency. > > > > > > > Do you imply that every gitlab CI job should be a gating job? And > > that the same jobs should be used when other people with their own > > forks? I find this problematic because: > > > > * It would trigger pipelines with jobs that, unless every user has the > > same runners configured, would have unfulfilled jobs that don't have > > a matching hardware. > > Jobs that require a custom runner should not be set to run by default, > but individual contributors must absolutely be able to opt-in to running > those jobs simply by registering a runner on their account. >
Agreed, and that's why they have been put into this diffent "gating" class here. > > * It dilutes the idea that those jobs are inherently different with > > regards to the management of their infrastructure. > > I don't really know what yiu mean here, but "Inherantly different" > does not sound like a desirable property. > Organizations and individuals will have responsibility over the infrastructure they choose to add, which is "inherently different" from the gitlab shared machines. Not sure there's a way around it. > > * It destroys the notion of layered testing, for whatever people find > > that worth it, where a faster turnaround could/would be possible > > with fewer jobs for every push, and many more jobs before a merge. > > The key goal of CI is to reduce the burden on maintainers. The biggest > cost is if we merge code and failure is noticed after merge. IT is > still a large cost, however, if Peter only finds a CI failure when he > attempts the pre-merge test. He has to throw out the pull request > putting more work on the subsystem maintainer. The subsystem maintainer > may have to throw it back to the original author. > > The ideal scenario that we need to strive towards is that the original > author has tested their code with 100% coverage of all the CI jobs QMEU > has defined. > I agree... but it's also unrealistic at this point, right? For instance, do we have s390x boxes to run all of those? Avocado has been using Travis CI for s390x/ppc64/aarch64, and those are quite unreliable even with a load many orders of magnitude smaller then the QEMU project. So, resources are needed to have this flat, 100% coverage, "ideal scenario" you describe. > Any time there is a job that is not run by authors, but only by the > maintainers, we are putting increased burden on the maintainers, so > must be minimize that. > I agree. But if resources are limited, then should the testing scope be decresead so that it's equalized? > IOW, layered testing is not desirable as goal. Rather layered testing > is just a default setup, but we'd encourage contributors to run the > full set of CI jobs, especially if they are frequent contributors. > The more they run themselves, the less burden on subsystem maintainers > and Peter, and thus the better we all scale. > We agree on goals, we don't agree on the strategy though. > > Finally, I find the split by runner architecture you suggested > > problematic because different organizations may have jobs for the same > > architecture. I believe that files for different organizations may be > > a better organization instead. Entries in the MAINTAINERS are one > > example where the grouping by architecture may not be optimal. > > I don't think we should be structuring jobs around organizations. We > should be defining a set of desired jobs we wish to be able to run. > Any organization can bring a runner that is capable of running the > jobs and donate it to the QEMU project for our formal CI runner > The organization is not defining the job though - QEMU is defining > the jobs we expect to have used for testing. > This was disscussed previously[1]. > This is key because any contributor needs to be able to spin up an > identical envrionment to replicate any build failures. We don't want > runners for merge testing that are built as a blackbox by someone. > That is the single biggest painpoint with Peter's current merge > jobs - we can't easily replicate Peter's merge env even if we had > the matching hardware available. > With the right automation, such as the playbooks introduced here, any person with the same hardware should have an environment to replicate a job and debug and issue. [1] - https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00231.html Best regards, - Cleber. > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
signature.asc
Description: PGP signature