On Tue, Jun 24, 2025 at 11:41:28AM -0600, Warner Losh wrote:

> FreeBSD does maintain all our archival releases forever. They never change.
> But, we don't have permanent links to them today. We start with one URL and
> then migrate to a second one when they transition from supported to 
> unsupported.
> We do this, in part, to make sure people upgrade. So in effect, this breakage
> means that our notion is "working" in the sense that the FreeBSD project's 
> goals
> of making people "keep up to date."
> 
> This does, I realize, clash with the views that QEMU wants to have some stable
> way to test images over time, even if the upstream's notion of supported or 
> not
> changes.
> 
> One easy idea might be to 'prestage' the 'legacy' releases when they
> are supported
> on the 'legacy' server so that tests can be written with the legacy
> path so that these
> tests always work, now and in the future.
> 
> So, this is terrible from a FreeBSD point of view. We'd like it if
> qemu always tested
> all of our releases, as well as snapshots of the tip of the spear.

FWIW, there are two distinct POV to testing which are clashing here.

What you're describing, IMHO, is a desire for QEMU to perform what
I would consider integration testing against all FreeBSD releases,
and forthcoming releases of FreeBSD.

What QEMU's functional test suite is aiming to do is provide
sufficient coverage of QEMU's functionality that we avoid
shipping regressions.

Where it gets fuzzy is that a functional test suite has overlap
with, and can be a decent proxy for, an integration test suite.

The key difference I see is around expectations for the results
of the test harness.

For QEMU's functional test suite, an overriding concern is that
a failure of the test suite *MUST* reflect a fault in QEMU.

We want to minimize (ideally eliminate) any failures caused by
factors outside QEMU. A failure should be something that can
be immediately referred back to the author of the PULL that
triggered it, without needing triage to determine if is it a
failure caused by something outside QEMU. A functional test
failure should generally gate the merging of a PULL request,
given that it should reflect a clear QEMU fault.

With this in mind, we don't ever want to be testing unreleased
snapshots, and even for released images, we always want to
fixate on a specific image hash. Similarly the execution  env
of the test suite is a docker container that has fairly well
constrained  software, though currently we do not fixate
our container images on particular package versions, which
has caused us painful spurious failures at times.


An integration test suite, by contrast, should be open to the
idea that failures can be cause by any moving part in the stack,
whether the host OS, QEMU, or the guest OS. Accepting that,
however, means taking on a significantly higher burden in the
triage of failures - that can easily become a full time job for
one or more people, so diagnose problems and then herd cats to
get it fixed in whichever piece was at fault.

This makes integration testing mostly unsuitable for use as a
gating test for merging PULL requests. It would run asynchronously
and problems could potentially take a long time to resolve, though
ideally by resolves by time of rrelease.

> There's got to be some
> way to have some shared responsibility that we can automate. FreeBSD could 
> test
> the most recent release of qemu against a bunch of images in our CI
> cluster. But we
> don't actually have a CI cluster we could put that into (our focus is
> just a little different)
> today.

The issue of CI resources also impacts QEMU :-( We have to be wary that
our upstream testing is using our own limited CI resources, and any
contributors using GitLab CI also have limited quota.

The human constraint is probably the overriding concern I would
have from the QEMU side though. I'd love it if QEMU did full
integration testing across all guestOS we can get our hands on,
both current & forthcoming FreeBSD/Linux releases, and the
countless historical releases of many OS. Realistically we just
don't have the human resources to manage such a testing effort,
even if we found the hardware to support it.

Putting my Fedora hat on, we rebase QEMU in Fedora rawhide when
rc0 comes out, and Fedora's QA team rely on the rawhide QEMU in
doing release testing. While this isn't always timely enough to
prevent QEMU bugs getting into Fedora which then impact Fedora
releases, it is the best we can do given constraints of both
projects.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to