On Tue, Jun 24, 2025 at 11:41:28AM -0600, Warner Losh wrote: > FreeBSD does maintain all our archival releases forever. They never change. > But, we don't have permanent links to them today. We start with one URL and > then migrate to a second one when they transition from supported to > unsupported. > We do this, in part, to make sure people upgrade. So in effect, this breakage > means that our notion is "working" in the sense that the FreeBSD project's > goals > of making people "keep up to date." > > This does, I realize, clash with the views that QEMU wants to have some stable > way to test images over time, even if the upstream's notion of supported or > not > changes. > > One easy idea might be to 'prestage' the 'legacy' releases when they > are supported > on the 'legacy' server so that tests can be written with the legacy > path so that these > tests always work, now and in the future. > > So, this is terrible from a FreeBSD point of view. We'd like it if > qemu always tested > all of our releases, as well as snapshots of the tip of the spear.
FWIW, there are two distinct POV to testing which are clashing here. What you're describing, IMHO, is a desire for QEMU to perform what I would consider integration testing against all FreeBSD releases, and forthcoming releases of FreeBSD. What QEMU's functional test suite is aiming to do is provide sufficient coverage of QEMU's functionality that we avoid shipping regressions. Where it gets fuzzy is that a functional test suite has overlap with, and can be a decent proxy for, an integration test suite. The key difference I see is around expectations for the results of the test harness. For QEMU's functional test suite, an overriding concern is that a failure of the test suite *MUST* reflect a fault in QEMU. We want to minimize (ideally eliminate) any failures caused by factors outside QEMU. A failure should be something that can be immediately referred back to the author of the PULL that triggered it, without needing triage to determine if is it a failure caused by something outside QEMU. A functional test failure should generally gate the merging of a PULL request, given that it should reflect a clear QEMU fault. With this in mind, we don't ever want to be testing unreleased snapshots, and even for released images, we always want to fixate on a specific image hash. Similarly the execution env of the test suite is a docker container that has fairly well constrained software, though currently we do not fixate our container images on particular package versions, which has caused us painful spurious failures at times. An integration test suite, by contrast, should be open to the idea that failures can be cause by any moving part in the stack, whether the host OS, QEMU, or the guest OS. Accepting that, however, means taking on a significantly higher burden in the triage of failures - that can easily become a full time job for one or more people, so diagnose problems and then herd cats to get it fixed in whichever piece was at fault. This makes integration testing mostly unsuitable for use as a gating test for merging PULL requests. It would run asynchronously and problems could potentially take a long time to resolve, though ideally by resolves by time of rrelease. > There's got to be some > way to have some shared responsibility that we can automate. FreeBSD could > test > the most recent release of qemu against a bunch of images in our CI > cluster. But we > don't actually have a CI cluster we could put that into (our focus is > just a little different) > today. The issue of CI resources also impacts QEMU :-( We have to be wary that our upstream testing is using our own limited CI resources, and any contributors using GitLab CI also have limited quota. The human constraint is probably the overriding concern I would have from the QEMU side though. I'd love it if QEMU did full integration testing across all guestOS we can get our hands on, both current & forthcoming FreeBSD/Linux releases, and the countless historical releases of many OS. Realistically we just don't have the human resources to manage such a testing effort, even if we found the hardware to support it. Putting my Fedora hat on, we rebase QEMU in Fedora rawhide when rc0 comes out, and Fedora's QA team rely on the rawhide QEMU in doing release testing. While this isn't always timely enough to prevent QEMU bugs getting into Fedora which then impact Fedora releases, it is the best we can do given constraints of both projects. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|