Hi Steve, On Wed, May 25, 2016 at 12:22:21AM -0700, Steve Beattie wrote:
> ======= > precise > ======= > http://people.canonical.com/~ubuntu-archive/proposed-migration/precise/update_excuses.html#eglibc > Regression in autopkgtest for dahdi-linux 1:2.5.0.1+dfsg-1ubuntu2.2 (i386): > Regression in autopkgtest for dahdi-linux 1:2.5.0.1+dfsg-1ubuntu2.2 (amd64): > tests routinely fail: > http://autopkgtest.ubuntu.com/packages/d/dahdi-linux/precise/i386/ > http://autopkgtest.ubuntu.com/packages/d/dahdi-linux/precise/amd64/ > I can't find what's causing these tests to be run or how they are > run, dahdi-linux source package has no Testsuite: header nor a > debian/tests/ directory. Probably the special hook to run dkms builds as autopkgtests. > Regression in autopkgtest for gdnsd 2.1.2-1 (i386): > Regression in autopkgtest for gdnsd 2.1.2-1 (amd64): > gdnsd fails on service startup in postinst: > http://autopkgtest.ubuntu.com/packages/g/gdnsd/wily/i386/ > http://autopkgtest.ubuntu.com/packages/g/gdnsd/wily/amd64/ > I am able to reproduce this in a vm, as the gdnsd daemon attempts to > bind to port 53, which conflicts with dnsmasq running on loopback > bound port 53. I'm not sure why this isn't a problem for the > successful test runs. This is marked 'badtest' for devel and should be marked so for the stable releases. I analyzed this before, and don't recall if the successes on other archs were due to dnsmasq not running there, or just a failure to detect that the systemd unit hadn't started. On Wed, May 25, 2016 at 01:17:14AM -0700, Steve Beattie wrote: > [While I support and appreciate all the efforts that that go into > the adt infrastructure, things seem awfully... brittle. Is there > something the community can do to make them a more reliable indicator > of regressions?] As Martin points out, you've found the worst case scenario, where nearly all the autopkgtests in the archive are being run in response to a single upload. However, even our worst case scenario is something we should strive to make *good*. There are two sources of false positives for autopkgtest failures in SRU that we ought to address. - Tests that we know were buggy, as of release. We have a way to mark these tests as bad, it's a "force-badtest" proposed-migration hint; we use these hints routinely for the devel series, but we haven't been carrying them over to the stable releases despite having the infrastructure for it (e.g.: lp:~ubuntu-sru/britney/hints-ubuntu-trusty/ has a hints file, and all the others are empty). So first, we should copy over badtest hints when we make a new stable release so we have reasonable baseline data. Second, when SRU team members see that an autopkgtest result makes no sense, *please* record a badtest hint for posterity instead of just ignoring the result, so that future SRUs don't have to retread the same ground. - Tests that regressed due to infrastructure changes. We try to keep the testbed unchanged throughout the lifecycle of a release, but it's a complex system and regressions can happen; including regressions introduced by SRUs of "unrelated" packages that thus evaded detection. I think we should address these by periodically resetting the baseline for tests in the stable release - rerunning all autopkgtests for release+updates, and capturing this as the new baseline against which test regressions are measured. If we could do this, say, once every 3 months, it should cut down on the busywork of chasing false positives. Martin, is this something that would be possible to implement? Thanks, -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developer http://www.debian.org/ [email protected] [email protected]
signature.asc
Description: PGP signature
-- Ubuntu-release mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-release
