This is harder to work around/catch, as in the new case the test does
*not* time out, it just kills sshd (or something in the kernel that
breaks ssh/networking). In general these are cases that we do want to
treat as "tmpfail" and auto-restart, I don't want to treat an auxverb
failure as failure in general.

Perhaps we need to introduce some kind of retry counter, but this would
need to span at least half a day -- three tmpfails on the same worker in
a row are usually a sign of a broken cloud or a broken testbed image,
not a test failure. So perhaps some logic to check if other tests
tmpfail on the same worker/cloud, and if not then call that test a
failure.

This would all require state keeping, which we don't currently do (the
only state is the AMQP queue contents).

** Changed in: autopkgtest (Ubuntu)
       Status: In Progress => Triaged

** Changed in: autopkgtest (Ubuntu)
   Importance: High => Medium

** Package changed: autopkgtest (Ubuntu) => auto-package-testing

** Changed in: auto-package-testing
    Milestone: ubuntu-16.10 => None

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1630578

Title:
  broken kernel causes eternal test retry loop

To manage notifications about this bug go to:
https://bugs.launchpad.net/auto-package-testing/+bug/1630578/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to