On Wed, 16 Feb 2022 at 21:15:16 +0100, Paul Gevers wrote: > I looked at the results of the autopkgtest of you package on ppc64el because > it was showing up as a regression for the upload of glibc. I noticed that > the test regularly fails since the beginning of February this year. The > failure is always the same (so far), and it happens on multiple of our > hosts.
Is there anything unusual about the ppc64el CI-runners compared with other architectures? (For example: lots of CPUs, few CPUs, lots of RAM, less RAM, lots of I/O bandwidth, running on tmpfs, using qemu, using lxc, running many tests in parallel, ...) >From https://ci.debian.net/packages/d/dbus/testing/ppc64el/ it looks like this is failing about 25% of the time, does that match your experience? >From the timing you mention, I think this was probably triggered by systemd having added a Recommends on dbus-user-session in libpam-systemd. Previously, this test would have been skipped if dbus was tested in a relatively minimal container. > Bail out! /run/user/1000/dbus-1/services is not a directory My best guess at the root cause for this is that when gnome-desktop-testing-runner schedules lots of unit tests in a newly-opened user session, if the integration test for transient services happens to be one of the first ones to be run, then the session dbus-daemon will not necessarily have been started by systemd socket activation just yet. If the test runner has a large number of CPU cores, then that makes it more likely that the test will win the race with the dbus-daemon, resulting in failure. I have a possible patch which I'll upload soon. Would you be able to schedule several consecutive runs on the affected hardware to make sure it's really fixed? 10 runs should be enough for a reasonable level of confidence. smcv