Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Hi Paul, On Sat, Mar 04, 2023 at 05:22:32PM +0100, Paul Gevers wrote: > Hi, > > On 04-03-2023 17:19, Salvatore Bonaccorso wrote: > > So would agree, it make sense to update all remaining hosts and then > > look into it again in case the problem arise again. > > All but s390x are up-to-date and liburing testing works. I can't upgrade > s390x because of bug #1031753 (which is worse for ci.d.n than this bug as > far as I see). Ok understand! The fix for that is included in the next upload. Regards, Salvatore
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Hi, On 04-03-2023 17:19, Salvatore Bonaccorso wrote: So would agree, it make sense to update all remaining hosts and then look into it again in case the problem arise again. All but s390x are up-to-date and liburing testing works. I can't upgrade s390x because of bug #1031753 (which is worse for ci.d.n than this bug as far as I see). Paul OpenPGP_signature Description: OpenPGP digital signature
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Hi, On Thu, Mar 02, 2023 at 12:47:53PM +0100, Guillem Jover wrote: > Hi! > > On Thu, 2023-03-02 at 12:12:06 +0100, Paul Gevers wrote: > > Control: fixed -1 5.10.162-1 6.1.8-1 > > > On Sun, 21 Aug 2022 21:35:58 +0200 Bastian Blank wrote: > > > On Sun, Aug 21, 2022 at 07:42:10PM +0200, Guillem Jover wrote: > > > > It seems like there was a regression with the latest stable update > > > > that affects the autopkgtest for liburing. Reassigning. > > > > > > Please provide enough information to make isolating the problem > > > possible. > > > > > > https://ci.debian.net/packages/libu/liburing/ is completely silent as > > > there are not results for any of the failed runs. > > > I decided to try again to see if I could collect more information. The test > > now passes on amd64, arm64, i386 and ppc64el, all running 5.10.162-1 and on > > riscv64 running unstable. However, on armhf, armel (amd64 kernel) and s390x > > (all running 5.10.158-2), it seems that the observation of brian is still > > true, some test in test-unit test segfaults, the test exits and hangs. > > @Guillem, do you see something more in the output below (armhf log) that may > > be of interest? And maybe spot something to run in isolation? > > > Reading the changelog of 5.10.162-1 I see io_uring mentioned a couple of > > times. Therefor I assume this bug is fixed in that version. Is it worth > > pursuing the real issue here? > > Thanks for looking into this! As this appears fixed in latest Linux > releases, then I'd honestly not bother further, and just try to get the > remaining hosts upgraded to the fixed versions. If this was to reappear > then it might make sense to look into it again. FWIW, it is correct, the 5.10.162 release upstream was basically the one to bring the io_uring implementation inline with 5.15.y series (with the io_uring codebase from 5.15.85). So would agree, it make sense to update all remaining hosts and then look into it again in case the problem arise again. Regards, Salvatore
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Hi! On Thu, 2023-03-02 at 12:12:06 +0100, Paul Gevers wrote: > Control: fixed -1 5.10.162-1 6.1.8-1 > On Sun, 21 Aug 2022 21:35:58 +0200 Bastian Blank wrote: > > On Sun, Aug 21, 2022 at 07:42:10PM +0200, Guillem Jover wrote: > > > It seems like there was a regression with the latest stable update > > > that affects the autopkgtest for liburing. Reassigning. > > > > Please provide enough information to make isolating the problem > > possible. > > > > https://ci.debian.net/packages/libu/liburing/ is completely silent as > > there are not results for any of the failed runs. > I decided to try again to see if I could collect more information. The test > now passes on amd64, arm64, i386 and ppc64el, all running 5.10.162-1 and on > riscv64 running unstable. However, on armhf, armel (amd64 kernel) and s390x > (all running 5.10.158-2), it seems that the observation of brian is still > true, some test in test-unit test segfaults, the test exits and hangs. > @Guillem, do you see something more in the output below (armhf log) that may > be of interest? And maybe spot something to run in isolation? > Reading the changelog of 5.10.162-1 I see io_uring mentioned a couple of > times. Therefor I assume this bug is fixed in that version. Is it worth > pursuing the real issue here? Thanks for looking into this! As this appears fixed in latest Linux releases, then I'd honestly not bother further, and just try to get the remaining hosts upgraded to the fixed versions. If this was to reappear then it might make sense to look into it again. Regards, Guillem
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Control: fixed -1 5.10.162-1 6.1.8-1 Hi, On Sun, 21 Aug 2022 21:35:58 +0200 Bastian Blank wrote: On Sun, Aug 21, 2022 at 07:42:10PM +0200, Guillem Jover wrote: > It seems like there was a regression with the latest stable update > that affects the autopkgtest for liburing. Reassigning. Please provide enough information to make isolating the problem possible. https://ci.debian.net/packages/libu/liburing/ is completely silent as there are not results for any of the failed runs. I decided to try again to see if I could collect more information. The test now passes on amd64, arm64, i386 and ppc64el, all running 5.10.162-1 and on riscv64 running unstable. However, on armhf, armel (amd64 kernel) and s390x (all running 5.10.158-2), it seems that the observation of brian is still true, some test in test-unit test segfaults, the test exits and hangs. @Guillem, do you see something more in the output below (armhf log) that may be of interest? And maybe spot something to run in isolation? When I try to destroy the lxc, that fails and in ps output I see this: root 3053528 0.0 0.0 5388 3072 ?Ss 03:34 0:00 [lxc monitor] /var/lib/lxc ci-061-8c60e21c root 3061512 0.0 0.0 0 0 ?Ss 03:35 0:00 \_ [systemd] debian 3110684 0.0 0.0 2140 192 ?DL 03:37 0:00 \_ ./iopoll-leak.t Note the "D" state. Reading the changelog of 5.10.162-1 I see io_uring mentioned a couple of times. Therefor I assume this bug is fixed in that version. Is it worth pursuing the real issue here? Paul root@ci-061-705317d0:/tmp/autopkgtest-lxc.v8gx_5j5/downtmp# cat test-unit-stdout + [ -n ] + CC=gcc + ./configure --cc=gcc prefix/usr includedir/usr/include libdir/usr/lib libdevdir /usr/lib relativelibdir mandir/usr/man datadir /usr/share stringop_overflow yes array_bounds yes __kernel_rwf_tyes __kernel_timespec yes open_how yes statx yes glibc_statx yes C++ yes has_ucontext yes NVMe uring command supportyes liburing_nolibc no CCgcc CXX g++ + make runtests make[1]: Entering directory '/tmp/autopkgtest-lxc.v8gx_5j5/downtmp/build.ksh/src/src' CC setup.ol CC queue.ol CC register.ol CC syscall.ol AR liburing.a ar: creating liburing.a RANLIB liburing.a CC setup.os CC queue.os CC register.os CC syscall.os CC liburing.so.2.3 make[1]: Leaving directory '/tmp/autopkgtest-lxc.v8gx_5j5/downtmp/build.ksh/src/src' make[1]: Entering directory '/tmp/autopkgtest-lxc.v8gx_5j5/downtmp/build.ksh/src/test' CC helpers.o CC 232c93d07b74.t CC 35fa71a030ca.t CC 500f9fbadef8.t CC 7ad0e4b2f83c.t CC 8a9973408177.t CC 917257daa0fe.t CC a0908ae19763.t CC a4c0b3decb33.t CC accept.t CC accept-link.t CC accept-reuse.t CC accept-test.t CC across-fork.t CC b19062a56726.t CC b5837bd5311d.t CC buf-ring.t CC ce593a6c480a.t CC close-opath.t CC connect.t CC cq-full.t CC cq-overflow.t CC cq-peek-batch.t CC cq-ready.t CC cq-size.t CC d4ae271dfaae.t CC d77a67ed5f27.t CC defer.t CC defer-taskrun.t CC double-poll-crash.t CC drop-submit.t CC eeed8b54e0df.t CC empty-eownerdead.t CC eventfd.t CC eventfd-disable.t CC eventfd-reg.t CC eventfd-ring.t CC exec-target.t CC exit-no-cleanup.t CC fadvise.t CC fallocate.t CC fc2a85cb02ef.t CC fd-pass.t CC file-register.t CC files-exit-hang-poll.t CC files-exit-hang-timeout.t CC file-update.t CC file-verify.t CC fixed-buf-iter.t CC fixed-link.t CC fixed-reuse.t CC fpos.t CC fsync.t CC hardlink.t CC io-cancel.t CC iopoll.t CC iopoll-leak.t CC io_uring_enter.t CC io_uring_passthrough.t CC io_uring_register.t CC io_uring_setup.t CC lfs-openat.t CC lfs-openat-write.t CC link.t CC link_drain.t CC link-timeout.t CC madvise.t CC mkdir.t CC msg-ring.t CC multicqes_drain.t CC nolibc.t CC nop-all-sizes.t CC nop.t CC openat2.t CC open-close.t CC open-direct-link.t CC open-direct-pick.t CC personality.t CC pipe-eof.t CC pipe-reuse.t CC poll.t CC poll-cancel.t CC poll-cancel-all.t CC poll-cancel-ton.t CC poll-link.t CC poll-many.t CC poll-mshot-update.t CC poll-mshot-overflow.t CC poll-ring.t CC poll-v-poll.t CC pollfree.t CC probe.t CC read-before-exit.t CC read-write.t CC recv-msgall.t CC
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Control: tags -1 moreinfo Control: severity -1 normal On Sun, Aug 21, 2022 at 07:42:10PM +0200, Guillem Jover wrote: > It seems like there was a regression with the latest stable update > that affects the autopkgtest for liburing. Reassigning. Please provide enough information to make isolating the problem possible. https://ci.debian.net/packages/libu/liburing/ is completely silent as there are not results for any of the failed runs. Bastian -- There is an order of things in this universe. -- Apollo, "Who Mourns for Adonais?" stardate 3468.1
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Control: reassign -1 linux Control: affects -1 liburing debci Hi! It seems like there was a regression with the latest stable update that affects the autopkgtest for liburing. Reassigning. On Mon, 2022-07-18 at 21:07:28 +0200, Paul Gevers wrote: > Source: liburing > Version: 2.1-2 > Severity: important > X-Debbugs-Cc: br...@ubuntu.com > Some days ago (approximately 7) the autopkgtest of liburing started to > behave badly on the Debian and Ubuntu infrastructure. It's not totally > clear to me what happens, but I have lxc containers left behind after > the test that I can't clean up. > > A similar thing seems to happen on the Ubuntu side because they have > blocked the test from running recently and I'll do the same on our > side. > > https://git.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-package-configs/commit/?id=fedf747ea808837217d373773d105f242819702d > > Because your package didn't change in that time, I suspect one of your > dependencies caused liburing to behave differently. It would be great > if we figured out what that is. Thanks, Guillem
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Hi, On 20-08-2022 13:44, Guillem Jover wrote: The liburing test suite works on the buildds, I just uploaded a new release and it built fine locally with Linux 5.19.0-trunk-amd64, and on the buildds. I think this needs to be reassigned, I'd assume to the kernel (but I'm not entirely sure). I'd say, go ahead. Also could a newer kernel be tried on the infra? If it's in a reasonable archive and the version sorting is such that it's below the next stable or security update version, yes. But I don't want to deviate from the stable (security) archive longer than needed. Paul OpenPGP_signature Description: OpenPGP digital signature
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Hi! On Tue, 2022-07-19 at 08:28:09 +0200, Paul Gevers wrote: > On 18-07-2022 23:44, Guillem Jover wrote: > > While only having skimmed over the logs, the first suspect for me would > > be the Linux kernel, where liburing might be triggering some kernel bug. > > Were the systems upgraded around that time with a newer kernel version > > (is that a 5.18.x from bpo)? > > In Debian we did have a point release on 2022-07-09, I upgraded that evening > (UTC). So this seems to confirm that the problem lays elsewhere. :) > > The only other options if that would not > > have been the case that seem plausible would be glibc (which was uploaded > > on the 10th, and could be a suspect), or lxc itself or similar, but that > > has not been uploaded for a while in sid. > > lxc comes from stable. Everything in it is either unstable or testing. > > I'm not totally sure about the Ubuntu setup. Maybe Brian can comment on > that. The liburing test suite works on the buildds, I just uploaded a new release and it built fine locally with Linux 5.19.0-trunk-amd64, and on the buildds. I think this needs to be reassigned, I'd assume to the kernel (but I'm not entirely sure). Also could a newer kernel be tried on the infra? Thanks, Guillem
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Hi,, On 18-07-2022 23:44, Guillem Jover wrote: While only having skimmed over the logs, the first suspect for me would be the Linux kernel, where liburing might be triggering some kernel bug. Were the systems upgraded around that time with a newer kernel version (is that a 5.18.x from bpo)? In Debian we did have a point release on 2022-07-09, I upgraded that evening (UTC). The only other options if that would not have been the case that seem plausible would be glibc (which was uploaded on the 10th, and could be a suspect), or lxc itself or similar, but that has not been uploaded for a while in sid. lxc comes from stable. Everything in it is either unstable or testing. I'm not totally sure about the Ubuntu setup. Maybe Brian can comment on that. Paul OpenPGP_signature Description: OpenPGP digital signature
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Hi! On Mon, 2022-07-18 at 21:07:28 +0200, Paul Gevers wrote: > Source: liburing > Version: 2.1-2 > Severity: important > X-Debbugs-Cc: br...@ubuntu.com > Some days ago (approximately 7) the autopkgtest of liburing started to > behave badly on the Debian and Ubuntu infrastructure. It's not totally > clear to me what happens, but I have lxc containers left behind after > the test that I can't clean up. > > A similar thing seems to happen on the Ubuntu side because they have > blocked the test from running recently and I'll do the same on our > side. > > https://git.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-package-configs/commit/?id=fedf747ea808837217d373773d105f242819702d > > Because your package didn't change in that time, I suspect one of your > dependencies caused liburing to behave differently. It would be great > if we figured out what that is. While only having skimmed over the logs, the first suspect for me would be the Linux kernel, where liburing might be triggering some kernel bug. Were the systems upgraded around that time with a newer kernel version (is that a 5.18.x from bpo)? The only other options if that would not have been the case that seem plausible would be glibc (which was uploaded on the 10th, and could be a suspect), or lxc itself or similar, but that has not been uploaded for a while in sid. Thanks, Guillem
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
On Mon, Jul 18, 2022 at 09:07:28PM +0200, Paul Gevers wrote: > Source: liburing > Version: 2.1-2 > Severity: important > X-Debbugs-Cc: br...@ubuntu.com > > Dear Guillem, > > Some days ago (approximately 7) the autopkgtest of liburing started to > behave badly on the Debian and Ubuntu infrastructure. It's not totally > clear to me what happens, but I have lxc containers left behind after > the test that I can't clean up. > > A similar thing seems to happen on the Ubuntu side because they have > blocked the test from running recently and I'll do the same on our > side. > > https://git.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-package-configs/commit/?id=fedf747ea808837217d373773d105f242819702d > > Because your package didn't change in that time, I suspect one of your > dependencies caused liburing to behave differently. It would be great > if we figured out what that is. I managed to capture some information from an instance in an Ubuntu bug report - http://launchpad.net/bugs/1981636. -- Brian Murray @ubuntu.com
Bug#1015272: liburing autopkgtest started to hang containers in Debian and Ubuntu since ~2022-07-11
Source: liburing Version: 2.1-2 Severity: important X-Debbugs-Cc: br...@ubuntu.com Dear Guillem, Some days ago (approximately 7) the autopkgtest of liburing started to behave badly on the Debian and Ubuntu infrastructure. It's not totally clear to me what happens, but I have lxc containers left behind after the test that I can't clean up. A similar thing seems to happen on the Ubuntu side because they have blocked the test from running recently and I'll do the same on our side. https://git.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-package-configs/commit/?id=fedf747ea808837217d373773d105f242819702d Because your package didn't change in that time, I suspect one of your dependencies caused liburing to behave differently. It would be great if we figured out what that is. Paul