On Sat Jan 17 15:19:42 2026, [email protected] wrote: > Hello Adam, > > Thank you for a very helpful message. I'm quoting more than I'm > replying to in order to copy your message to the BTS.
Unfortunately it may not actually have been that helpful. See my follow-up responses for corrections. :-( > Adam D. Barratt [17/Jan 2:46pm GMT] wrote: > > On Sat Jan 17 14:02:53 2026, [email protected] wrote: > >> Hello, > >> > >> Aurelien Jarno [17/Jan 12:36pm GMT] wrote: > >> > I haven't looked at all the details, but here are a few things > >> > from > >> > the logs. > >> > The reboot of tag2upload-builder-01 was scheduled at 14:12:29. It > >> > indeed caused a podman container to be stopped: > > [...] > >> > Could you please confirm from your logs that the reboot lock was > >> > indeed taken by your tag2upload job? > >> > >> It doesn't print anything if it successfully takes the lock, but it > >> prints something and exits if it fails to take the lock (verified by > >> our test suite), and the logs indicate it did not exit. So, yes, I > >> can > >> confirm that the job did indeed take the lock. > > > > Looking through the log of #1125239, I think some of the timings have > > been confused, so it would be worth checking the process flow. > > Hrm, yes. The Podman error comes much earlier than 14:12. > So possibly that Podman error is a completely unrelated bug in Podman. > It may have been introduced by the upgrade to trixie. > Ian, what are you thoughts on this? For clarity, the builder was rebooted at 14:12, but the oracle rebooted at 13:54 and the manager at 13:58. > > | Jan 10 13:53:44 tag2upload-oracle-01 tag2upload-oracled[2556368]: > > | [t2u-oracled tag2upload-builder-01.debian.org,2556368][2026-01- > > 10T13:53:44] > > | group_leader: received SIGTERM; shutting down workers > > | Jan 10 13:53:44 tag2upload-oracle-01 systemd[2556306]: Stopping > > tag2upload-oracled.service - tag2upload Oracle daemon... > > | Jan 10 13:53:44 tag2upload-oracle-01 systemd[2556306]: Stopped > > tag2upload-oracled.service - tag2upload Oracle daemon. > > | -- Boot cbbd32cac2974b5e901921187e477fa7 -- > > | > > | This is the host rebooting. > > > > In fact, it's not - as Aurelien noted, the reboot was at 14:12. (It is, just not the host I thought at first.) [...] > Okay, so based on this information it looks like we have an > incompatibility between our locking arrangements, regardless of > whether tag2upload job 2390 failed because of a reboot. > > In particular, when implementing the locking I had been assuming that > /var/run/reboot-lock would remain locked while a reboot was pending. > But in fact it isn't. It should be, I misread how the molly-guard script was running. So please ignore that section, and apologies for the confusion. :-| Regards, Adam

