bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
Christopher Baines skribis: > I believe I sorted access for Ludo, but nothing was found when looking > at the logs. I’m closing it. Let’s reopen if we stumble upon a similar issue. Ludo’.
bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
Ludovic Courtès writes: > Hi, > > Christopher Baines skribis: > >> Ludovic Courtès writes: >> >>> Hi, >>> >>> Christopher Baines skribis: >>> May 24 11:17:02 localhost shepherd[1]: Evaluating user expression (and (defined? (quote transient?)) (map (# ?) ?)). May 24 11:17:02 localhost shepherd[1]: Evaluating user expression (register-services (primitive-load "/gnu/st?") ?). May 24 11:17:03 localhost shepherd[1]: Service host-name has been started. May 24 11:17:03 localhost shepherd[1]: Service user-homes has been started. May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_hardlinks = 1 May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_symlinks = 1 May 24 11:18:41 localhost shepherd[1]: Exiting shepherd... May 24 11:18:46 localhost shepherd[1]: Grace period of 5 seconds is over; sending -337 SIGKILL. May 24 11:23:55 localhost shepherd[1]: Service root is not running. >>> >>> The grace period expiration thing is probably due to the fact that >>> shepherd is no longer processing signals, as I described here: >>> >>> https://issues.guix.gnu.org/63736 >>> >>> Could you share all of /var/log/messages (possibly privately, and >>> limiting to “shepherd” lines) starting from when the machine booted? >>> I’d like to see if there are hints of something that went wrong. >> >> The machine is hamal (one of the HoneyComb's) and I've added a user for >> you now and added the SSH key from maintenance.git. >> >> So you should be able to: ssh l...@hamal.cbaines.net > > Doesn’t work right now; anything in the logs? I believe I sorted access for Ludo, but nothing was found when looking at the logs. signature.asc Description: PGP signature
bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
Hi, Christopher Baines skribis: > Ludovic Courtès writes: > >> Hi, >> >> Christopher Baines skribis: >> >>> May 24 11:17:02 localhost shepherd[1]: Evaluating user expression (and >>> (defined? (quote transient?)) (map (# ?) ?)). >>> May 24 11:17:02 localhost shepherd[1]: Evaluating user expression >>> (register-services (primitive-load "/gnu/st?") ?). >>> May 24 11:17:03 localhost shepherd[1]: Service host-name has been started. >>> May 24 11:17:03 localhost shepherd[1]: Service user-homes has been started. >>> May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_hardlinks = 1 >>> May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_symlinks = 1 >>> May 24 11:18:41 localhost shepherd[1]: Exiting shepherd... >>> May 24 11:18:46 localhost shepherd[1]: Grace period of 5 seconds is over; >>> sending -337 SIGKILL. >>> May 24 11:23:55 localhost shepherd[1]: Service root is not running. >> >> The grace period expiration thing is probably due to the fact that >> shepherd is no longer processing signals, as I described here: >> >> https://issues.guix.gnu.org/63736 >> >> Could you share all of /var/log/messages (possibly privately, and >> limiting to “shepherd” lines) starting from when the machine booted? >> I’d like to see if there are hints of something that went wrong. > > The machine is hamal (one of the HoneyComb's) and I've added a user for > you now and added the SSH key from maintenance.git. > > So you should be able to: ssh l...@hamal.cbaines.net Doesn’t work right now; anything in the logs? Ludo’.
bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
On 2023-05-24 12:27, Christopher Baines wrote: Hey! On a system running shepherd 0.9.3 [1], I've reconfigured, but now can't reboot or halt. root@hamal ~# halt Service root is not running. 1: /gnu/store/y6w0xix15cq08qasmq75f04yzgbl98jx-shepherd-0.9.3 FWIW, this has happened to me a bunch of times, I just never reported it. Sometimes I was able to just login as root and run herd start root to fix it. I have an impression, from the "bunch of times" I've experienced, that service root doesn't fail to work because of the system reconfigure, but for some other reason. Best regards, David
bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
Ludovic Courtès writes: > Hi, > > Christopher Baines skribis: > >> May 24 11:17:02 localhost shepherd[1]: Evaluating user expression (and >> (defined? (quote transient?)) (map (# ?) ?)). >> May 24 11:17:02 localhost shepherd[1]: Evaluating user expression >> (register-services (primitive-load "/gnu/st?") ?). >> May 24 11:17:03 localhost shepherd[1]: Service host-name has been started. >> May 24 11:17:03 localhost shepherd[1]: Service user-homes has been started. >> May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_hardlinks = 1 >> May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_symlinks = 1 >> May 24 11:18:41 localhost shepherd[1]: Exiting shepherd... >> May 24 11:18:46 localhost shepherd[1]: Grace period of 5 seconds is over; >> sending -337 SIGKILL. >> May 24 11:23:55 localhost shepherd[1]: Service root is not running. > > The grace period expiration thing is probably due to the fact that > shepherd is no longer processing signals, as I described here: > > https://issues.guix.gnu.org/63736 > > Could you share all of /var/log/messages (possibly privately, and > limiting to “shepherd” lines) starting from when the machine booted? > I’d like to see if there are hints of something that went wrong. The machine is hamal (one of the HoneyComb's) and I've added a user for you now and added the SSH key from maintenance.git. So you should be able to: ssh l...@hamal.cbaines.net Your users password is also in your home directory. signature.asc Description: PGP signature
bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
Hi, Christopher Baines skribis: > May 24 11:17:02 localhost shepherd[1]: Evaluating user expression (and > (defined? (quote transient?)) (map (# ?) ?)). > May 24 11:17:02 localhost shepherd[1]: Evaluating user expression > (register-services (primitive-load "/gnu/st?") ?). > May 24 11:17:03 localhost shepherd[1]: Service host-name has been started. > May 24 11:17:03 localhost shepherd[1]: Service user-homes has been started. > May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_hardlinks = 1 > May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_symlinks = 1 > May 24 11:18:41 localhost shepherd[1]: Exiting shepherd... > May 24 11:18:46 localhost shepherd[1]: Grace period of 5 seconds is over; > sending -337 SIGKILL. > May 24 11:23:55 localhost shepherd[1]: Service root is not running. The grace period expiration thing is probably due to the fact that shepherd is no longer processing signals, as I described here: https://issues.guix.gnu.org/63736 Could you share all of /var/log/messages (possibly privately, and limiting to “shepherd” lines) starting from when the machine booted? I’d like to see if there are hints of something that went wrong. Ludo’.
bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
Ludovic Courtès writes: > Hi, > > Christopher Baines skribis: > >> On a system running shepherd 0.9.3 [1], I've reconfigured, but now can't >> reboot or halt. >> >> root@hamal ~# halt >> Service root is not running. > > Hey, why halt it if it’s not running? > > Seriously though, any insight from /var/log/messages? I upgraded a > bunch of machines and didn’t hit this particular problem. Bruno > reported a similar problem with 0.9.3, but this had nothing to do with > the upgrade: > > https://issues.guix.gnu.org/62619 > > Could it be the same problem? Do you see: > > Assertion (eq? (canonical-name new) (canonical-name old)) failed. > > in /var/log/messages? I don't see that, but I think these are the relevant log messages: May 24 11:17:02 localhost shepherd[1]: Evaluating user expression (and (defined? (quote transient?)) (map (# ?) ?)). May 24 11:17:02 localhost shepherd[1]: Evaluating user expression (register-services (primitive-load "/gnu/st?") ?). May 24 11:17:03 localhost shepherd[1]: Service host-name has been started. May 24 11:17:03 localhost shepherd[1]: Service user-homes has been started. May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_hardlinks = 1 May 24 11:17:03 localhost shepherd[1]: [sysctl] fs.protected_symlinks = 1 May 24 11:18:41 localhost shepherd[1]: Exiting shepherd... May 24 11:18:46 localhost shepherd[1]: Grace period of 5 seconds is over; sending -337 SIGKILL. May 24 11:23:55 localhost shepherd[1]: Service root is not running. May 24 11:24:16 localhost last message repeated 2 times May 24 11:30:49 localhost syslogd (GNU inetutils 2.3): restart May 24 11:30:49 localhost vmunix: [0.00] Booting Linux on physical CPU 0x00 [0x410fd083] May 24 11:30:49 localhost vmunix: [0.00] Linux version 6.3.3-arm64-generic (guix@guix) (gcc (GCC) 11.3.0, GNU ld (GNU Binutils) 2.38) #1 SMP PREEMPT 1 signature.asc Description: PGP signature
bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
Hi, Christopher Baines skribis: > On a system running shepherd 0.9.3 [1], I've reconfigured, but now can't > reboot or halt. > > root@hamal ~# halt > Service root is not running. Hey, why halt it if it’s not running? Seriously though, any insight from /var/log/messages? I upgraded a bunch of machines and didn’t hit this particular problem. Bruno reported a similar problem with 0.9.3, but this had nothing to do with the upgrade: https://issues.guix.gnu.org/62619 Could it be the same problem? Do you see: Assertion (eq? (canonical-name new) (canonical-name old)) failed. in /var/log/messages? Ludo’.
bug#63678: Can't restart/halt system with shepherd 0.9.3 after upgrading
Hey! On a system running shepherd 0.9.3 [1], I've reconfigured, but now can't reboot or halt. root@hamal ~# halt Service root is not running. 1: /gnu/store/y6w0xix15cq08qasmq75f04yzgbl98jx-shepherd-0.9.3 signature.asc Description: PGP signature