Hi all, yes I agree with Rutherther, I think it' be awesome™ if the Shepherd supported some kind of heath checkers. They could be a list of executables which must return 0 or the service is considered unhealthy.
I think docker compose supports something similar. cheers, giacomo Il 1 marzo 2026 15:58:25 CET, Rutherther <[email protected]> ha scritto: > >Hi Daniel, > >sounds good. > >Daniel Littlewood <[email protected]> writes: > >> Hi help-guix, >> >> I am wanting to try out the `guix deploy` command, but I am a little >> scared to use it. If I make a mistake in my OS config (such as removing >> a public key by mistake, or breaking the network configuration) then I >> might lock myself out. Of course, since this is a guix system, it is >> trivial to roll back to a previous generation, so long as I can still >> issue the command to roll back. >> >> I am looking for a mechanism whereby the system can be made to >> automatically issue a rollback command if the deployment fails somehow. >> Obviously, if your operating system config is badly broken (e.g. invalid >> syntax) then it will not build and the deployment will not proceed. I am >> thinking of the case where the OS is validly configured, but does not >> satisfy certain desired properties (like permitting SSH from a certain key). >> >> Of course, since deploying a full guix system image can alter the OS in >> more or less arbitrary ways, not everything can be guaranteed. The >> simplest thing I can think of that might work is to include a shepherd >> service in your OS config which will automatically issue a guix system >> roll-back (and perhaps also rebooting) unless a certain post-release >> "deploy succeeded" signal is received. For instance, you could configure >> your deployment script to halt this service via ssh, and if you don't do >> this within 30 seconds, the rollback occurs. As long as this service >> remains in your OS config, you could screw up everything else and it >> should remain accessible (after waiting for the timeout, at least). > >I would rather consider not making this dependent on the target or >current operating system configuration at all. You could make a script >that you start before the activation, or it even can do the activation. >And after the activation, if you do not connect to stop the script, it >will roll back automatigally. > >Since this script has to ideally persist the ssh session that might >close in case of wrong configuration, you might want to use Shepherd for >this that already runs on the system. The transient service could be >used for this >https://doc.guix.gnu.org/shepherd/latest/en/shepherd.html#Transient-Service-Maker, >to support any Guix System, no matter what deployment has been used for >the current system generation. Or also see >https://codeberg.org/efraim/shepherd-run, which as far as I can tell has >been largely replaced by the transient service, though. > >This way you depend only on Shepherd itself and on the fact that the >reconfigured system won't touch your service. I think this is reasonable >assumption. Activation does not touch services that do not depend on >the system at all. > >I've had success with deploy-rs [1] for NixOS, check it out, especially >the magic rollback and its implementation. > >Rutherther > >[1] https://github.com/serokell/deploy-rs >
