bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-18 Thread pelzflorian (Florian Pelz)
"pelzflorian (Florian Pelz)" writes: > * If I resume a crashed installer, I need to resume twice because the > first resume always fails immediately. Hooray, you fixed it. Ludo, your debugging speed is miraculous. I did not know SQLite uses multiple files per database. > * With bad luck,

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-17 Thread Ludovic Courtès
After spending a few more hours on this, I got convinced that upon restarting guix-daemon, even though we had restored /var/guix/db/db.sqlite, the presence of stale db.sqlite-{wal,shm} files could lead sqlite to do as if transactions in the WAL file had been committed. Commit

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-17 Thread Ludovic Courtès
"pelzflorian (Florian Pelz)" skribis: > I saw a comment >> void LocalStore::registerValidPaths(const ValidPathInfos & infos) >> { >> /* SQLite will fsync by default, but the new valid paths may not be >> fsync-ed. >> * So some may want to fsync them before registering the validity, at

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-17 Thread pelzflorian (Florian Pelz)
Ahoi. :) Ludovic Courtès writes: >> Now that you found the dynamic-wind’s out-guard does not even run: > It does not run on C-c, but it does run in other cases, typically if you > just press Enter after reading the message that says “command failed, > press Enter”. Ahh. Then would it be good

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-17 Thread pelzflorian (Florian Pelz)
Ludovic Courtès writes: > The error message that’s haunting us: > > opening file `/gnu/store/….drv': No such file or directory > > comes from guix-daemon. It happens while the client is doing an > ‘add-text-to-store’ RPC to add that .drv to the store. > ‘LocalStore::addTextToStore’ supposedly

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-17 Thread Ludovic Courtès
Ludovic Courtès skribis: > I did reproduce the issue in a VM by running “ifconfig ens3 down” in a > tty, or by killing the ‘guix substitute’ process, to cause failure of > ‘guix system init’. In that case the database is indeed restored, but I > occasionally get errors like “/gnu/store/….drv:

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-17 Thread Ludovic Courtès
Moin! "pelzflorian (Florian Pelz)" skribis: > Ludovic Courtès writes: >> One finding: when hitting C-c, the dynamic-wind exit handler (the one >> that restores the database and umounts the cow store) is *not* executed. > > Impressive findings. > > Now that you found the dynamic-wind’s

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-16 Thread pelzflorian (Florian Pelz)
Ludovic Courtès writes: > One finding: when hitting C-c, the dynamic-wind exit handler (the one > that restores the database and umounts the cow store) is *not* executed. Impressive findings. Now that you found the dynamic-wind’s out-guard does not even run: Uhh I had misdiagnosed when I

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-16 Thread pelzflorian (Florian Pelz)
Maxime Devos writes: > So, I'm nominally 'on hiatus', but I noticed this mail, and noticed > you copied a file (and fsync'ed it), but forgot to fsync the directory > it was copied to -- from what I've read (but I don't recall the > source), fsyncing the contents of the file isn't enough, you also

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-16 Thread Ludovic Courtès
Hi, "pelzflorian (Florian Pelz)" skribis: > Desperately I tried also adding fsync, to no avail, both issues remain. > Non-working patch attached. > > Maybe dynamic-wind is an inappropriate pattern here? > > If I interrupt installation using Ctrl-C (which I normally don’t, > instead I unplug

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-16 Thread Maxime Devos
On 14-12-2022 22:47, pelzflorian (Florian Pelz) wrote: fsyncing the database had no effect. (In addition to Ludo’s 'stop-service', I had done fsync.patch diff --git a/gnu/installer/final.scm b/gnu/installer/final.scm index ef487805f0..13deffef85 100644 --- a/gnu/installer/final.scm +++

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-15 Thread pelzflorian (Florian Pelz)
Desperately I tried also adding fsync, to no avail, both issues remain. Non-working patch attached. Maybe dynamic-wind is an inappropriate pattern here? If I interrupt installation using Ctrl-C (which I normally don’t, instead I unplug Ethernet), then I have to press Ctrl-C twice. Maybe that

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-15 Thread pelzflorian (Florian Pelz)
Hi Ludo… Ludovic Courtès writes: > This time, I believe we only ever copy the database when we’re sure no > guix-daemon process is accessing it. Failure. In addition to your partially helpful patch from before (with which a second resume now works most of the time), I now tried further the new

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-14 Thread Ludovic Courtès
Grrr, I’m really silly: we have the same problem (copying the database before the daemon has been stopped) just a few lines above. How about this: diff --git a/gnu/installer/final.scm b/gnu/installer/final.scm index 044f79372b..360b34d8cb 100644 --- a/gnu/installer/final.scm +++

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-14 Thread pelzflorian (Florian Pelz)
"pelzflorian (Florian Pelz)" writes: > I shall try with fsync now. fsyncing the database had no effect. (In addition to Ludo’s 'stop-service', I had done diff --git a/gnu/installer/final.scm b/gnu/installer/final.scm index ef487805f0..13deffef85 100644 --- a/gnu/installer/final.scm +++

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-14 Thread pelzflorian (Florian Pelz)
Eventual success, partially. First of all: Ludovic Courtès writes: > "pelzflorian (Florian Pelz)" skribis: >> Additionally, I had to do “GUIX_ALLOW_ME_TO_USE_PRIVATE_COMMIT=y >> make update-guix-package”. Or else the installer was using a Guix that >> did not have the lines swapped. > Hmm

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-13 Thread Ludovic Courtès
"pelzflorian (Florian Pelz)" skribis: > Ludovic Courtès writes: >> So my guess is that things will be much better if we swap these two >> lines. > > This was helpful, but not enough. Sorry, I think I wasn’t thinking at full speed. There needs to be zero daemons running while we copy the

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-13 Thread pelzflorian (Florian Pelz)
Hi again. Ludovic Courtès writes: > So my guess is that things will be much better if we swap these two > lines. This was helpful, but not enough. Swapping them may have improved the likelihood of being able to retry, but the issue is still there. I uploaded as installer-dump-5f9f8dbe, but it

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-13 Thread Ludovic Courtès
Hi again, Ludovic Courtès skribis: > It looks like the store is in a broken state, with its database not > matching its actual contents. The ‘install-system’ procedure is > supposed to protect against that by making a backup of the database > before starting the installation and restoring it

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-13 Thread Ludovic Courtès
Hi, "pelzflorian (Florian Pelz)" skribis: > I now uploaded an installer-dump-bade9971 of me reproducing the issue. Here’s the relevant syslog excerpt (this was with 1.4.0rc1) where we can see the point where you unplugged the Ethernet connection: --8<---cut

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-12 Thread pelzflorian (Florian Pelz)
"pelzflorian (Florian Pelz)" writes: > shepherd: Service guix-daemon has been stopped. > shepherd: Service guix-daemon has been started. > guix system: Fehler: opening file > `/gnu/store/4z81a7njyvnwa4kn46ad6vhvi0lcnrhh-shadow-4.9.drv': No such > file or directory > Befehl ("guix" "system" "init"

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-10 Thread pelzflorian (Florian Pelz)
Ludovic Courtès writes: > I tried to reproduce it: > > 0. I chose a basic installation to a fully-encrypted disk with a > single partition. > > 1. I hit Ctrl-C while ‘guix system init’ was downloading substitutes. > > 2. That led me to a confusing error screen says “Command cryptsetup

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-09 Thread Ludovic Courtès
Ludovic Courtès skribis: > 2. That led me to a confusing error screen says “Command cryptsetup > failed” with Ignore/Abort/Retry buttons. Actually it’s “External command ("cryptsetup" "close" "cryptroot") exited with code 5” and “cryptroot device is busy”. Ludo’.

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-09 Thread Ludovic Courtès
Hi, "pelzflorian (Florian Pelz)" skribis: > I aborted graphical system installation (Ctrl-C), retried the > installation and got this: > > shepherd: Service guix-daemon has been stopped. > shepherd: Service guix-daemon has been started. > guix system: Fehler: opening file >

bug#59784: [version 1.4.0rc1] Retrying a failed install fails

2022-12-02 Thread pelzflorian (Florian Pelz)
I aborted graphical system installation (Ctrl-C), retried the installation and got this: shepherd: Service guix-daemon has been stopped. shepherd: Service guix-daemon has been started. guix system: Fehler: opening file `/gnu/store/4z81a7njyvnwa4kn46ad6vhvi0lcnrhh-shadow-4.9.drv': No such file or