Bug#891434: Bug#923839: shim-signed: setup of shim-signed failed with 'Could not delete variable: No space left on device'
On Sun, 10 Mar 2019 21:57:02 + Colin Watson wrote: > Control: reassign 891434 src:grub2 > Control: forcemerge 891434 923839 > > On Tue, Mar 05, 2019 at 03:43:31PM -0800, Steve Langasek wrote: > > But I'm reassigning this bug to grub2, because I think the right answer for > > nearly all efibootmgr write failures on update of the bootloader packages is > > that grub should not be writing to nvram at all. Rather, in the common case > > of a bootloader upgrade, the contents being written to nvram will match what > > is already there. By detecting that there are no changes, we save ourselves > > a write, which in the exceptional cases sidesteps a write failure, and in > > the common case, reduces wear on the nvram which may have limited write > > cycles. > > This is the same as #891434, which I've been working on recently, and at > a high level you and I have reached the same conclusions about what's > needed. (I've also been discussing it with Steve McIntyre, again > reaching similar conclusions.) > > [...] > > I got this almost working at the Cambridge BSP today before I ran out of > time (very nearly bricking my own laptop in the process ...). I need to > add suitable --debug messages, finish getting it working, ensure that > it's only rewriting variables where needed, and generally tidy up the > fairly large pile of code involved, so there's still probably at least > four hours of work left to do on it, not to mention upstream review. > However, I'm reasonably hopeful that I'll have this done for buster. > > -- > Colin Watson [cjwat...@debian.org] > > Hi Colin, Thanks for working on this. :) I am glad to hear that we might have something almost ready thanks to your hard work. :) Thanks, ~Niels
Bug#891434: Bug#923839: shim-signed: setup of shim-signed failed with 'Could not delete variable: No space left on device'
Control: reassign 891434 src:grub2 Control: forcemerge 891434 923839 On Tue, Mar 05, 2019 at 03:43:31PM -0800, Steve Langasek wrote: > But I'm reassigning this bug to grub2, because I think the right answer for > nearly all efibootmgr write failures on update of the bootloader packages is > that grub should not be writing to nvram at all. Rather, in the common case > of a bootloader upgrade, the contents being written to nvram will match what > is already there. By detecting that there are no changes, we save ourselves > a write, which in the exceptional cases sidesteps a write failure, and in > the common case, reduces wear on the nvram which may have limited write > cycles. This is the same as #891434, which I've been working on recently, and at a high level you and I have reached the same conclusions about what's needed. (I've also been discussing it with Steve McIntyre, again reaching similar conclusions.) The problem that's been delaying this is that efibootmgr doesn't expose the interfaces we need. There's no way to ask it to write a variable only if it's changed, or even (any more) to write a new variable to a test file so that it can be compared with the existing contents in order to determine whether to set the variable. I initially thought that we might be able to ask efibootmgr to delete all but one entry from the same distributor and then to modify that one, but even that doesn't seem to be possible at the moment. And even if we did change efibootmgr to provide something along these lines (it might separately be a good idea to get it to minimise writes, at the very least), (a) it would be difficult to guarantee that we have a new enough version in an upstreamable way, and (b) having to fork efibootmgr several times in a single grub-install operation is annoying anyway. So, I've been working on converting grub-install to use libefivar and libefiboot directly, which are libraries used by modern-ish versions of efibootmgr. In many ways this is much nicer: we can say what we mean about exactly how variables are to be manipulated rather than operating at arm's length via a command-line interface that wasn't designed to offer this sort of fine control. In some ways it's uglier: we have to duplicate more of efibootmgr's logic than I'd like in order to build Boot* variables, such as EDD version detection, and it's possible that it will increase the maintenance burden for the future a bit. But regardless, this is the only approach I can think of that stands any chance of fixing this bug in the medium term, so it's what we've got. I got this almost working at the Cambridge BSP today before I ran out of time (very nearly bricking my own laptop in the process ...). I need to add suitable --debug messages, finish getting it working, ensure that it's only rewriting variables where needed, and generally tidy up the fairly large pile of code involved, so there's still probably at least four hours of work left to do on it, not to mention upstream review. However, I'm reasonably hopeful that I'll have this done for buster. -- Colin Watson [cjwat...@debian.org]
Bug#891434: grub-efi: System fails to boot after "No space left on device" on EFI variable storage
On Fri, 14 Dec 2018 10:22:49 +0100 Ralf Jung wrote: > Hi, > > > Fixing this does seem like it would be a good idea for general > > robustness against dodgy firmware (this is not the first iteration of > > problems along these lines). It would take some development work, but > > hopefully not too much. > > > > Things that GRUB can't do, as far as I can tell: > > > > * I don't think there's a way for GRUB to check whether it will be > >possible to recreate a boot entry later; as I understand it, that > >depends on various low-level details, including firmware-specific > >quirks. > > > > * Even detecting that nothing changed would require cooperation from > >efibootmgr, since the encoding of the EFI variable is an > >implementation detail there (so we can't just read it out and > >compare), and efibootmgr doesn't expose a way for GRUB to say "set > >this configuration, but only if it's different from what's already > >there". > > > > However, I think GRUB can at least manage to delete all but one entry > > from the same distributor rather than all of them, and if it finds one > > remaining entry then it can modify that rather than writing a brand new > > variable. As I understand it, that would probably be enough to fix this > > bug? > > Assuming that modification works even when the variable storage is (close to) > full, then yes, that would at least keep the device bootable which would be a > big improvement. > > Kind regards, > Ralf > > Hi Colin, Thanks for proposing this solution. :) I also think it would be a good solution for now that will hopefully avoid most of these errors. :) Thanks, ~Niels
Bug#891434: grub-efi: System fails to boot after "No space left on device" on EFI variable storage
Hi, > Fixing this does seem like it would be a good idea for general > robustness against dodgy firmware (this is not the first iteration of > problems along these lines). It would take some development work, but > hopefully not too much. > > Things that GRUB can't do, as far as I can tell: > > * I don't think there's a way for GRUB to check whether it will be >possible to recreate a boot entry later; as I understand it, that >depends on various low-level details, including firmware-specific >quirks. > > * Even detecting that nothing changed would require cooperation from >efibootmgr, since the encoding of the EFI variable is an >implementation detail there (so we can't just read it out and >compare), and efibootmgr doesn't expose a way for GRUB to say "set >this configuration, but only if it's different from what's already >there". > > However, I think GRUB can at least manage to delete all but one entry > from the same distributor rather than all of them, and if it finds one > remaining entry then it can modify that rather than writing a brand new > variable. As I understand it, that would probably be enough to fix this > bug? Assuming that modification works even when the variable storage is (close to) full, then yes, that would at least keep the device bootable which would be a big improvement. Kind regards, Ralf
Bug#891434: grub-efi: System fails to boot after "No space left on device" on EFI variable storage
On Sun, Feb 25, 2018 at 04:13:13PM +0100, Ralf Jung wrote: > earlier today I did a system update, which completed successfully (as in, dpkg > didn't stop due to an error). I then rebooted my machine. This left Linux > unable to boot; only the Windows entry was left in the boot menu. After some > hours of debugging, the problem turned out to be that writing an EFI variable > fails with "No space left on the device". I did a firmware update (from > Windows), to no avail. In the end I booted into a live system, deleted some > of > the "dump-type0-*" variables, rebooted, and then ran "grub-install" from a > chroot to fix the situation. > > I'm not exactly sure what went wrong here, but clearly the system shouldn't be > put into an unbootable state ever. I see two bugs here: > > * First, it looks like something is filling up the EFI variable space. I've > added an `ls -lah` of the evivars folder below. This is after I deleted > roughly 20-30 "dump-type0-*" variables. Is this the kernel dumping > information (about crashes or so)? If yes, it seems to do so without ever > cleaning up or taking free space into account, which I'd consider a serious > bug. Should I report this against the kernel? I don't even know what > creates > those EFI variables. Those are created by the efi_pstore_write function in the kernel. Beyond that I'm not really familiar with what's going on - you should ask Debian's kernel folks if you need to pursue this. > * Second, does grub-install really have to delete and create EFI variables > even > when nothing changed? It seems to me that writing an EFI variable is only > necessary when initially installing GRUB. Even if writing is necessary, a > check could be done *before* deleting the boot entry whether it will be > possible to write it again later. Right now, it seems that grub will > happily > delete the debian boot entry and then fail to create it again -- and this > doesn't even make the system update fail. Fixing this does seem like it would be a good idea for general robustness against dodgy firmware (this is not the first iteration of problems along these lines). It would take some development work, but hopefully not too much. Things that GRUB can't do, as far as I can tell: * I don't think there's a way for GRUB to check whether it will be possible to recreate a boot entry later; as I understand it, that depends on various low-level details, including firmware-specific quirks. * Even detecting that nothing changed would require cooperation from efibootmgr, since the encoding of the EFI variable is an implementation detail there (so we can't just read it out and compare), and efibootmgr doesn't expose a way for GRUB to say "set this configuration, but only if it's different from what's already there". However, I think GRUB can at least manage to delete all but one entry from the same distributor rather than all of them, and if it finds one remaining entry then it can modify that rather than writing a brand new variable. As I understand it, that would probably be enough to fix this bug? -- Colin Watson [cjwat...@debian.org]
Bug#891434: grub-efi: System fails to boot after "No space left on device" on EFI variable storage
Just had this again, this time even after I repaired Debian, Windows disappeared completely from the start menu. I do not yet know how to get it back. What does it take to get attention t a bug that completely breaks the system? Kind regards, Ralf
Bug#891434: grub-efi: System fails to boot after "No space left on device" on EFI variable storage
Package: grub-efi-amd64 Version: 2.02+dfsg1-4 Followup-For: Bug #891434 I just ran into this same issue and it is specific to grub: refind-install also has similar issues, so this is specific to the state of the computer. I found this answer helpful: https://unix.stackexchange.com/a/379824/79267 In particular deleting dump files helped: rm /sys/firmware/efi/efivars/dump-* and then grub-install worked fine. As a fix, perhaps grub could issue a message to look into /sys/firmware/efi/efivars directory, because it was not trivial to find it (all the mounted file systems have plenty of space as reported by `df -h` so the message "No space left on device" is not helpful). -- Package-specific info: *** BEGIN /proc/mounts /dev/sda8 / ext4 rw,noatime,nodiratime,discard,errors=remount-ro,data=ordered 0 0 /dev/sda9 /home ext4 rw,noatime,nodiratime,discard,data=ordered 0 0 /dev/sda2 /boot/efi vfat rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro 0 0 *** END /proc/mounts *** BEGIN /boot/grub/grub.cfg # # DO NOT EDIT THIS FILE # # It is automatically generated by grub-mkconfig using templates # from /etc/grub.d and settings from /etc/default/grub # ### BEGIN /etc/grub.d/00_header ### if [ -s $prefix/grubenv ]; then set have_grubenv=true load_env fi if [ "${next_entry}" ] ; then set default="${next_entry}" set next_entry= save_env next_entry set boot_once=true else set default="0" fi if [ x"${feature_menuentry_id}" = xy ]; then menuentry_id_option="--id" else menuentry_id_option="" fi export menuentry_id_option if [ "${prev_saved_entry}" ]; then set saved_entry="${prev_saved_entry}" save_env saved_entry set prev_saved_entry= save_env prev_saved_entry set boot_once=true fi function savedefault { if [ -z "${boot_once}" ]; then saved_entry="${chosen}" save_env saved_entry fi } function load_video { if [ x$feature_all_video_module = xy ]; then insmod all_video else insmod efi_gop insmod efi_uga insmod ieee1275_fb insmod vbe insmod vga insmod video_bochs insmod video_cirrus fi } if [ x$feature_default_font_path = xy ] ; then font=unicode else insmod part_gpt insmod ext2 set root='hd0,gpt8' if [ x$feature_platform_search_hint = xy ]; then search --no-floppy --fs-uuid --set=root --hint-bios=hd0,gpt8 --hint-efi=hd0,gpt8 --hint-baremetal=ahci0,gpt8 b6485691-c9ee-4aad-85ba-4d9a8032e2c7 else search --no-floppy --fs-uuid --set=root b6485691-c9ee-4aad-85ba-4d9a8032e2c7 fi font="/usr/share/grub/unicode.pf2" fi if loadfont $font ; then set gfxmode=800x600 load_video insmod gfxterm set locale_dir=$prefix/locale set lang=en_US insmod gettext fi terminal_output gfxterm if [ "${recordfail}" = 1 ] ; then set timeout=30 else if [ x$feature_timeout_style = xy ] ; then set timeout_style=menu set timeout=5 # Fallback normal timeout code in case the timeout_style feature is # unavailable. else set timeout=5 fi fi ### END /etc/grub.d/00_header ### ### BEGIN /etc/grub.d/05_debian_theme ### insmod part_gpt insmod ext2 set root='hd0,gpt8' if [ x$feature_platform_search_hint = xy ]; then search --no-floppy --fs-uuid --set=root --hint-bios=hd0,gpt8 --hint-efi=hd0,gpt8 --hint-baremetal=ahci0,gpt8 b6485691-c9ee-4aad-85ba-4d9a8032e2c7 else search --no-floppy --fs-uuid --set=root b6485691-c9ee-4aad-85ba-4d9a8032e2c7 fi insmod png if background_image /usr/share/desktop-base/softwaves-theme/grub/grub-16x9.png; then set color_normal=white/black set color_highlight=black/white else set menu_color_normal=cyan/blue set menu_color_highlight=white/blue fi ### END /etc/grub.d/05_debian_theme ### ### BEGIN /etc/grub.d/10_linux ### function gfxmode { set gfxpayload="${1}" } set linux_gfx_mode= export linux_gfx_mode menuentry 'Debian GNU/Linux' --class debian --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-b6485691-c9ee-4aad-85ba-4d9a8032e2c7' { load_video insmod gzio if [ x$grub_platform = xxen ]; then insmod xzio; insmod lzopio; fi insmod part_gpt insmod ext2 set root='hd0,gpt8' if [ x$feature_platform_search_hint = xy ]; then search --no-floppy --fs-uuid --set=root --hint-bios=hd0,gpt8 --hint-efi=hd0,gpt8 --hint-baremetal=ahci0,gpt8 b6485691-c9ee-4aad-85ba-4d9a8032e2c7 else search --no-floppy --fs-uuid --set=root b6485691-c9ee-4aad-85ba-4d9a8032e2c7 fi echo'Loading Linux 4.15.0-2-amd64 ...' linux /boot/vmlinuz-4.15.0-2-amd64 root=UUID=b6485691-c9ee-4aad-85ba-4d9a8032e2c7 ro quiet echo'Loading initial ramdisk ...' initrd /boot/initrd.img-4.15.0-2-amd64 } submen
Bug#891434: the same on 2 Acer Aspire V13 PC's
On Wed, 07 Mar 2018 11:13:02 -0500 Rann Bar-Onwrote: > Is this still the case with 2.02+dfsg1-3 or has it been fixed? Yes, it is. I experinced the same issue on 2 Acer Aspire V13 PC's. Updating GRUB to 2.02+dfsg1-2 and to 2.02+dfsg1-3. Both PC's left unbootable. Rescued with an install image + re-installing GRUB with the option --removable. I saved the logs if you are interested. Regards, Jean-Marc pgpy3OdO2EOsf.pgp Description: PGP signature
Bug#891434:
Is this still the case with 2.02+dfsg1-3 or has it been fixed?
Bug#891434: grub-efi: System fails to boot after "No space left on device" on EFI variable storage
Hi, I experienced the same today after a grub update to '2.02+dfsg1-1' on testing. Looking back at logs, grub-install reported an error but the upgrade process as a whole didn't fail, so I missed it at first: ``` Could not prepare Boot variable: No space left on device grub-install: error: efibootmgr failed to register the boot entry: Input/output error. Failed: grub-install --target=x86_64-efi WARNING: Bootloader is not properly installed, system may not be bootable ``` However, after manually recovering via efibootmgr, the ESP doesn't seem to be full nor close to: ``` # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda4 3.7G 238M 3.2G 7% /boot /dev/sda1 256M 24M 233M 10% /boot/efi ``` My pstore has ~150 entries, all quite small (~1Kb), and none of them are recent. So I'm not sure why this specific upgrade got stuck on ENOSPC. Ciao, Luca -- "If you build a wall, think of what you leave outside it" - Italo Calvino signature.asc Description: This is a digitally signed message part.
Bug#891434: grub-efi: System fails to boot after "No space left on device" on EFI variable storage
Package: grub-efi Version: 2.02+dfsg1-1 Severity: critical Justification: breaks the whole system Dear Maintainer, earlier today I did a system update, which completed successfully (as in, dpkg didn't stop due to an error). I then rebooted my machine. This left Linux unable to boot; only the Windows entry was left in the boot menu. After some hours of debugging, the problem turned out to be that writing an EFI variable fails with "No space left on the device". I did a firmware update (from Windows), to no avail. In the end I booted into a live system, deleted some of the "dump-type0-*" variables, rebooted, and then ran "grub-install" from a chroot to fix the situation. I'm not exactly sure what went wrong here, but clearly the system shouldn't be put into an unbootable state ever. I see two bugs here: * First, it looks like something is filling up the EFI variable space. I've added an `ls -lah` of the evivars folder below. This is after I deleted roughly 20-30 "dump-type0-*" variables. Is this the kernel dumping information (about crashes or so)? If yes, it seems to do so without ever cleaning up or taking free space into account, which I'd consider a serious bug. Should I report this against the kernel? I don't even know what creates those EFI variables. * Second, does grub-install really have to delete and create EFI variables even when nothing changed? It seems to me that writing an EFI variable is only necessary when initially installing GRUB. Even if writing is necessary, a check could be done *before* deleting the boot entry whether it will be possible to write it again later. Right now, it seems that grub will happily delete the debian boot entry and then fail to create it again -- and this doesn't even make the system update fail. This is all on a Lenovo P50. Initially I used the firmware version from last fall, and then updated it to the latest one (from last December). Kind regards, Ralf -- Package-specific info: *** BEGIN /sys/firmware/efi/efivars $ ls -lah total 0 drwxr-xr-x 2 root root0 Feb 25 14:25 . drwxr-xr-x 6 root root0 Feb 25 14:25 .. -rw-r--r-- 1 root root 26 Feb 25 14:25 AppName-1fd8b79f-0be2-4d57-b241-81c5e24e01a1 -rw-r--r-- 1 root root 36 Feb 25 14:25 AppPlatform-1fd8b79f-0be2-4d57-b241-81c5e24e01a1 -rw-r--r-- 1 root root5 Feb 25 14:25 AuthVarKeyDatabase-aaf32c78-947b-439a-a180-2e144ec37792 -rw-r--r-- 1 root root 304 Feb 25 14:25 Boot-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 122 Feb 25 14:25 Boot0001-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 46 Feb 25 14:25 Boot0010-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 54 Feb 25 14:25 Boot0011-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 84 Feb 25 14:25 Boot0012-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 72 Feb 25 14:25 Boot0013-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 80 Feb 25 14:25 Boot0014-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 74 Feb 25 14:25 Boot0015-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 60 Feb 25 14:25 Boot0016-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 64 Feb 25 14:25 Boot0017-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 66 Feb 25 14:25 Boot0018-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 63 Feb 25 14:25 Boot0019-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 63 Feb 25 14:25 Boot001A-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 69 Feb 25 14:25 Boot001B-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 69 Feb 25 14:25 Boot001C-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 69 Feb 25 14:25 Boot001D-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 69 Feb 25 14:25 Boot001E-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 66 Feb 25 14:25 Boot001F-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 66 Feb 25 14:25 Boot0020-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 70 Feb 25 14:25 Boot0021-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 72 Feb 25 14:25 Boot0022-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 66 Feb 25 14:25 Boot0023-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 68 Feb 25 14:25 Boot0024-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root6 Feb 25 14:25 BootCurrent-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root8 Feb 25 14:25 BootOptionSupport-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 28 Feb 25 14:25 BootOrder-8be4df61-93ca-11d2-aa0d-00e098032b8c -rw-r--r-- 1 root root 24 Feb 25 14:25 BootOrderDefault-0b7646a4-6b44-4332-8588-c8998117f2ef -rw-r--r-- 1 root root5 Feb 25 14:25 BootState-60b5e939-0fcf-4227-ba83-6bbed45bc0e3 -rw-r--r-- 1 root root 28 Feb 25 14:25