Bug#1038419: installation-reports: [arm64] Installation hangs on "Configuring grub-efi-arm64 (arm64)"

2023-06-25 Thread Joel May
This bug was fixed in kernel version 6.1.33 by commit 
e8631d84c01ece34670af0d300a6f88d86d12f70.  I compiled from source and confirm 
that the problem is indeed fixed in 6.1.33.

The root of the problem is that the "family" in the SMBIOS is "Lenovo 
ThinkSystem HR330A/HR350A" not "eMAG", so a workaround broken 
SetVirtualAddressMap() is not applied.

I now see that this bug applies to more than installer, that just happened to 
be where I first was blocked by it.  I'm not sure if I can/should reclassify 
this bug report to the kernel package.



Bug#1038419: installation-reports: [arm64] Installation hangs on "Configuring grub-efi-arm64 (arm64)"

2023-06-18 Thread Joel May
Upon further investigation, this problem appears to be related to the kernel 
version (6.1.0-9-arm64).

After booting into my new Debian 12 installation, efibootmgr hung with no 
output when I attempted to delete a boot entry (`sudo efibootmgr -Bb 0006`).  I 
downloaded the Bullseye kernel deb package and downgraded to 5.10.0-23-arm64.  
Using the older kernel with Debian 12 allowed me to delete the boot entry with 
efibootmgr.

After a few seconds of efibootmgr hanging with kernel 6.1.0-9-arm64, dmesg 
starts logging:
[  220.565406] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  220.571320] rcu: 31-...0: (1 GPs behind) idle=de3c/1/0x4000 
softirq=453/453 fqs=2582
[  220.580182]  (detected by 28, t=5255 jiffies, g=2537, q=452 ncpus=32)
[  220.580186] Task dump for CPU 31:
[  220.580188] task:kworker/u64:5   state:R  running task stack:0 
pid:378   ppid:2  flags:0x000a
[  220.580194] Workqueue: efi_rts_wq efi_call_rts
[  220.580203] Call trace:
[  220.580204]  __switch_to+0xf0/0x170
[  220.580211]  0x0
[  283.592504] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  283.598416] rcu: 31-...0: (1 GPs behind) idle=de3c/1/0x4000 
softirq=453/453 fqs=10186
[  283.607364]  (detected by 18, t=21013 jiffies, g=2537, q=644 ncpus=32)
[  283.607368] Task dump for CPU 31:
[  283.607369] task:kworker/u64:5   state:R  running task stack:0 
pid:378   ppid:2  flags:0x000a
[  283.607376] Workqueue: efi_rts_wq efi_call_rts
[  283.607382] Call trace:
[  283.607383]  __switch_to+0xf0/0x170
[  283.607388]  0x0
[  346.620541] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  346.626453] rcu: 31-...0: (1 GPs behind) idle=de3c/1/0x4000 
softirq=453/453 fqs=16943
[  346.635401]  (detected by 22, t=36771 jiffies, g=2537, q=1673 ncpus=32)
[  346.635404] Task dump for CPU 31:
[  346.635405] task:kworker/u64:5   state:R  running task stack:0 
pid:378   ppid:2  flags:0x000a
[  346.635411] Workqueue: efi_rts_wq efi_call_rts
[  346.635417] Call trace:
[  346.635419]  __switch_to+0xf0/0x170
[  346.635424]  0x0
[  363.463544] INFO: task khugepaged:188 blocked for more than 120 seconds.
[  363.470242]   Not tainted 6.1.0-9-arm64 #1 Debian 6.1.27-1
[  363.476069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  363.483894] task:khugepaged  state:D stack:0 pid:188   ppid:2  
flags:0x0008
[  363.483901] Call trace:
[  363.483902]  __switch_to+0xf0/0x170
[  363.483909]  __schedule+0x340/0x940
[  363.483913]  schedule+0x58/0xf0
[  363.483916]  schedule_timeout+0x14c/0x180
[  363.483919]  __wait_for_common+0xd4/0x254
[  363.483922]  wait_for_completion+0x28/0x3c
[  363.483926]  __flush_work.isra.0+0x180/0x2dc
[  363.483931]  flush_work+0x18/0x2c
[  363.483934]  __lru_add_drain_all+0x1a0/0x260
[  363.483938]  lru_add_drain_all+0x1c/0x30
[  363.483941]  khugepaged+0xa4/0x9d0
[  363.483945]  kthread+0xe0/0xe4
[  363.483948]  ret_from_fork+0x10/0x20
[  363.483986] INFO: task efibootmgr:956 blocked for more than 120 seconds.
[  363.490681]   Not tainted 6.1.0-9-arm64 #1 Debian 6.1.27-1
[  363.496510] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  363.504335] task:efibootmgr  state:D stack:0 pid:956   ppid:955
flags:0x0004
[  363.504341] Call trace:
[  363.504342]  __switch_to+0xf0/0x170
[  363.504347]  __schedule+0x340/0x940
[  363.504351]  schedule+0x58/0xf0
[  363.504354]  schedule_timeout+0x14c/0x180
[  363.504357]  __wait_for_common+0xd4/0x254
[  363.504360]  wait_for_completion+0x28/0x3c
[  363.504363]  virt_efi_set_variable+0x134/0x1b0
[  363.504367]  efivar_set_variable_locked+0x7c/0xfc
[  363.504370]  efivar_entry_delete+0x5c/0xec [efivarfs]
[  363.504380]  efivarfs_unlink+0x28/0x5c [efivarfs]
[  363.504385]  vfs_unlink+0x124/0x300
[  363.504390]  do_unlinkat+0x19c/0x2c0
[  363.504393]  __arm64_sys_unlinkat+0x44/0x90
[  363.504396]  invoke_syscall+0x78/0x100
[  363.504401]  el0_svc_common.constprop.0+0x4c/0xf4
[  363.504405]  do_el0_svc+0x34/0xd0
[  363.504408]  el0_svc+0x34/0xd4
[  363.504411]  el0t_64_sync_handler+0xf4/0x120
[  363.504414]  el0t_64_sync+0x18c/0x190
[  409.648706] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  409.654617] rcu: 31-...0: (1 GPs behind) idle=de3c/1/0x4000 
softirq=453/453 fqs=22740
[  409.663565]  (detected by 20, t=52529 jiffies, g=2537, q=1895 ncpus=32)
[  409.663568] Task dump for CPU 31:
[  409.663570] task:kworker/u64:5   state:R  running task stack:0 
pid:378   ppid:2  flags:0x000a
[  409.663576] Workqueue: efi_rts_wq efi_call_rts
[  409.663582] Call trace:
[  409.663583]  __switch_to+0xf0/0x170
[  409.663589]  0x0



Bug#1038419: installation-reports: [arm64] Installation hangs on "Configuring grub-efi-arm64 (arm64)"

2023-06-17 Thread Joel May
Package: installation-reports
Severity: important

Boot method: USB
Image version: 
https://cdimage.debian.org/debian-cd/current/arm64/bt-cd/debian-12.0.0-arm64-netinst.iso.torrent
Date: 2023-06-17

Machine: Lenovo HR350A server
Partitions:
Filesystem Type 1K-blocksUsed Available Use% Mounted on
rootfs rootfs   263117412  216724 262900688   1% /
tmpfs  tmpfs 26311744 144  26311600   1% /run
devtmpfs   devtmpfs 131020564   0 131020564   0% /dev
/dev/sdc1  iso9660 612352  612352 0 100% /cdrom
/dev/sdb2  ext4 237753112 1161200 224441896   1% /target
/dev/sdb1  vfat5240006224517776   2% /target/boot/efi


Base System Installation Checklist:
[O] = OK, [E] = Error (please elaborate below), [ ] = didn't try it

Initial boot:   [O]
Detect network card:[O]
Configure network:  [O]
Detect media:   [O]
Load installer modules: [O]
Clock/timezone setup:   [O]
User/password setup:[O]
Detect hard drives: [O]
Partition hard drives:  [O]
Install base system:[O]
Install tasks:  [O]
Install boot loader:[E]
Overall install:[ ]

Comments/Problems:

Installation hangs on "Configuring grub-efi-arm64 (arm64)".

When the installer is stuck in this state, other commands from the shell also 
hang, such as
`efibootmgr`
`cat /sys/firmware/efi/efivars/Boot-...`
`modprobe` (called by `reportbug installation-report`).

Using the expert install to disable updating NVRAM variables works around the 
problem.
For reference, Debian Bullseye, Ubuntu 22.04.2, and RHEL 9.2 successfully 
install the UEFI bootloader on this system.

-- Package-specific info:
- dmesg
[  396.440480] Adding 1000444k swap on /dev/sda3.  Priority:-2 extents:1 
across:1000444k SSFS
[  398.952999] EXT4-fs (sda2): mounted filesystem with ordered data mode. Quota 
mode: none.
[  398.963417] FAT-fs (sda1): Volume was not properly unmounted. Some data may 
be corrupt. Please run fsck.
[  850.581829] fuse: init (API version 7.37)
[  887.150168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  887.150177] rcu: 1-...0: (1 GPs behind) idle=dae4/1/0x4000 
softirq=11625/11625 fqs=2163
[  887.150185]  (detected by 7, t=5252 jiffies, g=68329, q=166 ncpus=32)
[  887.150188] Task dump for CPU 1:
[  887.150190] task:kworker/u64:0   state:R  running task stack:0 pid:9 
ppid:2  flags:0x000a
[  887.150197] Workqueue: efi_rts_wq efi_call_rts
[  887.150206] Call trace:
[  887.150208]  __switch_to+0xf0/0x170
[  887.150214]  0x0
[  950.170167] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  950.170173] rcu: 1-...0: (1 GPs behind) idle=dae4/1/0x4000 
softirq=11625/11625 fqs=7867
[  950.170180]  (detected by 16, t=21007 jiffies, g=68329, q=296 ncpus=32)
[  950.170183] Task dump for CPU 1:
[  950.170184] task:kworker/u64:0   state:R  running task stack:0 pid:9 
ppid:2  flags:0x000a
[  950.170190] Workqueue: efi_rts_wq efi_call_rts
[  950.170196] Call trace:
[  950.170198]  __switch_to+0xf0/0x170
[  950.170203]  0x0
[ 1013.190168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1013.190173] rcu: 1-...0: (1 GPs behind) idle=dae4/1/0x4000 
softirq=11625/11625 fqs=13685
[ 1013.190180]  (detected by 0, t=36762 jiffies, g=68329, q=343 ncpus=32)
[ 1013.190183] Task dump for CPU 1:
[ 1013.190184] task:kworker/u64:0   state:R  running task stack:0 pid:9 
ppid:2  flags:0x000a
[ 1013.190191] Workqueue: efi_rts_wq efi_call_rts
[ 1013.190197] Call trace:
[ 1013.190198]  __switch_to+0xf0/0x170
[ 1013.190204]  0x0
[ 1076.210168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1076.210173] rcu: 1-...0: (1 GPs behind) idle=dae4/1/0x4000 
softirq=11625/11625 fqs=19448
[ 1076.210179]  (detected by 11, t=52517 jiffies, g=68329, q=373 ncpus=32)
[ 1076.210182] Task dump for CPU 1:
[ 1076.210183] task:kworker/u64:0   state:R  running task stack:0 pid:9 
ppid:2  flags:0x000a
[ 1076.210189] Workqueue: efi_rts_wq efi_call_rts
[ 1076.210195] Call trace:
[ 1076.210196]  __switch_to+0xf0/0x170
[ 1076.210201]  0x0
[ 1088.482198] INFO: task khugepaged:188 blocked for more than 120 seconds.
[ 1088.482205]   Not tainted 6.1.0-9-arm64 #1 Debian 6.1.27-1
[ 1088.482207] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1088.482209] task:khugepaged  state:D stack:0 pid:188   ppid:2  
flags:0x0008
[ 1088.482214] Call trace:
[ 1088.482215]  __switch_to+0xf0/0x170
[ 1088.482221]  __schedule+0x340/0x940
[ 1088.482225]  schedule+0x58/0xf0
[ 1088.482228]  schedule_timeout+0x14c/0x180
[ 1088.482231]  __wait_for_common+0xd4/0x254
[ 1088.482234]  wait_for_completion+0x28/0x3c
[ 1088.482238]  __flush_work.isra.0+0x180/0x2dc
[ 1088.482244]  flush_work+0x18/0x2c
[ 1088.482246]  __lru_add_drain_all+0x1a0/0x260
[ 1088.482250]  lru_add_drain_all+0x1c/0x30
[ 1088.482253]