[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x
And on an arm64 platform we have something similar: 15:55:45 DEBUG| [stdout] Number of CPUs: 4 15:55:45 DEBUG| [stdout] Number of CPUs Online: 4 15:55:45 DEBUG| [stdout] 15:55:45 DEBUG| [stdout] access STARTING 15:55:49 DEBUG| [stdout] [ 7016.776865] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:55:50 DEBUG| [stdout] access RETURNED 0 15:55:50 DEBUG| [stdout] access PASSED 15:55:50 DEBUG| [stdout] af-alg STARTING 15:55:50 DEBUG| [stdout] [ 7017.948549] cryptd: max_cpu_qlen set to 1000 15:55:55 DEBUG| [stdout] af-alg RETURNED 0 15:55:55 DEBUG| [stdout] af-alg PASSED 15:55:55 DEBUG| [stdout] affinity STARTING 15:55:59 DEBUG| [stdout] [ 7026.984742] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:56:00 DEBUG| [stdout] affinity RETURNED 0 15:56:00 DEBUG| [stdout] affinity PASSED 15:56:00 DEBUG| [stdout] aio STARTING 15:56:05 DEBUG| [stdout] aio RETURNED 0 15:56:05 DEBUG| [stdout] aio PASSED 15:56:05 DEBUG| [stdout] aiol STARTING 15:56:09 DEBUG| [stdout] [ 7037.068696] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 ...and a stack dump too. protocol family 5 is AF_APPLETALK and stress-ng does not use exercise this, so this is pretty weird and unexpected. 15:57:08 DEBUG| [stdout] [ 7096.221119] NET: Registered protocol family 5 15:57:10 DEBUG| [stdout] [ 7098.023954] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:57:20 DEBUG| [stdout] [ 7108.103839] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:57:30 DEBUG| [stdout] [ 7118.183729] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:57:41 DEBUG| [stdout] [ 7128.267622] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:57:51 DEBUG| [stdout] [ 7138.343507] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:58:01 DEBUG| [stdout] [ 7148.427381] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:58:11 DEBUG| [stdout] [ 7158.503282] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:58:21 DEBUG| [stdout] [ 7168.587157] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:58:31 DEBUG| [stdout] [ 7178.663042] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:58:41 DEBUG| [stdout] [ 7188.742924] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:58:51 DEBUG| [stdout] [ 7198.822826] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:59:01 DEBUG| [stdout] [ 7208.902688] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:59:11 DEBUG| [stdout] [ 7218.982579] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:59:21 DEBUG| [stdout] [ 7229.062465] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:59:31 DEBUG| [stdout] [ 7239.142348] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:59:42 DEBUG| [stdout] [ 7249.74] unregister_netdevice: waiting for eth0 to become free. Usage count = 1 15:59:44 DEBUG| [stdout] [ 7251.334302] INFO: task modprobe:1184184 blocked for more than 120 seconds. 15:59:44 DEBUG| [stdout] [ 7251.335644] Tainted: G OE 5.4.0-7-generic #8-Ubuntu 15:59:44 DEBUG| [stdout] [ 7251.336889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 15:59:44 DEBUG| [stdout] [ 7251.338455] modprobeD0 1184184 1142782 0x0028 15:59:44 DEBUG| [stdout] [ 7251.338461] Call trace: 15:59:44 DEBUG| [stdout] [ 7251.338472] __switch_to+0xe4/0x148 15:59:44 DEBUG| [stdout] [ 7251.338478] __schedule+0x2fc/0x7c0 15:59:44 DEBUG| [stdout] [ 7251.338489] schedule+0x3c/0xb8 15:59:44 DEBUG| [stdout] [ 7251.338501] rwsem_down_write_slowpath+0x2e8/0x5b0 15:59:44 DEBUG| [stdout] [ 7251.338512] down_write+0x70/0x80 15:59:44 DEBUG| [stdout] [ 7251.338525] register_netdevice_notifier+0x4c/0x208 15:59:44 DEBUG| [stdout] [ 7251.338548] atalk_init+0xa0/0x118 [appletalk] 15:59:44 DEBUG| [stdout] [ 7251.338570] do_one_initcall+0x50/0x220 15:59:44 DEBUG| [stdout] [ 7251.338575] do_init_module+0x5c/0x248 15:59:44 DEBUG| [stdout] [ 7251.338582] load_module+0xecc/0x1170 15:59:44 DEBUG| [stdout] [ 7251.338585] __do_sys_finit_module+0xac/0x110 15:59:44 DEBUG| [stdout] [ 7251.338587] __arm64_sys_finit_module+0x28/0x38 15:59:44 DEBUG| [stdout] [ 7251.338591] el0_svc_common.constprop.0+0xdc/0x1d8 15:59:44 DEBUG| [stdout] [ 7251.338593] el0_svc_handler+0x34/0xa0 15:59:44 DEBUG| [stdout] [ 7251.338595] el0_svc+0x10/0x14 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854968 Title: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x Status in linux package in Ubuntu: Incomplete Bug description: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT regression
[Kernel-packages] [Bug 1854968] [NEW] stress-ng sctp stressor breaks 5.4.0.7-8 on s390x
Public bug reported: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT regression testing: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-focal-canonical-kernel-team- unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz 14:44:30 DEBUG| [stdout] sctp STARTING 14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 256/256) 14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked for more than 122 seconds. 14:47:44 DEBUG| [stdout] [ 3684.814345] Tainted: P OE 5.4.0-7-generic #8-Ubuntu 14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 2063618 0x0800 14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace: 14:47:44 DEBUG| [stdout] [ 3684.814360] ([] __schedule+0x304/0x7b0) 14:47:44 DEBUG| [stdout] [ 3684.814362] [ ] schedule+0x4a/0xe0 14:47:44 DEBUG| [stdout] [ 3684.814366] [ ] rwsem_down_write_slowpath+0x22c/0x530 14:47:44 DEBUG| [stdout] [ 3684.814370] [ ] register_pernet_subsys+0x2c/0x60 14:47:44 DEBUG| [stdout] [ 3684.814411] [<03ff80766638>] sctp_init+0x2f0/0x520 [sctp] 14:47:44 DEBUG| [stdout] [ 3684.814414] [ ] do_one_initcall+0x40/0x200 14:47:44 DEBUG| [stdout] [ 3684.814416] [ ] do_init_module+0x70/0x270 14:47:44 DEBUG| [stdout] [ 3684.814418] [ ] load_module+0x1142/0x1440 14:47:44 DEBUG| [stdout] [ 3684.814419] [ ] __do_sys_finit_module+0xa4/0xf0 14:47:44 DEBUG| [stdout] [ 3684.814421] [ ] system_call+0x2aa/0x2c8 14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:07 DEBUG| [stdout] [ 3708.084328] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:17 DEBUG| [stdout] [ 3718.134297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:27 DEBUG| [stdout] [ 3728.214335] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:37 DEBUG| [stdout] [ 3738.474354] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:48 DEBUG| [stdout] [ 3748.734396] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:58 DEBUG| [stdout] [ 3758.744352] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:49:08 DEBUG| [stdout] [ 3768.754349] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:49:18 DEBUG| [stdout] [ 3779.014352] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:49:28
[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x
I can't easily reproduce this on a s390 VM instance. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854968 Title: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x Status in linux package in Ubuntu: New Bug description: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT regression testing: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-focal-canonical-kernel-team- unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz 14:44:30 DEBUG| [stdout] sctp STARTING 14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 256/256) 14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked for more than 122 seconds. 14:47:44 DEBUG| [stdout] [ 3684.814345] Tainted: P OE 5.4.0-7-generic #8-Ubuntu 14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 2063618 0x0800 14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace: 14:47:44 DEBUG| [stdout] [ 3684.814360] ([] __schedule+0x304/0x7b0) 14:47:44 DEBUG| [stdout] [ 3684.814362] [ ] schedule+0x4a/0xe0 14:47:44 DEBUG| [stdout] [ 3684.814366] [ ] rwsem_down_write_slowpath+0x22c/0x530 14:47:44 DEBUG| [stdout] [ 3684.814370] [ ] register_pernet_subsys+0x2c/0x60 14:47:44 DEBUG| [stdout] [ 3684.814411] [<03ff80766638>] sctp_init+0x2f0/0x520 [sctp] 14:47:44 DEBUG| [stdout] [ 3684.814414] [ ] do_one_initcall+0x40/0x200 14:47:44 DEBUG| [stdout] [ 3684.814416] [ ] do_init_module+0x70/0x270 14:47:44 DEBUG| [stdout] [ 3684.814418] [ ] load_module+0x1142/0x1440 14:47:44 DEBUG| [stdout] [ 3684.814419] [ ] __do_sys_finit_module+0xa4/0xf0 14:47:44 DEBUG| [stdout] [ 3684.814421] [ ] system_call+0x2aa/0x2c8 14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:07 DEBUG| [stdout] [ 3708.084328] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:17 DEBUG| [stdout] [ 3718.134297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:27 DEBUG| [stdout] [ 3728.214335] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:37 DEBUG| [stdout] [ 3738.474354] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:48 DEBUG| [stdout] [
[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x
net/core/dev.c netdev_wait_allrefs() states: ** * netdev_wait_allrefs - wait until all references are gone. * @dev: target net_device * * This is called when unregistering network devices. * * Any protocol or device that holds a reference should register * for netdevice notification, and cleanup and put back the * reference if they receive an UNREGISTER event. * We can get stuck here if buggy protocols don't correctly * call dev_put. */ ... if (refcnt && time_after(jiffies, warning_time + 10 * HZ)) { pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", dev->name, refcnt); warning_time = jiffies; } -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854968 Title: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x Status in linux package in Ubuntu: New Bug description: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT regression testing: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-focal-canonical-kernel-team- unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz 14:44:30 DEBUG| [stdout] sctp STARTING 14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 256/256) 14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked for more than 122 seconds. 14:47:44 DEBUG| [stdout] [ 3684.814345] Tainted: P OE 5.4.0-7-generic #8-Ubuntu 14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 2063618 0x0800 14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace: 14:47:44 DEBUG| [stdout] [ 3684.814360] ([] __schedule+0x304/0x7b0) 14:47:44 DEBUG| [stdout] [ 3684.814362] [ ] schedule+0x4a/0xe0 14:47:44 DEBUG| [stdout] [ 3684.814366] [ ] rwsem_down_write_slowpath+0x22c/0x530 14:47:44 DEBUG| [stdout] [ 3684.814370] [ ] register_pernet_subsys+0x2c/0x60 14:47:44 DEBUG| [stdout] [ 3684.814411] [<03ff80766638>] sctp_init+0x2f0/0x520 [sctp] 14:47:44 DEBUG| [stdout] [ 3684.814414] [ ] do_one_initcall+0x40/0x200 14:47:44 DEBUG| [stdout] [ 3684.814416] [ ] do_init_module+0x70/0x270 14:47:44 DEBUG| [stdout] [ 3684.814418] [ ] load_module+0x1142/0x1440 14:47:44 DEBUG| [stdout] [ 3684.814419] [ ] __do_sys_finit_module+0xa4/0xf0 14:47:44 DEBUG| [stdout] [ 3684.814421] [ ]
[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x
This makes sense as the af-alg stressor now exercises a far wider set of crypto engines. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854968 Title: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x Status in linux package in Ubuntu: New Bug description: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT regression testing: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-focal-canonical-kernel-team- unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz 14:44:30 DEBUG| [stdout] sctp STARTING 14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 256/256) 14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked for more than 122 seconds. 14:47:44 DEBUG| [stdout] [ 3684.814345] Tainted: P OE 5.4.0-7-generic #8-Ubuntu 14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 2063618 0x0800 14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace: 14:47:44 DEBUG| [stdout] [ 3684.814360] ([] __schedule+0x304/0x7b0) 14:47:44 DEBUG| [stdout] [ 3684.814362] [ ] schedule+0x4a/0xe0 14:47:44 DEBUG| [stdout] [ 3684.814366] [ ] rwsem_down_write_slowpath+0x22c/0x530 14:47:44 DEBUG| [stdout] [ 3684.814370] [ ] register_pernet_subsys+0x2c/0x60 14:47:44 DEBUG| [stdout] [ 3684.814411] [<03ff80766638>] sctp_init+0x2f0/0x520 [sctp] 14:47:44 DEBUG| [stdout] [ 3684.814414] [ ] do_one_initcall+0x40/0x200 14:47:44 DEBUG| [stdout] [ 3684.814416] [ ] do_init_module+0x70/0x270 14:47:44 DEBUG| [stdout] [ 3684.814418] [ ] load_module+0x1142/0x1440 14:47:44 DEBUG| [stdout] [ 3684.814419] [ ] __do_sys_finit_module+0xa4/0xf0 14:47:44 DEBUG| [stdout] [ 3684.814421] [ ] system_call+0x2aa/0x2c8 14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:07 DEBUG| [stdout] [ 3708.084328] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:17 DEBUG| [stdout] [ 3718.134297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:27 DEBUG| [stdout] [ 3728.214335] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:37 DEBUG| [stdout] [ 3738.474354] unregister_netdevice: waiting for lo to become free. Usage count = 1
[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x
The unregister_netdevice message appears after the af-alg stressor starts, so it maybe a crypto algo that is the root cause: 14:34:33 DEBUG| [stdout] af-alg STARTING 14:34:35 DEBUG| [stdout] [ 2895.954700] unregister_netdevice: waiting for lo to become free. Usage count = 1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854968 Title: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x Status in linux package in Ubuntu: New Bug description: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT regression testing: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-focal-canonical-kernel-team- unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz 14:44:30 DEBUG| [stdout] sctp STARTING 14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 256/256) 14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked for more than 122 seconds. 14:47:44 DEBUG| [stdout] [ 3684.814345] Tainted: P OE 5.4.0-7-generic #8-Ubuntu 14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 2063618 0x0800 14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace: 14:47:44 DEBUG| [stdout] [ 3684.814360] ([] __schedule+0x304/0x7b0) 14:47:44 DEBUG| [stdout] [ 3684.814362] [ ] schedule+0x4a/0xe0 14:47:44 DEBUG| [stdout] [ 3684.814366] [ ] rwsem_down_write_slowpath+0x22c/0x530 14:47:44 DEBUG| [stdout] [ 3684.814370] [ ] register_pernet_subsys+0x2c/0x60 14:47:44 DEBUG| [stdout] [ 3684.814411] [<03ff80766638>] sctp_init+0x2f0/0x520 [sctp] 14:47:44 DEBUG| [stdout] [ 3684.814414] [ ] do_one_initcall+0x40/0x200 14:47:44 DEBUG| [stdout] [ 3684.814416] [ ] do_init_module+0x70/0x270 14:47:44 DEBUG| [stdout] [ 3684.814418] [ ] load_module+0x1142/0x1440 14:47:44 DEBUG| [stdout] [ 3684.814419] [ ] __do_sys_finit_module+0xa4/0xf0 14:47:44 DEBUG| [stdout] [ 3684.814421] [ ] system_call+0x2aa/0x2c8 14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:07 DEBUG| [stdout] [ 3708.084328] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:17 DEBUG| [stdout] [ 3718.134297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:48:27 DEBUG| [stdout] [
[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x
This stress test has not changed much lately, so I'm assuming this is a racy kernel regression. Last stress-sctp changes in stress-ng were: commit 27b045a498b360ccbc761c3b62e3dd38dd744f09 Author: Colin Ian King Date: Sat Aug 10 13:25:34 2019 +0100 stress-sctp: voidify unused return Signed-off-by: Colin Ian King commit 29043afe6d3c2fa95d6ce22c88aa4545d070e722 Author: Colin Ian King Date: Wed Jun 26 12:42:00 2019 +0100 stress-sctp: use setsockopt for more socket option exercising -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854968 Title: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x Status in linux package in Ubuntu: New Bug description: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT regression testing: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-focal-canonical-kernel-team- unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz 14:44:30 DEBUG| [stdout] sctp STARTING 14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 256/256) 14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked for more than 122 seconds. 14:47:44 DEBUG| [stdout] [ 3684.814345] Tainted: P OE 5.4.0-7-generic #8-Ubuntu 14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 2063618 0x0800 14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace: 14:47:44 DEBUG| [stdout] [ 3684.814360] ([<be310914>] __schedule+0x304/0x7b0) 14:47:44 DEBUG| [stdout] [ 3684.814362] [<be310e0a>] schedule+0x4a/0xe0 14:47:44 DEBUG| [stdout] [ 3684.814366] [<bdb071cc>] rwsem_down_write_slowpath+0x22c/0x530 14:47:44 DEBUG| [stdout] [ 3684.814370] [<be14d66c>] register_pernet_subsys+0x2c/0x60 14:47:44 DEBUG| [stdout] [ 3684.814411] [<03ff80766638>] sctp_init+0x2f0/0x520 [sctp] 14:47:44 DEBUG| [stdout] [ 3684.814414] [<bda288c0>] do_one_initcall+0x40/0x200 14:47:44 DEBUG| [stdout] [ 3684.814416] [<bdb594a0>] do_init_module+0x70/0x270 14:47:44 DEBUG| [stdout] [ 3684.814418] [<bdb5b892>] load_module+0x1142/0x1440 14:47:44 DEBUG| [stdout] [ 3684.814419] [<bdb5bdc4>] __do_sys_finit_module+0xa4/0xf0 14:47:44 DEBUG| [stdout] [ 3684.814421] [<be315fc6>] system_call+0x2aa/0x2c8 14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo to become free. Usage count = 1 14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice:
[Kernel-packages] [Bug 1854959] Re: stress-ng sysinfo stressor trips kernel oops on ppc64el with 5.4.0.7-8
Same on 5.4.0.4-5 too but not on 5.4.0.3.4 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854959 Title: stress-ng sysinfo stressor trips kernel oops on ppc64el with 5.4.0.7-8 Status in linux package in Ubuntu: In Progress Bug description: stress-ng on ppc64el with 5.4.0.7-8, sysinfo stressor seems to tickle a bug: 06:26:02 DEBUG| [stdout] sysinfo FAILED (kernel oopsed) 06:26:02 DEBUG| [stdout] [ 7262.965483] kernel tried to execute exec-protected page (c00017407ce0) - exploit attempt? (uid: 0) 06:26:02 DEBUG| [stdout] [ 7262.968030] BUG: Unable to handle kernel instruction fetch 06:26:02 DEBUG| [stdout] [ 7262.968121] Faulting instruction address: 0xc00017407ce0 06:26:02 DEBUG| [stdout] [ 7262.968224] Oops: Kernel access of bad area, sig: 11 [#1] 06:26:02 DEBUG| [stdout] [ 7262.968292] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries 06:26:02 DEBUG| [stdout] [ 7262.968403] Modules linked in: unix_diag sctp zfs(PO) zunicode(PO) zavl(PO) icp(PO) zlua(PO) zcommon(PO) znvpair(PO) spl(O) snd_seq snd_seq_device snd_timer snd soundcore vhost_vsock vmw_vsock_virtio_transport_common vsock kvm_pr kvm hci_vhci bluetooth ecdh_generic ecc userio uhid hid vhost_net vhost tap cuse dccp_ipv4 dccp psnap llc algif_rng aegis128 algif_aead anubis fcrypt khazad seed sm4_generic tea crc32_generic md4 michael_mic nhpoly1305 poly1305_generic rmd128 rmd160 rmd256 rmd320 sha3_generic sm3_generic streebog_generic tgr192 wp512 xxhash_generic algif_hash blowfish_generic blowfish_common cast5_generic des_generic libdes salsa20_generic chacha_generic camellia_generic cast6_generic cast_common serpent_generic twofish_generic twofish_common algif_skcipher af_alg aufs binfmt_misc af_packet_diag tcp_diag udp_diag raw_diag inet_diag iptable_mangle xt_TCPMSS xt_tcpudp bpfilter dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua vmx_crypto crct10dif_vpmsum sch_fq_codel ip_tables 06:26:02 DEBUG| [stdout] [ 7262.969078] x_tables autofs4 btrfs xor zstd_compress raid6_pq libcrc32c crc32c_vpmsum virtio_net virtio_blk net_failover failover [last unloaded: trace_printk] 06:26:02 DEBUG| [stdout] [ 7262.970416] CPU: 1 PID: 2613531 Comm: fuse_mnt Tainted: P OE 5.4.0-7-generic #8-Ubuntu 06:26:02 DEBUG| [stdout] [ 7262.970532] NIP: c00017407ce0 LR: c063e968 CTR: c00017407ce0 06:26:02 DEBUG| [stdout] [ 7262.970623] REGS: c001d8393810 TRAP: 0400 Tainted: P OE (5.4.0-7-generic) 06:26:02 DEBUG| [stdout] [ 7262.970737] MSR: 800010009033 CR: 88002440 XER: 2000 06:26:02 DEBUG| [stdout] [ 7262.970850] CFAR: c063e964 IRQMASK: 0 06:26:02 DEBUG| [stdout]GPR00: c063e944 c001d8393aa0 c1a5bf00 c0003d95ec00 06:26:02 DEBUG| [stdout]GPR04: c00017407c18 06:26:02 DEBUG| [stdout]GPR08: 06:26:02 DEBUG| [stdout]GPR12: c00017407ce0 c0003fffee00 7c8ab4814410 06:26:02 DEBUG| [stdout]GPR16: 7c8ab4b9 7c8ab4810320 7c8ab2f6f240 7c8ab4814420 06:26:02 DEBUG| [stdout]GPR20: 7c8aa8000b60 7c8ab4aad3a0 06:26:02 DEBUG| [stdout]GPR24: c001f38f7da0 c001fbb81e4c c00017407ce0 c001f38f7d80 06:26:02 DEBUG| [stdout]GPR28: c001f38f7da0 c0003d95ec00 c001f38f7d70 06:26:02 DEBUG| [stdout] [ 7262.971713] NIP [c00017407ce0] 0xc00017407ce0 06:26:02 DEBUG| [stdout] [ 7262.971804] LR [c063e968] fuse_request_end+0x128/0x2f0 06:26:02 DEBUG| [stdout] [ 7262.971893] Call Trace: 06:26:02 DEBUG| [stdout] [ 7262.971930] [c001d8393aa0] [c063e944] fuse_request_end+0x104/0x2f0 (unreliable) 06:26:02 DEBUG| [stdout] [ 7262.972035] [c001d8393af0] [c06427cc] fuse_dev_do_write+0x2cc/0x5c0 06:26:02 DEBUG| [stdout] [ 7262.972138] [c001d8393b70] [c0642f64] fuse_dev_write+0x74/0xd0 06:26:02 DEBUG| [stdout] [ 7262.972221] [c001d8393c00] [c04702b0] do_iter_readv_writev+0x240/0x290 06:26:02 DEBUG| [stdout] [ 7262.972334] [c001d8393c70] [c0472bc8] do_iter_write+0xc8/0x280 06:26:02 DEBUG| [stdout] [ 7262.972424] [c001d8393cc0] [c0472e90] vfs_writev+0xe0/0x180 06:26:02 DEBUG| [stdout] [ 7262.972508] [c001d8393dc0] [c0472fcc] do_writev+0x9c/0x1a0 06:26:02 DEBUG| [stdout] [ 7262.972588] [c001d8393e20] [c000b278] system_call+0x5c/0x68 06:26:02 DEBUG| [stdout] [ 7262.972661] Instruction dump: 06:26:02 DEBUG| [stdout] [ 7262.972716]
[Kernel-packages] [Bug 1854959] [NEW] stress-ng sysinfo stressor trips kernel oops on ppc64el with 5.4.0.7-8
Public bug reported: stress-ng on ppc64el with 5.4.0.7-8, sysinfo stressor seems to tickle a bug: 06:26:02 DEBUG| [stdout] sysinfo FAILED (kernel oopsed) 06:26:02 DEBUG| [stdout] [ 7262.965483] kernel tried to execute exec-protected page (c00017407ce0) - exploit attempt? (uid: 0) 06:26:02 DEBUG| [stdout] [ 7262.968030] BUG: Unable to handle kernel instruction fetch 06:26:02 DEBUG| [stdout] [ 7262.968121] Faulting instruction address: 0xc00017407ce0 06:26:02 DEBUG| [stdout] [ 7262.968224] Oops: Kernel access of bad area, sig: 11 [#1] 06:26:02 DEBUG| [stdout] [ 7262.968292] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries 06:26:02 DEBUG| [stdout] [ 7262.968403] Modules linked in: unix_diag sctp zfs(PO) zunicode(PO) zavl(PO) icp(PO) zlua(PO) zcommon(PO) znvpair(PO) spl(O) snd_seq snd_seq_device snd_timer snd soundcore vhost_vsock vmw_vsock_virtio_transport_common vsock kvm_pr kvm hci_vhci bluetooth ecdh_generic ecc userio uhid hid vhost_net vhost tap cuse dccp_ipv4 dccp psnap llc algif_rng aegis128 algif_aead anubis fcrypt khazad seed sm4_generic tea crc32_generic md4 michael_mic nhpoly1305 poly1305_generic rmd128 rmd160 rmd256 rmd320 sha3_generic sm3_generic streebog_generic tgr192 wp512 xxhash_generic algif_hash blowfish_generic blowfish_common cast5_generic des_generic libdes salsa20_generic chacha_generic camellia_generic cast6_generic cast_common serpent_generic twofish_generic twofish_common algif_skcipher af_alg aufs binfmt_misc af_packet_diag tcp_diag udp_diag raw_diag inet_diag iptable_mangle xt_TCPMSS xt_tcpudp bpfilter dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua vmx_crypto crct10dif_vpmsum sch_fq_codel ip_tables 06:26:02 DEBUG| [stdout] [ 7262.969078] x_tables autofs4 btrfs xor zstd_compress raid6_pq libcrc32c crc32c_vpmsum virtio_net virtio_blk net_failover failover [last unloaded: trace_printk] 06:26:02 DEBUG| [stdout] [ 7262.970416] CPU: 1 PID: 2613531 Comm: fuse_mnt Tainted: P OE 5.4.0-7-generic #8-Ubuntu 06:26:02 DEBUG| [stdout] [ 7262.970532] NIP: c00017407ce0 LR: c063e968 CTR: c00017407ce0 06:26:02 DEBUG| [stdout] [ 7262.970623] REGS: c001d8393810 TRAP: 0400 Tainted: P OE (5.4.0-7-generic) 06:26:02 DEBUG| [stdout] [ 7262.970737] MSR: 800010009033 CR: 88002440 XER: 2000 06:26:02 DEBUG| [stdout] [ 7262.970850] CFAR: c063e964 IRQMASK: 0 06:26:02 DEBUG| [stdout]GPR00: c063e944 c001d8393aa0 c1a5bf00 c0003d95ec00 06:26:02 DEBUG| [stdout]GPR04: c00017407c18 06:26:02 DEBUG| [stdout]GPR08: 06:26:02 DEBUG| [stdout]GPR12: c00017407ce0 c0003fffee00 7c8ab4814410 06:26:02 DEBUG| [stdout]GPR16: 7c8ab4b9 7c8ab4810320 7c8ab2f6f240 7c8ab4814420 06:26:02 DEBUG| [stdout]GPR20: 7c8aa8000b60 7c8ab4aad3a0 06:26:02 DEBUG| [stdout]GPR24: c001f38f7da0 c001fbb81e4c c00017407ce0 c001f38f7d80 06:26:02 DEBUG| [stdout]GPR28: c001f38f7da0 c0003d95ec00 c001f38f7d70 06:26:02 DEBUG| [stdout] [ 7262.971713] NIP [c00017407ce0] 0xc00017407ce0 06:26:02 DEBUG| [stdout] [ 7262.971804] LR [c063e968] fuse_request_end+0x128/0x2f0 06:26:02 DEBUG| [stdout] [ 7262.971893] Call Trace: 06:26:02 DEBUG| [stdout] [ 7262.971930] [c001d8393aa0] [c063e944] fuse_request_end+0x104/0x2f0 (unreliable) 06:26:02 DEBUG| [stdout] [ 7262.972035] [c001d8393af0] [c06427cc] fuse_dev_do_write+0x2cc/0x5c0 06:26:02 DEBUG| [stdout] [ 7262.972138] [c001d8393b70] [c0642f64] fuse_dev_write+0x74/0xd0 06:26:02 DEBUG| [stdout] [ 7262.972221] [c001d8393c00] [c04702b0] do_iter_readv_writev+0x240/0x290 06:26:02 DEBUG| [stdout] [ 7262.972334] [c001d8393c70] [c0472bc8] do_iter_write+0xc8/0x280 06:26:02 DEBUG| [stdout] [ 7262.972424] [c001d8393cc0] [c0472e90] vfs_writev+0xe0/0x180 06:26:02 DEBUG| [stdout] [ 7262.972508] [c001d8393dc0] [c0472fcc] do_writev+0x9c/0x1a0 06:26:02 DEBUG| [stdout] [ 7262.972588] [c001d8393e20] [c000b278] system_call+0x5c/0x68 06:26:02 DEBUG| [stdout] [ 7262.972661] Instruction dump: 06:26:02 DEBUG| [stdout] [ 7262.972716] 06:26:02 DEBUG| [stdout] [ 7262.972815] 06:26:02 DEBUG| [stdout] [ 7262.972919] ---[ end trace 5852d488fba4a06e ]--- 06:26:02 DEBUG| [stdout] 06:26:02 DEBUG| [stdout] ** Affects: linux (Ubuntu) Importance: High Assignee: Colin Ian King (colin-king) Status: In Progress ** Changed in: linux (Ubuntu
[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle
Hi Dean, I've prepared another debug test kernel that has 70+ of the drm patches removed that were introduced between the 5.3.0-19 and 5.3.9-23 kernels. If this stops the fan spinning then this implies the regression was introduced in a drm graphics patch. Updated revision r2 Debian packages can be found here for testing: https://kernel.ubuntu.com/~cking/lp-1853044/ Please test and let me know the outcome. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853044 Title: 5.3.0-23-generic causes fans to spin when idle Status in linux package in Ubuntu: Confirmed Bug description: After upgrading to 5.3.0-23-generic the fans in my machine don't stop running. They always sound like something is utilizing CPU - even with no applications running after boot. If I boot back to 5.3.0-19-generic it's fine. My microcode version is reported as 0xd4 and iucode-tool reports: iucode-tool: system has processor(s) with signature 0x000506e3 Let me know if you need anything else. ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-23-generic 5.3.0-23.25 ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7 Uname: Linux 5.3.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu8.2 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC2: dean 2898 F pulseaudio /dev/snd/pcmC2D0p: dean 2898 F...m pulseaudio /dev/snd/controlC0: dean 2898 F pulseaudio /dev/snd/controlC1: dean 2898 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Mon Nov 18 13:03:34 2019 HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b InstallationDate: Installed on 2018-07-24 (482 days ago) InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180724) ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-5.3.0-23-generic N/A linux-backports-modules-5.3.0-23-generic N/A linux-firmware1.183.2 SourcePackage: linux UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago) dmi.bios.date: 05/16/2018 dmi.bios.vendor: Intel Corp. dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629 dmi.board.name: NUC6i7KYB dmi.board.vendor: Intel Corporation dmi.board.version: H90766-406 dmi.chassis.type: 3 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 1.0 dmi.modalias: dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0: To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle
CPU averages: 5.3.0-19: 2.03W, 99.6% idle, 0.1% in kernel, 93.5% in C10 state, 5.3% in C8 state, 1.66GHz 5.3.0-23: 13.71W, 99.3% idle, 0.1% in kernel, 92.3% in C10 state, 6.1% in C8 state, 2.05GHz GPU averages: 5.3.0-19: 0.10W 5.3.0-23: 7.19W ACPI thermal zone: 5.3.0-19: 38.92 C 5.3.0.23: 68.65 C So, not much difference in CPU loading or in C10/C8 states, but it is clocked faster on the -23 kernel and is 11.7W more power being consumed. The GPU is also consuming far more power in the -23 kernel. The ACPI thermal zone is ~30 degrees hotter, hence the fan activity. Given the kernel changes I provided made no changes, this looks like a i915 regression somehow. I'll see what has changed there. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853044 Title: 5.3.0-23-generic causes fans to spin when idle Status in linux package in Ubuntu: Confirmed Bug description: After upgrading to 5.3.0-23-generic the fans in my machine don't stop running. They always sound like something is utilizing CPU - even with no applications running after boot. If I boot back to 5.3.0-19-generic it's fine. My microcode version is reported as 0xd4 and iucode-tool reports: iucode-tool: system has processor(s) with signature 0x000506e3 Let me know if you need anything else. ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-23-generic 5.3.0-23.25 ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7 Uname: Linux 5.3.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu8.2 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC2: dean 2898 F pulseaudio /dev/snd/pcmC2D0p: dean 2898 F...m pulseaudio /dev/snd/controlC0: dean 2898 F pulseaudio /dev/snd/controlC1: dean 2898 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Mon Nov 18 13:03:34 2019 HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b InstallationDate: Installed on 2018-07-24 (482 days ago) InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180724) ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-5.3.0-23-generic N/A linux-backports-modules-5.3.0-23-generic N/A linux-firmware1.183.2 SourcePackage: linux UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago) dmi.bios.date: 05/16/2018 dmi.bios.vendor: Intel Corp. dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629 dmi.board.name: NUC6i7KYB dmi.board.vendor: Intel Corporation dmi.board.version: H90766-406 dmi.chassis.type: 3 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 1.0 dmi.modalias: dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0: To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle
@Dean, just one sanity check, do you have non-integer icon scaling on your desktop? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853044 Title: 5.3.0-23-generic causes fans to spin when idle Status in linux package in Ubuntu: Confirmed Bug description: After upgrading to 5.3.0-23-generic the fans in my machine don't stop running. They always sound like something is utilizing CPU - even with no applications running after boot. If I boot back to 5.3.0-19-generic it's fine. My microcode version is reported as 0xd4 and iucode-tool reports: iucode-tool: system has processor(s) with signature 0x000506e3 Let me know if you need anything else. ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-23-generic 5.3.0-23.25 ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7 Uname: Linux 5.3.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu8.2 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC2: dean 2898 F pulseaudio /dev/snd/pcmC2D0p: dean 2898 F...m pulseaudio /dev/snd/controlC0: dean 2898 F pulseaudio /dev/snd/controlC1: dean 2898 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Mon Nov 18 13:03:34 2019 HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b InstallationDate: Installed on 2018-07-24 (482 days ago) InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180724) ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-5.3.0-23-generic N/A linux-backports-modules-5.3.0-23-generic N/A linux-firmware1.183.2 SourcePackage: linux UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago) dmi.bios.date: 05/16/2018 dmi.bios.vendor: Intel Corp. dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629 dmi.board.name: NUC6i7KYB dmi.board.vendor: Intel Corporation dmi.board.version: H90766-406 dmi.chassis.type: 3 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 1.0 dmi.modalias: dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0: To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
** Also affects: linux (Ubuntu Focal) Importance: Critical Assignee: Colin Ian King (colin-king) Status: Confirmed ** Also affects: linux-hwe (Ubuntu Focal) Importance: Undecided Status: Invalid ** Also affects: linux (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: linux-hwe (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: linux-hwe (Ubuntu Disco) Importance: Undecided Status: New ** No longer affects: linux-hwe (Ubuntu Focal) ** No longer affects: linux-hwe (Ubuntu Eoan) ** No longer affects: linux-hwe (Ubuntu Disco) ** Changed in: linux (Ubuntu Focal) Status: Confirmed => In Progress ** Changed in: linux-hwe (Ubuntu Bionic) Status: Confirmed => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: In Progress Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: In Progress Status in linux source package in Disco: New Status in linux source package in Eoan: New Status in linux source package in Focal: In Progress Bug description: == SRU Justification Disco, Eoan, Focal == Multiple squashfs filesystems with overlayfs cause file corruption issues when modifying zero sized files == Fix == The current fix is pending in https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a == Test case == With an Ubuntu ISO on the cdrom drive, use: #!/bin/bash -x mkdir -p /cdrom mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom sleep 1 mkdir -p /cow mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow sleep 1 mkdir -p /cow/upper mkdir -p /cow/work modprobe -q -b overlay sleep 1 modprobe -q -b loop sleep 1 dev=$(losetup -f) mkdir -p /filesystem.squashfs losetup $dev /cdrom/casper/filesystem.squashfs mount -t squashfs -o ro,noatime $dev /filesystem.squashfs sleep 1 dev=$(losetup -f) mkdir -p /installer.squashfs losetup $dev /cdrom/casper/installer.squashfs mount -t squashfs -o ro,noatime $dev /installer.squashfs sleep 1 mkdir -p /root-tmp mount -t overlay -o 'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work' /cow /root-tmp FILE=/root-tmp/etc/.pwd.lock echo foo > $FILE cat $FILE sync # # dropping caches or remounting causes the bug # echo 3 > /proc/sys/vm/drop_caches cat $FILE Without the fix the cat of the file will produce an error. With the the cat will work correctly. == Regression Potential == There is an unhandled corner case: - two filesystems, A and B, both have null uuid - upper layer is on A - lower layer 1 is also on A - lower layer 2 is on B However, since this is an issue without the fix and will be addressed later with subsequent fixes once they are OK with upstream I think the risk is minimal considering nobody is complaining about these corner cases with the current broken overlayfs squashfs layering. --- 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, a
[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD
Thanks Witold! Much appreciated. ** Tags added: verification-done verification-done-eoan -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1852406 Title: Double-escape in initramfs DECRYPT_CMD Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: Fix Committed Status in zfs-linux source package in Focal: Fix Released Bug description: == SRU Justification, Eoan == initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. == Fix == One line fix as attached in https://bugs.launchpad.net/ubuntu/+source /zfs-linux/+bug/1852406/comments/1 == Test == Boot with encrypted data set with plymouth. Without the fix zfs is unable to find the root encrypted pool. With the fix this works. == Regression Potential == This just affects the encrypted dataset that holds key for root dataset; currently this is causing issues because of the bug, so the risk of the fix outweighs the current situation where this is currently broken. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD
I was hoping you could test the version in -proposed. Without it being verified as fixed then the fix won't be released for Eoan. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1852406 Title: Double-escape in initramfs DECRYPT_CMD Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: Fix Committed Status in zfs-linux source package in Focal: Fix Released Bug description: == SRU Justification, Eoan == initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. == Fix == One line fix as attached in https://bugs.launchpad.net/ubuntu/+source /zfs-linux/+bug/1852406/comments/1 == Test == Boot with encrypted data set with plymouth. Without the fix zfs is unable to find the root encrypted pool. With the fix this works. == Regression Potential == This just affects the encrypted dataset that holds key for root dataset; currently this is causing issues because of the bug, so the risk of the fix outweighs the current situation where this is currently broken. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD
@Witold, is it possible for you to sanity check this, if it's not verified it won't be fixed. thanks Colin -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1852406 Title: Double-escape in initramfs DECRYPT_CMD Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: Fix Committed Status in zfs-linux source package in Focal: Fix Released Bug description: == SRU Justification, Eoan == initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. == Fix == One line fix as attached in https://bugs.launchpad.net/ubuntu/+source /zfs-linux/+bug/1852406/comments/1 == Test == Boot with encrypted data set with plymouth. Without the fix zfs is unable to find the root encrypted pool. With the fix this works. == Regression Potential == This just affects the encrypted dataset that holds key for root dataset; currently this is causing issues because of the bug, so the risk of the fix outweighs the current situation where this is currently broken. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
** Description changed: + == SRU Justification Disco, Eoan, Focal == + + Multiple squashfs filesystems with overlayfs cause file corruption issues + when modifying zero sized files + + == Fix == + + The current fix is pending in + https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a + + == Test case == + + With an Ubuntu ISO on the cdrom drive, use: + + #!/bin/bash -x + mkdir -p /cdrom + mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom + sleep 1 + mkdir -p /cow + mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow + sleep 1 + mkdir -p /cow/upper + mkdir -p /cow/work + modprobe -q -b overlay + sleep 1 + modprobe -q -b loop + sleep 1 + dev=$(losetup -f) + mkdir -p /filesystem.squashfs + losetup $dev /cdrom/casper/filesystem.squashfs + mount -t squashfs -o ro,noatime $dev /filesystem.squashfs + sleep 1 + + dev=$(losetup -f) + mkdir -p /installer.squashfs + losetup $dev /cdrom/casper/installer.squashfs + mount -t squashfs -o ro,noatime $dev /installer.squashfs + sleep 1 + + mkdir -p /root-tmp + mount -t overlay -o 'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work' /cow /root-tmp + + FILE=/root-tmp/etc/.pwd.lock + + echo foo > $FILE + cat $FILE + sync + # + # dropping caches or remounting causes the bug + # + echo 3 > /proc/sys/vm/drop_caches + cat $FILE + + Without the fix the cat of the file will produce an error. With the the + cat will work correctly. + + == Regression Potential == + + There is an unhandled corner case: + - two filesystems, A and B, both have null uuid + - upper layer is on A + - lower layer 1 is also on A + - lower layer 2 is on B + + However, since this is an issue without the fix and will be addressed + later with subsequent fixes once they are OK with upstream I think the + risk is minimal considering nobody is complaining about these corner + cases with the current broken overlayfs squashfs layering. + + --- + 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. ** Description changed: == SRU Justification Disco, Eoan, Focal == Multiple squashfs filesystems with overlayfs cause file corruption issues - when modifying zero sized files + when modifying zero sized files == Fix == The current fix is pending in https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a == Test case == With an Ubuntu ISO on the cdrom drive, use: #!/bin/bash -x mkdir -p /cdrom mount -t iso9660 -o ro,noatime /dev/sr0
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
I'm doing some testing right now on the current upstream fix, hopefully will SRU this by EOD. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle
I've found 3 possible commits that may have contributed to this regression. Can you install the kernel headers, image and module debs in https://kernel.ubuntu.com/~cking/lp-1853044/ and see if this helps fix the issue. ** Changed in: linux (Ubuntu) Status: In Progress => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853044 Title: 5.3.0-23-generic causes fans to spin when idle Status in linux package in Ubuntu: Incomplete Bug description: After upgrading to 5.3.0-23-generic the fans in my machine don't stop running. They always sound like something is utilizing CPU - even with no applications running after boot. If I boot back to 5.3.0-19-generic it's fine. My microcode version is reported as 0xd4 and iucode-tool reports: iucode-tool: system has processor(s) with signature 0x000506e3 Let me know if you need anything else. ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-23-generic 5.3.0-23.25 ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7 Uname: Linux 5.3.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu8.2 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC2: dean 2898 F pulseaudio /dev/snd/pcmC2D0p: dean 2898 F...m pulseaudio /dev/snd/controlC0: dean 2898 F pulseaudio /dev/snd/controlC1: dean 2898 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Mon Nov 18 13:03:34 2019 HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b InstallationDate: Installed on 2018-07-24 (482 days ago) InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180724) ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-5.3.0-23-generic N/A linux-backports-modules-5.3.0-23-generic N/A linux-firmware1.183.2 SourcePackage: linux UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago) dmi.bios.date: 05/16/2018 dmi.bios.vendor: Intel Corp. dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629 dmi.board.name: NUC6i7KYB dmi.board.vendor: Intel Corporation dmi.board.version: H90766-406 dmi.chassis.type: 3 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 1.0 dmi.modalias: dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0: To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle
Also, when the fan is running at high speed can you do the following: sudo apt-get install acpi acpi -V and add the output to the bug report -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853044 Title: 5.3.0-23-generic causes fans to spin when idle Status in linux package in Ubuntu: In Progress Bug description: After upgrading to 5.3.0-23-generic the fans in my machine don't stop running. They always sound like something is utilizing CPU - even with no applications running after boot. If I boot back to 5.3.0-19-generic it's fine. My microcode version is reported as 0xd4 and iucode-tool reports: iucode-tool: system has processor(s) with signature 0x000506e3 Let me know if you need anything else. ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-23-generic 5.3.0-23.25 ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7 Uname: Linux 5.3.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu8.2 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC2: dean 2898 F pulseaudio /dev/snd/pcmC2D0p: dean 2898 F...m pulseaudio /dev/snd/controlC0: dean 2898 F pulseaudio /dev/snd/controlC1: dean 2898 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Mon Nov 18 13:03:34 2019 HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b InstallationDate: Installed on 2018-07-24 (482 days ago) InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180724) ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-5.3.0-23-generic N/A linux-backports-modules-5.3.0-23-generic N/A linux-firmware1.183.2 SourcePackage: linux UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago) dmi.bios.date: 05/16/2018 dmi.bios.vendor: Intel Corp. dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629 dmi.board.name: NUC6i7KYB dmi.board.vendor: Intel Corporation dmi.board.version: H90766-406 dmi.chassis.type: 3 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 1.0 dmi.modalias: dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0: To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle
Hi Dean, As a first triaging step, with the 5.3.0-23-generic and also the 5.3.0-19-generic kernel do you mind installing and running the following command: powerstat -Ra | tee powerstat-$(uname -r).log and attaching the log files to the bug report. The command takes about 60 seconds to run. thanks. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853044 Title: 5.3.0-23-generic causes fans to spin when idle Status in linux package in Ubuntu: In Progress Bug description: After upgrading to 5.3.0-23-generic the fans in my machine don't stop running. They always sound like something is utilizing CPU - even with no applications running after boot. If I boot back to 5.3.0-19-generic it's fine. My microcode version is reported as 0xd4 and iucode-tool reports: iucode-tool: system has processor(s) with signature 0x000506e3 Let me know if you need anything else. ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-23-generic 5.3.0-23.25 ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7 Uname: Linux 5.3.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu8.2 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC2: dean 2898 F pulseaudio /dev/snd/pcmC2D0p: dean 2898 F...m pulseaudio /dev/snd/controlC0: dean 2898 F pulseaudio /dev/snd/controlC1: dean 2898 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Mon Nov 18 13:03:34 2019 HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b InstallationDate: Installed on 2018-07-24 (482 days ago) InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180724) ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-5.3.0-23-generic N/A linux-backports-modules-5.3.0-23-generic N/A linux-firmware1.183.2 SourcePackage: linux UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago) dmi.bios.date: 05/16/2018 dmi.bios.vendor: Intel Corp. dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629 dmi.board.name: NUC6i7KYB dmi.board.vendor: Intel Corporation dmi.board.version: H90766-406 dmi.chassis.type: 3 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 1.0 dmi.modalias: dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0: To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle
** Changed in: linux (Ubuntu) Importance: Undecided => High ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: linux (Ubuntu) Status: Confirmed => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853044 Title: 5.3.0-23-generic causes fans to spin when idle Status in linux package in Ubuntu: In Progress Bug description: After upgrading to 5.3.0-23-generic the fans in my machine don't stop running. They always sound like something is utilizing CPU - even with no applications running after boot. If I boot back to 5.3.0-19-generic it's fine. My microcode version is reported as 0xd4 and iucode-tool reports: iucode-tool: system has processor(s) with signature 0x000506e3 Let me know if you need anything else. ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-23-generic 5.3.0-23.25 ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7 Uname: Linux 5.3.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu8.2 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC2: dean 2898 F pulseaudio /dev/snd/pcmC2D0p: dean 2898 F...m pulseaudio /dev/snd/controlC0: dean 2898 F pulseaudio /dev/snd/controlC1: dean 2898 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Mon Nov 18 13:03:34 2019 HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b InstallationDate: Installed on 2018-07-24 (482 days ago) InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180724) ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-5.3.0-23-generic N/A linux-backports-modules-5.3.0-23-generic N/A linux-firmware1.183.2 SourcePackage: linux UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago) dmi.bios.date: 05/16/2018 dmi.bios.vendor: Intel Corp. dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629 dmi.board.name: NUC6i7KYB dmi.board.vendor: Intel Corporation dmi.board.version: H90766-406 dmi.chassis.type: 3 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 1.0 dmi.modalias: dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0: To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1853130] Re: zfs-dkms-0.7.5-1ubuntu15 fail build kernel-hwe kernel 5.0.0-36-generic
No problem at all. I'll close this bug if that's OK. ** Changed in: zfs-linux (Ubuntu) Status: Triaged => Won't Fix -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1853130 Title: zfs-dkms-0.7.5-1ubuntu15 fail build kernel-hwe kernel 5.0.0-36-generic Status in zfs-linux package in Ubuntu: Won't Fix Bug description: zfs-dkms- Fail build with kernel 5.0.0-36-generic cripts/Makefile.build:284: recipe for target '/var/lib/dkms/spl/0.7.5/build/module/spl/spl-condvar.o' failed make[5]: *** [/var/lib/dkms/spl/0.7.5/build/module/spl/spl-condvar.o] Error 1 make[5]: *** Waiting for unfinished jobs In file included from /var/lib/dkms/spl/0.7.5/build/include/sys/kstat.h:31:0, from /var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.c:28: /var/lib/dkms/spl/0.7.5/build/include/sys/time.h: In function ‘gethrestime’: /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:63:9: error: implicit declaration of function ‘current_kernel_time’; did you mean ‘current_time’? [-Werror=implicit-function-declaration] *now = current_kernel_time(); ^~~ current_time /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:63:7: error: incompatible types when assigning to type ‘timestruc_t {aka struct timespec}’ from type ‘int’ *now = current_kernel_time(); ^ /var/lib/dkms/spl/0.7.5/build/include/sys/time.h: In function ‘gethrestime_sec’: /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:70:5: error: incompatible types when assigning to type ‘struct timespec’ from type ‘int’ ts = current_kernel_time(); ^ CC [M] /var/lib/dkms/spl/0.7.5/build/module/splat/splat-linux.o cc1: some warnings being treated as errors scripts/Makefile.build:284: recipe for target '/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.o' failed make[5]: *** [/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.o] Error 1 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1853130/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1853130] Re: zfs-dkms-0.7.5-1ubuntu15 fail build kernel-hwe kernel 5.0.0-36-generic
The hwe (and bionic) kernels come with zfs and spl modules already provided, so there is no need for zfs-dkms and spl-dkms, e.g. cking@bionic-amd64:~$ uname -a Linux bionic-amd64 5.0.0-36-generic #39~18.04.1-Ubuntu SMP Tue Nov 12 11:09:50 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux cking@bionic-amd64:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 18.04.3 LTS Release:18.04 Codename: bionic cking@bionic-amd64:~$ dpkg -l | grep zfs-dkms cking@bionic-amd64:~$ dmesg | grep ZFS [ 16.209504] ZFS: Loaded module v0.7.12-1ubuntu5, ZFS pool version 5000, ZFS filesystem version 5 cking@bionic-amd64:~$ lsmod | grep zfs zfs 3035136 8 zunicode 331776 1 zfs zavl 16384 1 zfs icp 258048 1 zfs zcommon65536 1 zfs znvpair77824 2 zfs,zcommon spl 102400 4 zfs,icp,znvpair,zcommon so just remove zfs-dkms and you can still use zfs on the HWE 5.0.x kernel on Bionic. ** Changed in: zfs-linux (Ubuntu) Importance: Undecided => Wishlist ** Changed in: zfs-linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: zfs-linux (Ubuntu) Status: New => Triaged -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1853130 Title: zfs-dkms-0.7.5-1ubuntu15 fail build kernel-hwe kernel 5.0.0-36-generic Status in zfs-linux package in Ubuntu: Triaged Bug description: zfs-dkms- Fail build with kernel 5.0.0-36-generic cripts/Makefile.build:284: recipe for target '/var/lib/dkms/spl/0.7.5/build/module/spl/spl-condvar.o' failed make[5]: *** [/var/lib/dkms/spl/0.7.5/build/module/spl/spl-condvar.o] Error 1 make[5]: *** Waiting for unfinished jobs In file included from /var/lib/dkms/spl/0.7.5/build/include/sys/kstat.h:31:0, from /var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.c:28: /var/lib/dkms/spl/0.7.5/build/include/sys/time.h: In function ‘gethrestime’: /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:63:9: error: implicit declaration of function ‘current_kernel_time’; did you mean ‘current_time’? [-Werror=implicit-function-declaration] *now = current_kernel_time(); ^~~ current_time /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:63:7: error: incompatible types when assigning to type ‘timestruc_t {aka struct timespec}’ from type ‘int’ *now = current_kernel_time(); ^ /var/lib/dkms/spl/0.7.5/build/include/sys/time.h: In function ‘gethrestime_sec’: /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:70:5: error: incompatible types when assigning to type ‘struct timespec’ from type ‘int’ ts = current_kernel_time(); ^ CC [M] /var/lib/dkms/spl/0.7.5/build/module/splat/splat-linux.o cc1: some warnings being treated as errors scripts/Makefile.build:284: recipe for target '/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.o' failed make[5]: *** [/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.o] Error 1 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1853130/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1771091] Re: zpool freezes importing older ZFS pool, blocks shotdown and system does not boot
This bug has not been updated with further information requested in question 4 for over a year. Marking as Won't Fix. ** Changed in: zfs-linux (Ubuntu) Status: Incomplete => Won't Fix -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1771091 Title: zpool freezes importing older ZFS pool, blocks shotdown and system does not boot Status in zfs-linux package in Ubuntu: Won't Fix Bug description: After fresh install of xubuntu 18.04 LTS 64-bit, and the installation of zfs-dkms I tried to do 'zpool import' on an older ZFS pool, consisting of on partition on the separate PATA HDD. After issuing 'sudo zpool import ' , command freezes (as to zfs commands). System then fails to shutdown properly and seems locked and needs hard reboot (actually it waits up to half an hour to shutdown). After restarting, system displays Xubuntu splash screen and does not boot anymore (it actually resets itself if given again half an hour or so). When getting to rescue options, by pressing SHIFT key on keyboard and going to shell and remounting / read-write, I could do removing of ZFS Ubuntu packages and after that system could boot. Usefull message I got when trying to continue booting in shell was: "[ 40.811792] VERIFY3(0 == remove_reference(hdr, ((void *)0), tag)) failed (0 = 0) [ 40.811856] PANIC at arc.c:3084:arc_buf_destroy()" So it points to some ZFS bug with ARC. Previously, I was able to (unlike with 17.10) upgrade from 17.10 to 18.04 and to import and use a newer ZFS pool. But this bug is about fresh 18.04 install and an older ZFS pool. (zpool import says pool can be upgraded) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1771091/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1846486] Re: revert the revert of ext4: make __ext4_get_inode_loc plug
** Changed in: linux (Ubuntu) Status: In Progress => Fix Committed ** Changed in: linux (Ubuntu) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1846486 Title: revert the revert of ext4: make __ext4_get_inode_loc plug Status in linux package in Ubuntu: Fix Released Status in linux source package in Eoan: Fix Released Bug description: == SRU Justification Eoan == Now that 5.4 contains a fix to the bootup regression due to the lack of entropy at bootable we should apply this fix and also revert the revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"". == Fix == So, to clarify, apply the two upstream 5.4-rc commits: commit 50ee7529ec4500c88f8664560770a7a1b65db72b Author: Linus Torvalds Date: Sat Sep 28 16:53:52 2019 -0700 random: try to actively add entropy rather than passively wait for it commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13 Author: Linus Torvalds Date: Sun Sep 29 17:59:23 2019 -0700 Revert "Revert "ext4: make __ext4_get_inode_loc plug"" I've benchmarked the Eoan kernel with these two patches and found theo following speed improvements on an i7-3770 CPU @ 3.40GHz with 8GB and a WDC WD10EZEX-21WN4A HDD (7200RPM, 64MB cache). git grep of the kernel: 0.14% building fwts: 0.40% build stress-ng 0.45% tar up kernel source: 7.6% boot time of eoan cloud image: 10.5% So I think the speed improvements justify the SRU. == Regression potential == minor change to ext4, which has been regression tested, so risk here is small. The entropy change will alter the random number generation, but I believe this does not change the cryptographical security of the random numbers being generated, so think this change is not security risk. originally the ext4 change caused boot time user space regressions because of the entropy change of this fix, but the random fix addresses this, so I believe this risk is now zero. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1846486/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD
Yes, minimal impact and reducing regression risk is key in SRUs. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1852406 Title: Double-escape in initramfs DECRYPT_CMD Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: In Progress Status in zfs-linux source package in Focal: Fix Released Bug description: == SRU Justification, Eoan == initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. == Fix == One line fix as attached in https://bugs.launchpad.net/ubuntu/+source /zfs-linux/+bug/1852406/comments/1 == Test == Boot with encrypted data set with plymouth. Without the fix zfs is unable to find the root encrypted pool. With the fix this works. == Regression Potential == This just affects the encrypted dataset that holds key for root dataset; currently this is causing issues because of the bug, so the risk of the fix outweighs the current situation where this is currently broken. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD
** Description changed: + == SRU Justification, Eoan == + initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. + + == Fix == + + One line fix as attached in https://bugs.launchpad.net/ubuntu/+source + /zfs-linux/+bug/1852406/comments/1 + + == Test == + + Boot with encrypted data set with plymouth. Without the fix zfs is + unable to find the root encrypted pool. With the fix this works. + + == Regression Potential == + + This just affects the encrypted dataset that holds key for root dataset; + currently this is causing issues because of the bug, so the risk of the + fix outweighs the current situation where this is currently broken. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1852406 Title: Double-escape in initramfs DECRYPT_CMD Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: In Progress Status in zfs-linux source package in Focal: Fix Released Bug description: == SRU Justification, Eoan == initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. == Fix == One line fix as attached in https://bugs.launchpad.net/ubuntu/+source /zfs-linux/+bug/1852406/comments/1 == Test == Boot with encrypted data set with plymouth. Without the fix zfs is unable to find the root encrypted pool. With the fix this works. == Regression Potential == This just affects the encrypted dataset that holds key for root dataset; currently this is causing issues because of the bug, so the risk of the fix outweighs the current situation where this is currently broken. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD
Fix required only in zfs-linux-0.8.1 in Eoan. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1852406 Title: Double-escape in initramfs DECRYPT_CMD Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: In Progress Status in zfs-linux source package in Focal: Fix Released Bug description: initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD
Fixed in zfs-0.8.2 in focal. ** Also affects: zfs-linux (Ubuntu Focal) Importance: Medium Assignee: Colin Ian King (colin-king) Status: Triaged ** Changed in: zfs-linux (Ubuntu Focal) Status: Triaged => Fix Released ** Also affects: zfs-linux (Ubuntu Eoan) Importance: Undecided Status: New ** Changed in: zfs-linux (Ubuntu Eoan) Status: New => In Progress ** Changed in: zfs-linux (Ubuntu Eoan) Importance: Undecided => Medium ** Changed in: zfs-linux (Ubuntu Eoan) Assignee: (unassigned) => Colin Ian King (colin-king) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1852406 Title: Double-escape in initramfs DECRYPT_CMD Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: In Progress Status in zfs-linux source package in Focal: Fix Released Bug description: initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD
Thanks for the patch. Any specific version of zfs-linux this relates to? ** Changed in: zfs-linux (Ubuntu) Importance: Undecided => Medium ** Changed in: zfs-linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: zfs-linux (Ubuntu) Status: New => Triaged -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1852406 Title: Double-escape in initramfs DECRYPT_CMD Status in zfs-linux package in Ubuntu: Triaged Bug description: initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line 414: DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'" This is OK when the line is executed by shell, such as in line 430 or 436, but when plymouth is used it results in plymouth executing "zfs load-key 'rpool'" - and zfs is unable to find pool called "'rpool'". If I understand https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly zfs pool name is always 'shell-friendly', so removing the quotation marks would be a proper fix for that. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1851749] Re: Frequently getting thermal warnings and cpu throttling messages in syslog
** Changed in: linux (Ubuntu) Importance: Undecided => Medium ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: linux (Ubuntu) Status: Confirmed => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1851749 Title: Frequently getting thermal warnings and cpu throttling messages in syslog Status in linux package in Ubuntu: In Progress Bug description: Nov 6 11:34:26 fog kernel: [1129655.443564] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 50300) Nov 6 11:34:26 fog kernel: [1129655.443565] mce: CPU2: Core temperature above threshold, cpu clock throttled (total events = 50300) Nov 6 11:34:26 fog kernel: [1129655.443567] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 58637) Nov 6 11:34:26 fog kernel: [1129655.443568] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 58637) Nov 6 11:34:26 fog kernel: [1129655.443569] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 58637) Nov 6 11:34:26 fog kernel: [1129655.443570] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 58637) Nov 6 11:34:26 fog kernel: [1129655.446528] mce: CPU2: Core temperature/speed normal Nov 6 11:34:26 fog kernel: [1129655.446529] mce: CPU0: Core temperature/speed normal Nov 6 11:34:26 fog kernel: [1129655.446530] mce: CPU1: Package temperature/speed normal Nov 6 11:34:26 fog kernel: [1129655.446531] mce: CPU3: Package temperature/speed normal Nov 6 11:34:26 fog kernel: [1129655.446531] mce: CPU0: Package temperature/speed normal Nov 6 11:34:26 fog kernel: [1129655.446532] mce: CPU2: Package temperature/speed normal Nov 6 11:40:35 fog kernel: [1130024.427390] mce: CPU0: Core temperature above threshold, cpu clock throttled (total events = 50316) Nov 6 11:40:35 fog kernel: [1130024.427391] mce: CPU2: Core temperature above threshold, cpu clock throttled (total event
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
pr_warn can be removed with a sauce patch, so no worries with that. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
been iterating on a fix with upstream: https://lkml.org/lkml/2019/11/7/317 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
When I'm more awake tomorrow I'll send a patch upstream as a suggested fix and see if we can get a good solution on the UUIDs worked out. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
Just love the way launchpad mangles pasted code. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
I was thinking of a more generalized overlayfs solution that detects if file systems don't initialize the superblock uuid and overlayfs improvises by generating the internal overlayfs uuid, something like: diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index 698d112bdb17..da3faaf68d69 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -248,6 +248,7 @@ struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool is_upper) void *buf; int buflen = MAX_HANDLE_SZ; uuid_t *uuid = >d_sb->s_uuid; + static const uuid_t z_uuid; buf = kmalloc(buflen, GFP_KERNEL); if (!buf) @@ -289,7 +290,22 @@ struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool is_upper) if (is_upper) fh->flags |= OVL_FH_FLAG_PATH_UPPER; fh->len = fh_len; - fh->uuid = *uuid; + + if (uuid_equal(uuid, _uuid)) { + struct super_block *sb = real->d_sb; + u16 hash; + + pr_warn("ovl_encode_real_fh: ZERO UUID, generating one from superblock\n"); + + memcpy(>uuid.b[0], >s_magic, 8); + memcpy(>uuid.b[8], >s_dev, 6); + hash = ((long)sb ^ (long)sb->s_fs_info) >> 12; + memcpy(>uuid.b[14], , 2); + } else { + fh->uuid = *uuid; + } + -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
The concern I have is for other file systems that also don't populate the UUID - this seems to be a general problem for overlayfs. Perhaps a UUID can be autogenerated based on the superblock rather than file system specific UUID magic if the UUID is zero. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
Adding a uuid into the superblock on squashfs seems to resolve the issue. Since squashfs does not have UUID support, my hack below generates one based on some squashfs superblock metadata that provides a good enough UUID for our purposes. diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c index effa638d6d85..cfb34a75feb6 100644 --- a/fs/squashfs/super.c +++ b/fs/squashfs/super.c @@ -186,6 +186,12 @@ static int squashfs_fill_super(struct super_block *sb, void *data, int silent) sb->s_flags |= SB_RDONLY; sb->s_op = _super_ops; + memcpy(>s_uuid.b[0], >inodes, 4); + memcpy(>s_uuid.b[4], >mkfs_time, 4); + memcpy(>s_uuid.b[8], >fragments, 4); + memcpy(>s_uuid.b[12], >compression, 2); + memcpy(>s_uuid.b[14], >block_log, 2); + err = -ENOMEM; msblk->block_cache = squashfs_cache_init("metadata", -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
Comparing the previous debug with 2 squashfs overlayfs lowers with the *same* data on ext4 as the 2 overlayfs lowers we have: [ 56.257691] repro-nosquashf (1038): drop_caches: 3 [ 56.265075] ovl_get_fh: 112 dentry: etc/.pwd.lock name: trusted.overlay.origin [ 56.265077] ovl_get_fh: 115: res = 29 [ 56.265079] ovl_get_fh: 152: return fh = b56cf4e7 [ 56.265079] ovl_check_origin: 413 fh = b56cf4e7 [ 56.265080] ovl_check_origin_fh: 354 upperdentry = etc/.pwd.lock [ 56.265081] ovl_decode_real_fh: 174 [ 56.265081] ovl_decode_real_fh: 181 uuid not equal, return NULL [ 56.265082] ovl_check_origin_fh: 363, i=0, origin = NULL [ 56.265082] ovl_decode_real_fh: 174 [ 56.266162] ovl_decode_real_fh: 211 return dentry (OK) [ 56.266163] ovl_check_origin_fh: 360, i=1, origin = / [ 56.266164] ovl_check_origin_fh: level=1 upper: etc/.pwd.lock 100600, lower: / 100600 [ 56.266166] ovl_check_origin_fh: 395 return 0 [ 56.266166] ovl_check_origin: 422 err = 0 [ 56.266167] ovl_check_origin: 439, return 0 ( So this works fine, note that the squashfs lower / is 40755 (S_IFDIR | S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH) where as the ext3 lower / is 100600 (S_IFREG | S_IRUSR | S_IWUSR) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
4.15: ovl_get_origin_fh detects zero sized files on lower paths and treats these a special zero sized "copied up but origin unknown" magic. [ 25.442916] ovl_check_origin: etc/.pwd.lock 2 [ 25.442918] ovl_get_origin_fh: 104 etc/.pwd.lock [ 25.442919] ovl_get_origin_fh: 107 res=0 [ 25.442920] ovl_get_origin_fh: 117 res == 0, return NULL [ 25.442921] ovl_get_origin: 179 fh = (null) 1 [ 25.442922] ovl_get_origin_fh: 104 etc/.pwd.lock [ 25.442922] ovl_get_origin_fh: 107 res=0 [ 25.442923] ovl_get_origin_fh: 117 res == 0, return NULL [ 25.442923] ovl_get_origin: 179 fh = (null) 1 5.3: the lower is / and hence is a directory hence the S_IFDIR origin return. [ 33.320630] ovl_get_fh: 112 dentry: etc/.pwd.lock name: trusted.overlay.origin [ 33.320632] ovl_get_fh: 115: res = 29 [ 33.320634] ovl_get_fh: 152: return fh = 6e71855c [ 33.320634] ovl_check_origin: 413 fh = 6e71855c [ 33.320635] ovl_check_origin_fh: 354 upperdentry = etc/.pwd.lock [ 33.320635] ovl_decode_real_fh: 174 [ 33.320769] ovl_decode_real_fh: 211 return dentry (OK) [ 33.320769] ovl_check_origin_fh: 360, i=0, origin = / [ 33.320770] ovl_check_origin_fh: level=0 upper: etc/.pwd.lock 100600, lower: / 40755 [ 33.320770] ovl_check_origin_fh: 380 goto invalid [ 33.320771] overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000). [ 33.320773] ovl_check_origin: 422 err = -5 [ 33.320774] ovl_check_origin: 429, return -5 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
Installed the new zfsutils + zfs dkms to sanity check the kernel driver part of the fix: dmesg | grep ZFS [ 22.420188] ZFS: Loaded module v0.8.1-1ubuntu14.1, ZFS pool version 5000, ZFS filesystem version 5 And now the test: root@eoan-amd64-uefi:~# mkdir /zfs-test root@eoan-amd64-uefi:~# cd /zfs-test root@eoan-amd64-uefi:/zfs-test# truncate -s 10G file.img root@eoan-amd64-uefi:/zfs-test# zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img root@eoan-amd64-uefi:/zfs-test# zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: root@eoan-amd64-uefi:/zfs-test# dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0.319657 s, 131 MB/s root@eoan-amd64-uefi:/zfs-test# zfs snapshot tank/d1@s1 root@eoan-amd64-uefi:/zfs-test# dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0.312195 s, 134 MB/s root@eoan-amd64-uefi:/zfs-test# zfs diff tank/d1@s1 tank/d1 M /tank/d1/ + /tank/d1/somedata2.bin The zfsutils + dkms package has the fix. Once this lands we can then sync this into the next kernel release for the complete fix. ** Tags added: verification-done-eoan -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: Fix Committed Bug description: == SRU Justification, Eoan == Using zfs diff on an encrypted dataset with large objects one can hit an error such as follows: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists == Fix == Upstream commit d359e99c38f667 ("diff_cb() does not handle large dnodes") as addressed in ZFS bug fix: https://github.com/zfsonlinux/zfs/pull/9343 == Testcase == # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s Without the fix, one hits an error such as: # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists With the fix, we get: + /tank/d1/somedata2.bin M /tank/d1/ == Regression Potential == This is a minor change in module/zfs/dmu_diff.c and it only affects the zfs diff component, so this should not affect ZFS in terms of file system corruption/data loss. This has also been upstream regression tested and passes the Ubuntu ZFS regressions tests too. So the risk is limited. - Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. To manage notifications about this
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
So this is a 2-phase fix. The dkms package is updated, then we test this, then this gets sync'd into the kernel. I'm testing it right now, let me sanity check the zfs-dkms part first and get that updated as step #1. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: Fix Committed Bug description: == SRU Justification, Eoan == Using zfs diff on an encrypted dataset with large objects one can hit an error such as follows: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists == Fix == Upstream commit d359e99c38f667 ("diff_cb() does not handle large dnodes") as addressed in ZFS bug fix: https://github.com/zfsonlinux/zfs/pull/9343 == Testcase == # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s Without the fix, one hits an error such as: # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists With the fix, we get: + /tank/d1/somedata2.bin M /tank/d1/ == Regression Potential == This is a minor change in module/zfs/dmu_diff.c and it only affects the zfs diff component, so this should not affect ZFS in terms of file system corruption/data loss. This has also been upstream regression tested and passes the Ubuntu ZFS regressions tests too. So the risk is limited. - Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
Replaced one of the two squashfs with read-only ext4 partitions and can't reproduce the error. Seems that we need 2 stacked squashfs file systems. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
Replaced read-only squashfs with read-only ext4 partitions and can't reproduce the error. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
And if we change use a different file: /root-tmp/var/log/ubuntu- advantage.log we get the following error too: [ 24.531406] SQUASHFS error: squashfs_read_data failed to read block 0x89c066e0b540 [ 24.531444] SQUASHFS error: Unable to read metadata cache entry [89c066e0b540] -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
OK, now managed to get a reproducer script to kick this bug even outside the early install context. Seems like we can force this bug by either remounting OR sync'ing and dropping caches. Attached is the reproducer script. Run as root, we hit the error: cat: /root-tmp/etc/.pwd.lock: Input/output error dmesg: [ 42.415432] overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000). ** Attachment added: "repro.sh" https://bugs.launchpad.net/ubuntu/bionic/+source/linux-hwe/+bug/1824407/+attachment/5302762/+files/repro.sh -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
BTW, I can generate the mount move failure with the cut down script as follows (which follows the same mount patterns as the casper script) #!/bin/bash -x mkdir -p /cdrom mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom sleep 1 mkdir -p /cow mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow sleep 1 mkdir -p /cow/upper mkdir -p /cow/work modprobe -q -b overlay sleep 1 modprobe -q -b loop sleep 1 dev=$(losetup -f) mkdir -p /filesystem.squashfs losetup $dev /cdrom/casper/filesystem.squashfs mount -t squashfs -o ro,noatime $dev /filesystem.squashfs sleep 1 dev=$(losetup -f) mkdir -p /installer.squashfs losetup $dev /cdrom/casper/installer.squashfs mount -t squashfs -o ro,noatime $dev /installer.squashfs sleep 1 mount -t overlay -o 'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work' /cow /root mkdir -p /root/rofs mount -o move /filesystem.squashfs /root/rofs -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
Hi Dimitri, while debugging this I found the following in setup_unionfs() in scripts/casper: # move the first mount; no head in busybox-initramfs for d in $(mount -t squashfs | cut -d\ -f 3); do mkdir -p "${rootmnt}/rofs" if [ "${UNIONFS}" = unionfs-fuse ]; then mount -o bind "${d}" "${rootmnt}/rofs" else mount -o move "${d}" "${rootmnt}/rofs" fi break done and looking at the debug /run/initramfs/initramfs.debug log for the above stanza I see: + cut '-d ' -f 3 + mount -t squashfs + mkdir -p /root/rofs + '[' overlay '=' unionfs-fuse ] + mount -o move /filesystem.squashfs /root/rofs + break however, when I cannot reproduce this mount -o move operation by hand as I get the mount error: mount: /root/rofs: /filesystem.squashfs is not a block device. It appears to me that the scripts/casper mount seems to silently ignore this failure. Should the mount be a bind mount instead? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) After --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1846486] Re: revert the revert of ext4: make __ext4_get_inode_loc plug
Verified, seeing same ball-park performance improvements. ** Tags removed: verification-needed-eoan ** Tags added: verification-done-eoan -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1846486 Title: revert the revert of ext4: make __ext4_get_inode_loc plug Status in linux package in Ubuntu: In Progress Status in linux source package in Eoan: In Progress Bug description: == SRU Justification Eoan == Now that 5.4 contains a fix to the bootup regression due to the lack of entropy at bootable we should apply this fix and also revert the revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"". == Fix == So, to clarify, apply the two upstream 5.4-rc commits: commit 50ee7529ec4500c88f8664560770a7a1b65db72b Author: Linus Torvalds Date: Sat Sep 28 16:53:52 2019 -0700 random: try to actively add entropy rather than passively wait for it commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13 Author: Linus Torvalds Date: Sun Sep 29 17:59:23 2019 -0700 Revert "Revert "ext4: make __ext4_get_inode_loc plug"" I've benchmarked the Eoan kernel with these two patches and found theo following speed improvements on an i7-3770 CPU @ 3.40GHz with 8GB and a WDC WD10EZEX-21WN4A HDD (7200RPM, 64MB cache). git grep of the kernel: 0.14% building fwts: 0.40% build stress-ng 0.45% tar up kernel source: 7.6% boot time of eoan cloud image: 10.5% So I think the speed improvements justify the SRU. == Regression potential == minor change to ext4, which has been regression tested, so risk here is small. The entropy change will alter the random number generation, but I believe this does not change the cryptographical security of the random numbers being generated, so think this change is not security risk. originally the ext4 change caused boot time user space regressions because of the entropy change of this fix, but the random fix addresses this, so I believe this risk is now zero. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1846486/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
** Also affects: zfs-linux (Ubuntu Eoan) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: Fix Released Status in zfs-linux source package in Eoan: New Bug description: == SRU Justification, Eoan == Using zfs diff on an encrypted dataset with large objects one can hit an error such as follows: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists == Fix == Upstream commit d359e99c38f667 ("diff_cb() does not handle large dnodes") as addressed in ZFS bug fix: https://github.com/zfsonlinux/zfs/pull/9343 == Testcase == # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s Without the fix, one hits an error such as: # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists With the fix, we get: + /tank/d1/somedata2.bin M /tank/d1/ == Regression Potential == This is a minor change in module/zfs/dmu_diff.c and it only affects the zfs diff component, so this should not affect ZFS in terms of file system corruption/data loss. This has also been upstream regression tested and passes the Ubuntu ZFS regressions tests too. So the risk is limited. - Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: linux-hwe (Ubuntu Bionic) Assignee: (unassigned) => Colin Ian King (colin-king) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1824407 Title: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files Status in linux package in Ubuntu: Confirmed Status in linux-hwe package in Ubuntu: Invalid Status in linux-hwe source package in Bionic: Confirmed Bug description: 1) Download focal subiquity pending image, or eoan release image 2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI) 3) Before --- insert the following options break=top debug init=/bin/bash 4) Continue boot (Enter in BIOS, ctrl+x in UEFI) 5) in the initramfs execute: rm /scripts/casper-bottom/25adduser exit 6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one 7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience. 8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted. 9) Corruption examples (On both focal & eoan) cat /etc/.pwd.lock systemd-sysusers cat /etc/.pwd.lock mount -o remount / cat /etc/.pwd.lock overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000) cat: /etc/.pwd.lock: Input/output error (Only on eoan) cat /etc/machine-id systemd-machine-id-setup cat /etc/machine-id mount -o remount / cat /etc/machine-id overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000) cat: /etc/machine-id: Input/output error Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc. We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this. Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files. Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
uploaded zfs-linux (0.8.1-1ubuntu14.1) eoan (will land in -proposed sometime soon) uploaded zfs-linux (0.0.1.1ubuntu16) focal Once the packages are uploaded the dkms driver component will be sync'd into the next kernel and then once this is in -proposed it can be fully tested. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: In Progress Bug description: == SRU Justification, Eoan == Using zfs diff on an encrypted dataset with large objects one can hit an error such as follows: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists == Fix == Upstream commit d359e99c38f667 ("diff_cb() does not handle large dnodes") as addressed in ZFS bug fix: https://github.com/zfsonlinux/zfs/pull/9343 == Testcase == # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s Without the fix, one hits an error such as: # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists With the fix, we get: + /tank/d1/somedata2.bin M /tank/d1/ == Regression Potential == This is a minor change in module/zfs/dmu_diff.c and it only affects the zfs diff component, so this should not affect ZFS in terms of file system corruption/data loss. This has also been upstream regression tested and passes the Ubuntu ZFS regressions tests too. So the risk is limited. - Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
I'll get this uploaded into -proposed once the current SRU backlog is out. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: In Progress Bug description: == SRU Justification, Eoan == Using zfs diff on an encrypted dataset with large objects one can hit an error such as follows: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists == Fix == Upstream commit d359e99c38f667 ("diff_cb() does not handle large dnodes") as addressed in ZFS bug fix: https://github.com/zfsonlinux/zfs/pull/9343 == Testcase == # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s Without the fix, one hits an error such as: # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists With the fix, we get: + /tank/d1/somedata2.bin M /tank/d1/ == Regression Potential == This is a minor change in module/zfs/dmu_diff.c and it only affects the zfs diff component, so this should not affect ZFS in terms of file system corruption/data loss. This has also been upstream regression tested and passes the Ubuntu ZFS regressions tests too. So the risk is limited. - Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
** Description changed: + == SRU Justification, Eoan == + + Using zfs diff on an encrypted dataset with large objects one can hit an + error such as follows: + + # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + + /nsnx/trusty-2a/bin + Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists + + == Fix == + + Upstream commit d359e99c38f667 ("diff_cb() does not handle large + dnodes") as addressed in ZFS bug fix: + https://github.com/zfsonlinux/zfs/pull/9343 + + == Testcase == + + # mkdir /zfs-test + # cd /zfs-test + # truncate -s 10G file.img + # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img + # zfs create tank/d1 -o encryption=on -o keyformat=passphrase + Enter passphrase: + Re-enter passphrase: + # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 + 10240+0 records in + 10240+0 records out + 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s + # zfs snapshot tank/d1@s1 + # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 + 10240+0 records in + 10240+0 records out + 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s + + Without the fix, one hits an error such as: + + # zfs diff tank/d1@s1 tank/d1 + Unable to determine path or stats for object 3 in tank/d1@s1: File exists + + With the fix, we get: + + /tank/d1/somedata2.bin + M /tank/d1/ + + == Regression Potential == + + This is a minor change in module/zfs/dmu_diff.c and it only affects the + zfs diff component, so this should not affect ZFS in terms of file + system corruption/data loss. This has also been upstream regression + tested and passes the Ubuntu ZFS regressions tests too. So the risk is + limited. + + + - + Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: In Progress Bug description: == SRU Justification, Eoan == Using zfs diff on an encrypted dataset with large objects one can hit an error such as follows: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists == Fix == Upstream commit d359e99c38f667 ("diff_cb() does not handle large dnodes") as addressed in ZFS bug fix: https://github.com/zfsonlinux/zfs/pull/9343 == Testcase == # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s Without the fix, one hits an error such as: # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists With the fix, we get: + /tank/d1/somedata2.bin M /tank/d1/ ==
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
With the fix: root@eoan-amd64-efi:/home/cking# mkdir /zfs-test root@eoan-amd64-efi:/home/cking# cd /zfs-test root@eoan-amd64-efi:/zfs-test# truncate -s 10G file.img root@eoan-amd64-efi:/zfs-test# zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img root@eoan-amd64-efi:/zfs-test# zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: root@eoan-amd64-efi:/zfs-test# dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0.238499 s, 176 MB/s root@eoan-amd64-efi:/zfs-test# zfs snapshot tank/d1@s1 root@eoan-amd64-efi:/zfs-test# dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0.228746 s, 183 MB/s root@eoan-amd64-efi:/zfs-test# zfs diff tank/d1@s1 tank/d1 + /tank/d1/somedata2.bin M /tank/d1/ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: In Progress Bug description: == SRU Justification, Eoan == Using zfs diff on an encrypted dataset with large objects one can hit an error such as follows: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists == Fix == Upstream commit d359e99c38f667 ("diff_cb() does not handle large dnodes") as addressed in ZFS bug fix: https://github.com/zfsonlinux/zfs/pull/9343 == Testcase == # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s Without the fix, one hits an error such as: # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists With the fix, we get: + /tank/d1/somedata2.bin M /tank/d1/ == Regression Potential == This is a minor change in module/zfs/dmu_diff.c and it only affects the zfs diff component, so this should not affect ZFS in terms of file system corruption/data loss. This has also been upstream regression tested and passes the Ubuntu ZFS regressions tests too. So the risk is limited. - Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
Confirming that upstream commit https://github.com/zfsonlinux/zfs/commit/d359e99c38f66732d42278c32d52cfcf1839aa4f fixes this issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: In Progress Bug description: == SRU Justification, Eoan == Using zfs diff on an encrypted dataset with large objects one can hit an error such as follows: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists == Fix == Upstream commit d359e99c38f667 ("diff_cb() does not handle large dnodes") as addressed in ZFS bug fix: https://github.com/zfsonlinux/zfs/pull/9343 == Testcase == # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s Without the fix, one hits an error such as: # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists With the fix, we get: + /tank/d1/somedata2.bin M /tank/d1/ == Regression Potential == This is a minor change in module/zfs/dmu_diff.c and it only affects the zfs diff component, so this should not affect ZFS in terms of file system corruption/data loss. This has also been upstream regression tested and passes the Ubuntu ZFS regressions tests too. So the risk is limited. - Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object
** Changed in: zfs-linux (Ubuntu) Importance: Undecided => High ** Changed in: zfs-linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: zfs-linux (Ubuntu) Status: New => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1849665 Title: zfs diff: Unable to determine path or stats for object Status in zfs-linux package in Ubuntu: In Progress Bug description: Eoan 19.10 zfsutils-linux 0.8.1-1ubuntu14 kernel 5.3.0-19-generic #20-Ubuntu When using zfs diff on an encrypted dataset, I frequently encounter this error: # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a + /nsnx/trusty-2a/bin Unable to determine path or stats for object 5 in nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists I believe this to be upstream bug https://github.com/zfsonlinux/zfs/issues/7678, fixed with https://github.com/zfsonlinux/zfs/pull/9343 Here is one way to reproduce it: # mkdir /zfs-test # cd /zfs-test # truncate -s 10G file.img # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img # zfs create tank/d1 -o encryption=on -o keyformat=passphrase Enter passphrase: Re-enter passphrase: # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s # zfs snapshot tank/d1@s1 # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240 10240+0 records in 10240+0 records out 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s # zfs diff tank/d1@s1 tank/d1 Unable to determine path or stats for object 3 in tank/d1@s1: File exists There may be a simpler way to test this, but this should be enough to start with. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1847628] Re: When using swap in ZFS, system stops when you start using swap
** Changed in: ubiquity (Ubuntu) Importance: Undecided => High -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1847628 Title: When using swap in ZFS, system stops when you start using swap Status in ubiquity package in Ubuntu: Confirmed Status in zfs-linux package in Ubuntu: Confirmed Bug description: # Problem When using swap in ZFS, system stops when you start using swap. > stress --vm 100 if you doing swapoff will only occur OOM and the system will not stop. # Environment jehos@MacBuntu:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu Eoan Ermine (development branch) Release:19.10 Codename: eoan jehos@MacBuntu:~$ dpkg -l | grep zfs ii libzfs2linux 0.8.1-1ubuntu13 amd64OpenZFS filesystem library for Linux ii zfs-initramfs 0.8.1-1ubuntu13 amd64OpenZFS root filesystem capabilities for Linux - initramfs ii zfs-zed0.8.1-1ubuntu13 amd64OpenZFS Event Daemon ii zfsutils-linux 0.8.1-1ubuntu13 amd64command-line tools to manage OpenZFS filesystems jehos@MacBuntu:~$ uname -a Linux MacBuntu 5.3.0-13-generic #14-Ubuntu SMP Tue Sep 24 02:46:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux jehos@MacBuntu:~$ zpool list NAMESIZE ALLOC FREE CKPOINT EXPANDSZ FRAGCAP DEDUPHEALTH ALTROOT bpool 1.88G 66.1M 1.81G- - - 3% 1.00xONLINE - rpool 230G 124G 106G- - 9%53% 1.00xONLINE - jehos@MacBuntu:~$ zfs get all rpool/swap NAMEPROPERTY VALUESOURCE rpool/swap type volume - rpool/swap creation 목 10월 10 15:56 2019 - rpool/swap used 2.13G- rpool/swap available 98.9G- rpool/swap referenced72K - rpool/swap compressratio 1.11x- rpool/swap reservation none default rpool/swap volsize 2G local rpool/swap volblocksize 4K - rpool/swap checksum on default rpool/swap compression zle local rpool/swap readonly off default rpool/swap createtxg 34 - rpool/swap copies1default rpool/swap refreservation2.13Glocal rpool/swap guid 18209330213704683244 - rpool/swap primarycache metadata local rpool/swap secondarycachenone local rpool/swap usedbysnapshots 0B - rpool/swap usedbydataset 72K - rpool/swap usedbychildren0B - rpool/swap usedbyrefreservation 2.13G- rpool/swap logbias throughput local rpool/swap objsetid 393 - rpool/swap dedup off default rpool/swap mlslabel none default rpool/swap sync always local rpool/swap refcompressratio 1.11x- rpool/swap written 72K - rpool/swap logicalused 40K - rpool/swap logicalreferenced 40K - rpool/swap volmode default default rpool/swap snapshot_limitnone default rpool/swap snapshot_countnone default rpool/swap snapdev hidden default rpool/swap context none default rpool/swap fscontext none default rpool/swap defcontextnone default rpool/swap rootcontext none default rpool/swap redundant_metadataall default rpool/swap encryptionoff default rpool/swap keylocation none default rpool/swap keyformat none default rpool/swap pbkdf2iters 0default To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1847628/+subscriptions --
[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'
A fix has landed in lxd, I refer you to the following comment: https://github.com/lxc/lxd/issues/4656#issuecomment-541266681 Please check if this addresses the issues. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779156 Title: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy' Status in linux package in Ubuntu: Triaged Status in lxc package in Ubuntu: Confirmed Status in linux source package in Cosmic: Triaged Status in lxc source package in Cosmic: Confirmed Status in linux source package in Disco: New Status in lxc source package in Disco: New Status in linux source package in Eoan: Triaged Status in lxc source package in Eoan: Confirmed Bug description: I'm not sure exactly what got me into this state, but I have several lxc containers that cannot be deleted. $ lxc info api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls environment: addresses: [] architectures: - x86_64 - i686 certificate: | -BEGIN CERTIFICATE- -END CERTIFICATE- certificate_fingerprint: 3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb driver: lxc driver_version: 3.0.1 kernel: Linux kernel_architecture: x86_64 kernel_version: 4.15.0-23-generic server: lxd server_pid: 15123 server_version: "3.2" storage: zfs storage_version: 0.7.5-1ubuntu15 server_clustered: false server_name: milhouse $ lxc delete --force b1 Error: Failed to destroy ZFS filesystem: cannot destroy 'default/containers/b1': dataset is busy Talking in #lxc-dev, stgraber and sforeshee provided diagnosis: | short version is that something unshared a mount namespace causing | them to get a copy of the mount table at the time that dataset was | mounted, which then prevents zfs from being able to destroy it) The work around provided was | you can unstick this particular issue by doing: | grep default/containers/b1 /proc/*/mountinfo | then for any of the hits, do: | nsenter -t PID -m -- umount /var/snap/lxd/common/lxd/storage-pools/default/containers/b1 | then try the delete again ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.15.0-23-generic 4.15.0-23.25 ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18 Uname: Linux 4.15.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.10-0ubuntu3 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: smoser31412 F pulseaudio /dev/snd/controlC2: smoser31412 F pulseaudio /dev/snd/controlC0: smoser31412 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Thu Jun 28 10:42:45 2018 EcryptfsInUse: Yes InstallationDate: Installed on 2015-07-23 (1071 days ago) InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1) MachineType: b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1 RelatedPackageVersions: linux-restricted-modules-4.15.0-23-generic N/A linux-backports-modules-4.15.0-23-generic N/A linux-firmware 1.174 SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 03/09/2015 dmi.bios.vendor: Intel Corporation dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355 dmi.board.asset.tag: � dmi.board.name: NUC5i5RYB dmi.board.vendor: Intel Corporation dmi.board.version: H40999-503 dmi.chassis.asset.tag: � dmi.chassis.type: 3 dmi.chassis.vendor: � dmi.chassis.version: � dmi.modalias: dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i5RYB:rvrH40999-503:cvn:ct3:cvr: dmi.product.family: � dmi.product.name: � dmi.product.version: � dmi.sys.vendor: � To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779156/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to :
[Kernel-packages] [Bug 1847628] Re: When using swap in ZFS, system stops when you start using swap
https://github.com/zfsonlinux/pkg-zfs/wiki/HOWTO-use-a-zvol-as-a-swap- device ..there are known issues with swap on ZFS not working well on heavily memory loaded systems. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1847628 Title: When using swap in ZFS, system stops when you start using swap Status in zfs-linux package in Ubuntu: New Bug description: # Problem When using swap in ZFS, system stops when you start using swap. > stress --vm 100 if you doing swapoff will only occur OOM and the system will not stop. # Environment jehos@MacBuntu:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu Eoan Ermine (development branch) Release:19.10 Codename: eoan jehos@MacBuntu:~$ dpkg -l | grep zfs ii libzfs2linux 0.8.1-1ubuntu13 amd64OpenZFS filesystem library for Linux ii zfs-initramfs 0.8.1-1ubuntu13 amd64OpenZFS root filesystem capabilities for Linux - initramfs ii zfs-zed0.8.1-1ubuntu13 amd64OpenZFS Event Daemon ii zfsutils-linux 0.8.1-1ubuntu13 amd64command-line tools to manage OpenZFS filesystems jehos@MacBuntu:~$ uname -a Linux MacBuntu 5.3.0-13-generic #14-Ubuntu SMP Tue Sep 24 02:46:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux jehos@MacBuntu:~$ zpool list NAMESIZE ALLOC FREE CKPOINT EXPANDSZ FRAGCAP DEDUPHEALTH ALTROOT bpool 1.88G 66.1M 1.81G- - - 3% 1.00xONLINE - rpool 230G 124G 106G- - 9%53% 1.00xONLINE - jehos@MacBuntu:~$ zfs get all rpool/swap NAMEPROPERTY VALUESOURCE rpool/swap type volume - rpool/swap creation 목 10월 10 15:56 2019 - rpool/swap used 2.13G- rpool/swap available 98.9G- rpool/swap referenced72K - rpool/swap compressratio 1.11x- rpool/swap reservation none default rpool/swap volsize 2G local rpool/swap volblocksize 4K - rpool/swap checksum on default rpool/swap compression zle local rpool/swap readonly off default rpool/swap createtxg 34 - rpool/swap copies1default rpool/swap refreservation2.13Glocal rpool/swap guid 18209330213704683244 - rpool/swap primarycache metadata local rpool/swap secondarycachenone local rpool/swap usedbysnapshots 0B - rpool/swap usedbydataset 72K - rpool/swap usedbychildren0B - rpool/swap usedbyrefreservation 2.13G- rpool/swap logbias throughput local rpool/swap objsetid 393 - rpool/swap dedup off default rpool/swap mlslabel none default rpool/swap sync always local rpool/swap refcompressratio 1.11x- rpool/swap written 72K - rpool/swap logicalused 40K - rpool/swap logicalreferenced 40K - rpool/swap volmode default default rpool/swap snapshot_limitnone default rpool/swap snapshot_countnone default rpool/swap snapdev hidden default rpool/swap context none default rpool/swap fscontext none default rpool/swap defcontextnone default rpool/swap rootcontext none default rpool/swap redundant_metadataall default rpool/swap encryptionoff default rpool/swap keylocation none default rpool/swap keyformat none default rpool/swap pbkdf2iters 0default To manage notifications about this bug go to:
[Kernel-packages] [Bug 1847628] Re: When using swap in ZFS, system stops when you start using swap
A swapfile on ZFS is a bad idea. Swapped out pages get pushed through the vfs into zfs and each page of swap will be magnified in the number of free pages required to get this page out to disk. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1847628 Title: When using swap in ZFS, system stops when you start using swap Status in zfs-linux package in Ubuntu: New Bug description: # Problem When using swap in ZFS, system stops when you start using swap. > stress --vm 100 if you doing swapoff will only occur OOM and the system will not stop. # Environment jehos@MacBuntu:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu Eoan Ermine (development branch) Release:19.10 Codename: eoan jehos@MacBuntu:~$ dpkg -l | grep zfs ii libzfs2linux 0.8.1-1ubuntu13 amd64OpenZFS filesystem library for Linux ii zfs-initramfs 0.8.1-1ubuntu13 amd64OpenZFS root filesystem capabilities for Linux - initramfs ii zfs-zed0.8.1-1ubuntu13 amd64OpenZFS Event Daemon ii zfsutils-linux 0.8.1-1ubuntu13 amd64command-line tools to manage OpenZFS filesystems jehos@MacBuntu:~$ uname -a Linux MacBuntu 5.3.0-13-generic #14-Ubuntu SMP Tue Sep 24 02:46:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux jehos@MacBuntu:~$ zpool list NAMESIZE ALLOC FREE CKPOINT EXPANDSZ FRAGCAP DEDUPHEALTH ALTROOT bpool 1.88G 66.1M 1.81G- - - 3% 1.00xONLINE - rpool 230G 124G 106G- - 9%53% 1.00xONLINE - jehos@MacBuntu:~$ zfs get all rpool/swap NAMEPROPERTY VALUESOURCE rpool/swap type volume - rpool/swap creation 목 10월 10 15:56 2019 - rpool/swap used 2.13G- rpool/swap available 98.9G- rpool/swap referenced72K - rpool/swap compressratio 1.11x- rpool/swap reservation none default rpool/swap volsize 2G local rpool/swap volblocksize 4K - rpool/swap checksum on default rpool/swap compression zle local rpool/swap readonly off default rpool/swap createtxg 34 - rpool/swap copies1default rpool/swap refreservation2.13Glocal rpool/swap guid 18209330213704683244 - rpool/swap primarycache metadata local rpool/swap secondarycachenone local rpool/swap usedbysnapshots 0B - rpool/swap usedbydataset 72K - rpool/swap usedbychildren0B - rpool/swap usedbyrefreservation 2.13G- rpool/swap logbias throughput local rpool/swap objsetid 393 - rpool/swap dedup off default rpool/swap mlslabel none default rpool/swap sync always local rpool/swap refcompressratio 1.11x- rpool/swap written 72K - rpool/swap logicalused 40K - rpool/swap logicalreferenced 40K - rpool/swap volmode default default rpool/swap snapshot_limitnone default rpool/swap snapshot_countnone default rpool/swap snapdev hidden default rpool/swap context none default rpool/swap fscontext none default rpool/swap defcontextnone default rpool/swap rootcontext none default rpool/swap redundant_metadataall default rpool/swap encryptionoff default rpool/swap keylocation none default rpool/swap keyformat none default rpool/swap pbkdf2iters 0default To manage notifications about this bug go to:
[Kernel-packages] [Bug 1846424] Re: 19.10 ZFS Update failed on 2019-10-02
The error: "zfs[9317]: cannot mount '/': directory is not empty" seems to suggest that this is a root mounted zfs. Is that so? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1846424 Title: 19.10 ZFS Update failed on 2019-10-02 Status in zfs-linux package in Ubuntu: In Progress Bug description: On all my systems the update from zfs-initrams_0.8.1-1ubuntu12_amd64.deb failed the same is true for zfs-zed and zfsutils-linux. The system still runs on 0.8.1-1ubuntu11_amd64. The first error message was about a failing mount and at the end it announced that all 3 modules were not updated. I have the error on Xubuntu 19.10, Ubuntu Mate 19.10 on my laptop i5-2520M and in a VBox VM on a Ryzen 3 2200G with Ubuntu 19.10. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1846424/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1846486] Re: revert the revert of ext4: make __ext4_get_inode_loc plug
** Description changed: == SRU Justification Eoan == Now that 5.4 contains a fix to the bootup regression due to the lack of entropy at bootable we should apply this fix and also revert the revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"". == Fix == So, to clarify, apply the two upstream 5.4-rc commits: commit 50ee7529ec4500c88f8664560770a7a1b65db72b Author: Linus Torvalds Date: Sat Sep 28 16:53:52 2019 -0700 random: try to actively add entropy rather than passively wait for it commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13 Author: Linus Torvalds Date: Sun Sep 29 17:59:23 2019 -0700 Revert "Revert "ext4: make __ext4_get_inode_loc plug"" - I've benchmarked the Eoan kernel with these two patches and found the - following speed improvements: + I've benchmarked the Eoan kernel with these two patches and found theo + following speed improvements on an i7-3770 CPU @ 3.40GHz with 8GB and a + WDC WD10EZEX-21WN4A HDD (7200RPM, 64MB cache). git grep of the kernel: 0.14% building fwts: 0.40% build stress-ng 0.45% tar up kernel source: 7.6% boot time of eoan cloud image: 10.5% So I think the speed improvements justify the SRU. == Regression potential == minor change to ext4, which has been regression tested, so risk here is small. The entropy change will alter the random number generation, but I believe this does not change the cryptographical security of the random numbers being generated, so think this change is not security risk. originally the ext4 change caused boot time user space regressions because of the entropy change of this fix, but the random fix addresses this, so I believe this risk is now zero. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1846486 Title: revert the revert of ext4: make __ext4_get_inode_loc plug Status in linux package in Ubuntu: In Progress Bug description: == SRU Justification Eoan == Now that 5.4 contains a fix to the bootup regression due to the lack of entropy at bootable we should apply this fix and also revert the revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"". == Fix == So, to clarify, apply the two upstream 5.4-rc commits: commit 50ee7529ec4500c88f8664560770a7a1b65db72b Author: Linus Torvalds Date: Sat Sep 28 16:53:52 2019 -0700 random: try to actively add entropy rather than passively wait for it commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13 Author: Linus Torvalds Date: Sun Sep 29 17:59:23 2019 -0700 Revert "Revert "ext4: make __ext4_get_inode_loc plug"" I've benchmarked the Eoan kernel with these two patches and found theo following speed improvements on an i7-3770 CPU @ 3.40GHz with 8GB and a WDC WD10EZEX-21WN4A HDD (7200RPM, 64MB cache). git grep of the kernel: 0.14% building fwts: 0.40% build stress-ng 0.45% tar up kernel source: 7.6% boot time of eoan cloud image: 10.5% So I think the speed improvements justify the SRU. == Regression potential == minor change to ext4, which has been regression tested, so risk here is small. The entropy change will alter the random number generation, but I believe this does not change the cryptographical security of the random numbers being generated, so think this change is not security risk. originally the ext4 change caused boot time user space regressions because of the entropy change of this fix, but the random fix addresses this, so I believe this risk is now zero. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1846486/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1846486] [NEW] revert the revert of ext4: make __ext4_get_inode_loc plug
Public bug reported: == SRU Justification Eoan == Now that 5.4 contains a fix to the bootup regression due to the lack of entropy at bootable we should apply this fix and also revert the revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"". == Fix == So, to clarify, apply the two upstream 5.4-rc commits: commit 50ee7529ec4500c88f8664560770a7a1b65db72b Author: Linus Torvalds Date: Sat Sep 28 16:53:52 2019 -0700 random: try to actively add entropy rather than passively wait for it commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13 Author: Linus Torvalds Date: Sun Sep 29 17:59:23 2019 -0700 Revert "Revert "ext4: make __ext4_get_inode_loc plug"" I've benchmarked the Eoan kernel with these two patches and found the following speed improvements: git grep of the kernel: 0.14% building fwts: 0.40% build stress-ng 0.45% tar up kernel source: 7.6% boot time of eoan cloud image: 10.5% So I think the speed improvements justify the SRU. == Regression potential == minor change to ext4, which has been regression tested, so risk here is small. The entropy change will alter the random number generation, but I believe this does not change the cryptographical security of the random numbers being generated, so think this change is not security risk. originally the ext4 change caused boot time user space regressions because of the entropy change of this fix, but the random fix addresses this, so I believe this risk is now zero. ** Affects: linux (Ubuntu) Importance: High Assignee: Colin Ian King (colin-king) Status: In Progress ** Changed in: linux (Ubuntu) Status: New => In Progress ** Changed in: linux (Ubuntu) Importance: Undecided => High ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Description changed: - == SRU Justifucation Eoan == + == SRU Justification Eoan == Now that 5.4 contains a fix to the bootup regression due to the lack of entropy at bootable we should apply this fix and also revert the revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"". == Fix == So, to clarify, apply the two upstream 5.4-rc commits: commit 50ee7529ec4500c88f8664560770a7a1b65db72b Author: Linus Torvalds Date: Sat Sep 28 16:53:52 2019 -0700 - random: try to actively add entropy rather than passively wait for + random: try to actively add entropy rather than passively wait for it commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13 Author: Linus Torvalds Date: Sun Sep 29 17:59:23 2019 -0700 - Revert "Revert "ext4: make __ext4_get_inode_loc plug"" + Revert "Revert "ext4: make __ext4_get_inode_loc plug"" - - I've benchmarked the Eoan kernel with these two patches and found the following speed improvements: + I've benchmarked the Eoan kernel with these two patches and found the + following speed improvements: git grep of the kernel: 0.14% building fwts: 0.40% build stress-ng 0.45% tar up kernel source: 7.6% boot time of eoan cloud image: 10.5% So I think this justifies the speed improvements. == Regression potential == minor change to ext4, which has been regression tested, so risk here is small. The entropy change will alter the random number generation, but I believe this does not change the cryptographical security of the random numbers being generated, so think this change is not security risk. originally the ext4 change caused boot time user space regressions because of the entropy change of this fix, but the random fix addresses this, so I believe this risk is now zero. ** Description changed: == SRU Justification Eoan == Now that 5.4 contains a fix to the bootup regression due to the lack of entropy at bootable we should apply this fix and also revert the revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"". == Fix == So, to clarify, apply the two upstream 5.4-rc commits: commit 50ee7529ec4500c88f8664560770a7a1b65db72b Author: Linus Torvalds Date: Sat Sep 28 16:53:52 2019 -0700 random: try to actively add entropy rather than passively wait for it commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13 Author: Linus Torvalds Date: Sun Sep 29 17:59:23 2019 -0700 Revert "Revert "ext4: make __ext4_get_inode_loc plug"" I've benchmarked the Eoan kernel with these two patches and found the following speed improvements: git grep of the kernel: 0.14% building fwts: 0.40% build stress-ng 0.45% tar up kernel source: 7.6% boot time of eoan cloud image: 10.5% - So I think this justifies the speed improvements. + So I think the speed improvements justify the SRU. == Regression potential == minor change to ext4, which has been regression tested, so risk here is
[Kernel-packages] [Bug 1846424] Re: 19.10 ZFS Update failed on 2019-10-02
When you have error messages about modules not being updated then this makes me believe that perhaps you have zfs-dkms install. This package is not required if you are using the 19.10 5.2 or 5.3 kernel as this has the zfs modules provided already with it. If you have the official 19.19 5.2 or 5.3 kernels then one can remove zfs-dkms. Can you provide more information about the error? ** Changed in: zfs-linux (Ubuntu) Importance: Undecided => High ** Changed in: zfs-linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: zfs-linux (Ubuntu) Status: New => In Progress ** Changed in: zfs-linux (Ubuntu) Status: In Progress => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1846424 Title: 19.10 ZFS Update failed on 2019-10-02 Status in zfs-linux package in Ubuntu: Incomplete Bug description: On all my systems the update from zfs-initrams_0.8.1-1ubuntu12_amd64.deb failed the same is true for zfs-zed and zfsutils-linux. The system still runs on 0.8.1-1ubuntu11_amd64. The first error message was about a failing mount and at the end it announced that all 3 modules were not updated. I have the error on Xubuntu 19.10, Ubuntu Mate 19.10 on my laptop i5-2520M and in a VBox VM on a Ryzen 3 2200G with Ubuntu 19.10. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1846424/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1845948] Re: clone test from ubuntu_stress_smoke_test failed on B-hwe 5.0 i386
Fixes committed to make the clone test more OOM-able and the autotest less OOM-able: https://kernel.ubuntu.com/git/cking/stress-ng.git/commit/?id=f37bba3874b1e613f307cb40c040e06f21b1e521 https://kernel.ubuntu.com/git/cking/stress-ng.git/commit/?id=cdd32c1c25b9c7f11be4778cd99b7c45f6f9d51d and https://kernel.ubuntu.com/git/ubuntu/autotest-client- tests.git/commit/?id=7447c7b658e3a6cc40496a75033c007e4a91f166 ** Changed in: stress-ng Importance: Undecided => High ** Changed in: stress-ng Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: ubuntu-kernel-tests Importance: Undecided => High ** Changed in: ubuntu-kernel-tests Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: stress-ng Status: New => Fix Committed ** Changed in: ubuntu-kernel-tests Status: New => Fix Committed ** No longer affects: linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1845948 Title: clone test from ubuntu_stress_smoke_test failed on B-hwe 5.0 i386 Status in Stress-ng: Fix Committed Status in ubuntu-kernel-tests: Fix Committed Bug description: Reproduce rate: 2/2 Issue found on an i386 node pepe with B-hwe 5.0. It looks like the clone test haven't pass successfully. Setting up swapspace version 1, size = 1024 MiB (1073737728 bytes) no label, UUID=25a57478-f726-411c-b79e-05d18c1cc5b2 Machine Configuration Physical Pages: 2046087 Pages available: 1518620 Page Size: 4096 Zswap enabled: Y Free memory: totalusedfree shared buff/cache available Mem:8184348 185048 6074228 932 1925072 7348440 Swap: 5242872 0 5242872 Number of CPUs: 8 Number of CPUs Online: 8 access STARTING access RETURNED 0 access PASSED af-alg STARTING af-alg RETURNED 0 af-alg PASSED affinity STARTING affinity RETURNED 0 affinity PASSED aio STARTING aio RETURNED 0 aio PASSED aiol STARTING aiol RETURNED 0 aiol PASSED bad-altstack STARTING bad-altstack RETURNED 0 bad-altstack PASSED bigheap STARTING bigheap RETURNED 0 bigheap PASSED branch STARTING branch RETURNED 0 branch PASSED brk STARTING brk RETURNED 0 brk PASSED cache STARTING cache RETURNED 0 cache PASSED cap STARTING cap RETURNED 0 cap PASSED chdir STARTING chdir RETURNED 0 chdir PASSED chmod STARTING chmod RETURNED 0 chmod PASSED chown STARTING chown RETURNED 0 chown PASSED chroot STARTING chroot RETURNED 0 chroot PASSED clock STARTING clock RETURNED 0 clock PASSED clone STARTING clone RETURNED 0 stderr: 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.43609 s, 198 MB/s 12:03:16 INFO | END ERROR ubuntu_stress_smoke_test.stress-smoke-test ubuntu_stress_smoke_test.stress-smoke-test timestamp=1569844996 localtime=Sep 30 12:03:16 12:03:16 DEBUG| Persistent state client._record_indent now set to 1 12:03:16 DEBUG| Persistent state client.unexpected_reboot deleted To manage notifications about this bug go to: https://bugs.launchpad.net/stress-ng/+bug/1845948/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1845948] Re: clone test from ubuntu_stress_smoke_test failed on B-hwe 5.0 i386
I've managed to reproduce this on pepe, seems like the autotest is being OOM'd in preference to the actual cloning processes. I'll see if I can figure out how to stop this. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1845948 Title: clone test from ubuntu_stress_smoke_test failed on B-hwe 5.0 i386 Status in Stress-ng: New Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Incomplete Bug description: Reproduce rate: 2/2 Issue found on an i386 node pepe with B-hwe 5.0. It looks like the clone test haven't pass successfully. Setting up swapspace version 1, size = 1024 MiB (1073737728 bytes) no label, UUID=25a57478-f726-411c-b79e-05d18c1cc5b2 Machine Configuration Physical Pages: 2046087 Pages available: 1518620 Page Size: 4096 Zswap enabled: Y Free memory: totalusedfree shared buff/cache available Mem:8184348 185048 6074228 932 1925072 7348440 Swap: 5242872 0 5242872 Number of CPUs: 8 Number of CPUs Online: 8 access STARTING access RETURNED 0 access PASSED af-alg STARTING af-alg RETURNED 0 af-alg PASSED affinity STARTING affinity RETURNED 0 affinity PASSED aio STARTING aio RETURNED 0 aio PASSED aiol STARTING aiol RETURNED 0 aiol PASSED bad-altstack STARTING bad-altstack RETURNED 0 bad-altstack PASSED bigheap STARTING bigheap RETURNED 0 bigheap PASSED branch STARTING branch RETURNED 0 branch PASSED brk STARTING brk RETURNED 0 brk PASSED cache STARTING cache RETURNED 0 cache PASSED cap STARTING cap RETURNED 0 cap PASSED chdir STARTING chdir RETURNED 0 chdir PASSED chmod STARTING chmod RETURNED 0 chmod PASSED chown STARTING chown RETURNED 0 chown PASSED chroot STARTING chroot RETURNED 0 chroot PASSED clock STARTING clock RETURNED 0 clock PASSED clone STARTING clone RETURNED 0 stderr: 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.43609 s, 198 MB/s 12:03:16 INFO | END ERROR ubuntu_stress_smoke_test.stress-smoke-test ubuntu_stress_smoke_test.stress-smoke-test timestamp=1569844996 localtime=Sep 30 12:03:16 12:03:16 DEBUG| Persistent state client._record_indent now set to 1 12:03:16 DEBUG| Persistent state client.unexpected_reboot deleted To manage notifications about this bug go to: https://bugs.launchpad.net/stress-ng/+bug/1845948/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1845638] Re: ubuntu_lttng_smoke_test failed with D PowerPC
Fix committed: https://kernel.ubuntu.com/git/ubuntu/autotest-client- tests.git/commit/?id=8e618fb7b00ecc206fe3ea73084492ebb5835747 ** Changed in: ubuntu-kernel-tests Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: ubuntu-kernel-tests Status: Incomplete => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1845638 Title: ubuntu_lttng_smoke_test failed with D PowerPC Status in ubuntu-kernel-tests: Fix Committed Status in linux package in Ubuntu: Incomplete Status in linux source package in Disco: Incomplete Bug description: Found on node modoc: 09/23 03:53:18 DEBUG| utils:0153| [stdout] == lttng smoke test trace context switches == 09/23 03:53:18 DEBUG| utils:0153| [stdout] Session test-kernel-session created. 09/23 03:53:18 DEBUG| utils:0153| [stdout] Traces will be written in /tmp/lttng-kernel-trace-4683-session 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng create) 09/23 03:53:18 DEBUG| utils:0153| [stdout] Kernel event sched_switch created in channel channel0 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng enable-event) 09/23 03:53:18 DEBUG| utils:0153| [stdout] Tracing started for session test-kernel-session 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng start) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Waiting for data availability 09/23 03:53:24 DEBUG| utils:0153| [stdout] Tracing stopped for session test-kernel-session 09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng stop) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Session test-kernel-session destroyed 09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng destroy) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Found 10 dd and 19927 context switches 09/23 03:53:24 DEBUG| utils:0153| [stdout] FAILED (did not trace any dd context switches) 09/23 03:53:24 DEBUG| utils:0153| [stdout] 09/23 03:53:24 DEBUG| utils:0153| [stdout] Summary: 7 passed, 1 failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1845638/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1845638] Re: ubuntu_lttng_smoke_test failed with D PowerPC
** Changed in: ubuntu-kernel-tests Status: In Progress => Incomplete ** Changed in: linux (Ubuntu) Status: In Progress => Incomplete ** Changed in: ubuntu-kernel-tests Importance: High => Medium ** Changed in: linux (Ubuntu) Importance: High => Medium ** Changed in: ubuntu-kernel-tests Assignee: Colin Ian King (colin-king) => (unassigned) ** Changed in: linux (Ubuntu) Assignee: Colin Ian King (colin-king) => (unassigned) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1845638 Title: ubuntu_lttng_smoke_test failed with D PowerPC Status in ubuntu-kernel-tests: Incomplete Status in linux package in Ubuntu: Incomplete Status in linux source package in Disco: Incomplete Bug description: Found on node modoc: 09/23 03:53:18 DEBUG| utils:0153| [stdout] == lttng smoke test trace context switches == 09/23 03:53:18 DEBUG| utils:0153| [stdout] Session test-kernel-session created. 09/23 03:53:18 DEBUG| utils:0153| [stdout] Traces will be written in /tmp/lttng-kernel-trace-4683-session 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng create) 09/23 03:53:18 DEBUG| utils:0153| [stdout] Kernel event sched_switch created in channel channel0 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng enable-event) 09/23 03:53:18 DEBUG| utils:0153| [stdout] Tracing started for session test-kernel-session 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng start) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Waiting for data availability 09/23 03:53:24 DEBUG| utils:0153| [stdout] Tracing stopped for session test-kernel-session 09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng stop) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Session test-kernel-session destroyed 09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng destroy) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Found 10 dd and 19927 context switches 09/23 03:53:24 DEBUG| utils:0153| [stdout] FAILED (did not trace any dd context switches) 09/23 03:53:24 DEBUG| utils:0153| [stdout] 09/23 03:53:24 DEBUG| utils:0153| [stdout] Summary: 7 passed, 1 failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1845638/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
And for Standard_D2s_v3 IP addr Mac AddrKernel Reboots 104.42.252.54 00:0d:3a:32:df:92 5.0.0-1020-azure500 104.42.150.26 00:0d:3a:31:1b:50 5.0.0-1020-azure500 104.42.147.144 00:0d:3a:32:d8:2f 5.0.0-1020-azure500 40.112.129.232 00:0d:3a:32:d5:7c 5.0.0-1020-azure500 40.112.134.251 00:0d:3a:32:d9:2d 5.0.0-1020-azure500 13.64.195.2100:0d:3a:5a:7b:51 5.0.0-1020-azure500 40.83.214.204 00:0d:3a:36:47:98 5.0.0-1020-azure500 13.64.195.2700:0d:3a:5a:7f:05 5.0.0-1020-azure500 13.64.195.3100:0d:3a:5a:78:55 5.0.0-1020-azure500 13.64.195.6900:0d:3a:5a:7c:72 5.0.0-1020-azure500 104.42.51.2300:0d:3a:37:47:0a 5.0.0-1020-azure500 13.64.233.120 00:0d:3a:37:46:ab 5.0.0-1020-azure500 13.64.233.216 00:0d:3a:37:49:fb 5.0.0-1020-azure500 13.64.239.157 00:0d:3a:37:43:cf 5.0.0-1020-azure16 [hang] 52.160.87.177 00:0d:3a:35:fc:b1 5.0.0-1020-azure500 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would. Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. After the 4th attempt, my script timed out, I took a look at the instance console log and the following displayed on the console. [ OK ] Reached target Reboot. /shutdown: error while loading shared libra[ 89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.498980] [ 89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu [ 89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 89.508026] Call Trace: [ 89.508026] dump_stack+0x63/0x8a [ 89.508026] panic+0xe7/0x247 [ 89.508026] do_exit.cold.23+0x26/0x75 [ 89.508026] do_group_exit+0x43/0xb0 [ 89.508026] __x64_sys_exit_group+0x18/0x20 [ 89.508026] do_syscall_64+0x5a/0x110 [ 89.508026] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 89.508026] RIP: 0033:0x7f7bf0154d86 [ 89.508026] Code: Bad RIP value. [ 89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 00e7 [ 89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 7f7bf0154d86 [ 89.508026] RDX: 007f RSI: 003c RDI: 007f [ 89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: [ 89.508026] R10: 7ffd6be6974c R11: 0206 R12: 0018 [ 89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: [ 89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation range: 0x8000-0xbfff) [ 89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.508026] ]--- this only occurred once in my testing. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
@Joseph, so I can reproduce this hang/crash issue across a variety of instances. I can't get any info back on a console, so debugging this is not easy. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would. Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. After the 4th attempt, my script timed out, I took a look at the instance console log and the following displayed on the console. [ OK ] Reached target Reboot. /shutdown: error while loading shared libra[ 89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.498980] [ 89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu [ 89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 89.508026] Call Trace: [ 89.508026] dump_stack+0x63/0x8a [ 89.508026] panic+0xe7/0x247 [ 89.508026] do_exit.cold.23+0x26/0x75 [ 89.508026] do_group_exit+0x43/0xb0 [ 89.508026] __x64_sys_exit_group+0x18/0x20 [ 89.508026] do_syscall_64+0x5a/0x110 [ 89.508026] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 89.508026] RIP: 0033:0x7f7bf0154d86 [ 89.508026] Code: Bad RIP value. [ 89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 00e7 [ 89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 7f7bf0154d86 [ 89.508026] RDX: 007f RSI: 003c RDI: 007f [ 89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: [ 89.508026] R10: 7ffd6be6974c R11: 0206 R12: 0018 [ 89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: [ 89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation range: 0x8000-0xbfff) [ 89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.508026] ]--- this only occurred once in my testing. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1845638] Re: ubuntu_lttng_smoke_test failed with D PowerPC
** Changed in: linux (Ubuntu) Status: Incomplete => In Progress ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: linux (Ubuntu) Importance: Undecided => High ** Changed in: ubuntu-kernel-tests Importance: Undecided => High ** Changed in: ubuntu-kernel-tests Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: ubuntu-kernel-tests Status: New => Fix Committed ** Changed in: ubuntu-kernel-tests Status: Fix Committed => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1845638 Title: ubuntu_lttng_smoke_test failed with D PowerPC Status in ubuntu-kernel-tests: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Disco: Incomplete Bug description: Found on node modoc: 09/23 03:53:18 DEBUG| utils:0153| [stdout] == lttng smoke test trace context switches == 09/23 03:53:18 DEBUG| utils:0153| [stdout] Session test-kernel-session created. 09/23 03:53:18 DEBUG| utils:0153| [stdout] Traces will be written in /tmp/lttng-kernel-trace-4683-session 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng create) 09/23 03:53:18 DEBUG| utils:0153| [stdout] Kernel event sched_switch created in channel channel0 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng enable-event) 09/23 03:53:18 DEBUG| utils:0153| [stdout] Tracing started for session test-kernel-session 09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng start) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Waiting for data availability 09/23 03:53:24 DEBUG| utils:0153| [stdout] Tracing stopped for session test-kernel-session 09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng stop) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Session test-kernel-session destroyed 09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng destroy) 09/23 03:53:24 DEBUG| utils:0153| [stdout] Found 10 dd and 19927 context switches 09/23 03:53:24 DEBUG| utils:0153| [stdout] FAILED (did not trace any dd context switches) 09/23 03:53:24 DEBUG| utils:0153| [stdout] 09/23 03:53:24 DEBUG| utils:0153| [stdout] Summary: 7 passed, 1 failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1845638/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
Get more failures with Standard_B1ms IP addr Mac AddrKernel Reboots 52.160.101.11 00:0d:3a:5b:a0:7c 5.0.0-1020-azure10 137.135.51.101 00:0d:3a:31:20:fc 5.0.0-1020-azure500 137.135.50.133 00:0d:3a:31:27:0f 5.0.0-1020-azure396 [hang] 137.135.51.198 00:0d:3a:31:28:d7 5.0.0-1020-azure500 137.135.49.89 00:0d:3a:31:22:c1 5.0.0-1020-azure500 137.135.48.14 00:0d:3a:33:05:7d 5.0.0-1020-azure500 104.40.5.23 00:0d:3a:32:e7:27 5.0.0-1020-azure228 [hang] 13.93.223.213 00:0d:3a:32:e8:59 5.0.0-1020-azure500 104.40.0.15100:0d:3a:31:32:09 5.0.0-1020-azure500 40.118.128.130 00:0d:3a:32:f5:71 5.0.0-1020-azure500 23.101.200.119 00:0d:3a:36:c5:94 5.0.0-1020-azure500 104.40.8.52 00:0d:3a:33:07:6e 5.0.0-1020-azure500 104.40.19.222 00:0d:3a:33:01:0d 5.0.0-1020-azure500 104.42.135.72 00:0d:3a:3b:e9:15 5.0.0-1020-azure500 104.40.22.205 00:0d:3a:33:0d:d8 5.0.0-1020-azure500 104.40.7.22 00:0d:3a:37:85:ff 5.0.0-1020-azure500 13.88.17.94 00:0d:3a:5a:54:c6 5.0.0-1020-azure500 104.40.8.19600:0d:3a:59:56:f3 5.0.0-1020-azure500 13.88.21.12500:0d:3a:5a:50:00 5.0.0-1020-azure500 13.88.23.13900:0d:3a:5a:55:c3 5.0.0-1020-azure500 23.99.81.18800:0d:3a:5a:52:0f 5.0.0-1020-azure500 13.88.20.13200:0d:3a:5a:55:f1 5.0.0-1020-azure500 13.88.20.12600:0d:3a:5a:58:65 5.0.0-1020-azure500 13.88.20.23700:0d:3a:5a:55:42 5.0.0-1020-azure500 13.88.17.35 00:0d:3a:5a:57:5f 5.0.0-1020-azure500 13.91.54.22500:0d:3a:5a:57:5a 5.0.0-1020-azure500 13.88.21.57 00:0d:3a:5a:52:ce 5.0.0-1020-azure500 13.88.21.67 00:0d:3a:5a:5b:b7 5.0.0-1020-azure500 13.88.18.46 00:0d:3a:5a:5d:02 5.0.0-1020-azure229 [hang] 13.88.16.22200:0d:3a:37:80:d1 5.0.0-1020-azure500 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would. Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. After the 4th attempt, my script timed out, I took a look at the instance console log and the following displayed on the console. [ OK ] Reached target Reboot. /shutdown: error while loading shared libra[ 89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.498980] [ 89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu [ 89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 89.508026] Call Trace: [ 89.508026] dump_stack+0x63/0x8a [ 89.508026] panic+0xe7/0x247 [ 89.508026] do_exit.cold.23+0x26/0x75 [ 89.508026] do_group_exit+0x43/0xb0 [ 89.508026] __x64_sys_exit_group+0x18/0x20 [ 89.508026] do_syscall_64+0x5a/0x110 [ 89.508026] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 89.508026] RIP: 0033:0x7f7bf0154d86 [ 89.508026] Code: Bad RIP value. [ 89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 00e7 [ 89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 7f7bf0154d86 [ 89.508026] RDX: 007f RSI: 003c RDI: 007f [ 89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: [ 89.508026] R10: 7ffd6be6974c R11: 0206 R12: 0018 [ 89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: [ 89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation range: 0x8000-0xbfff) [ 89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.508026] ]--- this only occurred once in my testing. To manage notifications about this bug go to:
[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'
See: https://github.com/lxc/lxd/issues/4656#issuecomment-535531229 In https://github.com/lxc/lxd/blob/master/lxd/storage_zfs_utils.go#L255 the umount is done by err := unix.Unmount(mountpoint, unix.MNT_DETACH) The umount2(2) manpage writes about MNT_DETACH: Perform a lazy unmount: make the mount point unavailable for new accesses, immediately disconnect the filesystem and all filesystems mounted below it from each other and from the mount table, and actually perform the unmount when the mount point ceases to be busy. Could this be it? The MNT_DETACH umount looks partially asynchronous. All the subsequent destroy commands may fail because they keep the mount point busy. Finally the retry loop ends, the umount happens for real and the following destroy succeeds. — -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779156 Title: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy' Status in linux package in Ubuntu: Triaged Status in lxc package in Ubuntu: Confirmed Status in linux source package in Cosmic: Triaged Status in lxc source package in Cosmic: Confirmed Status in linux source package in Disco: New Status in lxc source package in Disco: New Status in linux source package in Eoan: Triaged Status in lxc source package in Eoan: Confirmed Bug description: I'm not sure exactly what got me into this state, but I have several lxc containers that cannot be deleted. $ lxc info api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls environment: addresses: [] architectures: - x86_64 - i686 certificate: | -BEGIN CERTIFICATE- -END CERTIFICATE- certificate_fingerprint: 3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb driver: lxc driver_version: 3.0.1 kernel: Linux kernel_architecture: x86_64 kernel_version: 4.15.0-23-generic server: lxd server_pid: 15123 server_version: "3.2" storage: zfs storage_version: 0.7.5-1ubuntu15 server_clustered: false server_name: milhouse $ lxc delete --force b1 Error: Failed to destroy ZFS filesystem: cannot destroy 'default/containers/b1': dataset is busy Talking in #lxc-dev, stgraber and sforeshee provided diagnosis: | short version is that something unshared a mount namespace causing | them to get a copy of the mount table at the time that dataset was | mounted, which then prevents zfs from being able to destroy it) The work around provided was | you can unstick this particular issue by doing: | grep default/containers/b1 /proc/*/mountinfo | then for any of the hits, do: | nsenter -t PID -m -- umount /var/snap/lxd/common/lxd/storage-pools/default/containers/b1 | then try the delete again ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.15.0-23-generic 4.15.0-23.25 ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18 Uname: Linux 4.15.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.10-0ubuntu3 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: smoser31412 F pulseaudio /dev/snd/controlC2: smoser31412 F pulseaudio /dev/snd/controlC0: smoser31412 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Thu Jun 28 10:42:45 2018 EcryptfsInUse: Yes InstallationDate: Installed on 2015-07-23 (1071 days ago) InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1) MachineType: b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1 RelatedPackageVersions: linux-restricted-modules-4.15.0-23-generic N/A linux-backports-modules-4.15.0-23-generic N/A linux-firmware 1.174 SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 03/09/2015 dmi.bios.vendor: Intel Corporation dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355 dmi.board.asset.tag: � dmi.board.name: NUC5i5RYB dmi.board.vendor: Intel Corporation dmi.board.version: H40999-503 dmi.chassis.asset.tag: � dmi.chassis.type: 3 dmi.chassis.vendor: � dmi.chassis.version:
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
I kicked off another ~20K reboot tests with Standard_B2S instances and hit hangs again: IP addr Mac AddrKernel Reboots 104.42.3.16100:0d:3a:37:82:ee 5.0.0-1020-azure100 13.91.5.23 00:0d:3a:5a:74:23 5.0.0-1020-azure57 [ HANG ] 13.91.5.222 00:0d:3a:5a:75:1a 5.0.0-1020-azure100 13.64.117.146 00:0d:3a:5a:74:da 5.0.0-1020-azure100 13.64.117.1700:0d:3a:37:67:0e 5.0.0-1020-azure100 13.91.6.207 00:0d:3a:3a:cc:2c 5.0.0-1020-azure100 40.78.30.12900:0d:3a:36:6e:eb 5.0.0-1020-azure100 104.210.36.238 00:0d:3a:5a:73:da 5.0.0-1020-azure100 13.91.6.143 00:0d:3a:3a:c8:ec 5.0.0-1020-azure100 40.83.249.5800:0d:3a:3a:c0:7a 5.0.0-1020-azure100 104.45.216.53 00:0d:3a:3b:8a:55 5.0.0-1020-azure100 104.210.42.18 00:0d:3a:5a:73:5c 5.0.0-1020-azure100 40.78.27.21 00:0d:3a:3a:c9:19 5.0.0-1020-azure100 40.83.252.110 00:0d:3a:5a:79:93 5.0.0-1020-azure100 13.64.119.204 00:0d:3a:5a:7e:bc 5.0.0-1020-azure100 104.210.34.400:0d:3a:31:18:ee 5.0.0-1020-azure250 138.91.197.202 00:0d:3a:31:1d:c1 5.0.0-1020-azure94 [ HANG ] 138.91.196.241 00:0d:3a:31:15:2b 5.0.0-1020-azure250 104.210.33.44 00:0d:3a:31:16:f3 5.0.0-1020-azure250 40.83.248.7600:0d:3a:32:af:a7 5.0.0-1020-azure250 40.83.253.204 00:0d:3a:32:ba:09 5.0.0-1020-azure250 168.62.202.800:0d:3a:32:a0:11 5.0.0-1020-azure250 40.83.249.8 00:0d:3a:32:bd:ce 5.0.0-1020-azure250 40.83.249.9300:0d:3a:32:b7:32 5.0.0-1020-azure250 40.83.253.187 00:0d:3a:32:b9:cd 5.0.0-1020-azure250 23.99.9.88 00:0d:3a:37:96:c9 5.0.0-1020-azure250 104.40.29.184 00:0d:3a:36:9f:e0 5.0.0-1020-azure250 137.135.40.122 00:0d:3a:36:9f:eb 5.0.0-1020-azure250 137.135.49.43 00:0d:3a:36:92:aa 5.0.0-1020-azure250 138.91.251.800:0d:3a:37:9e:ef 5.0.0-1020-azure250 13.64.146.175 00:0d:3a:31:de:ee 5.0.0-1020-azure500 104.42.23.145 00:0d:3a:31:da:d7 5.0.0-1020-azure500 104.42.29.9900:0d:3a:31:d4:4f 5.0.0-1020-azure500 40.78.106.1200:0d:3a:31:d9:8a 5.0.0-1020-azure500 138.91.233.210 00:0d:3a:31:df:84 5.0.0-1020-azure500 104.42.25.3000:0d:3a:31:c9:a4 5.0.0-1020-azure500 13.64.150.6900:0d:3a:31:dd:47 5.0.0-1020-azure321 [ HANG ] 104.42.25.2300:0d:3a:31:d3:c9 5.0.0-1020-azure500 104.42.24.176 00:0d:3a:31:d8:36 5.0.0-1020-azure500 13.64.79.13300:0d:3a:31:d5:b4 5.0.0-1020-azure500 104.42.29.146 00:0d:3a:31:de:73 5.0.0-1020-azure500 104.42.19.191 00:0d:3a:31:d4:78 5.0.0-1020-azure500 40.118.249.118 00:0d:3a:31:db:20 5.0.0-1020-azure500 40.112.219.112 00:0d:3a:31:dc:da 5.0.0-1020-azure500 104.42.17.115 00:0d:3a:31:d3:21 5.0.0-1020-azure500 40.83.212.164 00:0d:3a:5a:ab:48 5.0.0-1020-azure500 52.160.123.400:0d:3a:36:0d:6a 5.0.0-1020-azure500 52.160.83.3700:0d:3a:5a:ab:79 5.0.0-1020-azure500 52.160.122.92 00:0d:3a:36:00:4c 5.0.0-1020-azure500 52.160.122.71 00:0d:3a:36:0f:bd 5.0.0-1020-azure500 52.160.123.12 00:0d:3a:36:04:39 5.0.0-1020-azure500 104.210.60.218 00:0d:3a:36:b6:25 5.0.0-1020-azure500 52.160.123.221 00:0d:3a:5a:a9:a3 5.0.0-1020-azure500 52.160.123.234 00:0d:3a:5a:a7:1c 5.0.0-1020-azure500 104.210.61.139 00:0d:3a:37:b7:84 5.0.0-1020-azure500 104.210.61.43 00:0d:3a:36:b5:96 5.0.0-1020-azure500 40.83.212.185 00:0d:3a:5a:af:9c 5.0.0-1020-azure500 52.160.82.111 00:0d:3a:5a:a9:a9 5.0.0-1020-azure500 52.160.82.167 00:0d:3a:5a:a7:17 5.0.0-1020-azure500 104.210.61.135 00:0d:3a:36:b3:97 5.0.0-1020-azure500 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
So the best way to reproduce this issue is to run ~500 reboots across multiple instances rather than 5000-1 reboots on once instance. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would. Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. After the 4th attempt, my script timed out, I took a look at the instance console log and the following displayed on the console. [ OK ] Reached target Reboot. /shutdown: error while loading shared libra[ 89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.498980] [ 89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu [ 89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 89.508026] Call Trace: [ 89.508026] dump_stack+0x63/0x8a [ 89.508026] panic+0xe7/0x247 [ 89.508026] do_exit.cold.23+0x26/0x75 [ 89.508026] do_group_exit+0x43/0xb0 [ 89.508026] __x64_sys_exit_group+0x18/0x20 [ 89.508026] do_syscall_64+0x5a/0x110 [ 89.508026] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 89.508026] RIP: 0033:0x7f7bf0154d86 [ 89.508026] Code: Bad RIP value. [ 89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 00e7 [ 89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 7f7bf0154d86 [ 89.508026] RDX: 007f RSI: 003c RDI: 007f [ 89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: [ 89.508026] R10: 7ffd6be6974c R11: 0206 R12: 0018 [ 89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: [ 89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation range: 0x8000-0xbfff) [ 89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.508026] ]--- this only occurred once in my testing. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
See above, I ran several thousand reboot tests on a lot of Basic_A3 instances, ranging from 50, 250 to 500 reboots. Only one failed. So this is *really* hard to reproduce. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would. Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. After the 4th attempt, my script timed out, I took a look at the instance console log and the following displayed on the console. [ OK ] Reached target Reboot. /shutdown: error while loading shared libra[ 89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.498980] [ 89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu [ 89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 89.508026] Call Trace: [ 89.508026] dump_stack+0x63/0x8a [ 89.508026] panic+0xe7/0x247 [ 89.508026] do_exit.cold.23+0x26/0x75 [ 89.508026] do_group_exit+0x43/0xb0 [ 89.508026] __x64_sys_exit_group+0x18/0x20 [ 89.508026] do_syscall_64+0x5a/0x110 [ 89.508026] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 89.508026] RIP: 0033:0x7f7bf0154d86 [ 89.508026] Code: Bad RIP value. [ 89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 00e7 [ 89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 7f7bf0154d86 [ 89.508026] RDX: 007f RSI: 003c RDI: 007f [ 89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: [ 89.508026] R10: 7ffd6be6974c R11: 0206 R12: 0018 [ 89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: [ 89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation range: 0x8000-0xbfff) [ 89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.508026] ]--- this only occurred once in my testing. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
IP addr Mac AddrKernel Reboots 13.64.67.18600:0d:3a:3a:dd:04 5.0.0-1016-azure50 104.42.152.115 00:0d:3a:35:b1:e6 5.0.0-1016-azure50 65.52.121.205 00:0d:3a:3b:0f:52 5.0.0-1016-azure50 13.88.28.42 00:0d:3a:3b:c7:da 5.0.0-1016-azure50 40.118.165.237 00:0d:3a:3b:c2:4e 5.0.0-1016-azure50 40.118.190.105 00:0d:3a:36:c6:d7 5.0.0-1016-azure50 40.78.90.95 00:0d:3a:37:c0:d9 5.0.0-1016-azure50 13.83.84.15000:0d:3a:37:c0:15 5.0.0-1016-azure50 104.42.74.129 00:0d:3a:36:c2:3e 5.0.0-1016-azure50 40.85.154.162 00:0d:3a:37:cc:dd 5.0.0-1016-azure50 40.78.43.4 00:0d:3a:37:c5:07 5.0.0-1016-azure50 13.93.142.147 00:0d:3a:37:c8:5f 5.0.0-1016-azure50 40.78.44.22900:0d:3a:3b:e4:80 5.0.0-1016-azure50 40.118.189.62 00:0d:3a:3b:e8:8e 5.0.0-1016-azure50 40.78.85.10 00:0d:3a:3b:e6:37 5.0.0-1016-azure50 40.78.13.20300:0d:3a:3a:c2:b0 5.0.0-1016-azure50 104.42.112.81 00:0d:3a:30:71:fb 5.0.0-1016-azure50 40.80.156.132 00:0d:3a:30:2f:7c 5.0.0-1016-azure50 13.64.173.138 00:0d:3a:30:73:b2 5.0.0-1016-azure50 13.64.189.105 00:0d:3a:30:a4:6f 5.0.0-1016-azure50 13.64.189.127 00:0d:3a:30:a4:1f 5.0.0-1016-azure50 104.45.237.232 00:0d:3a:32:1e:3b 5.0.0-1016-azure50 104.42.233.11 00:0d:3a:32:34:68 5.0.0-1016-azure50 104.42.233.20 00:0d:3a:34:ed:42 5.0.0-1016-azure50 23.101.202.206 00:0d:3a:32:32:b0 5.0.0-1016-azure50 104.42.233.18 00:0d:3a:34:ee:ba 5.0.0-1016-azure50 104.42.233.151 00:0d:3a:34:e9:0d 5.0.0-1016-azure50 104.40.51.248 00:0d:3a:32:27:c6 5.0.0-1016-azure50 104.40.69.158 00:0d:3a:34:f1:5d 5.0.0-1016-azure50 52.160.41.9500:0d:3a:35:9f:c8 5.0.0-1016-azure50 104.42.158.74 00:0d:3a:34:c7:91 5.0.0-1016-azure50 IP addr Mac AddrKernel Reboots 40.83.145.235 00:0d:3a:5a:01:f9 5.0.0-1016-azure250 104.210.50.91 00:0d:3a:35:b2:48 5.0.0-1016-azure250 13.88.186.166 00:0d:3a:5a:0b:86 5.0.0-1016-azure250 40.118.185.194 00:0d:3a:35:b9:59 5.0.0-1016-azure250 104.42.37.175 00:0d:3a:5a:06:ff 5.0.0-1016-azure250 13.88.186.188 00:0d:3a:5a:05:da 5.0.0-1016-azure250 104.210.48.49 00:0d:3a:35:b8:a7 5.0.0-1016-azure250 104.210.50.215 00:0d:3a:35:ba:13 5.0.0-1016-azure250 40.78.52.50 00:0d:3a:35:b6:50 5.0.0-1016-azure250 40.118.186.25 00:0d:3a:35:b5:93 5.0.0-1016-azure250 13.93.233.2600:0d:3a:37:06:7e 5.0.0-1016-azure156 crashed 13.93.136.144 00:0d:3a:37:0e:2f 5.0.0-1016-azure250 40.118.241.192 00:0d:3a:32:c4:5d 5.0.0-1016-azure250 40.83.160.5200:0d:3a:37:47:48 5.0.0-1016-azure250 104.42.9.61 00:0d:3a:36:d3:a0 5.0.0-1016-azure250 IP addr Mac AddrKernel Reboots 104.40.1.50 00:0d:3a:30:8d:ae 5.0.0-1016-azure500 104.40.3.20500:0d:3a:30:81:d5 5.0.0-1016-azure500 104.40.9.37 00:0d:3a:30:86:bb 5.0.0-1016-azure500 104.40.0.24200:0d:3a:30:88:6b 5.0.0-1016-azure500 104.40.12.184 00:0d:3a:30:8e:e0 5.0.0-1016-azure500 137.135.46.72 00:0d:3a:30:ac:df 5.0.0-1016-azure500 137.135.47.169 00:0d:3a:30:9a:3f 5.0.0-1016-azure500 104.40.10.226 00:0d:3a:30:2d:fb 5.0.0-1016-azure500 104.40.10.244 00:0d:3a:30:8e:12 5.0.0-1016-azure500 104.40.15.160 00:0d:3a:30:87:4d 5.0.0-1016-azure500 40.112.132.200:0d:3a:59:b5:80 5.0.0-1016-azure500 13.64.97.59 00:0d:3a:59:b1:e7 5.0.0-1016-azure500 40.78.19.22400:0d:3a:5a:bd:99 5.0.0-1016-azure500 13.64.88.51 00:0d:3a:37:30:07 5.0.0-1016-azure500 40.118.225.110 00:0d:3a:59:b1:8f 5.0.0-1016-azure500 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug
[Kernel-packages] [Bug 1815178] Re: 18.04: Raid performances on kernel 4.15 and newer are suboptimal when used on NVMe devices
This bug has been dormant for a while with no update. I'm marking it as won't fix. If this is still and issue, please re-open this bug. ** Changed in: linux (Ubuntu) Status: Incomplete => Won't Fix -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1815178 Title: 18.04: Raid performances on kernel 4.15 and newer are suboptimal when used on NVMe devices Status in linux package in Ubuntu: Won't Fix Bug description: Hello, We have been running multiple tests using the md driver to build various type of RAID devices. Performances on RAID5 are particularly disappointing so we would like to know if there are any known issue with the md driver on Bionic kernels. Here are some of our results : Test (Threads)QCT JBODQCT RAID5 (8 threads) QCT RAID10 read(1) 838751 (-10%) 572 (-32%) read(2) 1226 1172 (-4%) 1057 (-14%) read(4) 1129 2006 (+78%) 1760 (+56%) write(1) 804382 (-52%) 398 (-50%) write(2) 1047 175 (-83%) 667 (-36%) write(4) 883415 (-53%) 871 (-1%) randread(1) 863749 (-13%) 619 (-28%) randread(2) 1346 1067 (-21%) 1020 (-24%) randread(4) 1648 1785 (+8%) 1650 (=) randwrite(1) 766287 (-63%) 391 (-50%) randwrite(2) 1044 313 (-70%) 664 (-36%) randwrite(4) 916282 (-69%) 885 (-3%) We are preparing tests with one of the latest mainline kernel. TIA, ...Louis To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1815178/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1811730] Re: Thermald does not set max CPU after reseting the voltage using RAPL
** Changed in: thermald (Ubuntu) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to thermald in Ubuntu. https://bugs.launchpad.net/bugs/1811730 Title: Thermald does not set max CPU after reseting the voltage using RAPL Status in thermald package in Ubuntu: Fix Released Status in thermald source package in Bionic: Fix Released Bug description: Hi, I was using Ubuntu 18.10 thermald package, but I noted that, after few seconds at max CPU usage (max temp), thermald send the signal to RALP to reduce the voltage of the CPU. It set the freq to minimum (800MHz in my case). But when the CPU is idle and temp is lowered (35-40ºC) it did not send the signal to resume normal operation of the CPU. I compiled the latest version of thermald from git (https://github.com/intel/thermal_daemon) and now everything works fine. I started to thought that the problem was the hardware or BIOS problem, as I disabled the CPU scaling on the BIOS, but intel_pstate continued doing freq scaling (I think for Turbo mode). But the real problem was Thermald. The latest version from git works really fine and it automatically disables and enable the Intel Turbo state and balance the freqs and fan control fine. My hardware is a Lenovo Thinkpad P52 with i7-8850H. I tested it using Ubuntu kernel and Kernel 4.20 optimized for i7 processor but with Ubuntu default config (except processor family="Core2/newer Xeon", Preemption Model="Preemptible Kernel (Low- Latency Desktop)" and Timer frequency="1000HZ". (Only changed those settings from make oldconfig and make deb-pkg). ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: thermald (not installed) ProcVersionSignature: Ubuntu 4.18.0-13.14-generic 4.18.17 Uname: Linux 4.18.0-13-generic x86_64 NonfreeKernelModules: nvidia_modeset nvidia ApportVersion: 2.20.10-0ubuntu13.1 Architecture: amd64 CurrentDesktop: ubuntu:GNOME Date: Mon Jan 14 23:45:01 2019 InstallationDate: Installed on 2018-12-11 (34 days ago) InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 (20181017.3) SourcePackage: thermald UpgradeStatus: No upgrade log present (probably fresh install) - SRU Justification == [Impact] * As described by the original bug reporter, CPU usage of Lenovo P52 is sub-optimal under heavy load. * My observation is the machine exhibits a sharp drop of power usage and CPU frequency and takes time to slowly ramp up again (refer to the chart at https://bit.ly/2OJphB8) * Fixed by bisecting and backporting fixes from thermald project. [Test Case] * One can stress the CPU load of the machine and collect the CPU frequency and power usage over time to check for any anamoly. The script at https://people.canonical.com/~ypwong/p52_test_cpu.sh can help with this. * With the fix, the behaviour should be like this: https://bit.ly/2KA9EXB, a consistent power usage can be maintained. [Regression Potential] * Medium. The fix consists of 7 commits cherry-picked from upstream, these changes will affect any machines using RAPL cooling device. The impact may not be obvious in normal daily usage but will be manifested during heavy load, suggest a longer verification period so that more people can discover any adverse effect. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1811730/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'
Been digging into this a bit further with lxc 3.17 on Eoan. lxc launch ubuntu:bionic zfs-bug-test Creating zfs-bug-test Starting zfs-bug-test lxc delete zfs-bug-test --force Error: Failed to destroy ZFS filesystem: Failed to run: zfs destroy -r default/containers/z1: cannot destroy 'default/containers/z1': dataset is busy However, re-running the delete works fine: lxd.lxc delete z1 --force Looking at system calls, it appears that the first failing delete --force command attempts to destroy the zfs file system multiple times and then gives up. In doing so, it umounts the zfs file system. Hence the second time the delete is issued it works fine because zfs is now umounted. So it appears that the ordering in the delete is not as it expected. It seems to do: zfs destroy x 10 (or so and then gives up because of errno 16 -EBUSY) zfs umount It should be doing: zfs umount zfs destroy This matches the observed reference counting. The ref count is only dropped once the umount is complete. Attempts to destroy it before that will cause an -EBUSY. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779156 Title: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy' Status in linux package in Ubuntu: Triaged Status in lxc package in Ubuntu: Confirmed Status in linux source package in Cosmic: Triaged Status in lxc source package in Cosmic: Confirmed Status in linux source package in Disco: New Status in lxc source package in Disco: New Status in linux source package in Eoan: Triaged Status in lxc source package in Eoan: Confirmed Bug description: I'm not sure exactly what got me into this state, but I have several lxc containers that cannot be deleted. $ lxc info api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls environment: addresses: [] architectures: - x86_64 - i686 certificate: | -BEGIN CERTIFICATE- -END CERTIFICATE- certificate_fingerprint: 3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb driver: lxc driver_version: 3.0.1 kernel: Linux kernel_architecture: x86_64 kernel_version: 4.15.0-23-generic server: lxd server_pid: 15123 server_version: "3.2" storage: zfs storage_version: 0.7.5-1ubuntu15 server_clustered: false server_name: milhouse $ lxc delete --force b1 Error: Failed to destroy ZFS filesystem: cannot destroy 'default/containers/b1': dataset is busy Talking in #lxc-dev, stgraber and sforeshee provided diagnosis: | short version is that something unshared a mount namespace causing | them to get a copy of the mount table at the time that dataset was | mounted, which then prevents zfs from being able to destroy it) The work around provided was | you can unstick this particular issue by doing: | grep default/containers/b1 /proc/*/mountinfo | then for any of the hits, do: | nsenter -t PID -m -- umount /var/snap/lxd/common/lxd/storage-pools/default/containers/b1 | then try the delete again ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.15.0-23-generic 4.15.0-23.25 ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18 Uname: Linux 4.15.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.10-0ubuntu3 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: smoser31412 F pulseaudio /dev/snd/controlC2: smoser31412 F pulseaudio /dev/snd/controlC0: smoser31412 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Thu Jun 28 10:42:45 2018 EcryptfsInUse: Yes InstallationDate: Installed on 2015-07-23 (1071 days ago) InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1) MachineType: b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1 RelatedPackageVersions: linux-restricted-modules-4.15.0-23-generic N/A linux-backports-modules-4.15.0-23-generic N/A linux-firmware 1.174 SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 03/09/2015 dmi.bios.vendor: Intel Corporation dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355 dmi.board.asset.tag: �
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
@Robert, was there a specific class of virtual machine you were using when this issue occurred? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would. Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. After the 4th attempt, my script timed out, I took a look at the instance console log and the following displayed on the console. [ OK ] Reached target Reboot. /shutdown: error while loading shared libra[ 89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.498980] [ 89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu [ 89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 89.508026] Call Trace: [ 89.508026] dump_stack+0x63/0x8a [ 89.508026] panic+0xe7/0x247 [ 89.508026] do_exit.cold.23+0x26/0x75 [ 89.508026] do_group_exit+0x43/0xb0 [ 89.508026] __x64_sys_exit_group+0x18/0x20 [ 89.508026] do_syscall_64+0x5a/0x110 [ 89.508026] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 89.508026] RIP: 0033:0x7f7bf0154d86 [ 89.508026] Code: Bad RIP value. [ 89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 00e7 [ 89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 7f7bf0154d86 [ 89.508026] RDX: 007f RSI: 003c RDI: 007f [ 89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: [ 89.508026] R10: 7ffd6be6974c R11: 0206 R12: 0018 [ 89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: [ 89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation range: 0x8000-0xbfff) [ 89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.508026] ]--- this only occurred once in my testing. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
** Changed in: linux-azure (Ubuntu) Status: In Progress => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would. Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. After the 4th attempt, my script timed out, I took a look at the instance console log and the following displayed on the console. [ OK ] Reached target Reboot. /shutdown: error while loading shared libra[ 89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.498980] [ 89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu [ 89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 89.508026] Call Trace: [ 89.508026] dump_stack+0x63/0x8a [ 89.508026] panic+0xe7/0x247 [ 89.508026] do_exit.cold.23+0x26/0x75 [ 89.508026] do_group_exit+0x43/0xb0 [ 89.508026] __x64_sys_exit_group+0x18/0x20 [ 89.508026] do_syscall_64+0x5a/0x110 [ 89.508026] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 89.508026] RIP: 0033:0x7f7bf0154d86 [ 89.508026] Code: Bad RIP value. [ 89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 00e7 [ 89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 7f7bf0154d86 [ 89.508026] RDX: 007f RSI: 003c RDI: 007f [ 89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: [ 89.508026] R10: 7ffd6be6974c R11: 0206 R12: 0018 [ 89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: [ 89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation range: 0x8000-0xbfff) [ 89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.508026] ]--- this only occurred once in my testing. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance
@Joseph, any ideas how we can progress on this? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1822118 Title: Kernel Panic while rebooting cloud instance Status in linux-azure package in Ubuntu: Incomplete Status in systemd package in Ubuntu: New Bug description: Description: In the event a particular Azure cloud instance is rebooted it's possible that it may never recover and the instance will break indefinitely. In My case, it was a kernel panic. See specifics below.. Series: Disco Instance Size: Basic_A3 Region: (Default) US-WEST-2 Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux I had a simple script to reboot an instance (X) amount of times, I chose 50, so the machine would power cycle by issuing a "reboot" from the terminal prompt just as a user would. Once the machine came up, it captured dmesg and other bits then rebooted again until it reached 50. After the 4th attempt, my script timed out, I took a look at the instance console log and the following displayed on the console. [ OK ] Reached target Reboot. /shutdown: error while loading shared libra[ 89.498980] Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.498980] [ 89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure #13-Ubuntu [ 89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 89.508026] Call Trace: [ 89.508026] dump_stack+0x63/0x8a [ 89.508026] panic+0xe7/0x247 [ 89.508026] do_exit.cold.23+0x26/0x75 [ 89.508026] do_group_exit+0x43/0xb0 [ 89.508026] __x64_sys_exit_group+0x18/0x20 [ 89.508026] do_syscall_64+0x5a/0x110 [ 89.508026] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 89.508026] RIP: 0033:0x7f7bf0154d86 [ 89.508026] Code: Bad RIP value. [ 89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 00e7 [ 89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 7f7bf0154d86 [ 89.508026] RDX: 007f RSI: 003c RDI: 007f [ 89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: [ 89.508026] R10: 7ffd6be6974c R11: 0206 R12: 0018 [ 89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: [ 89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation range: 0x8000-0xbfff) [ 89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00 [ 89.508026] ]--- this only occurred once in my testing. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798574] Re: bionic/linux: hung task triggered by ubuntu_vfat_stress stress-ng testcase
I can reproduce this on 4.15.0-38 but not on a more recent kernel, e.g. 4.15.0-64, so I think this has been fixed. I'll close this for now as fixed released, but feel free to re-open it if we see the same issue again. ** Changed in: linux (Ubuntu) Status: Incomplete => Fix Released ** Changed in: linux (Ubuntu Bionic) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798574 Title: bionic/linux: hung task triggered by ubuntu_vfat_stress stress-ng testcase Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: Fix Released Bug description: When running the ubuntu_vfat_stress on a powerpc64el system with kernel 4.15.0-36-generic it triggers the following hung task: 10:39:09 DEBUG| [stdout] Mounted tmpfs /mnt/vfat-test-56562 10:39:09 DEBUG| [stdout] Created loop image /mnt/vfat-test-56562/vfat-loop-data 10:39:09 DEBUG| [stdout] mkfs.fat 4.1 (2017-01-24) 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] VFAT options: dmask=777 10:39:09 DEBUG| [stdout] Stress test: /home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-ng --verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 --aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 --fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M --fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512 10:39:09 DEBUG| [stdout] VFAT_IMAGE path: /mnt/vfat-test-56562 10:39:09 DEBUG| [stdout] Mount point: /mnt/vfat-test-56562 10:39:09 DEBUG| [stdout] Date: Thu Oct 18 10:39:09 UTC 2018 10:39:09 DEBUG| [stdout] Host: baltar 10:39:09 DEBUG| [stdout] Kernel:4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:18:48 UTC 2018 10:39:09 DEBUG| [stdout] Machine: baltar ppc64le ppc64le 10:39:09 DEBUG| [stdout] CPUs online: 160 10:39:09 DEBUG| [stdout] CPUs total:160 10:39:09 DEBUG| [stdout] Page size: 65536 10:39:09 DEBUG| [stdout] Pages avail: 1907765 10:39:09 DEBUG| [stdout] Pages total: 2089666 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] stress-ng: info: [146983] dispatching hogs: 2 hdd, 2 lockf, 2 seek, 2 aio, 2 dentry, 2 dir, 2 fallocate, 2 fstat, 2 lease, 2 open, 2 rename, 2 chdir, 2 rename 10:39:09 DEBUG| [stdout] stress-ng: info: [146983] cache allocate: using built-in defaults as unable to determine cache details 10:39:11 DEBUG| [stdout] stress-ng: fail: [147027] stress-ng-chdir: mkdir failed, errno=28 (No space left on device) 10:39:11 DEBUG| [stdout] stress-ng: fail: [147007] stress-ng-hdd: read failed, errno=28 (No space left on device) 10:39:11 DEBUG| [stdout] stress-ng: fail: [146993] stress-ng-hdd: read failed, errno=28 (No space left on device) 10:39:11 DEBUG| [stdout] stress-ng: fail: [147005] stress-ng-chdir: mkdir failed, errno=28 (No space left on device) 10:39:12 DEBUG| [stdout] stress-ng: error: [146983] process 146993 (stress-ng-hdd) terminated with an error, exit status=1 (stress-ng core failure) 10:39:12 DEBUG| [stdout] stress-ng: error: [146983] process 147007 (stress-ng-hdd) terminated with an error, exit status=1 (stress-ng core failure) 10:42:05 DEBUG| [stdout] Found kernel warning and/or call trace: 10:42:05 DEBUG| [stdout] 10:42:05 DEBUG| [stdout] [ 5869.838856] TESTING: --verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 --aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 --fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M --fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512 10:42:05 DEBUG| [stdout] [ 5870.027499] ubuntu_vfat_str (56562): drop_caches: 1 10:42:05 DEBUG| [stdout] [ 5870.037987] ubuntu_vfat_str (56562): drop_caches: 2 10:42:05 DEBUG| [stdout] [ 5870.041591] ubuntu_vfat_str (56562): drop_caches: 3 10:42:05 DEBUG| [stdout] [ 5870.441211] VFAT options: dmask=777 10:42:05 DEBUG| [stdout] [ 5870.441271] Stress test: /home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-ng --verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 --aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 --fstat 2
[Kernel-packages] [Bug 1798574] Re: bionic/linux: hung task triggered by ubuntu_vfat_stress stress-ng testcase
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) ** Changed in: linux (Ubuntu Bionic) Assignee: (unassigned) => Colin Ian King (colin-king) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798574 Title: bionic/linux: hung task triggered by ubuntu_vfat_stress stress-ng testcase Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Confirmed Bug description: When running the ubuntu_vfat_stress on a powerpc64el system with kernel 4.15.0-36-generic it triggers the following hung task: 10:39:09 DEBUG| [stdout] Mounted tmpfs /mnt/vfat-test-56562 10:39:09 DEBUG| [stdout] Created loop image /mnt/vfat-test-56562/vfat-loop-data 10:39:09 DEBUG| [stdout] mkfs.fat 4.1 (2017-01-24) 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] VFAT options: dmask=777 10:39:09 DEBUG| [stdout] Stress test: /home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-ng --verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 --aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 --fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M --fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512 10:39:09 DEBUG| [stdout] VFAT_IMAGE path: /mnt/vfat-test-56562 10:39:09 DEBUG| [stdout] Mount point: /mnt/vfat-test-56562 10:39:09 DEBUG| [stdout] Date: Thu Oct 18 10:39:09 UTC 2018 10:39:09 DEBUG| [stdout] Host: baltar 10:39:09 DEBUG| [stdout] Kernel:4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:18:48 UTC 2018 10:39:09 DEBUG| [stdout] Machine: baltar ppc64le ppc64le 10:39:09 DEBUG| [stdout] CPUs online: 160 10:39:09 DEBUG| [stdout] CPUs total:160 10:39:09 DEBUG| [stdout] Page size: 65536 10:39:09 DEBUG| [stdout] Pages avail: 1907765 10:39:09 DEBUG| [stdout] Pages total: 2089666 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] 10:39:09 DEBUG| [stdout] stress-ng: info: [146983] dispatching hogs: 2 hdd, 2 lockf, 2 seek, 2 aio, 2 dentry, 2 dir, 2 fallocate, 2 fstat, 2 lease, 2 open, 2 rename, 2 chdir, 2 rename 10:39:09 DEBUG| [stdout] stress-ng: info: [146983] cache allocate: using built-in defaults as unable to determine cache details 10:39:11 DEBUG| [stdout] stress-ng: fail: [147027] stress-ng-chdir: mkdir failed, errno=28 (No space left on device) 10:39:11 DEBUG| [stdout] stress-ng: fail: [147007] stress-ng-hdd: read failed, errno=28 (No space left on device) 10:39:11 DEBUG| [stdout] stress-ng: fail: [146993] stress-ng-hdd: read failed, errno=28 (No space left on device) 10:39:11 DEBUG| [stdout] stress-ng: fail: [147005] stress-ng-chdir: mkdir failed, errno=28 (No space left on device) 10:39:12 DEBUG| [stdout] stress-ng: error: [146983] process 146993 (stress-ng-hdd) terminated with an error, exit status=1 (stress-ng core failure) 10:39:12 DEBUG| [stdout] stress-ng: error: [146983] process 147007 (stress-ng-hdd) terminated with an error, exit status=1 (stress-ng core failure) 10:42:05 DEBUG| [stdout] Found kernel warning and/or call trace: 10:42:05 DEBUG| [stdout] 10:42:05 DEBUG| [stdout] [ 5869.838856] TESTING: --verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 --aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 --fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M --fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512 10:42:05 DEBUG| [stdout] [ 5870.027499] ubuntu_vfat_str (56562): drop_caches: 1 10:42:05 DEBUG| [stdout] [ 5870.037987] ubuntu_vfat_str (56562): drop_caches: 2 10:42:05 DEBUG| [stdout] [ 5870.041591] ubuntu_vfat_str (56562): drop_caches: 3 10:42:05 DEBUG| [stdout] [ 5870.441211] VFAT options: dmask=777 10:42:05 DEBUG| [stdout] [ 5870.441271] Stress test: /home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-ng --verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 --aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 --fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M --fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512 10:42:05 DEBUG| [stdout] [ 5870.441296] Mount point: /mnt/vfa
[Kernel-packages] [Bug 1840934] Re: Change kernel compression method to improve boot speed
>From "Comment bridged from LTC Bugzilla" in a bug discussion: "FWIW, I verified this on z14, and there clearly lz4 is (as expected) the fastest decompression algorithm. With vanilla 5.3-rc6 and defconfig I get the following kernel uncompression times: lzo: 27us lz4: 24us An initrd (uncompressed size ~55MB) gets these uncompression times: lzo: 62us lz4: 49us So I'd clearly vote to switch to lz4 on s390 as well." Also: "I also instrumented the kernel code to only measure the time to decompress the kernel. If its stckf or stcke doesn't matter in this case. Note that if you shift a tod clock value 12 bits to the right will give you microseconds. (All numbers I posted were actually milliseconds not microseconds by the way). I measured both runs (z13 + z14) when running within z/VM and IPL'ed from the punch card reader. Times used for decompressing the initrd were just extracted from dmesg; no kernel instrumentation required here, since there are two messages provided before and after initrd decompression. Find below an extract of the patch to measure decompression time. diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c index 7b0d054..cee3d97 100644 --- a/arch/s390/boot/startup.c +++ b/arch/s390/boot/startup.c @@ -146,7 +146,10 @@ void startup_kernel(void) } if (!IS_ENABLED(CONFIG_KERNEL_UNCOMPRESSED)) { + start = get_tod_clock(); img = decompress_kernel(); + end = get_tod_clock(); + time = (end - start) >> 12; memmove((void *)vmlinux.default_lma, img, vmlinux.image_size); } else if (__kaslr_offset) memcpy((void *)vmlinux.default_lma, img, vmlinux.image_size); ..." -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1840934 Title: Change kernel compression method to improve boot speed Status in linux package in Ubuntu: Fix Released Bug description: Colin King has done some analysis of kernel boot speed using different kernel compression methods. Results for x86 are at: https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/kernel-compression-method.txt https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/boot-speed-compression-5.3-rc4.ods Testing of s390 gave the following: GZIP31528972 LZ4 192348049 LZO 85990145 From Colin: "I used the monotonic TOD timer using the stckf opcode to fetch a 64 bit time value. Not sure how this maps to 'real time' in seconds." Conclusion: We should switch x86 to LZ4 and s390 to LZO. PPC and ARM do not support LZO or LZ4, so we will stick with gzip there. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840934/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1840934] Re: Change kernel compression method to improve boot speed
Also enable LZ4 for s390x as IBM has provided us with some positive feedback about using this. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1840934 Title: Change kernel compression method to improve boot speed Status in linux package in Ubuntu: Fix Released Bug description: Colin King has done some analysis of kernel boot speed using different kernel compression methods. Results for x86 are at: https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/kernel-compression-method.txt https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/boot-speed-compression-5.3-rc4.ods Testing of s390 gave the following: GZIP31528972 LZ4 192348049 LZO 85990145 From Colin: "I used the monotonic TOD timer using the stckf opcode to fetch a 64 bit time value. Not sure how this maps to 'real time' in seconds." Conclusion: We should switch x86 to LZ4 and s390 to LZO. PPC and ARM do not support LZO or LZ4, so we will stick with gzip there. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840934/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'
The ZFS destroy checks the reference count on the dataset with zfs_refcount_count(>ds_longholds) != expected_holds and returns EBUSY in dsl_destroy_head_check_impl. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779156 Title: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy' Status in linux package in Ubuntu: Triaged Status in lxc package in Ubuntu: Confirmed Status in linux source package in Cosmic: Triaged Status in lxc source package in Cosmic: Confirmed Status in linux source package in Disco: New Status in lxc source package in Disco: New Status in linux source package in Eoan: Triaged Status in lxc source package in Eoan: Confirmed Bug description: I'm not sure exactly what got me into this state, but I have several lxc containers that cannot be deleted. $ lxc info api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls environment: addresses: [] architectures: - x86_64 - i686 certificate: | -BEGIN CERTIFICATE- -END CERTIFICATE- certificate_fingerprint: 3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb driver: lxc driver_version: 3.0.1 kernel: Linux kernel_architecture: x86_64 kernel_version: 4.15.0-23-generic server: lxd server_pid: 15123 server_version: "3.2" storage: zfs storage_version: 0.7.5-1ubuntu15 server_clustered: false server_name: milhouse $ lxc delete --force b1 Error: Failed to destroy ZFS filesystem: cannot destroy 'default/containers/b1': dataset is busy Talking in #lxc-dev, stgraber and sforeshee provided diagnosis: | short version is that something unshared a mount namespace causing | them to get a copy of the mount table at the time that dataset was | mounted, which then prevents zfs from being able to destroy it) The work around provided was | you can unstick this particular issue by doing: | grep default/containers/b1 /proc/*/mountinfo | then for any of the hits, do: | nsenter -t PID -m -- umount /var/snap/lxd/common/lxd/storage-pools/default/containers/b1 | then try the delete again ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.15.0-23-generic 4.15.0-23.25 ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18 Uname: Linux 4.15.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.10-0ubuntu3 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: smoser31412 F pulseaudio /dev/snd/controlC2: smoser31412 F pulseaudio /dev/snd/controlC0: smoser31412 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Thu Jun 28 10:42:45 2018 EcryptfsInUse: Yes InstallationDate: Installed on 2015-07-23 (1071 days ago) InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1) MachineType: b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1 RelatedPackageVersions: linux-restricted-modules-4.15.0-23-generic N/A linux-backports-modules-4.15.0-23-generic N/A linux-firmware 1.174 SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 03/09/2015 dmi.bios.vendor: Intel Corporation dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355 dmi.board.asset.tag: � dmi.board.name: NUC5i5RYB dmi.board.vendor: Intel Corporation dmi.board.version: H40999-503 dmi.chassis.asset.tag: � dmi.chassis.type: 3 dmi.chassis.vendor: � dmi.chassis.version: � dmi.modalias: dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i5RYB:rvrH40999-503:cvn:ct3:cvr: dmi.product.family: � dmi.product.name: � dmi.product.version: � dmi.sys.vendor: � To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779156/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net
[Kernel-packages] [Bug 1832384] Re: Unable to unmount apparently unused filesystem
latter 1-2 weeks of this cycle -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1832384 Title: Unable to unmount apparently unused filesystem Status in linux package in Ubuntu: Incomplete Bug description: We periodically see an issue where unmounting a ZFS filesystem fails with EBUSY, even though there appears to be no one using it. # cat /proc/self/mounts | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0 'lsof' and 'fuser' show no processes using any of the files in the problematic filesystem: # ls -l /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/ total 221 -rw-r- 1 500 500 52736 May 22 11:01 1_19_1008904362.dbf -rw-r- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/ # lsof | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive # The filesystem was shared over NFS, but has since been unshared: # showmount -e | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive # Since no one appears to be using the filesystem, our expectation is that it should be possible to unmount the filesystem. However, attempts to unmount the filesystem fail with EBUSY: # zfs destroy domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive umount: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target is busy. cannot unmount '/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive': umount failed # umount /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive umount: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target is busy. Using bpftrace, we can see that the unmount is failing in 'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can look at the 'mount' struct for this particular mount and see that the 'mnt_count' refcount summed across all CPUs is 2. For filesystems that are eligible for unmounting, the refcount is 1. The only way to work around this issue that we have found is to reboot, at which point the filesystem can be unmounted and destroyed. So far, we have only been able to reproduce this using a workload driven by our application. The application mananges ZFS filesystems in groups, and the lifecycle of each group looks something like - Create and mount a group of filesystems, 1 parent and 4 children: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370 /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp - Share all 5 filesystems over NFS - A client mounts all 5 shares using NFSv3 - For a few hours, the client does NFS operations on the filesystems and the server occasionally takes ZFS snapshots of them - Unshare filesystems - Unmount filesystems - Delete filesystems These groups of filesystems are constantly being created and destroyed. At any given time, we have ~30k filesystems on the system, about 5k of which are shared. On average, one out of ~200-300k unmounts fails with this EBUSY error. To create and destroy this many filesystems takes us about a week or so. Note that we are using ZFS built from https://github.com/delphix/zfs, which is essentially master ZFS on Linux. ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-50-generic 4.15.0-50.54 ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18 Uname: Linux 4.15.0-50-generic x86_64 NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp AlsaDevices: total 0 crw-rw 1 root audio 116, 1 May 20 19:10 seq crw-rw 1 root audio 116, 33 May 20 19:10 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code
[Kernel-packages] [Bug 1832384] Re: Unable to unmount apparently unused filesystem
See also: https://wiki.ubuntu.com/Kernel/StableReleaseCadence and https://kernel.ubuntu.com/ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1832384 Title: Unable to unmount apparently unused filesystem Status in linux package in Ubuntu: Incomplete Bug description: We periodically see an issue where unmounting a ZFS filesystem fails with EBUSY, even though there appears to be no one using it. # cat /proc/self/mounts | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0 'lsof' and 'fuser' show no processes using any of the files in the problematic filesystem: # ls -l /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/ total 221 -rw-r- 1 500 500 52736 May 22 11:01 1_19_1008904362.dbf -rw-r- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/ # lsof | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive # The filesystem was shared over NFS, but has since been unshared: # showmount -e | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive # Since no one appears to be using the filesystem, our expectation is that it should be possible to unmount the filesystem. However, attempts to unmount the filesystem fail with EBUSY: # zfs destroy domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive umount: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target is busy. cannot unmount '/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive': umount failed # umount /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive umount: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target is busy. Using bpftrace, we can see that the unmount is failing in 'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can look at the 'mount' struct for this particular mount and see that the 'mnt_count' refcount summed across all CPUs is 2. For filesystems that are eligible for unmounting, the refcount is 1. The only way to work around this issue that we have found is to reboot, at which point the filesystem can be unmounted and destroyed. So far, we have only been able to reproduce this using a workload driven by our application. The application mananges ZFS filesystems in groups, and the lifecycle of each group looks something like - Create and mount a group of filesystems, 1 parent and 4 children: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370 /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp - Share all 5 filesystems over NFS - A client mounts all 5 shares using NFSv3 - For a few hours, the client does NFS operations on the filesystems and the server occasionally takes ZFS snapshots of them - Unshare filesystems - Unmount filesystems - Delete filesystems These groups of filesystems are constantly being created and destroyed. At any given time, we have ~30k filesystems on the system, about 5k of which are shared. On average, one out of ~200-300k unmounts fails with this EBUSY error. To create and destroy this many filesystems takes us about a week or so. Note that we are using ZFS built from https://github.com/delphix/zfs, which is essentially master ZFS on Linux. ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-50-generic 4.15.0-50.54 ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18 Uname: Linux 4.15.0-50-generic x86_64 NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp AlsaDevices: total 0 crw-rw 1 root audio 116, 1 May 20 19:10 seq crw-rw 1 root audio 116, 33 May 20 19:10 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser',
[Kernel-packages] [Bug 1832384] Re: Unable to unmount apparently unused filesystem
We generally have a 3 week release cycle on kernels, so if it's in -proposed it probably in the later 1-2 weeks of this cycle. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1832384 Title: Unable to unmount apparently unused filesystem Status in linux package in Ubuntu: Incomplete Bug description: We periodically see an issue where unmounting a ZFS filesystem fails with EBUSY, even though there appears to be no one using it. # cat /proc/self/mounts | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0 'lsof' and 'fuser' show no processes using any of the files in the problematic filesystem: # ls -l /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/ total 221 -rw-r- 1 500 500 52736 May 22 11:01 1_19_1008904362.dbf -rw-r- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/ # lsof | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive # The filesystem was shared over NFS, but has since been unshared: # showmount -e | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive # Since no one appears to be using the filesystem, our expectation is that it should be possible to unmount the filesystem. However, attempts to unmount the filesystem fail with EBUSY: # zfs destroy domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive umount: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target is busy. cannot unmount '/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive': umount failed # umount /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive umount: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target is busy. Using bpftrace, we can see that the unmount is failing in 'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can look at the 'mount' struct for this particular mount and see that the 'mnt_count' refcount summed across all CPUs is 2. For filesystems that are eligible for unmounting, the refcount is 1. The only way to work around this issue that we have found is to reboot, at which point the filesystem can be unmounted and destroyed. So far, we have only been able to reproduce this using a workload driven by our application. The application mananges ZFS filesystems in groups, and the lifecycle of each group looks something like - Create and mount a group of filesystems, 1 parent and 4 children: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370 /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp - Share all 5 filesystems over NFS - A client mounts all 5 shares using NFSv3 - For a few hours, the client does NFS operations on the filesystems and the server occasionally takes ZFS snapshots of them - Unshare filesystems - Unmount filesystems - Delete filesystems These groups of filesystems are constantly being created and destroyed. At any given time, we have ~30k filesystems on the system, about 5k of which are shared. On average, one out of ~200-300k unmounts fails with this EBUSY error. To create and destroy this many filesystems takes us about a week or so. Note that we are using ZFS built from https://github.com/delphix/zfs, which is essentially master ZFS on Linux. ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-50-generic 4.15.0-50.54 ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18 Uname: Linux 4.15.0-50-generic x86_64 NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp AlsaDevices: total 0 crw-rw 1 root audio 116, 1 May 20 19:10 seq crw-rw 1 root audio 116, 33 May 20 19:10 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'
Reproducer is as follows: lxc launch ubuntu:bionic zfs-bug-test Creating zfs-bug-test Starting zfs-bug-test lxc delete zfs-bug-test --force Error: Failed to destroy ZFS filesystem: Can reproduce this on Eoan with latest 5.2, 5.3 kernel. ** Also affects: linux (Ubuntu Eoan) Importance: Medium Assignee: Colin Ian King (colin-king) Status: Triaged ** Also affects: lxc (Ubuntu Eoan) Importance: Undecided Status: Confirmed ** Also affects: linux (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: lxc (Ubuntu Disco) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779156 Title: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy' Status in linux package in Ubuntu: Triaged Status in lxc package in Ubuntu: Confirmed Status in linux source package in Cosmic: Triaged Status in lxc source package in Cosmic: Confirmed Status in linux source package in Disco: New Status in lxc source package in Disco: New Status in linux source package in Eoan: Triaged Status in lxc source package in Eoan: Confirmed Bug description: I'm not sure exactly what got me into this state, but I have several lxc containers that cannot be deleted. $ lxc info api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls environment: addresses: [] architectures: - x86_64 - i686 certificate: | -BEGIN CERTIFICATE- -END CERTIFICATE- certificate_fingerprint: 3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb driver: lxc driver_version: 3.0.1 kernel: Linux kernel_architecture: x86_64 kernel_version: 4.15.0-23-generic server: lxd server_pid: 15123 server_version: "3.2" storage: zfs storage_version: 0.7.5-1ubuntu15 server_clustered: false server_name: milhouse $ lxc delete --force b1 Error: Failed to destroy ZFS filesystem: cannot destroy 'default/containers/b1': dataset is busy Talking in #lxc-dev, stgraber and sforeshee provided diagnosis: | short version is that something unshared a mount namespace causing | them to get a copy of the mount table at the time that dataset was | mounted, which then prevents zfs from being able to destroy it) The work around provided was | you can unstick this particular issue by doing: | grep default/containers/b1 /proc/*/mountinfo | then for any of the hits, do: | nsenter -t PID -m -- umount /var/snap/lxd/common/lxd/storage-pools/default/containers/b1 | then try the delete again ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.15.0-23-generic 4.15.0-23.25 ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18 Uname: Linux 4.15.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.10-0ubuntu3 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: smoser31412 F pulseaudio /dev/snd/controlC2: smoser31412 F pulseaudio /dev/snd/controlC0: smoser31412 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Thu Jun 28 10:42:45 2018 EcryptfsInUse: Yes InstallationDate: Installed on 2015-07-23 (1071 days ago) InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1) MachineType: b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1 RelatedPackageVersions: linux-restricted-modules-4.15.0-23-generic N/A linux-backports-modules-4.15.0-23-generic N/A linux-firmware 1.174 SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 03/09/2015 dmi.bios.vendor: Intel Corporation dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355 dmi.board.asset.tag: � dmi.board.name: NUC5i5RYB dmi.board.vendor: Intel Corporation dmi.board.version: H40999-503 dmi.chassis.asset.tag: � dmi.chassis.type: 3 dmi.chassis.vendor: � dmi.chassis.version: � dmi.modalias: dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i
[Kernel-packages] [Bug 1832384] Re: Unable to unmount apparently unused filesystem
That's really helpful to know John, thanks for the feedback. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1832384 Title: Unable to unmount apparently unused filesystem Status in linux package in Ubuntu: Incomplete Bug description: We periodically see an issue where unmounting a ZFS filesystem fails with EBUSY, even though there appears to be no one using it. # cat /proc/self/mounts | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0 'lsof' and 'fuser' show no processes using any of the files in the problematic filesystem: # ls -l /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/ total 221 -rw-r- 1 500 500 52736 May 22 11:01 1_19_1008904362.dbf -rw-r- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf # fuser /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/ # lsof | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive # The filesystem was shared over NFS, but has since been unshared: # showmount -e | grep /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive # Since no one appears to be using the filesystem, our expectation is that it should be possible to unmount the filesystem. However, attempts to unmount the filesystem fail with EBUSY: # zfs destroy domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive umount: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target is busy. cannot unmount '/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive': umount failed # umount /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive umount: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target is busy. Using bpftrace, we can see that the unmount is failing in 'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can look at the 'mount' struct for this particular mount and see that the 'mnt_count' refcount summed across all CPUs is 2. For filesystems that are eligible for unmounting, the refcount is 1. The only way to work around this issue that we have found is to reboot, at which point the filesystem can be unmounted and destroyed. So far, we have only been able to reproduce this using a workload driven by our application. The application mananges ZFS filesystems in groups, and the lifecycle of each group looks something like - Create and mount a group of filesystems, 1 parent and 4 children: /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370 /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp - Share all 5 filesystems over NFS - A client mounts all 5 shares using NFSv3 - For a few hours, the client does NFS operations on the filesystems and the server occasionally takes ZFS snapshots of them - Unshare filesystems - Unmount filesystems - Delete filesystems These groups of filesystems are constantly being created and destroyed. At any given time, we have ~30k filesystems on the system, about 5k of which are shared. On average, one out of ~200-300k unmounts fails with this EBUSY error. To create and destroy this many filesystems takes us about a week or so. Note that we are using ZFS built from https://github.com/delphix/zfs, which is essentially master ZFS on Linux. ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-50-generic 4.15.0-50.54 ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18 Uname: Linux 4.15.0-50-generic x86_64 NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp AlsaDevices: total 0 crw-rw 1 root audio 116, 1 May 20 19:10 seq crw-rw 1 root audio 116, 33 May 20 19:10 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'
Do we have any hunches on how to reproduce this issue? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779156 Title: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy' Status in linux package in Ubuntu: Triaged Status in lxc package in Ubuntu: Confirmed Status in linux source package in Cosmic: Triaged Status in lxc source package in Cosmic: Confirmed Bug description: I'm not sure exactly what got me into this state, but I have several lxc containers that cannot be deleted. $ lxc info api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls environment: addresses: [] architectures: - x86_64 - i686 certificate: | -BEGIN CERTIFICATE- -END CERTIFICATE- certificate_fingerprint: 3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb driver: lxc driver_version: 3.0.1 kernel: Linux kernel_architecture: x86_64 kernel_version: 4.15.0-23-generic server: lxd server_pid: 15123 server_version: "3.2" storage: zfs storage_version: 0.7.5-1ubuntu15 server_clustered: false server_name: milhouse $ lxc delete --force b1 Error: Failed to destroy ZFS filesystem: cannot destroy 'default/containers/b1': dataset is busy Talking in #lxc-dev, stgraber and sforeshee provided diagnosis: | short version is that something unshared a mount namespace causing | them to get a copy of the mount table at the time that dataset was | mounted, which then prevents zfs from being able to destroy it) The work around provided was | you can unstick this particular issue by doing: | grep default/containers/b1 /proc/*/mountinfo | then for any of the hits, do: | nsenter -t PID -m -- umount /var/snap/lxd/common/lxd/storage-pools/default/containers/b1 | then try the delete again ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.15.0-23-generic 4.15.0-23.25 ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18 Uname: Linux 4.15.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.10-0ubuntu3 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: smoser31412 F pulseaudio /dev/snd/controlC2: smoser31412 F pulseaudio /dev/snd/controlC0: smoser31412 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Thu Jun 28 10:42:45 2018 EcryptfsInUse: Yes InstallationDate: Installed on 2015-07-23 (1071 days ago) InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1) MachineType: b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1 RelatedPackageVersions: linux-restricted-modules-4.15.0-23-generic N/A linux-backports-modules-4.15.0-23-generic N/A linux-firmware 1.174 SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 03/09/2015 dmi.bios.vendor: Intel Corporation dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355 dmi.board.asset.tag: � dmi.board.name: NUC5i5RYB dmi.board.vendor: Intel Corporation dmi.board.version: H40999-503 dmi.chassis.asset.tag: � dmi.chassis.type: 3 dmi.chassis.vendor: � dmi.chassis.version: � dmi.modalias: dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i5RYB:rvrH40999-503:cvn:ct3:cvr: dmi.product.family: � dmi.product.name: � dmi.product.version: � dmi.sys.vendor: � To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779156/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'
Cosmic is now end-of-life. Does this still occur on Disco? ** Changed in: linux (Ubuntu) Importance: High => Medium ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Colin Ian King (colin-king) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1779156 Title: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy' Status in linux package in Ubuntu: Triaged Status in lxc package in Ubuntu: Confirmed Status in linux source package in Cosmic: Triaged Status in lxc source package in Cosmic: Confirmed Bug description: I'm not sure exactly what got me into this state, but I have several lxc containers that cannot be deleted. $ lxc info api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls environment: addresses: [] architectures: - x86_64 - i686 certificate: | -BEGIN CERTIFICATE- -END CERTIFICATE- certificate_fingerprint: 3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb driver: lxc driver_version: 3.0.1 kernel: Linux kernel_architecture: x86_64 kernel_version: 4.15.0-23-generic server: lxd server_pid: 15123 server_version: "3.2" storage: zfs storage_version: 0.7.5-1ubuntu15 server_clustered: false server_name: milhouse $ lxc delete --force b1 Error: Failed to destroy ZFS filesystem: cannot destroy 'default/containers/b1': dataset is busy Talking in #lxc-dev, stgraber and sforeshee provided diagnosis: | short version is that something unshared a mount namespace causing | them to get a copy of the mount table at the time that dataset was | mounted, which then prevents zfs from being able to destroy it) The work around provided was | you can unstick this particular issue by doing: | grep default/containers/b1 /proc/*/mountinfo | then for any of the hits, do: | nsenter -t PID -m -- umount /var/snap/lxd/common/lxd/storage-pools/default/containers/b1 | then try the delete again ProblemType: Bug DistroRelease: Ubuntu 18.10 Package: linux-image-4.15.0-23-generic 4.15.0-23.25 ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18 Uname: Linux 4.15.0-23-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.10-0ubuntu3 Architecture: amd64 AudioDevicesInUse: USERPID ACCESS COMMAND /dev/snd/controlC1: smoser31412 F pulseaudio /dev/snd/controlC2: smoser31412 F pulseaudio /dev/snd/controlC0: smoser31412 F pulseaudio CurrentDesktop: ubuntu:GNOME Date: Thu Jun 28 10:42:45 2018 EcryptfsInUse: Yes InstallationDate: Installed on 2015-07-23 (1071 days ago) InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1) MachineType: b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1 RelatedPackageVersions: linux-restricted-modules-4.15.0-23-generic N/A linux-backports-modules-4.15.0-23-generic N/A linux-firmware 1.174 SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 03/09/2015 dmi.bios.vendor: Intel Corporation dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355 dmi.board.asset.tag: � dmi.board.name: NUC5i5RYB dmi.board.vendor: Intel Corporation dmi.board.version: H40999-503 dmi.chassis.asset.tag: � dmi.chassis.type: 3 dmi.chassis.vendor: � dmi.chassis.version: � dmi.modalias: dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i5RYB:rvrH40999-503:cvn:ct3:cvr: dmi.product.family: � dmi.product.name: � dmi.product.version: � dmi.sys.vendor: � To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779156/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1841747] Re: dev test in ubuntu_stress_smoke_test crashes AWS c5.large with Disco
The following run as root will cause the instances to hang and then reboot (by the watchdog?) #include #include #include #include #include int main(void) { for (;;) { int fd; fd = open("/dev/hpet", O_RDONLY | O_NONBLOCK); close(fd); } } ** Information type changed from Public to Public Security ** Information type changed from Public Security to Private Security -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1841747 Title: dev test in ubuntu_stress_smoke_test crashes AWS c5.large with Disco Status in Stress-ng: New Status in ubuntu-kernel-tests: New Status in linux-aws package in Ubuntu: New Bug description: When testing the dev test in ubuntu_stress_smoke_test, the instance will crash and gets rebooted. Spotted on: * c3.xlarge * c4.large * m3.large * m4.large * r3.large * t2.small * x1e.xlarge Test output: 09:10:12 DEBUG| [stdout] dentry RETURNED 0 09:10:12 DEBUG| [stdout] dentry PASSED 09:10:12 DEBUG| [stdout] dev STARTING packet_write_wait: Connection to 18.237.187.217 port 22: Broken pipe tailing syslog: Aug 28 09:10:07 ip-172-31-3-117 stress-ng: info: [19659] dispatching hogs: 4 dentry Aug 28 09:10:12 ip-172-31-3-117 stress-ng: info: [19659] successful run completed in 5.13s Aug 28 09:10:12 ip-172-31-3-117 stress-ng: invoked with './stress-n' by user 0 Aug 28 09:10:12 ip-172-31-3-117 stress-ng: system: 'ip-172-31-3-117' Linux 5.0.0-1012-aws #13-Ubuntu SMP Fri Aug 2 12:25:32 UTC 2019 x86_64 Aug 28 09:10:12 ip-172-31-3-117 stress-ng: memory (MB): total 3754.94, free 3386.23, shared 0.06, buffer 145.52, swap 1024.00, free swap 951.48 Aug 28 09:10:12 ip-172-31-3-117 stress-ng: info: [19674] dispatching hogs: 4 dev Aug 28 09:10:12 ip-172-31-3-117 kernel: [ 481.406432] PM: Marking nosave pages: [mem 0x-0x0fff] Aug 28 09:10:12 ip-172-31-3-117 kernel: [ 481.406434] PM: Marking nosave pages: [mem 0x0009e000-0x000f] Aug 28 09:10:12 ip-172-31-3-117 kernel: [ 481.406437] PM: Basic memory bitmaps created Aug 28 09:10:12 ip-172-31-3-117 kernel: [ 481.406496] PM: Basic memory bitmaps freed packet_write_wait: Connection to 18.237.187.217 port 22: Broken pipe ProblemType: Bug DistroRelease: Ubuntu 19.04 Package: linux-image-5.0.0-1012-aws 5.0.0-1012.13 ProcVersionSignature: User Name 5.0.0-1012.13-aws 5.0.15 Uname: Linux 5.0.0-1012-aws x86_64 ApportVersion: 2.20.10-0ubuntu27.1 Architecture: amd64 Date: Wed Aug 28 09:21:28 2019 Ec2AMI: ami-0b731fb4a9a36df8c Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-west-2c Ec2InstanceType: c4.large Ec2Kernel: unavailable Ec2Ramdisk: unavailable SourcePackage: linux-aws UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/stress-ng/+bug/1841747/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1841747] Re: dev test in ubuntu_stress_smoke_test crashes AWS c5.large with Disco
Seems to occur when exercising /dev/hpet -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1841747 Title: dev test in ubuntu_stress_smoke_test crashes AWS c5.large with Disco Status in Stress-ng: New Status in ubuntu-kernel-tests: New Status in linux-aws package in Ubuntu: New Bug description: When testing the dev test in ubuntu_stress_smoke_test, the instance will crash and gets rebooted. Spotted on: * c3.xlarge * c4.large * m3.large * m4.large * r3.large * t2.small * x1e.xlarge Test output: 09:10:12 DEBUG| [stdout] dentry RETURNED 0 09:10:12 DEBUG| [stdout] dentry PASSED 09:10:12 DEBUG| [stdout] dev STARTING packet_write_wait: Connection to 18.237.187.217 port 22: Broken pipe tailing syslog: Aug 28 09:10:07 ip-172-31-3-117 stress-ng: info: [19659] dispatching hogs: 4 dentry Aug 28 09:10:12 ip-172-31-3-117 stress-ng: info: [19659] successful run completed in 5.13s Aug 28 09:10:12 ip-172-31-3-117 stress-ng: invoked with './stress-n' by user 0 Aug 28 09:10:12 ip-172-31-3-117 stress-ng: system: 'ip-172-31-3-117' Linux 5.0.0-1012-aws #13-Ubuntu SMP Fri Aug 2 12:25:32 UTC 2019 x86_64 Aug 28 09:10:12 ip-172-31-3-117 stress-ng: memory (MB): total 3754.94, free 3386.23, shared 0.06, buffer 145.52, swap 1024.00, free swap 951.48 Aug 28 09:10:12 ip-172-31-3-117 stress-ng: info: [19674] dispatching hogs: 4 dev Aug 28 09:10:12 ip-172-31-3-117 kernel: [ 481.406432] PM: Marking nosave pages: [mem 0x-0x0fff] Aug 28 09:10:12 ip-172-31-3-117 kernel: [ 481.406434] PM: Marking nosave pages: [mem 0x0009e000-0x000f] Aug 28 09:10:12 ip-172-31-3-117 kernel: [ 481.406437] PM: Basic memory bitmaps created Aug 28 09:10:12 ip-172-31-3-117 kernel: [ 481.406496] PM: Basic memory bitmaps freed packet_write_wait: Connection to 18.237.187.217 port 22: Broken pipe ProblemType: Bug DistroRelease: Ubuntu 19.04 Package: linux-image-5.0.0-1012-aws 5.0.0-1012.13 ProcVersionSignature: User Name 5.0.0-1012.13-aws 5.0.15 Uname: Linux 5.0.0-1012-aws x86_64 ApportVersion: 2.20.10-0ubuntu27.1 Architecture: amd64 Date: Wed Aug 28 09:21:28 2019 Ec2AMI: ami-0b731fb4a9a36df8c Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-west-2c Ec2InstanceType: c4.large Ec2Kernel: unavailable Ec2Ramdisk: unavailable SourcePackage: linux-aws UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/stress-ng/+bug/1841747/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1840934] Re: Change kernel compression method to improve boot speed
@Dimitri, Seems that for initramfs lz4 makes a lot of sense, see: https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/boot-speed- initramfs-decompression-eoan.ods The load times for LZ4 is slower than the previous default, however, the decompression time makes up for this unless one is booting off really slow media sub-5400 RPM HDD such as slow flash. So, the LZ4 default looks sane to me, lets see how it works out for Eaon. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1840934 Title: Change kernel compression method to improve boot speed Status in linux package in Ubuntu: Fix Committed Bug description: Colin King has done some analysis of kernel boot speed using different kernel compression methods. Results for x86 are at: https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/kernel-compression-method.txt https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/boot-speed-compression-5.3-rc4.ods Testing of s390 gave the following: GZIP31528972 LZ4 192348049 LZO 85990145 From Colin: "I used the monotonic TOD timer using the stckf opcode to fetch a 64 bit time value. Not sure how this maps to 'real time' in seconds." Conclusion: We should switch x86 to LZ4 and s390 to LZO. PPC and ARM do not support LZO or LZ4, so we will stick with gzip there. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840934/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp