[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x

2019-12-03 Thread Colin Ian King
And on an arm64 platform we have something similar:

15:55:45 DEBUG| [stdout] Number of CPUs: 4
15:55:45 DEBUG| [stdout] Number of CPUs Online: 4
15:55:45 DEBUG| [stdout]  
15:55:45 DEBUG| [stdout] access STARTING
15:55:49 DEBUG| [stdout] [ 7016.776865] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:55:50 DEBUG| [stdout] access RETURNED 0
15:55:50 DEBUG| [stdout] access PASSED
15:55:50 DEBUG| [stdout] af-alg STARTING
15:55:50 DEBUG| [stdout] [ 7017.948549] cryptd: max_cpu_qlen set to 1000
15:55:55 DEBUG| [stdout] af-alg RETURNED 0
15:55:55 DEBUG| [stdout] af-alg PASSED
15:55:55 DEBUG| [stdout] affinity STARTING
15:55:59 DEBUG| [stdout] [ 7026.984742] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:56:00 DEBUG| [stdout] affinity RETURNED 0
15:56:00 DEBUG| [stdout] affinity PASSED
15:56:00 DEBUG| [stdout] aio STARTING
15:56:05 DEBUG| [stdout] aio RETURNED 0
15:56:05 DEBUG| [stdout] aio PASSED
15:56:05 DEBUG| [stdout] aiol STARTING
15:56:09 DEBUG| [stdout] [ 7037.068696] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1

...and a stack dump too. protocol family 5 is AF_APPLETALK and stress-ng
does not use exercise this, so this is pretty weird and unexpected.

15:57:08 DEBUG| [stdout] [ 7096.221119] NET: Registered protocol family 5
15:57:10 DEBUG| [stdout] [ 7098.023954] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:57:20 DEBUG| [stdout] [ 7108.103839] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:57:30 DEBUG| [stdout] [ 7118.183729] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:57:41 DEBUG| [stdout] [ 7128.267622] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:57:51 DEBUG| [stdout] [ 7138.343507] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:58:01 DEBUG| [stdout] [ 7148.427381] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:58:11 DEBUG| [stdout] [ 7158.503282] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:58:21 DEBUG| [stdout] [ 7168.587157] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:58:31 DEBUG| [stdout] [ 7178.663042] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:58:41 DEBUG| [stdout] [ 7188.742924] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:58:51 DEBUG| [stdout] [ 7198.822826] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:59:01 DEBUG| [stdout] [ 7208.902688] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:59:11 DEBUG| [stdout] [ 7218.982579] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:59:21 DEBUG| [stdout] [ 7229.062465] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:59:31 DEBUG| [stdout] [ 7239.142348] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:59:42 DEBUG| [stdout] [ 7249.74] unregister_netdevice: waiting for eth0 
to become free. Usage count = 1
15:59:44 DEBUG| [stdout] [ 7251.334302] INFO: task modprobe:1184184 blocked for 
more than 120 seconds.
15:59:44 DEBUG| [stdout] [ 7251.335644]   Tainted: G   OE 
5.4.0-7-generic #8-Ubuntu
15:59:44 DEBUG| [stdout] [ 7251.336889] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
15:59:44 DEBUG| [stdout] [ 7251.338455] modprobeD0 1184184 1142782 
0x0028
15:59:44 DEBUG| [stdout] [ 7251.338461] Call trace:
15:59:44 DEBUG| [stdout] [ 7251.338472]  __switch_to+0xe4/0x148
15:59:44 DEBUG| [stdout] [ 7251.338478]  __schedule+0x2fc/0x7c0
15:59:44 DEBUG| [stdout] [ 7251.338489]  schedule+0x3c/0xb8
15:59:44 DEBUG| [stdout] [ 7251.338501]  rwsem_down_write_slowpath+0x2e8/0x5b0
15:59:44 DEBUG| [stdout] [ 7251.338512]  down_write+0x70/0x80
15:59:44 DEBUG| [stdout] [ 7251.338525]  register_netdevice_notifier+0x4c/0x208
15:59:44 DEBUG| [stdout] [ 7251.338548]  atalk_init+0xa0/0x118 [appletalk]
15:59:44 DEBUG| [stdout] [ 7251.338570]  do_one_initcall+0x50/0x220
15:59:44 DEBUG| [stdout] [ 7251.338575]  do_init_module+0x5c/0x248
15:59:44 DEBUG| [stdout] [ 7251.338582]  load_module+0xecc/0x1170
15:59:44 DEBUG| [stdout] [ 7251.338585]  __do_sys_finit_module+0xac/0x110
15:59:44 DEBUG| [stdout] [ 7251.338587]  __arm64_sys_finit_module+0x28/0x38
15:59:44 DEBUG| [stdout] [ 7251.338591]  el0_svc_common.constprop.0+0xdc/0x1d8
15:59:44 DEBUG| [stdout] [ 7251.338593]  el0_svc_handler+0x34/0xa0
15:59:44 DEBUG| [stdout] [ 7251.338595]  el0_svc+0x10/0x14

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1854968

Title:
  stress-ng sctp stressor breaks 5.4.0.7-8  on s390x

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT
  regression 

[Kernel-packages] [Bug 1854968] [NEW] stress-ng sctp stressor breaks 5.4.0.7-8 on s390x

2019-12-03 Thread Colin Ian King
Public bug reported:

stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT regression
testing:

https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
/autopkgtest-focal-canonical-kernel-team-
unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz

14:44:30 DEBUG| [stdout] sctp STARTING
14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 
256/256)
14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked for 
more than 122 seconds.
14:47:44 DEBUG| [stdout] [ 3684.814345]   Tainted: P   OE 
5.4.0-7-generic #8-Ubuntu
14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 2063618 
0x0800
14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace:
14:47:44 DEBUG| [stdout] [ 3684.814360] ([] 
__schedule+0x304/0x7b0)
14:47:44 DEBUG| [stdout] [ 3684.814362]  [] 
schedule+0x4a/0xe0 
14:47:44 DEBUG| [stdout] [ 3684.814366]  [] 
rwsem_down_write_slowpath+0x22c/0x530 
14:47:44 DEBUG| [stdout] [ 3684.814370]  [] 
register_pernet_subsys+0x2c/0x60 
14:47:44 DEBUG| [stdout] [ 3684.814411]  [<03ff80766638>] 
sctp_init+0x2f0/0x520 [sctp] 
14:47:44 DEBUG| [stdout] [ 3684.814414]  [] 
do_one_initcall+0x40/0x200 
14:47:44 DEBUG| [stdout] [ 3684.814416]  [] 
do_init_module+0x70/0x270 
14:47:44 DEBUG| [stdout] [ 3684.814418]  [] 
load_module+0x1142/0x1440 
14:47:44 DEBUG| [stdout] [ 3684.814419]  [] 
__do_sys_finit_module+0xa4/0xf0 
14:47:44 DEBUG| [stdout] [ 3684.814421]  [] 
system_call+0x2aa/0x2c8 
14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:48:07 DEBUG| [stdout] [ 3708.084328] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:48:17 DEBUG| [stdout] [ 3718.134297] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:48:27 DEBUG| [stdout] [ 3728.214335] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:48:37 DEBUG| [stdout] [ 3738.474354] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:48:48 DEBUG| [stdout] [ 3748.734396] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:48:58 DEBUG| [stdout] [ 3758.744352] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:49:08 DEBUG| [stdout] [ 3768.754349] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:49:18 DEBUG| [stdout] [ 3779.014352] unregister_netdevice: waiting for lo to 
become free. Usage count = 1
14:49:28 

[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x

2019-12-03 Thread Colin Ian King
I can't easily reproduce this on a s390 VM instance.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1854968

Title:
  stress-ng sctp stressor breaks 5.4.0.7-8  on s390x

Status in linux package in Ubuntu:
  New

Bug description:
  stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT
  regression testing:

  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-focal-canonical-kernel-team-
  unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz

  14:44:30 DEBUG| [stdout] sctp STARTING
  14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 
256/256)
  14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked 
for more than 122 seconds.
  14:47:44 DEBUG| [stdout] [ 3684.814345]   Tainted: P   OE 
5.4.0-7-generic #8-Ubuntu
  14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
  14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 
2063618 0x0800
  14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace:
  14:47:44 DEBUG| [stdout] [ 3684.814360] ([] 
__schedule+0x304/0x7b0)
  14:47:44 DEBUG| [stdout] [ 3684.814362]  [] 
schedule+0x4a/0xe0 
  14:47:44 DEBUG| [stdout] [ 3684.814366]  [] 
rwsem_down_write_slowpath+0x22c/0x530 
  14:47:44 DEBUG| [stdout] [ 3684.814370]  [] 
register_pernet_subsys+0x2c/0x60 
  14:47:44 DEBUG| [stdout] [ 3684.814411]  [<03ff80766638>] 
sctp_init+0x2f0/0x520 [sctp] 
  14:47:44 DEBUG| [stdout] [ 3684.814414]  [] 
do_one_initcall+0x40/0x200 
  14:47:44 DEBUG| [stdout] [ 3684.814416]  [] 
do_init_module+0x70/0x270 
  14:47:44 DEBUG| [stdout] [ 3684.814418]  [] 
load_module+0x1142/0x1440 
  14:47:44 DEBUG| [stdout] [ 3684.814419]  [] 
__do_sys_finit_module+0xa4/0xf0 
  14:47:44 DEBUG| [stdout] [ 3684.814421]  [] 
system_call+0x2aa/0x2c8 
  14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:07 DEBUG| [stdout] [ 3708.084328] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:17 DEBUG| [stdout] [ 3718.134297] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:27 DEBUG| [stdout] [ 3728.214335] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:37 DEBUG| [stdout] [ 3738.474354] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:48 DEBUG| [stdout] [ 

[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x

2019-12-03 Thread Colin Ian King
net/core/dev.c netdev_wait_allrefs() states:

**
 * netdev_wait_allrefs - wait until all references are gone.
 * @dev: target net_device
 *
 * This is called when unregistering network devices.
 *
 * Any protocol or device that holds a reference should register
 * for netdevice notification, and cleanup and put back the
 * reference if they receive an UNREGISTER event.
 * We can get stuck here if buggy protocols don't correctly
 * call dev_put.
 */

...

if (refcnt && time_after(jiffies, warning_time + 10 * HZ)) {
pr_emerg("unregister_netdevice: waiting for %s to 
become free. Usage count = %d\n",
 dev->name, refcnt);
warning_time = jiffies;
}

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1854968

Title:
  stress-ng sctp stressor breaks 5.4.0.7-8  on s390x

Status in linux package in Ubuntu:
  New

Bug description:
  stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT
  regression testing:

  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-focal-canonical-kernel-team-
  unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz

  14:44:30 DEBUG| [stdout] sctp STARTING
  14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 
256/256)
  14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked 
for more than 122 seconds.
  14:47:44 DEBUG| [stdout] [ 3684.814345]   Tainted: P   OE 
5.4.0-7-generic #8-Ubuntu
  14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
  14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 
2063618 0x0800
  14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace:
  14:47:44 DEBUG| [stdout] [ 3684.814360] ([] 
__schedule+0x304/0x7b0)
  14:47:44 DEBUG| [stdout] [ 3684.814362]  [] 
schedule+0x4a/0xe0 
  14:47:44 DEBUG| [stdout] [ 3684.814366]  [] 
rwsem_down_write_slowpath+0x22c/0x530 
  14:47:44 DEBUG| [stdout] [ 3684.814370]  [] 
register_pernet_subsys+0x2c/0x60 
  14:47:44 DEBUG| [stdout] [ 3684.814411]  [<03ff80766638>] 
sctp_init+0x2f0/0x520 [sctp] 
  14:47:44 DEBUG| [stdout] [ 3684.814414]  [] 
do_one_initcall+0x40/0x200 
  14:47:44 DEBUG| [stdout] [ 3684.814416]  [] 
do_init_module+0x70/0x270 
  14:47:44 DEBUG| [stdout] [ 3684.814418]  [] 
load_module+0x1142/0x1440 
  14:47:44 DEBUG| [stdout] [ 3684.814419]  [] 
__do_sys_finit_module+0xa4/0xf0 
  14:47:44 DEBUG| [stdout] [ 3684.814421]  [] 

[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x

2019-12-03 Thread Colin Ian King
This makes sense as the af-alg stressor now exercises a far wider set of
crypto engines.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1854968

Title:
  stress-ng sctp stressor breaks 5.4.0.7-8  on s390x

Status in linux package in Ubuntu:
  New

Bug description:
  stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT
  regression testing:

  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-focal-canonical-kernel-team-
  unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz

  14:44:30 DEBUG| [stdout] sctp STARTING
  14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 
256/256)
  14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked 
for more than 122 seconds.
  14:47:44 DEBUG| [stdout] [ 3684.814345]   Tainted: P   OE 
5.4.0-7-generic #8-Ubuntu
  14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
  14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 
2063618 0x0800
  14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace:
  14:47:44 DEBUG| [stdout] [ 3684.814360] ([] 
__schedule+0x304/0x7b0)
  14:47:44 DEBUG| [stdout] [ 3684.814362]  [] 
schedule+0x4a/0xe0 
  14:47:44 DEBUG| [stdout] [ 3684.814366]  [] 
rwsem_down_write_slowpath+0x22c/0x530 
  14:47:44 DEBUG| [stdout] [ 3684.814370]  [] 
register_pernet_subsys+0x2c/0x60 
  14:47:44 DEBUG| [stdout] [ 3684.814411]  [<03ff80766638>] 
sctp_init+0x2f0/0x520 [sctp] 
  14:47:44 DEBUG| [stdout] [ 3684.814414]  [] 
do_one_initcall+0x40/0x200 
  14:47:44 DEBUG| [stdout] [ 3684.814416]  [] 
do_init_module+0x70/0x270 
  14:47:44 DEBUG| [stdout] [ 3684.814418]  [] 
load_module+0x1142/0x1440 
  14:47:44 DEBUG| [stdout] [ 3684.814419]  [] 
__do_sys_finit_module+0xa4/0xf0 
  14:47:44 DEBUG| [stdout] [ 3684.814421]  [] 
system_call+0x2aa/0x2c8 
  14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:07 DEBUG| [stdout] [ 3708.084328] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:17 DEBUG| [stdout] [ 3718.134297] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:27 DEBUG| [stdout] [ 3728.214335] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:37 DEBUG| [stdout] [ 3738.474354] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
 

[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x

2019-12-03 Thread Colin Ian King
The unregister_netdevice message appears after the af-alg stressor
starts, so it maybe a crypto algo that is the root cause:

14:34:33 DEBUG| [stdout] af-alg STARTING
14:34:35 DEBUG| [stdout] [ 2895.954700] unregister_netdevice: waiting for lo to 
become free. Usage count = 1

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1854968

Title:
  stress-ng sctp stressor breaks 5.4.0.7-8  on s390x

Status in linux package in Ubuntu:
  New

Bug description:
  stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT
  regression testing:

  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-focal-canonical-kernel-team-
  unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz

  14:44:30 DEBUG| [stdout] sctp STARTING
  14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 
256/256)
  14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked 
for more than 122 seconds.
  14:47:44 DEBUG| [stdout] [ 3684.814345]   Tainted: P   OE 
5.4.0-7-generic #8-Ubuntu
  14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
  14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 
2063618 0x0800
  14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace:
  14:47:44 DEBUG| [stdout] [ 3684.814360] ([] 
__schedule+0x304/0x7b0)
  14:47:44 DEBUG| [stdout] [ 3684.814362]  [] 
schedule+0x4a/0xe0 
  14:47:44 DEBUG| [stdout] [ 3684.814366]  [] 
rwsem_down_write_slowpath+0x22c/0x530 
  14:47:44 DEBUG| [stdout] [ 3684.814370]  [] 
register_pernet_subsys+0x2c/0x60 
  14:47:44 DEBUG| [stdout] [ 3684.814411]  [<03ff80766638>] 
sctp_init+0x2f0/0x520 [sctp] 
  14:47:44 DEBUG| [stdout] [ 3684.814414]  [] 
do_one_initcall+0x40/0x200 
  14:47:44 DEBUG| [stdout] [ 3684.814416]  [] 
do_init_module+0x70/0x270 
  14:47:44 DEBUG| [stdout] [ 3684.814418]  [] 
load_module+0x1142/0x1440 
  14:47:44 DEBUG| [stdout] [ 3684.814419]  [] 
__do_sys_finit_module+0xa4/0xf0 
  14:47:44 DEBUG| [stdout] [ 3684.814421]  [] 
system_call+0x2aa/0x2c8 
  14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:07 DEBUG| [stdout] [ 3708.084328] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:17 DEBUG| [stdout] [ 3718.134297] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:48:27 DEBUG| [stdout] [ 

[Kernel-packages] [Bug 1854968] Re: stress-ng sctp stressor breaks 5.4.0.7-8 on s390x

2019-12-03 Thread Colin Ian King
This stress test has not changed much lately, so I'm assuming this is a
racy kernel regression.

Last stress-sctp changes in stress-ng were:

commit 27b045a498b360ccbc761c3b62e3dd38dd744f09
Author: Colin Ian King 
Date:   Sat Aug 10 13:25:34 2019 +0100

stress-sctp: voidify unused return

Signed-off-by: Colin Ian King 

commit 29043afe6d3c2fa95d6ce22c88aa4545d070e722
Author: Colin Ian King 
Date:   Wed Jun 26 12:42:00 2019 +0100

stress-sctp: use setsockopt for more socket option exercising

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1854968

Title:
  stress-ng sctp stressor breaks 5.4.0.7-8  on s390x

Status in linux package in Ubuntu:
  New

Bug description:
  stress-ng sctp stressor breaks 5.4.0.7-8 on s390x during ADT
  regression testing:

  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-focal-canonical-kernel-team-
  unstable/focal/s390x/l/linux/20191203_153629_d7a41@/log.gz

  14:44:30 DEBUG| [stdout] sctp STARTING
  14:44:30 DEBUG| [stdout] [ 3491.098762] sctp: Hash tables configured (bind 
256/256)
  14:44:33 DEBUG| [stdout] [ 3494.694285] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:43 DEBUG| [stdout] [ 3504.714324] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:44:54 DEBUG| [stdout] [ 3514.974288] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:04 DEBUG| [stdout] [ 3525.234306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:14 DEBUG| [stdout] [ 3535.494291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:25 DEBUG| [stdout] [ 3545.754323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:35 DEBUG| [stdout] [ 3556.014294] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:45 DEBUG| [stdout] [ 3566.034317] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:45:55 DEBUG| [stdout] [ 3576.054296] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:05 DEBUG| [stdout] [ 3586.324332] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:15 DEBUG| [stdout] [ 3596.334306] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:25 DEBUG| [stdout] [ 3606.594337] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:36 DEBUG| [stdout] [ 3616.854305] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:46 DEBUG| [stdout] [ 3627.124323] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:46:56 DEBUG| [stdout] [ 3637.154313] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:06 DEBUG| [stdout] [ 3647.414304] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:16 DEBUG| [stdout] [ 3657.674353] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:27 DEBUG| [stdout] [ 3667.734297] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:37 DEBUG| [stdout] [ 3677.994396] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:44 DEBUG| [stdout] [ 3684.814335] INFO: task modprobe:2063628 blocked 
for more than 122 seconds.
  14:47:44 DEBUG| [stdout] [ 3684.814345]   Tainted: P   OE 
5.4.0-7-generic #8-Ubuntu
  14:47:44 DEBUG| [stdout] [ 3684.814346] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
  14:47:44 DEBUG| [stdout] [ 3684.814348] modprobeD0 2063628 
2063618 0x0800
  14:47:44 DEBUG| [stdout] [ 3684.814351] Call Trace:
  14:47:44 DEBUG| [stdout] [ 3684.814360] ([<be310914>] 
__schedule+0x304/0x7b0)
  14:47:44 DEBUG| [stdout] [ 3684.814362]  [<be310e0a>] 
schedule+0x4a/0xe0 
  14:47:44 DEBUG| [stdout] [ 3684.814366]  [<bdb071cc>] 
rwsem_down_write_slowpath+0x22c/0x530 
  14:47:44 DEBUG| [stdout] [ 3684.814370]  [<be14d66c>] 
register_pernet_subsys+0x2c/0x60 
  14:47:44 DEBUG| [stdout] [ 3684.814411]  [<03ff80766638>] 
sctp_init+0x2f0/0x520 [sctp] 
  14:47:44 DEBUG| [stdout] [ 3684.814414]  [<bda288c0>] 
do_one_initcall+0x40/0x200 
  14:47:44 DEBUG| [stdout] [ 3684.814416]  [<bdb594a0>] 
do_init_module+0x70/0x270 
  14:47:44 DEBUG| [stdout] [ 3684.814418]  [<bdb5b892>] 
load_module+0x1142/0x1440 
  14:47:44 DEBUG| [stdout] [ 3684.814419]  [<bdb5bdc4>] 
__do_sys_finit_module+0xa4/0xf0 
  14:47:44 DEBUG| [stdout] [ 3684.814421]  [<be315fc6>] 
system_call+0x2aa/0x2c8 
  14:47:47 DEBUG| [stdout] [ 3688.014291] unregister_netdevice: waiting for lo 
to become free. Usage count = 1
  14:47:57 DEBUG| [stdout] [ 3698.064370] unregister_netdevice: 

[Kernel-packages] [Bug 1854959] Re: stress-ng sysinfo stressor trips kernel oops on ppc64el with 5.4.0.7-8

2019-12-03 Thread Colin Ian King
Same on 5.4.0.4-5 too but not on 5.4.0.3.4

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1854959

Title:
  stress-ng sysinfo stressor trips kernel oops on ppc64el with 5.4.0.7-8

Status in linux package in Ubuntu:
  In Progress

Bug description:
  stress-ng on ppc64el with 5.4.0.7-8, sysinfo stressor seems to tickle
  a bug:

  06:26:02 DEBUG| [stdout] sysinfo FAILED (kernel oopsed)
  06:26:02 DEBUG| [stdout] [ 7262.965483] kernel tried to execute 
exec-protected page (c00017407ce0) - exploit attempt? (uid: 0)
  06:26:02 DEBUG| [stdout] [ 7262.968030] BUG: Unable to handle kernel 
instruction fetch
  06:26:02 DEBUG| [stdout] [ 7262.968121] Faulting instruction address: 
0xc00017407ce0
  06:26:02 DEBUG| [stdout] [ 7262.968224] Oops: Kernel access of bad area, sig: 
11 [#1]
  06:26:02 DEBUG| [stdout] [ 7262.968292] LE PAGE_SIZE=64K MMU=Hash SMP 
NR_CPUS=2048 NUMA pSeries
  06:26:02 DEBUG| [stdout] [ 7262.968403] Modules linked in: unix_diag sctp 
zfs(PO) zunicode(PO) zavl(PO) icp(PO) zlua(PO) zcommon(PO) znvpair(PO) spl(O) 
snd_seq snd_seq_device snd_timer snd soundcore vhost_vsock 
vmw_vsock_virtio_transport_common vsock kvm_pr kvm hci_vhci bluetooth 
ecdh_generic ecc userio uhid hid vhost_net vhost tap cuse dccp_ipv4 dccp psnap 
llc algif_rng aegis128 algif_aead anubis fcrypt khazad seed sm4_generic tea 
crc32_generic md4 michael_mic nhpoly1305 poly1305_generic rmd128 rmd160 rmd256 
rmd320 sha3_generic sm3_generic streebog_generic tgr192 wp512 xxhash_generic 
algif_hash blowfish_generic blowfish_common cast5_generic des_generic libdes 
salsa20_generic chacha_generic camellia_generic cast6_generic cast_common 
serpent_generic twofish_generic twofish_common algif_skcipher af_alg aufs 
binfmt_misc af_packet_diag tcp_diag udp_diag raw_diag inet_diag iptable_mangle 
xt_TCPMSS xt_tcpudp bpfilter dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua 
vmx_crypto crct10dif_vpmsum sch_fq_codel ip_tables
  06:26:02 DEBUG| [stdout] [ 7262.969078]  x_tables autofs4 btrfs xor 
zstd_compress raid6_pq libcrc32c crc32c_vpmsum virtio_net virtio_blk 
net_failover failover [last unloaded: trace_printk]
  06:26:02 DEBUG| [stdout] [ 7262.970416] CPU: 1 PID: 2613531 Comm: fuse_mnt 
Tainted: P   OE 5.4.0-7-generic #8-Ubuntu
  06:26:02 DEBUG| [stdout] [ 7262.970532] NIP:  c00017407ce0 LR: 
c063e968 CTR: c00017407ce0
  06:26:02 DEBUG| [stdout] [ 7262.970623] REGS: c001d8393810 TRAP: 0400   
Tainted: P   OE  (5.4.0-7-generic)
  06:26:02 DEBUG| [stdout] [ 7262.970737] MSR:  800010009033 
  CR: 88002440  XER: 2000
  06:26:02 DEBUG| [stdout] [ 7262.970850] CFAR: c063e964 IRQMASK: 0 
  06:26:02 DEBUG| [stdout]GPR00: c063e944 
c001d8393aa0 c1a5bf00 c0003d95ec00 
  06:26:02 DEBUG| [stdout]GPR04: c00017407c18 
   
  06:26:02 DEBUG| [stdout]GPR08:  
   
  06:26:02 DEBUG| [stdout]GPR12: c00017407ce0 
c0003fffee00 7c8ab4814410  
  06:26:02 DEBUG| [stdout]GPR16: 7c8ab4b9 
7c8ab4810320 7c8ab2f6f240 7c8ab4814420 
  06:26:02 DEBUG| [stdout]GPR20:  
 7c8aa8000b60 7c8ab4aad3a0 
  06:26:02 DEBUG| [stdout]GPR24: c001f38f7da0 
c001fbb81e4c c00017407ce0 c001f38f7d80 
  06:26:02 DEBUG| [stdout]GPR28: c001f38f7da0 
 c0003d95ec00 c001f38f7d70 
  06:26:02 DEBUG| [stdout] [ 7262.971713] NIP [c00017407ce0] 
0xc00017407ce0
  06:26:02 DEBUG| [stdout] [ 7262.971804] LR [c063e968] 
fuse_request_end+0x128/0x2f0
  06:26:02 DEBUG| [stdout] [ 7262.971893] Call Trace:
  06:26:02 DEBUG| [stdout] [ 7262.971930] [c001d8393aa0] [c063e944] 
fuse_request_end+0x104/0x2f0 (unreliable)
  06:26:02 DEBUG| [stdout] [ 7262.972035] [c001d8393af0] [c06427cc] 
fuse_dev_do_write+0x2cc/0x5c0
  06:26:02 DEBUG| [stdout] [ 7262.972138] [c001d8393b70] [c0642f64] 
fuse_dev_write+0x74/0xd0
  06:26:02 DEBUG| [stdout] [ 7262.972221] [c001d8393c00] [c04702b0] 
do_iter_readv_writev+0x240/0x290
  06:26:02 DEBUG| [stdout] [ 7262.972334] [c001d8393c70] [c0472bc8] 
do_iter_write+0xc8/0x280
  06:26:02 DEBUG| [stdout] [ 7262.972424] [c001d8393cc0] [c0472e90] 
vfs_writev+0xe0/0x180
  06:26:02 DEBUG| [stdout] [ 7262.972508] [c001d8393dc0] [c0472fcc] 
do_writev+0x9c/0x1a0
  06:26:02 DEBUG| [stdout] [ 7262.972588] [c001d8393e20] [c000b278] 
system_call+0x5c/0x68
  06:26:02 DEBUG| [stdout] [ 7262.972661] Instruction dump:
  06:26:02 DEBUG| [stdout] [ 7262.972716]     
   

[Kernel-packages] [Bug 1854959] [NEW] stress-ng sysinfo stressor trips kernel oops on ppc64el with 5.4.0.7-8

2019-12-03 Thread Colin Ian King
Public bug reported:

stress-ng on ppc64el with 5.4.0.7-8, sysinfo stressor seems to tickle a
bug:

06:26:02 DEBUG| [stdout] sysinfo FAILED (kernel oopsed)
06:26:02 DEBUG| [stdout] [ 7262.965483] kernel tried to execute exec-protected 
page (c00017407ce0) - exploit attempt? (uid: 0)
06:26:02 DEBUG| [stdout] [ 7262.968030] BUG: Unable to handle kernel 
instruction fetch
06:26:02 DEBUG| [stdout] [ 7262.968121] Faulting instruction address: 
0xc00017407ce0
06:26:02 DEBUG| [stdout] [ 7262.968224] Oops: Kernel access of bad area, sig: 
11 [#1]
06:26:02 DEBUG| [stdout] [ 7262.968292] LE PAGE_SIZE=64K MMU=Hash SMP 
NR_CPUS=2048 NUMA pSeries
06:26:02 DEBUG| [stdout] [ 7262.968403] Modules linked in: unix_diag sctp 
zfs(PO) zunicode(PO) zavl(PO) icp(PO) zlua(PO) zcommon(PO) znvpair(PO) spl(O) 
snd_seq snd_seq_device snd_timer snd soundcore vhost_vsock 
vmw_vsock_virtio_transport_common vsock kvm_pr kvm hci_vhci bluetooth 
ecdh_generic ecc userio uhid hid vhost_net vhost tap cuse dccp_ipv4 dccp psnap 
llc algif_rng aegis128 algif_aead anubis fcrypt khazad seed sm4_generic tea 
crc32_generic md4 michael_mic nhpoly1305 poly1305_generic rmd128 rmd160 rmd256 
rmd320 sha3_generic sm3_generic streebog_generic tgr192 wp512 xxhash_generic 
algif_hash blowfish_generic blowfish_common cast5_generic des_generic libdes 
salsa20_generic chacha_generic camellia_generic cast6_generic cast_common 
serpent_generic twofish_generic twofish_common algif_skcipher af_alg aufs 
binfmt_misc af_packet_diag tcp_diag udp_diag raw_diag inet_diag iptable_mangle 
xt_TCPMSS xt_tcpudp bpfilter dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua 
vmx_crypto crct10dif_vpmsum sch_fq_codel ip_tables
06:26:02 DEBUG| [stdout] [ 7262.969078]  x_tables autofs4 btrfs xor 
zstd_compress raid6_pq libcrc32c crc32c_vpmsum virtio_net virtio_blk 
net_failover failover [last unloaded: trace_printk]
06:26:02 DEBUG| [stdout] [ 7262.970416] CPU: 1 PID: 2613531 Comm: fuse_mnt 
Tainted: P   OE 5.4.0-7-generic #8-Ubuntu
06:26:02 DEBUG| [stdout] [ 7262.970532] NIP:  c00017407ce0 LR: 
c063e968 CTR: c00017407ce0
06:26:02 DEBUG| [stdout] [ 7262.970623] REGS: c001d8393810 TRAP: 0400   
Tainted: P   OE  (5.4.0-7-generic)
06:26:02 DEBUG| [stdout] [ 7262.970737] MSR:  800010009033 
  CR: 88002440  XER: 2000
06:26:02 DEBUG| [stdout] [ 7262.970850] CFAR: c063e964 IRQMASK: 0 
06:26:02 DEBUG| [stdout]GPR00: c063e944 
c001d8393aa0 c1a5bf00 c0003d95ec00 
06:26:02 DEBUG| [stdout]GPR04: c00017407c18 
   
06:26:02 DEBUG| [stdout]GPR08:  
   
06:26:02 DEBUG| [stdout]GPR12: c00017407ce0 
c0003fffee00 7c8ab4814410  
06:26:02 DEBUG| [stdout]GPR16: 7c8ab4b9 
7c8ab4810320 7c8ab2f6f240 7c8ab4814420 
06:26:02 DEBUG| [stdout]GPR20:  
 7c8aa8000b60 7c8ab4aad3a0 
06:26:02 DEBUG| [stdout]GPR24: c001f38f7da0 
c001fbb81e4c c00017407ce0 c001f38f7d80 
06:26:02 DEBUG| [stdout]GPR28: c001f38f7da0 
 c0003d95ec00 c001f38f7d70 
06:26:02 DEBUG| [stdout] [ 7262.971713] NIP [c00017407ce0] 
0xc00017407ce0
06:26:02 DEBUG| [stdout] [ 7262.971804] LR [c063e968] 
fuse_request_end+0x128/0x2f0
06:26:02 DEBUG| [stdout] [ 7262.971893] Call Trace:
06:26:02 DEBUG| [stdout] [ 7262.971930] [c001d8393aa0] [c063e944] 
fuse_request_end+0x104/0x2f0 (unreliable)
06:26:02 DEBUG| [stdout] [ 7262.972035] [c001d8393af0] [c06427cc] 
fuse_dev_do_write+0x2cc/0x5c0
06:26:02 DEBUG| [stdout] [ 7262.972138] [c001d8393b70] [c0642f64] 
fuse_dev_write+0x74/0xd0
06:26:02 DEBUG| [stdout] [ 7262.972221] [c001d8393c00] [c04702b0] 
do_iter_readv_writev+0x240/0x290
06:26:02 DEBUG| [stdout] [ 7262.972334] [c001d8393c70] [c0472bc8] 
do_iter_write+0xc8/0x280
06:26:02 DEBUG| [stdout] [ 7262.972424] [c001d8393cc0] [c0472e90] 
vfs_writev+0xe0/0x180
06:26:02 DEBUG| [stdout] [ 7262.972508] [c001d8393dc0] [c0472fcc] 
do_writev+0x9c/0x1a0
06:26:02 DEBUG| [stdout] [ 7262.972588] [c001d8393e20] [c000b278] 
system_call+0x5c/0x68
06:26:02 DEBUG| [stdout] [ 7262.972661] Instruction dump:
06:26:02 DEBUG| [stdout] [ 7262.972716]     
    
06:26:02 DEBUG| [stdout] [ 7262.972815]     
    
06:26:02 DEBUG| [stdout] [ 7262.972919] ---[ end trace 5852d488fba4a06e ]---
06:26:02 DEBUG| [stdout] 
06:26:02 DEBUG| [stdout]

** Affects: linux (Ubuntu)
 Importance: High
 Assignee: Colin Ian King (colin-king)
 Status: In Progress

** Changed in: linux (Ubuntu

[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle

2019-11-29 Thread Colin Ian King
Hi Dean,

I've prepared another debug test kernel that has 70+ of the drm patches
removed that were introduced between the 5.3.0-19 and 5.3.9-23 kernels.
If this stops the fan spinning then this implies the regression was
introduced in a drm graphics patch.

Updated revision r2 Debian packages can be found here for testing:

https://kernel.ubuntu.com/~cking/lp-1853044/

Please test and let me know the outcome.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853044

Title:
  5.3.0-23-generic causes fans to spin when idle

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading to 5.3.0-23-generic the fans in my machine don't stop
  running. They always sound like something is utilizing CPU - even with
  no applications running after boot.

  If I boot back to 5.3.0-19-generic it's fine.

  My microcode version is reported as 0xd4 and iucode-tool reports:

  iucode-tool: system has processor(s) with signature 0x000506e3

  Let me know if you need anything else.

  ProblemType: Bug
  DistroRelease: Ubuntu 19.10
  Package: linux-image-5.3.0-23-generic 5.3.0-23.25
  ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
  Uname: Linux 5.3.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu8.2
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC2:  dean   2898 F pulseaudio
   /dev/snd/pcmC2D0p:   dean   2898 F...m pulseaudio
   /dev/snd/controlC0:  dean   2898 F pulseaudio
   /dev/snd/controlC1:  dean   2898 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Nov 18 13:03:34 2019
  HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b
  InstallationDate: Installed on 2018-07-24 (482 days ago)
  InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 
(20180724)
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-5.3.0-23-generic N/A
   linux-backports-modules-5.3.0-23-generic  N/A
   linux-firmware1.183.2
  SourcePackage: linux
  UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago)
  dmi.bios.date: 05/16/2018
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629
  dmi.board.name: NUC6i7KYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H90766-406
  dmi.chassis.type: 3
  dmi.chassis.vendor: Intel Corporation
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle

2019-11-26 Thread Colin Ian King
CPU averages:
   5.3.0-19:  2.03W, 99.6% idle, 0.1% in kernel, 93.5% in C10 state, 5.3% in C8 
state, 1.66GHz 
   5.3.0-23: 13.71W, 99.3% idle, 0.1% in kernel, 92.3% in C10 state, 6.1% in C8 
state, 2.05GHz

GPU averages:
   5.3.0-19:  0.10W
   5.3.0-23:  7.19W

ACPI thermal zone:
   5.3.0-19:  38.92 C
   5.3.0.23:  68.65 C

So, not much difference in CPU loading or in C10/C8 states, but it is
clocked faster on the -23 kernel and is 11.7W more power being consumed.
The GPU is also consuming far more power in the -23 kernel. The ACPI
thermal zone is ~30 degrees hotter, hence the fan activity.

Given the kernel changes I provided made no changes, this looks like a
i915 regression somehow. I'll see what has changed there.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853044

Title:
  5.3.0-23-generic causes fans to spin when idle

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading to 5.3.0-23-generic the fans in my machine don't stop
  running. They always sound like something is utilizing CPU - even with
  no applications running after boot.

  If I boot back to 5.3.0-19-generic it's fine.

  My microcode version is reported as 0xd4 and iucode-tool reports:

  iucode-tool: system has processor(s) with signature 0x000506e3

  Let me know if you need anything else.

  ProblemType: Bug
  DistroRelease: Ubuntu 19.10
  Package: linux-image-5.3.0-23-generic 5.3.0-23.25
  ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
  Uname: Linux 5.3.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu8.2
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC2:  dean   2898 F pulseaudio
   /dev/snd/pcmC2D0p:   dean   2898 F...m pulseaudio
   /dev/snd/controlC0:  dean   2898 F pulseaudio
   /dev/snd/controlC1:  dean   2898 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Nov 18 13:03:34 2019
  HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b
  InstallationDate: Installed on 2018-07-24 (482 days ago)
  InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 
(20180724)
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-5.3.0-23-generic N/A
   linux-backports-modules-5.3.0-23-generic  N/A
   linux-firmware1.183.2
  SourcePackage: linux
  UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago)
  dmi.bios.date: 05/16/2018
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629
  dmi.board.name: NUC6i7KYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H90766-406
  dmi.chassis.type: 3
  dmi.chassis.vendor: Intel Corporation
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle

2019-11-26 Thread Colin Ian King
@Dean, just one sanity check, do you have non-integer icon scaling on
your desktop?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853044

Title:
  5.3.0-23-generic causes fans to spin when idle

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading to 5.3.0-23-generic the fans in my machine don't stop
  running. They always sound like something is utilizing CPU - even with
  no applications running after boot.

  If I boot back to 5.3.0-19-generic it's fine.

  My microcode version is reported as 0xd4 and iucode-tool reports:

  iucode-tool: system has processor(s) with signature 0x000506e3

  Let me know if you need anything else.

  ProblemType: Bug
  DistroRelease: Ubuntu 19.10
  Package: linux-image-5.3.0-23-generic 5.3.0-23.25
  ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
  Uname: Linux 5.3.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu8.2
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC2:  dean   2898 F pulseaudio
   /dev/snd/pcmC2D0p:   dean   2898 F...m pulseaudio
   /dev/snd/controlC0:  dean   2898 F pulseaudio
   /dev/snd/controlC1:  dean   2898 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Nov 18 13:03:34 2019
  HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b
  InstallationDate: Installed on 2018-07-24 (482 days ago)
  InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 
(20180724)
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-5.3.0-23-generic N/A
   linux-backports-modules-5.3.0-23-generic  N/A
   linux-firmware1.183.2
  SourcePackage: linux
  UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago)
  dmi.bios.date: 05/16/2018
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629
  dmi.board.name: NUC6i7KYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H90766-406
  dmi.chassis.type: 3
  dmi.chassis.vendor: Intel Corporation
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-25 Thread Colin Ian King
** Also affects: linux (Ubuntu Focal)
   Importance: Critical
 Assignee: Colin Ian King (colin-king)
   Status: Confirmed

** Also affects: linux-hwe (Ubuntu Focal)
   Importance: Undecided
   Status: Invalid

** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: linux-hwe (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Also affects: linux-hwe (Ubuntu Disco)
   Importance: Undecided
   Status: New

** No longer affects: linux-hwe (Ubuntu Focal)

** No longer affects: linux-hwe (Ubuntu Eoan)

** No longer affects: linux-hwe (Ubuntu Disco)

** Changed in: linux (Ubuntu Focal)
   Status: Confirmed => In Progress

** Changed in: linux-hwe (Ubuntu Bionic)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  In Progress
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  In Progress
Status in linux source package in Disco:
  New
Status in linux source package in Eoan:
  New
Status in linux source package in Focal:
  In Progress

Bug description:
  == SRU Justification Disco, Eoan, Focal ==

  Multiple squashfs filesystems with overlayfs cause file corruption issues
  when modifying zero sized files

  == Fix ==

  The current fix is pending in
  
https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a

  == Test case ==

  With an Ubuntu ISO on the cdrom drive, use:

  #!/bin/bash -x
  mkdir -p /cdrom
  mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom
  sleep 1
  mkdir -p /cow
  mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow
  sleep 1
  mkdir -p /cow/upper
  mkdir -p /cow/work
  modprobe -q -b overlay
  sleep 1
  modprobe -q -b loop
  sleep 1
  dev=$(losetup -f)
  mkdir -p /filesystem.squashfs
  losetup $dev /cdrom/casper/filesystem.squashfs
  mount -t squashfs -o ro,noatime $dev /filesystem.squashfs
  sleep 1

  dev=$(losetup -f)
  mkdir -p /installer.squashfs
  losetup $dev /cdrom/casper/installer.squashfs
  mount -t squashfs -o ro,noatime $dev /installer.squashfs
  sleep 1

  mkdir -p /root-tmp
  mount -t overlay -o 
'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work'
 /cow /root-tmp

  FILE=/root-tmp/etc/.pwd.lock

  echo foo > $FILE
  cat $FILE
  sync
  #
  # dropping caches or remounting causes the bug
  #
  echo 3 > /proc/sys/vm/drop_caches
  cat $FILE

  Without the fix the cat of the file will produce an error. With the
  the cat will work correctly.

  == Regression Potential ==

  There is an unhandled corner case:
  - two filesystems, A and B, both have null uuid
  - upper layer is on A
  - lower layer 1 is also on A
  - lower layer 2 is on B

  However, since this is an issue without the fix and will be addressed
  later with subsequent fixes once they are OK with upstream I think the
  risk is minimal considering nobody is complaining about these corner
  cases with the current broken overlayfs squashfs layering.

  ---

  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, a

[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD

2019-11-25 Thread Colin Ian King
Thanks Witold! Much appreciated.

** Tags added: verification-done verification-done-eoan

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1852406

Title:
  Double-escape in initramfs DECRYPT_CMD

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  Fix Committed
Status in zfs-linux source package in Focal:
  Fix Released

Bug description:
  == SRU Justification, Eoan ==

  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:

  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"

  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".

  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html
  correctly zfs pool name is always 'shell-friendly', so removing the
  quotation marks would be a proper fix for that.

  == Fix ==

  One line fix as attached in https://bugs.launchpad.net/ubuntu/+source
  /zfs-linux/+bug/1852406/comments/1

  == Test ==

  Boot with encrypted data set with plymouth. Without the fix zfs is
  unable to find the root encrypted pool. With the fix this works.

  == Regression Potential ==

  This just affects the encrypted dataset that holds key for root
  dataset; currently this is causing issues because of the bug, so the
  risk of the fix outweighs the current situation where this is
  currently broken.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD

2019-11-25 Thread Colin Ian King
I was hoping you could test the version in -proposed. Without it being
verified as fixed then the fix won't be released for Eoan.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1852406

Title:
  Double-escape in initramfs DECRYPT_CMD

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  Fix Committed
Status in zfs-linux source package in Focal:
  Fix Released

Bug description:
  == SRU Justification, Eoan ==

  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:

  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"

  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".

  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html
  correctly zfs pool name is always 'shell-friendly', so removing the
  quotation marks would be a proper fix for that.

  == Fix ==

  One line fix as attached in https://bugs.launchpad.net/ubuntu/+source
  /zfs-linux/+bug/1852406/comments/1

  == Test ==

  Boot with encrypted data set with plymouth. Without the fix zfs is
  unable to find the root encrypted pool. With the fix this works.

  == Regression Potential ==

  This just affects the encrypted dataset that holds key for root
  dataset; currently this is causing issues because of the bug, so the
  risk of the fix outweighs the current situation where this is
  currently broken.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD

2019-11-25 Thread Colin Ian King
@Witold, is it possible for you to sanity check this, if it's not
verified it won't be fixed.

thanks

Colin

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1852406

Title:
  Double-escape in initramfs DECRYPT_CMD

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  Fix Committed
Status in zfs-linux source package in Focal:
  Fix Released

Bug description:
  == SRU Justification, Eoan ==

  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:

  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"

  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".

  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html
  correctly zfs pool name is always 'shell-friendly', so removing the
  quotation marks would be a proper fix for that.

  == Fix ==

  One line fix as attached in https://bugs.launchpad.net/ubuntu/+source
  /zfs-linux/+bug/1852406/comments/1

  == Test ==

  Boot with encrypted data set with plymouth. Without the fix zfs is
  unable to find the root encrypted pool. With the fix this works.

  == Regression Potential ==

  This just affects the encrypted dataset that holds key for root
  dataset; currently this is causing issues because of the bug, so the
  risk of the fix outweighs the current situation where this is
  currently broken.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-22 Thread Colin Ian King
** Description changed:

+ == SRU Justification Disco, Eoan, Focal ==
+ 
+ Multiple squashfs filesystems with overlayfs cause file corruption issues
+ when modifying zero sized files 
+ 
+ == Fix ==
+ 
+ The current fix is pending in
+ 
https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a
+ 
+ == Test case ==
+ 
+ With an Ubuntu ISO on the cdrom drive, use:
+ 
+ #!/bin/bash -x
+ mkdir -p /cdrom
+ mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom
+ sleep 1
+ mkdir -p /cow
+ mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow
+ sleep 1
+ mkdir -p /cow/upper
+ mkdir -p /cow/work
+ modprobe -q -b overlay
+ sleep 1
+ modprobe -q -b loop
+ sleep 1
+ dev=$(losetup -f)
+ mkdir -p /filesystem.squashfs
+ losetup $dev /cdrom/casper/filesystem.squashfs
+ mount -t squashfs -o ro,noatime $dev /filesystem.squashfs
+ sleep 1
+ 
+ dev=$(losetup -f)
+ mkdir -p /installer.squashfs
+ losetup $dev /cdrom/casper/installer.squashfs
+ mount -t squashfs -o ro,noatime $dev /installer.squashfs
+ sleep 1
+ 
+ mkdir -p /root-tmp
+ mount -t overlay -o 
'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work'
 /cow /root-tmp
+ 
+ FILE=/root-tmp/etc/.pwd.lock
+ 
+ echo foo > $FILE
+ cat $FILE
+ sync
+ #
+ # dropping caches or remounting causes the bug
+ #
+ echo 3 > /proc/sys/vm/drop_caches
+ cat $FILE
+ 
+ Without the fix the cat of the file will produce an error. With the the
+ cat will work correctly.
+ 
+ == Regression Potential ==
+ 
+ There is an unhandled corner case:
+ - two filesystems, A and B, both have null uuid
+ - upper layer is on A
+ - lower layer 1 is also on A
+ - lower layer 2 is on B
+ 
+ However, since this is an issue without the fix and will be addressed
+ later with subsequent fixes once they are OK with upstream I think the
+ risk is minimal considering nobody is complaining about these corner
+ cases with the current broken overlayfs squashfs layering.
+ 
+ ---
+ 
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options
  
     break=top debug init=/bin/bash
  
  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:
  
  rm /scripts/casper-bottom/25adduser
  exit
  
  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.
  
  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.
  
  9) Corruption examples
  
  (On both focal & eoan)
  
  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error
  
  (Only on eoan)
  
  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error
  
  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
  
  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root. But
  hopefully booting to a quite state with nothing running is sufficient to
  reproduce this.
  
  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.
  
  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

** Description changed:

  == SRU Justification Disco, Eoan, Focal ==
  
  Multiple squashfs filesystems with overlayfs cause file corruption issues
- when modifying zero sized files 
+ when modifying zero sized files
  
  == Fix ==
  
  The current fix is pending in
  
https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a
  
  == Test case ==
  
  With an Ubuntu ISO on the cdrom drive, use:
  
  #!/bin/bash -x
  mkdir -p /cdrom
  mount -t iso9660 -o ro,noatime /dev/sr0 

[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-22 Thread Colin Ian King
I'm doing some testing right now on the current upstream fix, hopefully
will SRU this by EOD.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle

2019-11-21 Thread Colin Ian King
I've found 3 possible commits that may have contributed to this
regression.  Can you install the kernel headers, image and module debs
in https://kernel.ubuntu.com/~cking/lp-1853044/ and see if this helps
fix the issue.


** Changed in: linux (Ubuntu)
   Status: In Progress => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853044

Title:
  5.3.0-23-generic causes fans to spin when idle

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  After upgrading to 5.3.0-23-generic the fans in my machine don't stop
  running. They always sound like something is utilizing CPU - even with
  no applications running after boot.

  If I boot back to 5.3.0-19-generic it's fine.

  My microcode version is reported as 0xd4 and iucode-tool reports:

  iucode-tool: system has processor(s) with signature 0x000506e3

  Let me know if you need anything else.

  ProblemType: Bug
  DistroRelease: Ubuntu 19.10
  Package: linux-image-5.3.0-23-generic 5.3.0-23.25
  ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
  Uname: Linux 5.3.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu8.2
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC2:  dean   2898 F pulseaudio
   /dev/snd/pcmC2D0p:   dean   2898 F...m pulseaudio
   /dev/snd/controlC0:  dean   2898 F pulseaudio
   /dev/snd/controlC1:  dean   2898 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Nov 18 13:03:34 2019
  HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b
  InstallationDate: Installed on 2018-07-24 (482 days ago)
  InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 
(20180724)
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-5.3.0-23-generic N/A
   linux-backports-modules-5.3.0-23-generic  N/A
   linux-firmware1.183.2
  SourcePackage: linux
  UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago)
  dmi.bios.date: 05/16/2018
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629
  dmi.board.name: NUC6i7KYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H90766-406
  dmi.chassis.type: 3
  dmi.chassis.vendor: Intel Corporation
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle

2019-11-21 Thread Colin Ian King
Also, when the fan is running at high speed can you do the following:

sudo apt-get install acpi
acpi -V

and add the output to the bug report

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853044

Title:
  5.3.0-23-generic causes fans to spin when idle

Status in linux package in Ubuntu:
  In Progress

Bug description:
  After upgrading to 5.3.0-23-generic the fans in my machine don't stop
  running. They always sound like something is utilizing CPU - even with
  no applications running after boot.

  If I boot back to 5.3.0-19-generic it's fine.

  My microcode version is reported as 0xd4 and iucode-tool reports:

  iucode-tool: system has processor(s) with signature 0x000506e3

  Let me know if you need anything else.

  ProblemType: Bug
  DistroRelease: Ubuntu 19.10
  Package: linux-image-5.3.0-23-generic 5.3.0-23.25
  ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
  Uname: Linux 5.3.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu8.2
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC2:  dean   2898 F pulseaudio
   /dev/snd/pcmC2D0p:   dean   2898 F...m pulseaudio
   /dev/snd/controlC0:  dean   2898 F pulseaudio
   /dev/snd/controlC1:  dean   2898 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Nov 18 13:03:34 2019
  HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b
  InstallationDate: Installed on 2018-07-24 (482 days ago)
  InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 
(20180724)
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-5.3.0-23-generic N/A
   linux-backports-modules-5.3.0-23-generic  N/A
   linux-firmware1.183.2
  SourcePackage: linux
  UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago)
  dmi.bios.date: 05/16/2018
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629
  dmi.board.name: NUC6i7KYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H90766-406
  dmi.chassis.type: 3
  dmi.chassis.vendor: Intel Corporation
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle

2019-11-20 Thread Colin Ian King
Hi Dean,

As a first triaging step, with the 5.3.0-23-generic and also the
5.3.0-19-generic kernel do you mind installing and running the following
command:

powerstat -Ra | tee powerstat-$(uname -r).log

and attaching the log files to the bug report. The command takes about
60 seconds to run.

thanks.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853044

Title:
  5.3.0-23-generic causes fans to spin when idle

Status in linux package in Ubuntu:
  In Progress

Bug description:
  After upgrading to 5.3.0-23-generic the fans in my machine don't stop
  running. They always sound like something is utilizing CPU - even with
  no applications running after boot.

  If I boot back to 5.3.0-19-generic it's fine.

  My microcode version is reported as 0xd4 and iucode-tool reports:

  iucode-tool: system has processor(s) with signature 0x000506e3

  Let me know if you need anything else.

  ProblemType: Bug
  DistroRelease: Ubuntu 19.10
  Package: linux-image-5.3.0-23-generic 5.3.0-23.25
  ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
  Uname: Linux 5.3.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu8.2
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC2:  dean   2898 F pulseaudio
   /dev/snd/pcmC2D0p:   dean   2898 F...m pulseaudio
   /dev/snd/controlC0:  dean   2898 F pulseaudio
   /dev/snd/controlC1:  dean   2898 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Nov 18 13:03:34 2019
  HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b
  InstallationDate: Installed on 2018-07-24 (482 days ago)
  InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 
(20180724)
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-5.3.0-23-generic N/A
   linux-backports-modules-5.3.0-23-generic  N/A
   linux-firmware1.183.2
  SourcePackage: linux
  UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago)
  dmi.bios.date: 05/16/2018
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629
  dmi.board.name: NUC6i7KYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H90766-406
  dmi.chassis.type: 3
  dmi.chassis.vendor: Intel Corporation
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1853044] Re: 5.3.0-23-generic causes fans to spin when idle

2019-11-20 Thread Colin Ian King
** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: linux (Ubuntu)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853044

Title:
  5.3.0-23-generic causes fans to spin when idle

Status in linux package in Ubuntu:
  In Progress

Bug description:
  After upgrading to 5.3.0-23-generic the fans in my machine don't stop
  running. They always sound like something is utilizing CPU - even with
  no applications running after boot.

  If I boot back to 5.3.0-19-generic it's fine.

  My microcode version is reported as 0xd4 and iucode-tool reports:

  iucode-tool: system has processor(s) with signature 0x000506e3

  Let me know if you need anything else.

  ProblemType: Bug
  DistroRelease: Ubuntu 19.10
  Package: linux-image-5.3.0-23-generic 5.3.0-23.25
  ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
  Uname: Linux 5.3.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu8.2
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC2:  dean   2898 F pulseaudio
   /dev/snd/pcmC2D0p:   dean   2898 F...m pulseaudio
   /dev/snd/controlC0:  dean   2898 F pulseaudio
   /dev/snd/controlC1:  dean   2898 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Nov 18 13:03:34 2019
  HibernationDevice: RESUME=UUID=55a42c82-50bf-4e75-a133-dbd3aa93611b
  InstallationDate: Installed on 2018-07-24 (482 days ago)
  InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 
(20180724)
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-5.3.0-23-generic N/A
   linux-backports-modules-5.3.0-23-generic  N/A
   linux-firmware1.183.2
  SourcePackage: linux
  UpgradeStatus: Upgraded to eoan on 2019-07-19 (121 days ago)
  dmi.bios.date: 05/16/2018
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: KYSKLi70.86A.0055.2018.0516.1629
  dmi.board.name: NUC6i7KYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H90766-406
  dmi.chassis.type: 3
  dmi.chassis.vendor: Intel Corporation
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnIntelCorp.:bvrKYSKLi70.86A.0055.2018.0516.1629:bd05/16/2018:svn:pn:pvr:rvnIntelCorporation:rnNUC6i7KYB:rvrH90766-406:cvnIntelCorporation:ct3:cvr1.0:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853044/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1853130] Re: zfs-dkms-0.7.5-1ubuntu15 fail build kernel-hwe kernel 5.0.0-36-generic

2019-11-19 Thread Colin Ian King
No problem at all. I'll close this bug if that's OK.

** Changed in: zfs-linux (Ubuntu)
   Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853130

Title:
  zfs-dkms-0.7.5-1ubuntu15 fail build kernel-hwe kernel 5.0.0-36-generic

Status in zfs-linux package in Ubuntu:
  Won't Fix

Bug description:
  zfs-dkms- Fail build  with kernel 5.0.0-36-generic 
  cripts/Makefile.build:284: recipe for target 
'/var/lib/dkms/spl/0.7.5/build/module/spl/spl-condvar.o' failed
   make[5]: *** [/var/lib/dkms/spl/0.7.5/build/module/spl/spl-condvar.o] Error 1
   make[5]: *** Waiting for unfinished jobs
   In file included from /var/lib/dkms/spl/0.7.5/build/include/sys/kstat.h:31:0,
from 
/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.c:28:
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h: In function ‘gethrestime’:
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:63:9: error: implicit 
declaration of function ‘current_kernel_time’; did you mean ‘current_time’? 
[-Werror=implicit-function-declaration]
 *now = current_kernel_time();
^~~
current_time
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:63:7: error: incompatible 
types when assigning to type ‘timestruc_t {aka struct timespec}’ from type ‘int’
 *now = current_kernel_time();
  ^
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h: In function 
‘gethrestime_sec’:
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:70:5: error: incompatible 
types when assigning to type ‘struct timespec’ from type ‘int’
 ts = current_kernel_time();
^
 CC [M]  /var/lib/dkms/spl/0.7.5/build/module/splat/splat-linux.o
   cc1: some warnings being treated as errors
   scripts/Makefile.build:284: recipe for target 
'/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.o' failed
   make[5]: *** [/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.o] Error 1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1853130/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1853130] Re: zfs-dkms-0.7.5-1ubuntu15 fail build kernel-hwe kernel 5.0.0-36-generic

2019-11-19 Thread Colin Ian King
The hwe (and bionic) kernels come with zfs and spl modules already
provided, so there is no need for zfs-dkms and spl-dkms, e.g.

cking@bionic-amd64:~$ uname -a
Linux bionic-amd64 5.0.0-36-generic #39~18.04.1-Ubuntu SMP Tue Nov 12 11:09:50 
UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
cking@bionic-amd64:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 18.04.3 LTS
Release:18.04
Codename:   bionic
cking@bionic-amd64:~$ dpkg -l | grep zfs-dkms
cking@bionic-amd64:~$ dmesg | grep ZFS
[   16.209504] ZFS: Loaded module v0.7.12-1ubuntu5, ZFS pool version 5000, ZFS 
filesystem version 5
cking@bionic-amd64:~$ lsmod | grep zfs
zfs  3035136  8
zunicode  331776  1 zfs
zavl   16384  1 zfs
icp   258048  1 zfs
zcommon65536  1 zfs
znvpair77824  2 zfs,zcommon
spl   102400  4 zfs,icp,znvpair,zcommon

so just remove zfs-dkms and you can still use zfs on the HWE 5.0.x
kernel on Bionic.

** Changed in: zfs-linux (Ubuntu)
   Importance: Undecided => Wishlist

** Changed in: zfs-linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: zfs-linux (Ubuntu)
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853130

Title:
  zfs-dkms-0.7.5-1ubuntu15 fail build kernel-hwe kernel 5.0.0-36-generic

Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  zfs-dkms- Fail build  with kernel 5.0.0-36-generic 
  cripts/Makefile.build:284: recipe for target 
'/var/lib/dkms/spl/0.7.5/build/module/spl/spl-condvar.o' failed
   make[5]: *** [/var/lib/dkms/spl/0.7.5/build/module/spl/spl-condvar.o] Error 1
   make[5]: *** Waiting for unfinished jobs
   In file included from /var/lib/dkms/spl/0.7.5/build/include/sys/kstat.h:31:0,
from 
/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.c:28:
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h: In function ‘gethrestime’:
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:63:9: error: implicit 
declaration of function ‘current_kernel_time’; did you mean ‘current_time’? 
[-Werror=implicit-function-declaration]
 *now = current_kernel_time();
^~~
current_time
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:63:7: error: incompatible 
types when assigning to type ‘timestruc_t {aka struct timespec}’ from type ‘int’
 *now = current_kernel_time();
  ^
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h: In function 
‘gethrestime_sec’:
   /var/lib/dkms/spl/0.7.5/build/include/sys/time.h:70:5: error: incompatible 
types when assigning to type ‘struct timespec’ from type ‘int’
 ts = current_kernel_time();
^
 CC [M]  /var/lib/dkms/spl/0.7.5/build/module/splat/splat-linux.o
   cc1: some warnings being treated as errors
   scripts/Makefile.build:284: recipe for target 
'/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.o' failed
   make[5]: *** [/var/lib/dkms/spl/0.7.5/build/module/spl/spl-kstat.o] Error 1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1853130/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1771091] Re: zpool freezes importing older ZFS pool, blocks shotdown and system does not boot

2019-11-18 Thread Colin Ian King
This bug has not been updated with further information requested in
question 4 for over a year. Marking as Won't Fix.

** Changed in: zfs-linux (Ubuntu)
   Status: Incomplete => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1771091

Title:
  zpool freezes importing older ZFS pool, blocks shotdown and system
  does not boot

Status in zfs-linux package in Ubuntu:
  Won't Fix

Bug description:
  After fresh install of xubuntu 18.04 LTS 64-bit, 
  and the installation of zfs-dkms I tried to do 'zpool import' on an older ZFS 
pool, consisting of on partition on the separate PATA HDD.

  After issuing 'sudo zpool import ' , command freezes (as to zfs 
commands).
  System then fails to shutdown properly and seems locked and needs hard reboot 
(actually it waits up to half an hour to shutdown).
  After restarting, system displays Xubuntu splash screen and does not boot 
anymore (it actually resets itself if given again half an hour or so).

  When getting to rescue options, by pressing SHIFT key on keyboard and
  going to shell and remounting / read-write, I could do removing of ZFS
  Ubuntu packages and after that system could boot.

  Usefull message I got when trying to continue booting in shell was:
  "[ 40.811792] VERIFY3(0 == remove_reference(hdr,  ((void *)0), tag)) failed 
(0 = 0)  
  [ 40.811856] PANIC at arc.c:3084:arc_buf_destroy()"

  So it points to some ZFS bug with ARC.

  Previously, I was able to (unlike with 17.10) upgrade from 17.10 to 18.04 and 
to import and use a newer ZFS pool.  
  But this bug is about fresh 18.04 install and an older ZFS pool. (zpool 
import says pool can be upgraded)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1771091/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1846486] Re: revert the revert of ext4: make __ext4_get_inode_loc plug

2019-11-13 Thread Colin Ian King
** Changed in: linux (Ubuntu)
   Status: In Progress => Fix Committed

** Changed in: linux (Ubuntu)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1846486

Title:
  revert the revert of ext4: make __ext4_get_inode_loc plug

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  == SRU Justification Eoan ==

  Now that 5.4 contains a fix to the bootup regression due to the lack
  of entropy at bootable we should apply this fix and also revert the
  revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"".

  == Fix ==

  So, to clarify, apply the two upstream 5.4-rc commits:

  commit 50ee7529ec4500c88f8664560770a7a1b65db72b
  Author: Linus Torvalds 
  Date:   Sat Sep 28 16:53:52 2019 -0700

  random: try to actively add entropy rather than passively wait for
  it

  commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13
  Author: Linus Torvalds 
  Date:   Sun Sep 29 17:59:23 2019 -0700

  Revert "Revert "ext4: make __ext4_get_inode_loc plug""

  I've benchmarked the Eoan kernel with these two patches and found theo
  following speed improvements on an i7-3770 CPU @ 3.40GHz with 8GB and a
  WDC WD10EZEX-21WN4A HDD (7200RPM, 64MB cache).

  git grep of the kernel: 0.14%
  building fwts: 0.40%
  build stress-ng 0.45%
  tar up kernel source: 7.6%
  boot time of eoan cloud image: 10.5%

  So I think the speed improvements justify the SRU.

  == Regression potential ==

  minor change to ext4, which has been regression tested, so risk here
  is small.  The entropy change will alter the random number generation,
  but I believe this does not change the cryptographical security of the
  random numbers being generated, so think this change is not security
  risk.

  originally the ext4 change caused boot time user space regressions
  because of the entropy change of this fix, but the random fix
  addresses this, so I believe this risk is now zero.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1846486/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD

2019-11-13 Thread Colin Ian King
Yes, minimal impact and reducing regression risk is key in SRUs.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1852406

Title:
  Double-escape in initramfs DECRYPT_CMD

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  In Progress
Status in zfs-linux source package in Focal:
  Fix Released

Bug description:
  == SRU Justification, Eoan ==

  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:

  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"

  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".

  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html
  correctly zfs pool name is always 'shell-friendly', so removing the
  quotation marks would be a proper fix for that.

  == Fix ==

  One line fix as attached in https://bugs.launchpad.net/ubuntu/+source
  /zfs-linux/+bug/1852406/comments/1

  == Test ==

  Boot with encrypted data set with plymouth. Without the fix zfs is
  unable to find the root encrypted pool. With the fix this works.

  == Regression Potential ==

  This just affects the encrypted dataset that holds key for root
  dataset; currently this is causing issues because of the bug, so the
  risk of the fix outweighs the current situation where this is
  currently broken.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD

2019-11-13 Thread Colin Ian King
** Description changed:

+ == SRU Justification, Eoan ==
+ 
  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:
  
  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"
  
  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".
  
  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html correctly
  zfs pool name is always 'shell-friendly', so removing the quotation
  marks would be a proper fix for that.
+ 
+ == Fix ==
+ 
+ One line fix as attached in https://bugs.launchpad.net/ubuntu/+source
+ /zfs-linux/+bug/1852406/comments/1
+ 
+ == Test ==
+ 
+ Boot with encrypted data set with plymouth. Without the fix zfs is
+ unable to find the root encrypted pool. With the fix this works.
+ 
+ == Regression Potential ==
+ 
+ This just affects the encrypted dataset that holds key for root dataset;
+ currently this is causing issues because of the bug, so the risk of the
+ fix outweighs the current situation where this is currently broken.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1852406

Title:
  Double-escape in initramfs DECRYPT_CMD

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  In Progress
Status in zfs-linux source package in Focal:
  Fix Released

Bug description:
  == SRU Justification, Eoan ==

  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:

  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"

  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".

  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html
  correctly zfs pool name is always 'shell-friendly', so removing the
  quotation marks would be a proper fix for that.

  == Fix ==

  One line fix as attached in https://bugs.launchpad.net/ubuntu/+source
  /zfs-linux/+bug/1852406/comments/1

  == Test ==

  Boot with encrypted data set with plymouth. Without the fix zfs is
  unable to find the root encrypted pool. With the fix this works.

  == Regression Potential ==

  This just affects the encrypted dataset that holds key for root
  dataset; currently this is causing issues because of the bug, so the
  risk of the fix outweighs the current situation where this is
  currently broken.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD

2019-11-13 Thread Colin Ian King
Fix required only in zfs-linux-0.8.1 in Eoan.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1852406

Title:
  Double-escape in initramfs DECRYPT_CMD

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  In Progress
Status in zfs-linux source package in Focal:
  Fix Released

Bug description:
  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:

  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"

  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".

  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html
  correctly zfs pool name is always 'shell-friendly', so removing the
  quotation marks would be a proper fix for that.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD

2019-11-13 Thread Colin Ian King
Fixed in zfs-0.8.2 in focal.

** Also affects: zfs-linux (Ubuntu Focal)
   Importance: Medium
 Assignee: Colin Ian King (colin-king)
   Status: Triaged

** Changed in: zfs-linux (Ubuntu Focal)
   Status: Triaged => Fix Released

** Also affects: zfs-linux (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Changed in: zfs-linux (Ubuntu Eoan)
   Status: New => In Progress

** Changed in: zfs-linux (Ubuntu Eoan)
   Importance: Undecided => Medium

** Changed in: zfs-linux (Ubuntu Eoan)
 Assignee: (unassigned) => Colin Ian King (colin-king)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1852406

Title:
  Double-escape in initramfs DECRYPT_CMD

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  In Progress
Status in zfs-linux source package in Focal:
  Fix Released

Bug description:
  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:

  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"

  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".

  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html
  correctly zfs pool name is always 'shell-friendly', so removing the
  quotation marks would be a proper fix for that.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1852406] Re: Double-escape in initramfs DECRYPT_CMD

2019-11-13 Thread Colin Ian King
Thanks for the patch. Any specific version of zfs-linux this relates to?

** Changed in: zfs-linux (Ubuntu)
   Importance: Undecided => Medium

** Changed in: zfs-linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: zfs-linux (Ubuntu)
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1852406

Title:
  Double-escape in initramfs DECRYPT_CMD

Status in zfs-linux package in Ubuntu:
  Triaged

Bug description:
  initramfs/scripts/zfs.in incorrectly quotes ${ENCRYPTIONROOT} on line
  414:

  DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"

  This is OK when the line is executed by shell, such as in line 430 or
  436, but when plymouth is used it results in plymouth executing "zfs
  load-key 'rpool'" - and zfs  is unable to find pool called "'rpool'".

  If I understand
  https://docs.oracle.com/cd/E23824_01/html/821-1448/gbcpt.html
  correctly zfs pool name is always 'shell-friendly', so removing the
  quotation marks would be a proper fix for that.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1852406/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1851749] Re: Frequently getting thermal warnings and cpu throttling messages in syslog

2019-11-08 Thread Colin Ian King
** Changed in: linux (Ubuntu)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: linux (Ubuntu)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1851749

Title:
  Frequently getting thermal warnings and cpu throttling messages in
  syslog

Status in linux package in Ubuntu:
  In Progress

Bug description:
  Nov  6 11:34:26 fog kernel: [1129655.443564] mce: CPU0: Core temperature 
above threshold, cpu clock throttled (total events = 50300) 


 
  Nov  6 11:34:26 fog kernel: [1129655.443565] mce: CPU2: Core temperature 
above threshold, cpu clock throttled (total events = 50300) 


 
  Nov  6 11:34:26 fog kernel: [1129655.443567] mce: CPU1: Package temperature 
above threshold, cpu clock throttled (total events = 58637) 


  
  Nov  6 11:34:26 fog kernel: [1129655.443568] mce: CPU3: Package temperature 
above threshold, cpu clock throttled (total events = 58637) 


  
  Nov  6 11:34:26 fog kernel: [1129655.443569] mce: CPU2: Package temperature 
above threshold, cpu clock throttled (total events = 58637) 


  
  Nov  6 11:34:26 fog kernel: [1129655.443570] mce: CPU0: Package temperature 
above threshold, cpu clock throttled (total events = 58637) 


  
  Nov  6 11:34:26 fog kernel: [1129655.446528] mce: CPU2: Core 
temperature/speed normal


 
  Nov  6 11:34:26 fog kernel: [1129655.446529] mce: CPU0: Core 
temperature/speed normal


 
  Nov  6 11:34:26 fog kernel: [1129655.446530] mce: CPU1: Package 
temperature/speed normal


  
  Nov  6 11:34:26 fog kernel: [1129655.446531] mce: CPU3: Package 
temperature/speed normal


  
  Nov  6 11:34:26 fog kernel: [1129655.446531] mce: CPU0: Package 
temperature/speed normal


  
  Nov  6 11:34:26 fog kernel: [1129655.446532] mce: CPU2: Package 
temperature/speed normal


  
  Nov  6 11:40:35 fog kernel: [1130024.427390] mce: CPU0: Core temperature 
above threshold, cpu clock throttled (total events = 50316) 


 
  Nov  6 11:40:35 fog kernel: [1130024.427391] mce: CPU2: Core temperature 
above threshold, cpu clock throttled (total event

[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-07 Thread Colin Ian King
pr_warn can be removed with a sauce patch, so no worries with that.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-07 Thread Colin Ian King
been iterating on a fix with upstream:
https://lkml.org/lkml/2019/11/7/317

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-06 Thread Colin Ian King
When I'm more awake tomorrow I'll send a patch upstream as a suggested
fix and see if we can get a good solution on the UUIDs worked out.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-06 Thread Colin Ian King
Just love the way launchpad mangles pasted code.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-06 Thread Colin Ian King
I was thinking of a more generalized overlayfs solution that detects if
file systems don't initialize the superblock uuid and overlayfs
improvises by generating the internal overlayfs uuid, something like:

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 698d112bdb17..da3faaf68d69 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -248,6 +248,7 @@ struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool 
is_upper)
void *buf;
int buflen = MAX_HANDLE_SZ;
uuid_t *uuid = >d_sb->s_uuid;
+   static const uuid_t z_uuid;
 
buf = kmalloc(buflen, GFP_KERNEL);
if (!buf)
@@ -289,7 +290,22 @@ struct ovl_fh *ovl_encode_real_fh(struct dentry *real, 
bool is_upper)
if (is_upper)
fh->flags |= OVL_FH_FLAG_PATH_UPPER;
fh->len = fh_len;
-   fh->uuid = *uuid;
+
+   if (uuid_equal(uuid, _uuid)) {
+   struct super_block *sb = real->d_sb;
+   u16 hash;
+
+   pr_warn("ovl_encode_real_fh: ZERO UUID, generating one from 
superblock\n");
+
+   memcpy(>uuid.b[0], >s_magic, 8);
+   memcpy(>uuid.b[8], >s_dev, 6);
+   hash = ((long)sb ^ (long)sb->s_fs_info) >> 12;
+   memcpy(>uuid.b[14], , 2);
+   } else {
+   fh->uuid = *uuid;
+   }
+

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-06 Thread Colin Ian King
The concern I have is for other file systems that also don't populate
the UUID - this seems to be a general problem for overlayfs. Perhaps a
UUID can be autogenerated based on the superblock rather than file
system specific UUID magic if the UUID is zero.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-06 Thread Colin Ian King
Adding a uuid into the superblock on squashfs seems to resolve the
issue.  Since squashfs does not have UUID support, my hack below
generates one based on some squashfs superblock metadata that provides a
good enough UUID for our purposes.

diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c
index effa638d6d85..cfb34a75feb6 100644
--- a/fs/squashfs/super.c
+++ b/fs/squashfs/super.c
@@ -186,6 +186,12 @@ static int squashfs_fill_super(struct super_block *sb, 
void *data, int silent)
sb->s_flags |= SB_RDONLY;
sb->s_op = _super_ops;
 
+   memcpy(>s_uuid.b[0], >inodes, 4);
+   memcpy(>s_uuid.b[4], >mkfs_time, 4);
+   memcpy(>s_uuid.b[8], >fragments, 4);
+   memcpy(>s_uuid.b[12], >compression, 2);
+   memcpy(>s_uuid.b[14], >block_log, 2);
+
err = -ENOMEM;
 
msblk->block_cache = squashfs_cache_init("metadata",

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-06 Thread Colin Ian King
Comparing the previous debug with 2 squashfs overlayfs lowers with the
*same* data on ext4 as the 2 overlayfs lowers we have:

[   56.257691] repro-nosquashf (1038): drop_caches: 3
[   56.265075] ovl_get_fh: 112 dentry: etc/.pwd.lock name: 
trusted.overlay.origin
[   56.265077] ovl_get_fh: 115: res = 29
[   56.265079] ovl_get_fh: 152: return fh = b56cf4e7
[   56.265079] ovl_check_origin: 413 fh = b56cf4e7
[   56.265080] ovl_check_origin_fh: 354 upperdentry = etc/.pwd.lock
[   56.265081] ovl_decode_real_fh: 174
[   56.265081] ovl_decode_real_fh: 181 uuid not equal, return NULL
[   56.265082] ovl_check_origin_fh: 363, i=0, origin = NULL
[   56.265082] ovl_decode_real_fh: 174
[   56.266162] ovl_decode_real_fh: 211 return dentry (OK)
[   56.266163] ovl_check_origin_fh: 360, i=1, origin = /
[   56.266164] ovl_check_origin_fh: level=1 upper: etc/.pwd.lock 100600, lower: 
/ 100600
[   56.266166] ovl_check_origin_fh: 395 return 0
[   56.266166] ovl_check_origin: 422 err = 0
[   56.266167] ovl_check_origin: 439, return 0
  (
So this works fine, note that the squashfs lower / is 40755 (S_IFDIR | S_IRWXU 
| S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH) where as the ext3 lower / is 100600 
(S_IFREG | S_IRUSR | S_IWUSR)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-05 Thread Colin Ian King
4.15: ovl_get_origin_fh detects zero sized files on lower paths and
treats these a special zero sized "copied up but origin unknown" magic.

[   25.442916] ovl_check_origin: etc/.pwd.lock 2
[   25.442918] ovl_get_origin_fh: 104 etc/.pwd.lock
[   25.442919] ovl_get_origin_fh: 107 res=0
[   25.442920] ovl_get_origin_fh: 117 res == 0, return NULL
[   25.442921] ovl_get_origin: 179 fh =   (null) 1
[   25.442922] ovl_get_origin_fh: 104 etc/.pwd.lock
[   25.442922] ovl_get_origin_fh: 107 res=0
[   25.442923] ovl_get_origin_fh: 117 res == 0, return NULL
[   25.442923] ovl_get_origin: 179 fh =   (null) 1

5.3: the lower is / and hence is a directory hence the S_IFDIR origin
return.

[   33.320630] ovl_get_fh: 112 dentry: etc/.pwd.lock name: 
trusted.overlay.origin
[   33.320632] ovl_get_fh: 115: res = 29
[   33.320634] ovl_get_fh: 152: return fh = 6e71855c
[   33.320634] ovl_check_origin: 413 fh = 6e71855c
[   33.320635] ovl_check_origin_fh: 354 upperdentry = etc/.pwd.lock
[   33.320635] ovl_decode_real_fh: 174
[   33.320769] ovl_decode_real_fh: 211 return dentry (OK)
[   33.320769] ovl_check_origin_fh: 360, i=0, origin = /
[   33.320770] ovl_check_origin_fh: level=0 upper: etc/.pwd.lock 100600, lower: 
/ 40755
[   33.320770] ovl_check_origin_fh: 380 goto invalid
[   33.320771] overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin 
ftype=4000).
[   33.320773] ovl_check_origin: 422 err = -5
[   33.320774] ovl_check_origin: 429, return -5

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-11-05 Thread Colin Ian King
Installed the new zfsutils + zfs dkms to sanity check the kernel driver
part of the fix:

dmesg | grep ZFS
[   22.420188] ZFS: Loaded module v0.8.1-1ubuntu14.1, ZFS pool version 5000, 
ZFS filesystem version 5

And now the test:

root@eoan-amd64-uefi:~# mkdir /zfs-test
root@eoan-amd64-uefi:~# cd /zfs-test
root@eoan-amd64-uefi:/zfs-test# truncate -s 10G file.img
root@eoan-amd64-uefi:/zfs-test# zpool create -o ashift=12 -O acltype=posixacl 
-O compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank 
$(pwd)/file.img
root@eoan-amd64-uefi:/zfs-test# zfs create tank/d1 -o encryption=on -o 
keyformat=passphrase
Enter passphrase: 
Re-enter passphrase: 
root@eoan-amd64-uefi:/zfs-test# dd if=/dev/urandom bs=4k 
of=/tank/d1/somedata.bin count=10240
10240+0 records in
10240+0 records out
41943040 bytes (42 MB, 40 MiB) copied, 0.319657 s, 131 MB/s
root@eoan-amd64-uefi:/zfs-test# zfs snapshot tank/d1@s1
root@eoan-amd64-uefi:/zfs-test# dd if=/dev/urandom bs=4k 
of=/tank/d1/somedata2.bin count=10240
10240+0 records in
10240+0 records out
41943040 bytes (42 MB, 40 MiB) copied, 0.312195 s, 134 MB/s
root@eoan-amd64-uefi:/zfs-test# zfs diff tank/d1@s1 tank/d1
M   /tank/d1/
+   /tank/d1/somedata2.bin

The zfsutils + dkms package has the fix. Once this lands we can then
sync this into the next kernel release for the complete fix.

** Tags added: verification-done-eoan

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  Fix Committed

Bug description:
  == SRU Justification, Eoan ==

  Using zfs diff on an encrypted dataset with large objects one can hit
  an error such as follows:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  == Fix ==

  Upstream commit d359e99c38f667 ("diff_cb() does not handle large
  dnodes") as addressed in ZFS bug fix:
  https://github.com/zfsonlinux/zfs/pull/9343

  == Testcase ==

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s

  Without the fix, one hits an error such as:

  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  With the fix, we get:
  + /tank/d1/somedata2.bin
  M /tank/d1/

  == Regression Potential ==

  This is a minor change in module/zfs/dmu_diff.c and it only affects
  the zfs diff component, so this should not affect ZFS in terms of file
  system corruption/data loss.  This has also been upstream regression
  tested and passes the Ubuntu ZFS regressions tests too.  So the risk
  is limited.

  
  -

  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu

  When using zfs diff on an encrypted dataset, I frequently encounter
  this error:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343

  Here is one way to reproduce it:

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  There may be a simpler way to test this, but this should be enough to
  start with.

To manage notifications about this 

[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-11-05 Thread Colin Ian King
So this is a 2-phase fix.  The dkms package is updated, then we test
this, then this gets sync'd into the kernel.  I'm testing it right now,
let me sanity check the zfs-dkms part first and get that updated as step
#1.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  Fix Committed

Bug description:
  == SRU Justification, Eoan ==

  Using zfs diff on an encrypted dataset with large objects one can hit
  an error such as follows:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  == Fix ==

  Upstream commit d359e99c38f667 ("diff_cb() does not handle large
  dnodes") as addressed in ZFS bug fix:
  https://github.com/zfsonlinux/zfs/pull/9343

  == Testcase ==

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s

  Without the fix, one hits an error such as:

  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  With the fix, we get:
  + /tank/d1/somedata2.bin
  M /tank/d1/

  == Regression Potential ==

  This is a minor change in module/zfs/dmu_diff.c and it only affects
  the zfs diff component, so this should not affect ZFS in terms of file
  system corruption/data loss.  This has also been upstream regression
  tested and passes the Ubuntu ZFS regressions tests too.  So the risk
  is limited.

  
  -

  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu

  When using zfs diff on an encrypted dataset, I frequently encounter
  this error:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343

  Here is one way to reproduce it:

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  There may be a simpler way to test this, but this should be enough to
  start with.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-05 Thread Colin Ian King
Replaced one of the two squashfs with read-only ext4 partitions and
can't reproduce the error. Seems that we need 2 stacked squashfs file
systems.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-05 Thread Colin Ian King
Replaced read-only squashfs with read-only ext4 partitions and can't
reproduce the error.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-04 Thread Colin Ian King
And if we change use a different file: /root-tmp/var/log/ubuntu-
advantage.log we get the following error too:

[   24.531406] SQUASHFS error: squashfs_read_data failed to read block 
0x89c066e0b540
[   24.531444] SQUASHFS error: Unable to read metadata cache entry 
[89c066e0b540]

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-04 Thread Colin Ian King
OK, now managed to get a reproducer script to kick this bug even outside
the early install context.  Seems like we can force this bug by either
remounting OR sync'ing and dropping caches.

Attached is the reproducer script.

Run as root, we hit the error:

cat: /root-tmp/etc/.pwd.lock: Input/output error

dmesg:
[   42.415432] overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin 
ftype=4000).


** Attachment added: "repro.sh"
   
https://bugs.launchpad.net/ubuntu/bionic/+source/linux-hwe/+bug/1824407/+attachment/5302762/+files/repro.sh

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-04 Thread Colin Ian King
BTW, I can generate the mount move failure with the cut down script as
follows (which follows the same mount patterns as the casper script)

#!/bin/bash -x
mkdir -p /cdrom
mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom
sleep 1
mkdir -p /cow
mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow
sleep 1
mkdir -p /cow/upper
mkdir -p /cow/work
modprobe -q -b overlay
sleep 1
modprobe -q -b loop
sleep 1
dev=$(losetup -f)
mkdir -p /filesystem.squashfs
losetup $dev /cdrom/casper/filesystem.squashfs
mount -t squashfs -o ro,noatime $dev /filesystem.squashfs
sleep 1

dev=$(losetup -f)
mkdir -p /installer.squashfs
losetup $dev /cdrom/casper/installer.squashfs
mount -t squashfs -o ro,noatime $dev /installer.squashfs
sleep 1

mount -t overlay -o
'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work'
/cow /root

mkdir -p /root/rofs
mount -o move /filesystem.squashfs /root/rofs

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-04 Thread Colin Ian King
Hi Dimitri,

while debugging this I found the following in setup_unionfs() in
scripts/casper:

# move the first mount; no head in busybox-initramfs
for d in $(mount -t squashfs | cut -d\  -f 3); do
mkdir -p "${rootmnt}/rofs"
if [ "${UNIONFS}" = unionfs-fuse ]; then
mount -o bind "${d}" "${rootmnt}/rofs"
else
mount -o move "${d}" "${rootmnt}/rofs"
fi
break
done

and looking at the debug /run/initramfs/initramfs.debug log for the
above stanza I see:

+ cut '-d ' -f 3
+ mount -t squashfs
+ mkdir -p /root/rofs
+ '[' overlay '=' unionfs-fuse ]
+ mount -o move /filesystem.squashfs /root/rofs
+ break

however, when I cannot reproduce this mount -o move operation by hand as
I get the mount error:

mount: /root/rofs: /filesystem.squashfs is not a block device.

It appears to me that the scripts/casper mount seems to silently ignore
this failure.  Should the mount be a bind mount instead?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1846486] Re: revert the revert of ext4: make __ext4_get_inode_loc plug

2019-11-04 Thread Colin Ian King
Verified, seeing same ball-park performance improvements.

** Tags removed: verification-needed-eoan
** Tags added: verification-done-eoan

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1846486

Title:
  revert the revert of ext4: make __ext4_get_inode_loc plug

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Eoan:
  In Progress

Bug description:
  == SRU Justification Eoan ==

  Now that 5.4 contains a fix to the bootup regression due to the lack
  of entropy at bootable we should apply this fix and also revert the
  revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"".

  == Fix ==

  So, to clarify, apply the two upstream 5.4-rc commits:

  commit 50ee7529ec4500c88f8664560770a7a1b65db72b
  Author: Linus Torvalds 
  Date:   Sat Sep 28 16:53:52 2019 -0700

  random: try to actively add entropy rather than passively wait for
  it

  commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13
  Author: Linus Torvalds 
  Date:   Sun Sep 29 17:59:23 2019 -0700

  Revert "Revert "ext4: make __ext4_get_inode_loc plug""

  I've benchmarked the Eoan kernel with these two patches and found theo
  following speed improvements on an i7-3770 CPU @ 3.40GHz with 8GB and a
  WDC WD10EZEX-21WN4A HDD (7200RPM, 64MB cache).

  git grep of the kernel: 0.14%
  building fwts: 0.40%
  build stress-ng 0.45%
  tar up kernel source: 7.6%
  boot time of eoan cloud image: 10.5%

  So I think the speed improvements justify the SRU.

  == Regression potential ==

  minor change to ext4, which has been regression tested, so risk here
  is small.  The entropy change will alter the random number generation,
  but I believe this does not change the cryptographical security of the
  random numbers being generated, so think this change is not security
  risk.

  originally the ext4 change caused boot time user space regressions
  because of the entropy change of this fix, but the random fix
  addresses this, so I believe this risk is now zero.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1846486/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-11-03 Thread Colin Ian King
** Also affects: zfs-linux (Ubuntu Eoan)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Eoan:
  New

Bug description:
  == SRU Justification, Eoan ==

  Using zfs diff on an encrypted dataset with large objects one can hit
  an error such as follows:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  == Fix ==

  Upstream commit d359e99c38f667 ("diff_cb() does not handle large
  dnodes") as addressed in ZFS bug fix:
  https://github.com/zfsonlinux/zfs/pull/9343

  == Testcase ==

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s

  Without the fix, one hits an error such as:

  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  With the fix, we get:
  + /tank/d1/somedata2.bin
  M /tank/d1/

  == Regression Potential ==

  This is a minor change in module/zfs/dmu_diff.c and it only affects
  the zfs diff component, so this should not affect ZFS in terms of file
  system corruption/data loss.  This has also been upstream regression
  tested and passes the Ubuntu ZFS regressions tests too.  So the risk
  is limited.

  
  -

  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu

  When using zfs diff on an encrypted dataset, I frequently encounter
  this error:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343

  Here is one way to reproduce it:

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  There may be a simpler way to test this, but this should be enough to
  start with.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1824407] Re: remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files

2019-11-01 Thread Colin Ian King
** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: linux-hwe (Ubuntu Bionic)
 Assignee: (unassigned) => Colin Ian King (colin-king)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) Before --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

  rm /scripts/casper-bottom/25adduser
  exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-11-01 Thread Colin Ian King
uploaded zfs-linux (0.8.1-1ubuntu14.1) eoan (will land in -proposed sometime 
soon)
uploaded zfs-linux (0.0.1.1ubuntu16) focal 

Once the packages are uploaded the dkms driver component will be sync'd
into the next kernel and then once this is in -proposed it can be fully
tested.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Justification, Eoan ==

  Using zfs diff on an encrypted dataset with large objects one can hit
  an error such as follows:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  == Fix ==

  Upstream commit d359e99c38f667 ("diff_cb() does not handle large
  dnodes") as addressed in ZFS bug fix:
  https://github.com/zfsonlinux/zfs/pull/9343

  == Testcase ==

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s

  Without the fix, one hits an error such as:

  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  With the fix, we get:
  + /tank/d1/somedata2.bin
  M /tank/d1/

  == Regression Potential ==

  This is a minor change in module/zfs/dmu_diff.c and it only affects
  the zfs diff component, so this should not affect ZFS in terms of file
  system corruption/data loss.  This has also been upstream regression
  tested and passes the Ubuntu ZFS regressions tests too.  So the risk
  is limited.

  
  -

  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu

  When using zfs diff on an encrypted dataset, I frequently encounter
  this error:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343

  Here is one way to reproduce it:

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  There may be a simpler way to test this, but this should be enough to
  start with.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-10-24 Thread Colin Ian King
I'll get this uploaded into -proposed once the current SRU backlog is
out.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Justification, Eoan ==

  Using zfs diff on an encrypted dataset with large objects one can hit
  an error such as follows:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  == Fix ==

  Upstream commit d359e99c38f667 ("diff_cb() does not handle large
  dnodes") as addressed in ZFS bug fix:
  https://github.com/zfsonlinux/zfs/pull/9343

  == Testcase ==

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s

  Without the fix, one hits an error such as:

  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  With the fix, we get:
  + /tank/d1/somedata2.bin
  M /tank/d1/

  == Regression Potential ==

  This is a minor change in module/zfs/dmu_diff.c and it only affects
  the zfs diff component, so this should not affect ZFS in terms of file
  system corruption/data loss.  This has also been upstream regression
  tested and passes the Ubuntu ZFS regressions tests too.  So the risk
  is limited.

  
  -

  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu

  When using zfs diff on an encrypted dataset, I frequently encounter
  this error:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343

  Here is one way to reproduce it:

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  There may be a simpler way to test this, but this should be enough to
  start with.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-10-24 Thread Colin Ian King
** Description changed:

+ == SRU Justification, Eoan ==
+ 
+ Using zfs diff on an encrypted dataset with large objects one can hit an
+ error such as follows:
+ 
+ # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
+ + /nsnx/trusty-2a/bin
+ Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists
+ 
+ == Fix ==
+ 
+ Upstream commit d359e99c38f667 ("diff_cb() does not handle large
+ dnodes") as addressed in ZFS bug fix:
+ https://github.com/zfsonlinux/zfs/pull/9343
+ 
+ == Testcase ==
+ 
+ # mkdir /zfs-test
+ # cd /zfs-test
+ # truncate -s 10G file.img
+ # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
+ # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
+ Enter passphrase:
+ Re-enter passphrase:
+ # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
+ 10240+0 records in
+ 10240+0 records out
+ 41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
+ # zfs snapshot tank/d1@s1
+ # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
+ 10240+0 records in
+ 10240+0 records out
+ 41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
+ 
+ Without the fix, one hits an error such as:
+ 
+ # zfs diff tank/d1@s1 tank/d1
+ Unable to determine path or stats for object 3 in tank/d1@s1: File exists
+ 
+ With the fix, we get:
+ + /tank/d1/somedata2.bin
+ M /tank/d1/
+ 
+ == Regression Potential ==
+ 
+ This is a minor change in module/zfs/dmu_diff.c and it only affects the
+ zfs diff component, so this should not affect ZFS in terms of file
+ system corruption/data loss.  This has also been upstream regression
+ tested and passes the Ubuntu ZFS regressions tests too.  So the risk is
+ limited.
+ 
+ 
+ -
+ 
  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu
  
  When using zfs diff on an encrypted dataset, I frequently encounter this
  error:
  
  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists
  
  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343
  
  Here is one way to reproduce it:
  
  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists
  
  There may be a simpler way to test this, but this should be enough to
  start with.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Justification, Eoan ==

  Using zfs diff on an encrypted dataset with large objects one can hit
  an error such as follows:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  == Fix ==

  Upstream commit d359e99c38f667 ("diff_cb() does not handle large
  dnodes") as addressed in ZFS bug fix:
  https://github.com/zfsonlinux/zfs/pull/9343

  == Testcase ==

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s

  Without the fix, one hits an error such as:

  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  With the fix, we get:
  + /tank/d1/somedata2.bin
  M /tank/d1/

  == 

[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-10-24 Thread Colin Ian King
With the fix:

root@eoan-amd64-efi:/home/cking# mkdir /zfs-test
root@eoan-amd64-efi:/home/cking# cd /zfs-test
root@eoan-amd64-efi:/zfs-test# truncate -s 10G file.img
root@eoan-amd64-efi:/zfs-test# zpool create -o ashift=12 -O acltype=posixacl -O 
compression=lz4 -O xattr=sa -O normalization=formD -O dnodesize=auto tank 
$(pwd)/file.img
root@eoan-amd64-efi:/zfs-test# zfs create tank/d1 -o encryption=on -o 
keyformat=passphrase
Enter passphrase: 
Re-enter passphrase: 
root@eoan-amd64-efi:/zfs-test# dd if=/dev/urandom bs=4k 
of=/tank/d1/somedata.bin count=10240
10240+0 records in
10240+0 records out
41943040 bytes (42 MB, 40 MiB) copied, 0.238499 s, 176 MB/s
root@eoan-amd64-efi:/zfs-test# zfs snapshot tank/d1@s1
root@eoan-amd64-efi:/zfs-test# dd if=/dev/urandom bs=4k 
of=/tank/d1/somedata2.bin count=10240
10240+0 records in
10240+0 records out
41943040 bytes (42 MB, 40 MiB) copied, 0.228746 s, 183 MB/s
root@eoan-amd64-efi:/zfs-test# zfs diff tank/d1@s1 tank/d1
+   /tank/d1/somedata2.bin
M   /tank/d1/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Justification, Eoan ==

  Using zfs diff on an encrypted dataset with large objects one can hit
  an error such as follows:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  == Fix ==

  Upstream commit d359e99c38f667 ("diff_cb() does not handle large
  dnodes") as addressed in ZFS bug fix:
  https://github.com/zfsonlinux/zfs/pull/9343

  == Testcase ==

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s

  Without the fix, one hits an error such as:

  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  With the fix, we get:
  + /tank/d1/somedata2.bin
  M /tank/d1/

  == Regression Potential ==

  This is a minor change in module/zfs/dmu_diff.c and it only affects
  the zfs diff component, so this should not affect ZFS in terms of file
  system corruption/data loss.  This has also been upstream regression
  tested and passes the Ubuntu ZFS regressions tests too.  So the risk
  is limited.

  
  -

  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu

  When using zfs diff on an encrypted dataset, I frequently encounter
  this error:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343

  Here is one way to reproduce it:

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  There may be a simpler way to test this, but this should be enough to
  start with.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-10-24 Thread Colin Ian King
Confirming that upstream commit
https://github.com/zfsonlinux/zfs/commit/d359e99c38f66732d42278c32d52cfcf1839aa4f
fixes this issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Justification, Eoan ==

  Using zfs diff on an encrypted dataset with large objects one can hit
  an error such as follows:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  == Fix ==

  Upstream commit d359e99c38f667 ("diff_cb() does not handle large
  dnodes") as addressed in ZFS bug fix:
  https://github.com/zfsonlinux/zfs/pull/9343

  == Testcase ==

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s

  Without the fix, one hits an error such as:

  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  With the fix, we get:
  + /tank/d1/somedata2.bin
  M /tank/d1/

  == Regression Potential ==

  This is a minor change in module/zfs/dmu_diff.c and it only affects
  the zfs diff component, so this should not affect ZFS in terms of file
  system corruption/data loss.  This has also been upstream regression
  tested and passes the Ubuntu ZFS regressions tests too.  So the risk
  is limited.

  
  -

  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu

  When using zfs diff on an encrypted dataset, I frequently encounter
  this error:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343

  Here is one way to reproduce it:

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  There may be a simpler way to test this, but this should be enough to
  start with.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1849665] Re: zfs diff: Unable to determine path or stats for object

2019-10-24 Thread Colin Ian King
** Changed in: zfs-linux (Ubuntu)
   Importance: Undecided => High

** Changed in: zfs-linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: zfs-linux (Ubuntu)
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1849665

Title:
  zfs diff: Unable to determine path or stats for object

Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  Eoan 19.10
  zfsutils-linux 0.8.1-1ubuntu14
  kernel 5.3.0-19-generic #20-Ubuntu

  When using zfs diff on an encrypted dataset, I frequently encounter
  this error:

  # zfs diff nsnx/trusty-2a@snap1 nsnx/trusty-2a
  + /nsnx/trusty-2a/bin
  Unable to determine path or stats for object 5 in 
nsnx/trusty-2a@zfs-diff-32359-00010001f165: File exists

  I believe this to be upstream bug
  https://github.com/zfsonlinux/zfs/issues/7678, fixed with
  https://github.com/zfsonlinux/zfs/pull/9343

  Here is one way to reproduce it:

  # mkdir /zfs-test
  # cd /zfs-test
  # truncate -s 10G file.img
  # zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O 
xattr=sa -O normalization=formD -O dnodesize=auto tank $(pwd)/file.img
  # zfs create tank/d1 -o encryption=on -o keyformat=passphrase
  Enter passphrase:
  Re-enter passphrase:
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,304365 s, 138 MB/s
  # zfs snapshot tank/d1@s1
  # dd if=/dev/urandom bs=4k of=/tank/d1/somedata2.bin count=10240
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 0,305324 s, 137 MB/s
  # zfs diff tank/d1@s1 tank/d1
  Unable to determine path or stats for object 3 in tank/d1@s1: File exists

  There may be a simpler way to test this, but this should be enough to
  start with.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1849665/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1847628] Re: When using swap in ZFS, system stops when you start using swap

2019-10-15 Thread Colin Ian King
** Changed in: ubiquity (Ubuntu)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1847628

Title:
  When using swap in ZFS, system stops when you start using swap

Status in ubiquity package in Ubuntu:
  Confirmed
Status in zfs-linux package in Ubuntu:
  Confirmed

Bug description:
  # Problem

  When using swap in ZFS, system stops when you start using swap.

  > stress --vm 100

  if you doing swapoff will only occur OOM and the system will not stop.

  # Environment

  jehos@MacBuntu:~$ lsb_release -a
  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:Ubuntu Eoan Ermine (development branch)
  Release:19.10
  Codename:   eoan

  jehos@MacBuntu:~$ dpkg -l | grep zfs
  ii  libzfs2linux   0.8.1-1ubuntu13
 amd64OpenZFS filesystem library for Linux
  ii  zfs-initramfs  0.8.1-1ubuntu13
 amd64OpenZFS root filesystem capabilities for Linux - initramfs
  ii  zfs-zed0.8.1-1ubuntu13
 amd64OpenZFS Event Daemon
  ii  zfsutils-linux 0.8.1-1ubuntu13
 amd64command-line tools to manage OpenZFS filesystems

  jehos@MacBuntu:~$ uname -a
  Linux MacBuntu 5.3.0-13-generic #14-Ubuntu SMP Tue Sep 24 02:46:08 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  jehos@MacBuntu:~$ zpool list
  NAMESIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAGCAP  DEDUPHEALTH  
ALTROOT
  bpool  1.88G  66.1M  1.81G- -  - 3%  1.00xONLINE  
-
  rpool   230G   124G   106G- - 9%53%  1.00xONLINE  
-

  jehos@MacBuntu:~$ zfs get all rpool/swap
  NAMEPROPERTY  VALUESOURCE
  rpool/swap  type  volume   -
  rpool/swap  creation  목 10월 10 15:56 2019  -
  rpool/swap  used  2.13G-
  rpool/swap  available 98.9G-
  rpool/swap  referenced72K  -
  rpool/swap  compressratio 1.11x-
  rpool/swap  reservation   none default
  rpool/swap  volsize   2G   local
  rpool/swap  volblocksize  4K   -
  rpool/swap  checksum  on   default
  rpool/swap  compression   zle  local
  rpool/swap  readonly  off  default
  rpool/swap  createtxg 34   -
  rpool/swap  copies1default
  rpool/swap  refreservation2.13Glocal
  rpool/swap  guid  18209330213704683244 -
  rpool/swap  primarycache  metadata local
  rpool/swap  secondarycachenone local
  rpool/swap  usedbysnapshots   0B   -
  rpool/swap  usedbydataset 72K  -
  rpool/swap  usedbychildren0B   -
  rpool/swap  usedbyrefreservation  2.13G-
  rpool/swap  logbias   throughput   local
  rpool/swap  objsetid  393  -
  rpool/swap  dedup off  default
  rpool/swap  mlslabel  none default
  rpool/swap  sync  always   local
  rpool/swap  refcompressratio  1.11x-
  rpool/swap  written   72K  -
  rpool/swap  logicalused   40K  -
  rpool/swap  logicalreferenced 40K  -
  rpool/swap  volmode   default  default
  rpool/swap  snapshot_limitnone default
  rpool/swap  snapshot_countnone default
  rpool/swap  snapdev   hidden   default
  rpool/swap  context   none default
  rpool/swap  fscontext none default
  rpool/swap  defcontextnone default
  rpool/swap  rootcontext   none default
  rpool/swap  redundant_metadataall  default
  rpool/swap  encryptionoff  default
  rpool/swap  keylocation   none default
  rpool/swap  keyformat none default
  rpool/swap  pbkdf2iters   0default

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1847628/+subscriptions

-- 

[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

2019-10-14 Thread Colin Ian King
A fix has landed in lxd, I refer you to the following comment:

https://github.com/lxc/lxd/issues/4656#issuecomment-541266681

Please check if this addresses the issues.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1779156

Title:
  lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

Status in linux package in Ubuntu:
  Triaged
Status in lxc package in Ubuntu:
  Confirmed
Status in linux source package in Cosmic:
  Triaged
Status in lxc source package in Cosmic:
  Confirmed
Status in linux source package in Disco:
  New
Status in lxc source package in Disco:
  New
Status in linux source package in Eoan:
  Triaged
Status in lxc source package in Eoan:
  Confirmed

Bug description:
  I'm not sure exactly what got me into this state, but I have several
  lxc containers that cannot be deleted.

  $ lxc info
  
  api_status: stable
  api_version: "1.0"
  auth: trusted
  public: false
  auth_methods:
  - tls
  environment:
addresses: []
architectures:
- x86_64
- i686
certificate: |
  -BEGIN CERTIFICATE-
  
  -END CERTIFICATE-
certificate_fingerprint: 
3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb
driver: lxc
driver_version: 3.0.1
kernel: Linux
kernel_architecture: x86_64
kernel_version: 4.15.0-23-generic
server: lxd
server_pid: 15123
server_version: "3.2"
storage: zfs
storage_version: 0.7.5-1ubuntu15
server_clustered: false
server_name: milhouse

  $ lxc delete --force b1
  Error: Failed to destroy ZFS filesystem: cannot destroy 
'default/containers/b1': dataset is busy

  Talking in #lxc-dev, stgraber and sforeshee provided diagnosis:

   | short version is that something unshared a mount namespace causing
   | them to get a copy of the mount table at the time that dataset was
   | mounted, which then prevents zfs from being able to destroy it)

  The work around provided was

   | you can unstick this particular issue by doing:
   |  grep default/containers/b1 /proc/*/mountinfo
   | then for any of the hits, do:
   |   nsenter -t PID -m -- umount 
/var/snap/lxd/common/lxd/storage-pools/default/containers/b1
   | then try the delete again

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.15.0-23-generic 4.15.0-23.25
  ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
  Uname: Linux 4.15.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.10-0ubuntu3
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  smoser31412 F pulseaudio
   /dev/snd/controlC2:  smoser31412 F pulseaudio
   /dev/snd/controlC0:  smoser31412 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 28 10:42:45 2018
  EcryptfsInUse: Yes
  InstallationDate: Installed on 2015-07-23 (1071 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
  MachineType: 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic 
root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-23-generic N/A
   linux-backports-modules-4.15.0-23-generic  N/A
   linux-firmware 1.174
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 03/09/2015
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355
  dmi.board.asset.tag: �
  dmi.board.name: NUC5i5RYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H40999-503
  dmi.chassis.asset.tag: �
  dmi.chassis.type: 3
  dmi.chassis.vendor: �
  dmi.chassis.version: �
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i5RYB:rvrH40999-503:cvn:ct3:cvr:
  dmi.product.family: �
  dmi.product.name: �
  dmi.product.version: �
  dmi.sys.vendor: �

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779156/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : 

[Kernel-packages] [Bug 1847628] Re: When using swap in ZFS, system stops when you start using swap

2019-10-10 Thread Colin Ian King
https://github.com/zfsonlinux/pkg-zfs/wiki/HOWTO-use-a-zvol-as-a-swap-
device

..there are known issues with swap on ZFS not working well on heavily
memory loaded systems.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1847628

Title:
  When using swap in ZFS, system stops when you start using swap

Status in zfs-linux package in Ubuntu:
  New

Bug description:
  # Problem

  When using swap in ZFS, system stops when you start using swap.

  > stress --vm 100

  if you doing swapoff will only occur OOM and the system will not stop.

  # Environment

  jehos@MacBuntu:~$ lsb_release -a
  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:Ubuntu Eoan Ermine (development branch)
  Release:19.10
  Codename:   eoan

  jehos@MacBuntu:~$ dpkg -l | grep zfs
  ii  libzfs2linux   0.8.1-1ubuntu13
 amd64OpenZFS filesystem library for Linux
  ii  zfs-initramfs  0.8.1-1ubuntu13
 amd64OpenZFS root filesystem capabilities for Linux - initramfs
  ii  zfs-zed0.8.1-1ubuntu13
 amd64OpenZFS Event Daemon
  ii  zfsutils-linux 0.8.1-1ubuntu13
 amd64command-line tools to manage OpenZFS filesystems

  jehos@MacBuntu:~$ uname -a
  Linux MacBuntu 5.3.0-13-generic #14-Ubuntu SMP Tue Sep 24 02:46:08 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  jehos@MacBuntu:~$ zpool list
  NAMESIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAGCAP  DEDUPHEALTH  
ALTROOT
  bpool  1.88G  66.1M  1.81G- -  - 3%  1.00xONLINE  
-
  rpool   230G   124G   106G- - 9%53%  1.00xONLINE  
-

  jehos@MacBuntu:~$ zfs get all rpool/swap
  NAMEPROPERTY  VALUESOURCE
  rpool/swap  type  volume   -
  rpool/swap  creation  목 10월 10 15:56 2019  -
  rpool/swap  used  2.13G-
  rpool/swap  available 98.9G-
  rpool/swap  referenced72K  -
  rpool/swap  compressratio 1.11x-
  rpool/swap  reservation   none default
  rpool/swap  volsize   2G   local
  rpool/swap  volblocksize  4K   -
  rpool/swap  checksum  on   default
  rpool/swap  compression   zle  local
  rpool/swap  readonly  off  default
  rpool/swap  createtxg 34   -
  rpool/swap  copies1default
  rpool/swap  refreservation2.13Glocal
  rpool/swap  guid  18209330213704683244 -
  rpool/swap  primarycache  metadata local
  rpool/swap  secondarycachenone local
  rpool/swap  usedbysnapshots   0B   -
  rpool/swap  usedbydataset 72K  -
  rpool/swap  usedbychildren0B   -
  rpool/swap  usedbyrefreservation  2.13G-
  rpool/swap  logbias   throughput   local
  rpool/swap  objsetid  393  -
  rpool/swap  dedup off  default
  rpool/swap  mlslabel  none default
  rpool/swap  sync  always   local
  rpool/swap  refcompressratio  1.11x-
  rpool/swap  written   72K  -
  rpool/swap  logicalused   40K  -
  rpool/swap  logicalreferenced 40K  -
  rpool/swap  volmode   default  default
  rpool/swap  snapshot_limitnone default
  rpool/swap  snapshot_countnone default
  rpool/swap  snapdev   hidden   default
  rpool/swap  context   none default
  rpool/swap  fscontext none default
  rpool/swap  defcontextnone default
  rpool/swap  rootcontext   none default
  rpool/swap  redundant_metadataall  default
  rpool/swap  encryptionoff  default
  rpool/swap  keylocation   none default
  rpool/swap  keyformat none default
  rpool/swap  pbkdf2iters   0default

To manage notifications about this bug go to:

[Kernel-packages] [Bug 1847628] Re: When using swap in ZFS, system stops when you start using swap

2019-10-10 Thread Colin Ian King
A swapfile on ZFS is a bad idea. Swapped out pages get pushed through
the vfs into zfs and each page of swap will be magnified in the number
of free pages required to get this page out to disk.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1847628

Title:
  When using swap in ZFS, system stops when you start using swap

Status in zfs-linux package in Ubuntu:
  New

Bug description:
  # Problem

  When using swap in ZFS, system stops when you start using swap.

  > stress --vm 100

  if you doing swapoff will only occur OOM and the system will not stop.

  # Environment

  jehos@MacBuntu:~$ lsb_release -a
  No LSB modules are available.
  Distributor ID: Ubuntu
  Description:Ubuntu Eoan Ermine (development branch)
  Release:19.10
  Codename:   eoan

  jehos@MacBuntu:~$ dpkg -l | grep zfs
  ii  libzfs2linux   0.8.1-1ubuntu13
 amd64OpenZFS filesystem library for Linux
  ii  zfs-initramfs  0.8.1-1ubuntu13
 amd64OpenZFS root filesystem capabilities for Linux - initramfs
  ii  zfs-zed0.8.1-1ubuntu13
 amd64OpenZFS Event Daemon
  ii  zfsutils-linux 0.8.1-1ubuntu13
 amd64command-line tools to manage OpenZFS filesystems

  jehos@MacBuntu:~$ uname -a
  Linux MacBuntu 5.3.0-13-generic #14-Ubuntu SMP Tue Sep 24 02:46:08 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  jehos@MacBuntu:~$ zpool list
  NAMESIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAGCAP  DEDUPHEALTH  
ALTROOT
  bpool  1.88G  66.1M  1.81G- -  - 3%  1.00xONLINE  
-
  rpool   230G   124G   106G- - 9%53%  1.00xONLINE  
-

  jehos@MacBuntu:~$ zfs get all rpool/swap
  NAMEPROPERTY  VALUESOURCE
  rpool/swap  type  volume   -
  rpool/swap  creation  목 10월 10 15:56 2019  -
  rpool/swap  used  2.13G-
  rpool/swap  available 98.9G-
  rpool/swap  referenced72K  -
  rpool/swap  compressratio 1.11x-
  rpool/swap  reservation   none default
  rpool/swap  volsize   2G   local
  rpool/swap  volblocksize  4K   -
  rpool/swap  checksum  on   default
  rpool/swap  compression   zle  local
  rpool/swap  readonly  off  default
  rpool/swap  createtxg 34   -
  rpool/swap  copies1default
  rpool/swap  refreservation2.13Glocal
  rpool/swap  guid  18209330213704683244 -
  rpool/swap  primarycache  metadata local
  rpool/swap  secondarycachenone local
  rpool/swap  usedbysnapshots   0B   -
  rpool/swap  usedbydataset 72K  -
  rpool/swap  usedbychildren0B   -
  rpool/swap  usedbyrefreservation  2.13G-
  rpool/swap  logbias   throughput   local
  rpool/swap  objsetid  393  -
  rpool/swap  dedup off  default
  rpool/swap  mlslabel  none default
  rpool/swap  sync  always   local
  rpool/swap  refcompressratio  1.11x-
  rpool/swap  written   72K  -
  rpool/swap  logicalused   40K  -
  rpool/swap  logicalreferenced 40K  -
  rpool/swap  volmode   default  default
  rpool/swap  snapshot_limitnone default
  rpool/swap  snapshot_countnone default
  rpool/swap  snapdev   hidden   default
  rpool/swap  context   none default
  rpool/swap  fscontext none default
  rpool/swap  defcontextnone default
  rpool/swap  rootcontext   none default
  rpool/swap  redundant_metadataall  default
  rpool/swap  encryptionoff  default
  rpool/swap  keylocation   none default
  rpool/swap  keyformat none default
  rpool/swap  pbkdf2iters   0default

To manage notifications about this bug go to:

[Kernel-packages] [Bug 1846424] Re: 19.10 ZFS Update failed on 2019-10-02

2019-10-07 Thread Colin Ian King
The error: "zfs[9317]: cannot mount '/': directory is not empty" seems
to suggest that this is a root mounted zfs.  Is that so?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1846424

Title:
  19.10 ZFS Update failed on 2019-10-02

Status in zfs-linux package in Ubuntu:
  In Progress

Bug description:
  On all my systems the update from zfs-initrams_0.8.1-1ubuntu12_amd64.deb 
failed the same is true for zfs-zed and zfsutils-linux.
  The system still runs on 0.8.1-1ubuntu11_amd64.
  The first error message was about a failing mount and at the end it announced 
that all 3 modules were not updated.
  I have the error on Xubuntu 19.10, Ubuntu Mate 19.10 on my laptop i5-2520M 
and in a VBox VM on a Ryzen 3 2200G with Ubuntu 19.10.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1846424/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1846486] Re: revert the revert of ext4: make __ext4_get_inode_loc plug

2019-10-03 Thread Colin Ian King
** Description changed:

  == SRU Justification Eoan ==
  
  Now that 5.4 contains a fix to the bootup regression due to the lack of
  entropy at bootable we should apply this fix and also revert the revert
  of commit "Revert "ext4: make __ext4_get_inode_loc plug"".
  
  == Fix ==
  
  So, to clarify, apply the two upstream 5.4-rc commits:
  
  commit 50ee7529ec4500c88f8664560770a7a1b65db72b
  Author: Linus Torvalds 
  Date:   Sat Sep 28 16:53:52 2019 -0700
  
  random: try to actively add entropy rather than passively wait for
  it
  
  commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13
  Author: Linus Torvalds 
  Date:   Sun Sep 29 17:59:23 2019 -0700
  
  Revert "Revert "ext4: make __ext4_get_inode_loc plug""
  
- I've benchmarked the Eoan kernel with these two patches and found the
- following speed improvements:
+ I've benchmarked the Eoan kernel with these two patches and found theo
+ following speed improvements on an i7-3770 CPU @ 3.40GHz with 8GB and a
+ WDC WD10EZEX-21WN4A HDD (7200RPM, 64MB cache).
  
  git grep of the kernel: 0.14%
  building fwts: 0.40%
  build stress-ng 0.45%
  tar up kernel source: 7.6%
  boot time of eoan cloud image: 10.5%
  
  So I think the speed improvements justify the SRU.
  
  == Regression potential ==
  
  minor change to ext4, which has been regression tested, so risk here is
  small.  The entropy change will alter the random number generation, but
  I believe this does not change the cryptographical security of the
  random numbers being generated, so think this change is not security
  risk.
  
  originally the ext4 change caused boot time user space regressions
  because of the entropy change of this fix, but the random fix addresses
  this, so I believe this risk is now zero.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1846486

Title:
  revert the revert of ext4: make __ext4_get_inode_loc plug

Status in linux package in Ubuntu:
  In Progress

Bug description:
  == SRU Justification Eoan ==

  Now that 5.4 contains a fix to the bootup regression due to the lack
  of entropy at bootable we should apply this fix and also revert the
  revert of commit "Revert "ext4: make __ext4_get_inode_loc plug"".

  == Fix ==

  So, to clarify, apply the two upstream 5.4-rc commits:

  commit 50ee7529ec4500c88f8664560770a7a1b65db72b
  Author: Linus Torvalds 
  Date:   Sat Sep 28 16:53:52 2019 -0700

  random: try to actively add entropy rather than passively wait for
  it

  commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13
  Author: Linus Torvalds 
  Date:   Sun Sep 29 17:59:23 2019 -0700

  Revert "Revert "ext4: make __ext4_get_inode_loc plug""

  I've benchmarked the Eoan kernel with these two patches and found theo
  following speed improvements on an i7-3770 CPU @ 3.40GHz with 8GB and a
  WDC WD10EZEX-21WN4A HDD (7200RPM, 64MB cache).

  git grep of the kernel: 0.14%
  building fwts: 0.40%
  build stress-ng 0.45%
  tar up kernel source: 7.6%
  boot time of eoan cloud image: 10.5%

  So I think the speed improvements justify the SRU.

  == Regression potential ==

  minor change to ext4, which has been regression tested, so risk here
  is small.  The entropy change will alter the random number generation,
  but I believe this does not change the cryptographical security of the
  random numbers being generated, so think this change is not security
  risk.

  originally the ext4 change caused boot time user space regressions
  because of the entropy change of this fix, but the random fix
  addresses this, so I believe this risk is now zero.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1846486/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1846486] [NEW] revert the revert of ext4: make __ext4_get_inode_loc plug

2019-10-03 Thread Colin Ian King
Public bug reported:

== SRU Justification Eoan ==

Now that 5.4 contains a fix to the bootup regression due to the lack of
entropy at bootable we should apply this fix and also revert the revert
of commit "Revert "ext4: make __ext4_get_inode_loc plug"".

== Fix ==

So, to clarify, apply the two upstream 5.4-rc commits:

commit 50ee7529ec4500c88f8664560770a7a1b65db72b
Author: Linus Torvalds 
Date:   Sat Sep 28 16:53:52 2019 -0700

random: try to actively add entropy rather than passively wait for
it

commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13
Author: Linus Torvalds 
Date:   Sun Sep 29 17:59:23 2019 -0700

Revert "Revert "ext4: make __ext4_get_inode_loc plug""

I've benchmarked the Eoan kernel with these two patches and found the
following speed improvements:

git grep of the kernel: 0.14%
building fwts: 0.40%
build stress-ng 0.45%
tar up kernel source: 7.6%
boot time of eoan cloud image: 10.5%

So I think the speed improvements justify the SRU.

== Regression potential ==

minor change to ext4, which has been regression tested, so risk here is
small.  The entropy change will alter the random number generation, but
I believe this does not change the cryptographical security of the
random numbers being generated, so think this change is not security
risk.

originally the ext4 change caused boot time user space regressions
because of the entropy change of this fix, but the random fix addresses
this, so I believe this risk is now zero.

** Affects: linux (Ubuntu)
     Importance: High
 Assignee: Colin Ian King (colin-king)
 Status: In Progress

** Changed in: linux (Ubuntu)
   Status: New => In Progress

** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Description changed:

- == SRU Justifucation Eoan ==
+ == SRU Justification Eoan ==
  
  Now that 5.4 contains a fix to the bootup regression due to the lack of
  entropy at bootable we should apply this fix and also revert the revert
  of commit "Revert "ext4: make __ext4_get_inode_loc plug"".
  
  == Fix ==
  
  So, to clarify, apply the two upstream 5.4-rc commits:
  
  commit 50ee7529ec4500c88f8664560770a7a1b65db72b
  Author: Linus Torvalds 
  Date:   Sat Sep 28 16:53:52 2019 -0700
  
- random: try to actively add entropy rather than passively wait for
+ random: try to actively add entropy rather than passively wait for
  it
  
  commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13
  Author: Linus Torvalds 
  Date:   Sun Sep 29 17:59:23 2019 -0700
  
- Revert "Revert "ext4: make __ext4_get_inode_loc plug""
+ Revert "Revert "ext4: make __ext4_get_inode_loc plug""
  
- 
- I've benchmarked the Eoan kernel with these two patches and found the 
following speed improvements:
+ I've benchmarked the Eoan kernel with these two patches and found the
+ following speed improvements:
  
  git grep of the kernel: 0.14%
  building fwts: 0.40%
  build stress-ng 0.45%
  tar up kernel source: 7.6%
  boot time of eoan cloud image: 10.5%
  
  So I think this justifies the speed improvements.
  
  == Regression potential ==
  
  minor change to ext4, which has been regression tested, so risk here is
  small.  The entropy change will alter the random number generation, but
  I believe this does not change the cryptographical security of the
  random numbers being generated, so think this change is not security
  risk.
  
  originally the ext4 change caused boot time user space regressions
  because of the entropy change of this fix, but the random fix addresses
  this, so I believe this risk is now zero.

** Description changed:

  == SRU Justification Eoan ==
  
  Now that 5.4 contains a fix to the bootup regression due to the lack of
  entropy at bootable we should apply this fix and also revert the revert
  of commit "Revert "ext4: make __ext4_get_inode_loc plug"".
  
  == Fix ==
  
  So, to clarify, apply the two upstream 5.4-rc commits:
  
  commit 50ee7529ec4500c88f8664560770a7a1b65db72b
  Author: Linus Torvalds 
  Date:   Sat Sep 28 16:53:52 2019 -0700
  
  random: try to actively add entropy rather than passively wait for
  it
  
  commit 02f03c4206c1b2a7451d3b3546f86c9c783eac13
  Author: Linus Torvalds 
  Date:   Sun Sep 29 17:59:23 2019 -0700
  
  Revert "Revert "ext4: make __ext4_get_inode_loc plug""
  
  I've benchmarked the Eoan kernel with these two patches and found the
  following speed improvements:
  
  git grep of the kernel: 0.14%
  building fwts: 0.40%
  build stress-ng 0.45%
  tar up kernel source: 7.6%
  boot time of eoan cloud image: 10.5%
  
- So I think this justifies the speed improvements.
+ So I think the speed improvements justify the SRU.
  
  == Regression potential ==
  
  minor change to ext4, which has been regression tested, so risk here is
  

[Kernel-packages] [Bug 1846424] Re: 19.10 ZFS Update failed on 2019-10-02

2019-10-02 Thread Colin Ian King
When you have error messages about modules not being updated then this
makes me believe that perhaps you have zfs-dkms install.  This package
is not required if you are using the 19.10 5.2 or 5.3 kernel as this has
the zfs modules provided already with it.  If you have the official
19.19 5.2 or 5.3 kernels then one can remove zfs-dkms.

Can you provide more information about the error?

** Changed in: zfs-linux (Ubuntu)
   Importance: Undecided => High

** Changed in: zfs-linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: zfs-linux (Ubuntu)
   Status: New => In Progress

** Changed in: zfs-linux (Ubuntu)
   Status: In Progress => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1846424

Title:
  19.10 ZFS Update failed on 2019-10-02

Status in zfs-linux package in Ubuntu:
  Incomplete

Bug description:
  On all my systems the update from zfs-initrams_0.8.1-1ubuntu12_amd64.deb 
failed the same is true for zfs-zed and zfsutils-linux.
  The system still runs on 0.8.1-1ubuntu11_amd64.
  The first error message was about a failing mount and at the end it announced 
that all 3 modules were not updated.
  I have the error on Xubuntu 19.10, Ubuntu Mate 19.10 on my laptop i5-2520M 
and in a VBox VM on a Ryzen 3 2200G with Ubuntu 19.10.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1846424/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1845948] Re: clone test from ubuntu_stress_smoke_test failed on B-hwe 5.0 i386

2019-10-01 Thread Colin Ian King
Fixes committed to make the clone test more OOM-able and the autotest
less OOM-able:

https://kernel.ubuntu.com/git/cking/stress-ng.git/commit/?id=f37bba3874b1e613f307cb40c040e06f21b1e521
https://kernel.ubuntu.com/git/cking/stress-ng.git/commit/?id=cdd32c1c25b9c7f11be4778cd99b7c45f6f9d51d

and

https://kernel.ubuntu.com/git/ubuntu/autotest-client-
tests.git/commit/?id=7447c7b658e3a6cc40496a75033c007e4a91f166

** Changed in: stress-ng
   Importance: Undecided => High

** Changed in: stress-ng
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: ubuntu-kernel-tests
   Importance: Undecided => High

** Changed in: ubuntu-kernel-tests
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: stress-ng
   Status: New => Fix Committed

** Changed in: ubuntu-kernel-tests
   Status: New => Fix Committed

** No longer affects: linux (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1845948

Title:
  clone test from ubuntu_stress_smoke_test failed on B-hwe 5.0 i386

Status in Stress-ng:
  Fix Committed
Status in ubuntu-kernel-tests:
  Fix Committed

Bug description:
  Reproduce rate: 2/2

  Issue found on an i386 node pepe with B-hwe 5.0.

  It looks like the clone test haven't pass successfully.
    Setting up swapspace version 1, size = 1024 MiB (1073737728 bytes)
    no label, UUID=25a57478-f726-411c-b79e-05d18c1cc5b2

    Machine Configuration
    Physical Pages:  2046087
    Pages available: 1518620
    Page Size:   4096
    Zswap enabled:   Y

    Free memory:
  totalusedfree  shared  buff/cache   
available
    Mem:8184348  185048 6074228 932 1925072 
7348440
    Swap:   5242872   0 5242872

    Number of CPUs: 8
    Number of CPUs Online: 8

    access STARTING
    access RETURNED 0
    access PASSED
    af-alg STARTING
    af-alg RETURNED 0
    af-alg PASSED
    affinity STARTING
    affinity RETURNED 0
    affinity PASSED
    aio STARTING
    aio RETURNED 0
    aio PASSED
    aiol STARTING
    aiol RETURNED 0
    aiol PASSED
    bad-altstack STARTING
    bad-altstack RETURNED 0
    bad-altstack PASSED
    bigheap STARTING
    bigheap RETURNED 0
    bigheap PASSED
    branch STARTING
    branch RETURNED 0
    branch PASSED
    brk STARTING
    brk RETURNED 0
    brk PASSED
    cache STARTING
    cache RETURNED 0
    cache PASSED
    cap STARTING
    cap RETURNED 0
    cap PASSED
    chdir STARTING
    chdir RETURNED 0
    chdir PASSED
    chmod STARTING
    chmod RETURNED 0
    chmod PASSED
    chown STARTING
    chown RETURNED 0
    chown PASSED
    chroot STARTING
    chroot RETURNED 0
    chroot PASSED
    clock STARTING
    clock RETURNED 0
    clock PASSED
    clone STARTING
    clone RETURNED 0
    stderr:
    1024+0 records in
    1024+0 records out
    1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.43609 s, 198 MB/s
  12:03:16 INFO |   END ERROR   
ubuntu_stress_smoke_test.stress-smoke-test  
ubuntu_stress_smoke_test.stress-smoke-test  timestamp=1569844996
localtime=Sep 30 12:03:16
  12:03:16 DEBUG| Persistent state client._record_indent now set to 1
  12:03:16 DEBUG| Persistent state client.unexpected_reboot deleted

To manage notifications about this bug go to:
https://bugs.launchpad.net/stress-ng/+bug/1845948/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1845948] Re: clone test from ubuntu_stress_smoke_test failed on B-hwe 5.0 i386

2019-10-01 Thread Colin Ian King
I've managed to reproduce this on pepe, seems like the autotest is being
OOM'd in preference to the actual cloning processes.  I'll see if I can
figure out how to stop this.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1845948

Title:
  clone test from ubuntu_stress_smoke_test failed on B-hwe 5.0 i386

Status in Stress-ng:
  New
Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Reproduce rate: 2/2

  Issue found on an i386 node pepe with B-hwe 5.0.

  It looks like the clone test haven't pass successfully.
    Setting up swapspace version 1, size = 1024 MiB (1073737728 bytes)
    no label, UUID=25a57478-f726-411c-b79e-05d18c1cc5b2

    Machine Configuration
    Physical Pages:  2046087
    Pages available: 1518620
    Page Size:   4096
    Zswap enabled:   Y

    Free memory:
  totalusedfree  shared  buff/cache   
available
    Mem:8184348  185048 6074228 932 1925072 
7348440
    Swap:   5242872   0 5242872

    Number of CPUs: 8
    Number of CPUs Online: 8

    access STARTING
    access RETURNED 0
    access PASSED
    af-alg STARTING
    af-alg RETURNED 0
    af-alg PASSED
    affinity STARTING
    affinity RETURNED 0
    affinity PASSED
    aio STARTING
    aio RETURNED 0
    aio PASSED
    aiol STARTING
    aiol RETURNED 0
    aiol PASSED
    bad-altstack STARTING
    bad-altstack RETURNED 0
    bad-altstack PASSED
    bigheap STARTING
    bigheap RETURNED 0
    bigheap PASSED
    branch STARTING
    branch RETURNED 0
    branch PASSED
    brk STARTING
    brk RETURNED 0
    brk PASSED
    cache STARTING
    cache RETURNED 0
    cache PASSED
    cap STARTING
    cap RETURNED 0
    cap PASSED
    chdir STARTING
    chdir RETURNED 0
    chdir PASSED
    chmod STARTING
    chmod RETURNED 0
    chmod PASSED
    chown STARTING
    chown RETURNED 0
    chown PASSED
    chroot STARTING
    chroot RETURNED 0
    chroot PASSED
    clock STARTING
    clock RETURNED 0
    clock PASSED
    clone STARTING
    clone RETURNED 0
    stderr:
    1024+0 records in
    1024+0 records out
    1073741824 bytes (1.1 GB, 1.0 GiB) copied, 5.43609 s, 198 MB/s
  12:03:16 INFO |   END ERROR   
ubuntu_stress_smoke_test.stress-smoke-test  
ubuntu_stress_smoke_test.stress-smoke-test  timestamp=1569844996
localtime=Sep 30 12:03:16
  12:03:16 DEBUG| Persistent state client._record_indent now set to 1
  12:03:16 DEBUG| Persistent state client.unexpected_reboot deleted

To manage notifications about this bug go to:
https://bugs.launchpad.net/stress-ng/+bug/1845948/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1845638] Re: ubuntu_lttng_smoke_test failed with D PowerPC

2019-09-27 Thread Colin Ian King
Fix committed: https://kernel.ubuntu.com/git/ubuntu/autotest-client-
tests.git/commit/?id=8e618fb7b00ecc206fe3ea73084492ebb5835747

** Changed in: ubuntu-kernel-tests
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: ubuntu-kernel-tests
   Status: Incomplete => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1845638

Title:
  ubuntu_lttng_smoke_test failed with D PowerPC

Status in ubuntu-kernel-tests:
  Fix Committed
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Disco:
  Incomplete

Bug description:
  Found on node modoc:

  09/23 03:53:18 DEBUG| utils:0153| [stdout] == lttng smoke test trace 
context switches ==
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Session test-kernel-session 
created.
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Traces will be written in 
/tmp/lttng-kernel-trace-4683-session
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng create)
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Kernel event sched_switch 
created in channel channel0
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng enable-event)
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Tracing started for session 
test-kernel-session
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng start)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Waiting for data 
availability
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Tracing stopped for session 
test-kernel-session
  09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng stop)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Session test-kernel-session 
destroyed
  09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng destroy)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Found 10 dd and 19927 context 
switches
  09/23 03:53:24 DEBUG| utils:0153| [stdout] FAILED (did not trace any dd 
context switches)
  09/23 03:53:24 DEBUG| utils:0153| [stdout]  
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Summary: 7 passed, 1 failed

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1845638/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1845638] Re: ubuntu_lttng_smoke_test failed with D PowerPC

2019-09-27 Thread Colin Ian King
** Changed in: ubuntu-kernel-tests
   Status: In Progress => Incomplete

** Changed in: linux (Ubuntu)
   Status: In Progress => Incomplete

** Changed in: ubuntu-kernel-tests
   Importance: High => Medium

** Changed in: linux (Ubuntu)
   Importance: High => Medium

** Changed in: ubuntu-kernel-tests
     Assignee: Colin Ian King (colin-king) => (unassigned)

** Changed in: linux (Ubuntu)
 Assignee: Colin Ian King (colin-king) => (unassigned)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1845638

Title:
  ubuntu_lttng_smoke_test failed with D PowerPC

Status in ubuntu-kernel-tests:
  Incomplete
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Disco:
  Incomplete

Bug description:
  Found on node modoc:

  09/23 03:53:18 DEBUG| utils:0153| [stdout] == lttng smoke test trace 
context switches ==
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Session test-kernel-session 
created.
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Traces will be written in 
/tmp/lttng-kernel-trace-4683-session
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng create)
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Kernel event sched_switch 
created in channel channel0
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng enable-event)
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Tracing started for session 
test-kernel-session
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng start)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Waiting for data 
availability
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Tracing stopped for session 
test-kernel-session
  09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng stop)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Session test-kernel-session 
destroyed
  09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng destroy)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Found 10 dd and 19927 context 
switches
  09/23 03:53:24 DEBUG| utils:0153| [stdout] FAILED (did not trace any dd 
context switches)
  09/23 03:53:24 DEBUG| utils:0153| [stdout]  
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Summary: 7 passed, 1 failed

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1845638/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-27 Thread Colin Ian King
And for Standard_D2s_v3

IP addr Mac AddrKernel  Reboots
104.42.252.54   00:0d:3a:32:df:92   5.0.0-1020-azure500
104.42.150.26   00:0d:3a:31:1b:50   5.0.0-1020-azure500
104.42.147.144  00:0d:3a:32:d8:2f   5.0.0-1020-azure500
40.112.129.232  00:0d:3a:32:d5:7c   5.0.0-1020-azure500
40.112.134.251  00:0d:3a:32:d9:2d   5.0.0-1020-azure500
13.64.195.2100:0d:3a:5a:7b:51   5.0.0-1020-azure500
40.83.214.204   00:0d:3a:36:47:98   5.0.0-1020-azure500
13.64.195.2700:0d:3a:5a:7f:05   5.0.0-1020-azure500
13.64.195.3100:0d:3a:5a:78:55   5.0.0-1020-azure500
13.64.195.6900:0d:3a:5a:7c:72   5.0.0-1020-azure500
104.42.51.2300:0d:3a:37:47:0a   5.0.0-1020-azure500
13.64.233.120   00:0d:3a:37:46:ab   5.0.0-1020-azure500
13.64.233.216   00:0d:3a:37:49:fb   5.0.0-1020-azure500
13.64.239.157   00:0d:3a:37:43:cf   5.0.0-1020-azure16   [hang]
52.160.87.177   00:0d:3a:35:fc:b1   5.0.0-1020-azure500

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, 
so the machine would power cycle by issuing a "reboot" from the terminal prompt 
just as a user would.   Once the machine came up, it captured dmesg and other 
bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not 
syncing: Attempted to kill init! exitcode=0x7f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure 
#13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 
00e7
  [   89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 
7f7bf0154d86
  [   89.508026] RDX: 007f RSI: 003c RDI: 
007f
  [   89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: 

  [   89.508026] R10: 7ffd6be6974c R11: 0206 R12: 
0018
  [   89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: 

  [   89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-27 Thread Colin Ian King
@Joseph, so I can reproduce this hang/crash issue across a variety of
instances. I can't get any info back on a console, so debugging this is
not easy.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, 
so the machine would power cycle by issuing a "reboot" from the terminal prompt 
just as a user would.   Once the machine came up, it captured dmesg and other 
bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not 
syncing: Attempted to kill init! exitcode=0x7f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure 
#13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 
00e7
  [   89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 
7f7bf0154d86
  [   89.508026] RDX: 007f RSI: 003c RDI: 
007f
  [   89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: 

  [   89.508026] R10: 7ffd6be6974c R11: 0206 R12: 
0018
  [   89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: 

  [   89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1845638] Re: ubuntu_lttng_smoke_test failed with D PowerPC

2019-09-27 Thread Colin Ian King
** Changed in: linux (Ubuntu)
   Status: Incomplete => In Progress

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Changed in: ubuntu-kernel-tests
   Importance: Undecided => High

** Changed in: ubuntu-kernel-tests
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: ubuntu-kernel-tests
   Status: New => Fix Committed

** Changed in: ubuntu-kernel-tests
   Status: Fix Committed => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1845638

Title:
  ubuntu_lttng_smoke_test failed with D PowerPC

Status in ubuntu-kernel-tests:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Disco:
  Incomplete

Bug description:
  Found on node modoc:

  09/23 03:53:18 DEBUG| utils:0153| [stdout] == lttng smoke test trace 
context switches ==
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Session test-kernel-session 
created.
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Traces will be written in 
/tmp/lttng-kernel-trace-4683-session
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng create)
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Kernel event sched_switch 
created in channel channel0
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng enable-event)
  09/23 03:53:18 DEBUG| utils:0153| [stdout] Tracing started for session 
test-kernel-session
  09/23 03:53:18 DEBUG| utils:0153| [stdout] PASSED (lttng start)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Waiting for data 
availability
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Tracing stopped for session 
test-kernel-session
  09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng stop)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Session test-kernel-session 
destroyed
  09/23 03:53:24 DEBUG| utils:0153| [stdout] PASSED (lttng destroy)
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Found 10 dd and 19927 context 
switches
  09/23 03:53:24 DEBUG| utils:0153| [stdout] FAILED (did not trace any dd 
context switches)
  09/23 03:53:24 DEBUG| utils:0153| [stdout]  
  09/23 03:53:24 DEBUG| utils:0153| [stdout] Summary: 7 passed, 1 failed

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1845638/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-27 Thread Colin Ian King
Get more failures with Standard_B1ms

IP addr Mac AddrKernel  Reboots
52.160.101.11   00:0d:3a:5b:a0:7c   5.0.0-1020-azure10
137.135.51.101  00:0d:3a:31:20:fc   5.0.0-1020-azure500
137.135.50.133  00:0d:3a:31:27:0f   5.0.0-1020-azure396 [hang]
137.135.51.198  00:0d:3a:31:28:d7   5.0.0-1020-azure500
137.135.49.89   00:0d:3a:31:22:c1   5.0.0-1020-azure500
137.135.48.14   00:0d:3a:33:05:7d   5.0.0-1020-azure500
104.40.5.23 00:0d:3a:32:e7:27   5.0.0-1020-azure228 [hang]
13.93.223.213   00:0d:3a:32:e8:59   5.0.0-1020-azure500
104.40.0.15100:0d:3a:31:32:09   5.0.0-1020-azure500
40.118.128.130  00:0d:3a:32:f5:71   5.0.0-1020-azure500
23.101.200.119  00:0d:3a:36:c5:94   5.0.0-1020-azure500
104.40.8.52 00:0d:3a:33:07:6e   5.0.0-1020-azure500
104.40.19.222   00:0d:3a:33:01:0d   5.0.0-1020-azure500
104.42.135.72   00:0d:3a:3b:e9:15   5.0.0-1020-azure500
104.40.22.205   00:0d:3a:33:0d:d8   5.0.0-1020-azure500

104.40.7.22 00:0d:3a:37:85:ff   5.0.0-1020-azure500
13.88.17.94 00:0d:3a:5a:54:c6   5.0.0-1020-azure500
104.40.8.19600:0d:3a:59:56:f3   5.0.0-1020-azure500
13.88.21.12500:0d:3a:5a:50:00   5.0.0-1020-azure500
13.88.23.13900:0d:3a:5a:55:c3   5.0.0-1020-azure500
23.99.81.18800:0d:3a:5a:52:0f   5.0.0-1020-azure500
13.88.20.13200:0d:3a:5a:55:f1   5.0.0-1020-azure500
13.88.20.12600:0d:3a:5a:58:65   5.0.0-1020-azure500
13.88.20.23700:0d:3a:5a:55:42   5.0.0-1020-azure500
13.88.17.35 00:0d:3a:5a:57:5f   5.0.0-1020-azure500
13.91.54.22500:0d:3a:5a:57:5a   5.0.0-1020-azure500
13.88.21.57 00:0d:3a:5a:52:ce   5.0.0-1020-azure500
13.88.21.67 00:0d:3a:5a:5b:b7   5.0.0-1020-azure500
13.88.18.46 00:0d:3a:5a:5d:02   5.0.0-1020-azure229 [hang]
13.88.16.22200:0d:3a:37:80:d1   5.0.0-1020-azure500

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, 
so the machine would power cycle by issuing a "reboot" from the terminal prompt 
just as a user would.   Once the machine came up, it captured dmesg and other 
bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not 
syncing: Attempted to kill init! exitcode=0x7f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure 
#13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 
00e7
  [   89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 
7f7bf0154d86
  [   89.508026] RDX: 007f RSI: 003c RDI: 
007f
  [   89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: 

  [   89.508026] R10: 7ffd6be6974c R11: 0206 R12: 
0018
  [   89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: 

  [   89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:

[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

2019-09-26 Thread Colin Ian King
See: https://github.com/lxc/lxd/issues/4656#issuecomment-535531229

In https://github.com/lxc/lxd/blob/master/lxd/storage_zfs_utils.go#L255
the umount is done by

err := unix.Unmount(mountpoint, unix.MNT_DETACH)

The umount2(2) manpage writes about MNT_DETACH:

Perform a lazy unmount: make the mount point unavailable for new
accesses, immediately disconnect the filesystem and all filesystems
mounted below it from each other and from the mount table, and actually
perform the unmount when the mount point ceases to be busy.

Could this be it? The MNT_DETACH umount looks partially asynchronous.
All the subsequent destroy commands may fail because they keep the mount
point busy. Finally the retry loop ends, the umount happens for real and
the following destroy succeeds.

—

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1779156

Title:
  lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

Status in linux package in Ubuntu:
  Triaged
Status in lxc package in Ubuntu:
  Confirmed
Status in linux source package in Cosmic:
  Triaged
Status in lxc source package in Cosmic:
  Confirmed
Status in linux source package in Disco:
  New
Status in lxc source package in Disco:
  New
Status in linux source package in Eoan:
  Triaged
Status in lxc source package in Eoan:
  Confirmed

Bug description:
  I'm not sure exactly what got me into this state, but I have several
  lxc containers that cannot be deleted.

  $ lxc info
  
  api_status: stable
  api_version: "1.0"
  auth: trusted
  public: false
  auth_methods:
  - tls
  environment:
addresses: []
architectures:
- x86_64
- i686
certificate: |
  -BEGIN CERTIFICATE-
  
  -END CERTIFICATE-
certificate_fingerprint: 
3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb
driver: lxc
driver_version: 3.0.1
kernel: Linux
kernel_architecture: x86_64
kernel_version: 4.15.0-23-generic
server: lxd
server_pid: 15123
server_version: "3.2"
storage: zfs
storage_version: 0.7.5-1ubuntu15
server_clustered: false
server_name: milhouse

  $ lxc delete --force b1
  Error: Failed to destroy ZFS filesystem: cannot destroy 
'default/containers/b1': dataset is busy

  Talking in #lxc-dev, stgraber and sforeshee provided diagnosis:

   | short version is that something unshared a mount namespace causing
   | them to get a copy of the mount table at the time that dataset was
   | mounted, which then prevents zfs from being able to destroy it)

  The work around provided was

   | you can unstick this particular issue by doing:
   |  grep default/containers/b1 /proc/*/mountinfo
   | then for any of the hits, do:
   |   nsenter -t PID -m -- umount 
/var/snap/lxd/common/lxd/storage-pools/default/containers/b1
   | then try the delete again

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.15.0-23-generic 4.15.0-23.25
  ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
  Uname: Linux 4.15.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.10-0ubuntu3
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  smoser31412 F pulseaudio
   /dev/snd/controlC2:  smoser31412 F pulseaudio
   /dev/snd/controlC0:  smoser31412 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 28 10:42:45 2018
  EcryptfsInUse: Yes
  InstallationDate: Installed on 2015-07-23 (1071 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
  MachineType: 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic 
root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-23-generic N/A
   linux-backports-modules-4.15.0-23-generic  N/A
   linux-firmware 1.174
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 03/09/2015
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355
  dmi.board.asset.tag: �
  dmi.board.name: NUC5i5RYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H40999-503
  dmi.chassis.asset.tag: �
  dmi.chassis.type: 3
  dmi.chassis.vendor: �
  dmi.chassis.version: 

[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-26 Thread Colin Ian King
I kicked off another ~20K reboot tests with Standard_B2S instances and
hit hangs again:

IP addr Mac AddrKernel  Reboots
104.42.3.16100:0d:3a:37:82:ee   5.0.0-1020-azure100
13.91.5.23  00:0d:3a:5a:74:23   5.0.0-1020-azure57   [ HANG ]
13.91.5.222 00:0d:3a:5a:75:1a   5.0.0-1020-azure100
13.64.117.146   00:0d:3a:5a:74:da   5.0.0-1020-azure100
13.64.117.1700:0d:3a:37:67:0e   5.0.0-1020-azure100
13.91.6.207 00:0d:3a:3a:cc:2c   5.0.0-1020-azure100
40.78.30.12900:0d:3a:36:6e:eb   5.0.0-1020-azure100
104.210.36.238  00:0d:3a:5a:73:da   5.0.0-1020-azure100
13.91.6.143 00:0d:3a:3a:c8:ec   5.0.0-1020-azure100
40.83.249.5800:0d:3a:3a:c0:7a   5.0.0-1020-azure100
104.45.216.53   00:0d:3a:3b:8a:55   5.0.0-1020-azure100
104.210.42.18   00:0d:3a:5a:73:5c   5.0.0-1020-azure100
40.78.27.21 00:0d:3a:3a:c9:19   5.0.0-1020-azure100
40.83.252.110   00:0d:3a:5a:79:93   5.0.0-1020-azure100
13.64.119.204   00:0d:3a:5a:7e:bc   5.0.0-1020-azure100

104.210.34.400:0d:3a:31:18:ee   5.0.0-1020-azure250
138.91.197.202  00:0d:3a:31:1d:c1   5.0.0-1020-azure94   [ HANG ]
138.91.196.241  00:0d:3a:31:15:2b   5.0.0-1020-azure250
104.210.33.44   00:0d:3a:31:16:f3   5.0.0-1020-azure250
40.83.248.7600:0d:3a:32:af:a7   5.0.0-1020-azure250
40.83.253.204   00:0d:3a:32:ba:09   5.0.0-1020-azure250
168.62.202.800:0d:3a:32:a0:11   5.0.0-1020-azure250
40.83.249.8 00:0d:3a:32:bd:ce   5.0.0-1020-azure250
40.83.249.9300:0d:3a:32:b7:32   5.0.0-1020-azure250
40.83.253.187   00:0d:3a:32:b9:cd   5.0.0-1020-azure250
23.99.9.88  00:0d:3a:37:96:c9   5.0.0-1020-azure250
104.40.29.184   00:0d:3a:36:9f:e0   5.0.0-1020-azure250
137.135.40.122  00:0d:3a:36:9f:eb   5.0.0-1020-azure250
137.135.49.43   00:0d:3a:36:92:aa   5.0.0-1020-azure250
138.91.251.800:0d:3a:37:9e:ef   5.0.0-1020-azure250

13.64.146.175   00:0d:3a:31:de:ee   5.0.0-1020-azure500
104.42.23.145   00:0d:3a:31:da:d7   5.0.0-1020-azure500
104.42.29.9900:0d:3a:31:d4:4f   5.0.0-1020-azure500
40.78.106.1200:0d:3a:31:d9:8a   5.0.0-1020-azure500
138.91.233.210  00:0d:3a:31:df:84   5.0.0-1020-azure500
104.42.25.3000:0d:3a:31:c9:a4   5.0.0-1020-azure500
13.64.150.6900:0d:3a:31:dd:47   5.0.0-1020-azure321   [ HANG ]
104.42.25.2300:0d:3a:31:d3:c9   5.0.0-1020-azure500
104.42.24.176   00:0d:3a:31:d8:36   5.0.0-1020-azure500
13.64.79.13300:0d:3a:31:d5:b4   5.0.0-1020-azure500
104.42.29.146   00:0d:3a:31:de:73   5.0.0-1020-azure500
104.42.19.191   00:0d:3a:31:d4:78   5.0.0-1020-azure500
40.118.249.118  00:0d:3a:31:db:20   5.0.0-1020-azure500
40.112.219.112  00:0d:3a:31:dc:da   5.0.0-1020-azure500
104.42.17.115   00:0d:3a:31:d3:21   5.0.0-1020-azure500
40.83.212.164   00:0d:3a:5a:ab:48   5.0.0-1020-azure500
52.160.123.400:0d:3a:36:0d:6a   5.0.0-1020-azure500
52.160.83.3700:0d:3a:5a:ab:79   5.0.0-1020-azure500
52.160.122.92   00:0d:3a:36:00:4c   5.0.0-1020-azure500
52.160.122.71   00:0d:3a:36:0f:bd   5.0.0-1020-azure500
52.160.123.12   00:0d:3a:36:04:39   5.0.0-1020-azure500
104.210.60.218  00:0d:3a:36:b6:25   5.0.0-1020-azure500
52.160.123.221  00:0d:3a:5a:a9:a3   5.0.0-1020-azure500
52.160.123.234  00:0d:3a:5a:a7:1c   5.0.0-1020-azure500
104.210.61.139  00:0d:3a:37:b7:84   5.0.0-1020-azure500
104.210.61.43   00:0d:3a:36:b5:96   5.0.0-1020-azure500
40.83.212.185   00:0d:3a:5a:af:9c   5.0.0-1020-azure500
52.160.82.111   00:0d:3a:5a:a9:a9   5.0.0-1020-azure500
52.160.82.167   00:0d:3a:5a:a7:17   5.0.0-1020-azure500
104.210.61.135  00:0d:3a:36:b3:97   5.0.0-1020-azure500

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 

[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-26 Thread Colin Ian King
So the best way to reproduce this issue is to run ~500 reboots across
multiple instances rather than 5000-1 reboots on once instance.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, 
so the machine would power cycle by issuing a "reboot" from the terminal prompt 
just as a user would.   Once the machine came up, it captured dmesg and other 
bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not 
syncing: Attempted to kill init! exitcode=0x7f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure 
#13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 
00e7
  [   89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 
7f7bf0154d86
  [   89.508026] RDX: 007f RSI: 003c RDI: 
007f
  [   89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: 

  [   89.508026] R10: 7ffd6be6974c R11: 0206 R12: 
0018
  [   89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: 

  [   89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-25 Thread Colin Ian King
See above, I ran several thousand reboot tests on a lot of Basic_A3
instances, ranging from 50, 250 to 500 reboots.  Only one failed.  So
this is *really* hard to reproduce.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, 
so the machine would power cycle by issuing a "reboot" from the terminal prompt 
just as a user would.   Once the machine came up, it captured dmesg and other 
bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not 
syncing: Attempted to kill init! exitcode=0x7f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure 
#13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 
00e7
  [   89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 
7f7bf0154d86
  [   89.508026] RDX: 007f RSI: 003c RDI: 
007f
  [   89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: 

  [   89.508026] R10: 7ffd6be6974c R11: 0206 R12: 
0018
  [   89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: 

  [   89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-25 Thread Colin Ian King
IP addr Mac AddrKernel  Reboots 
13.64.67.18600:0d:3a:3a:dd:04   5.0.0-1016-azure50  
104.42.152.115  00:0d:3a:35:b1:e6   5.0.0-1016-azure50  
65.52.121.205   00:0d:3a:3b:0f:52   5.0.0-1016-azure50  
13.88.28.42 00:0d:3a:3b:c7:da   5.0.0-1016-azure50  
40.118.165.237  00:0d:3a:3b:c2:4e   5.0.0-1016-azure50  
40.118.190.105  00:0d:3a:36:c6:d7   5.0.0-1016-azure50  
40.78.90.95 00:0d:3a:37:c0:d9   5.0.0-1016-azure50  
13.83.84.15000:0d:3a:37:c0:15   5.0.0-1016-azure50  
104.42.74.129   00:0d:3a:36:c2:3e   5.0.0-1016-azure50  
40.85.154.162   00:0d:3a:37:cc:dd   5.0.0-1016-azure50  
40.78.43.4  00:0d:3a:37:c5:07   5.0.0-1016-azure50  
13.93.142.147   00:0d:3a:37:c8:5f   5.0.0-1016-azure50  
40.78.44.22900:0d:3a:3b:e4:80   5.0.0-1016-azure50  
40.118.189.62   00:0d:3a:3b:e8:8e   5.0.0-1016-azure50  
40.78.85.10 00:0d:3a:3b:e6:37   5.0.0-1016-azure50  
40.78.13.20300:0d:3a:3a:c2:b0   5.0.0-1016-azure50  
104.42.112.81   00:0d:3a:30:71:fb   5.0.0-1016-azure50  
40.80.156.132   00:0d:3a:30:2f:7c   5.0.0-1016-azure50  
13.64.173.138   00:0d:3a:30:73:b2   5.0.0-1016-azure50  
13.64.189.105   00:0d:3a:30:a4:6f   5.0.0-1016-azure50  
13.64.189.127   00:0d:3a:30:a4:1f   5.0.0-1016-azure50  
104.45.237.232  00:0d:3a:32:1e:3b   5.0.0-1016-azure50  
104.42.233.11   00:0d:3a:32:34:68   5.0.0-1016-azure50  
104.42.233.20   00:0d:3a:34:ed:42   5.0.0-1016-azure50  
23.101.202.206  00:0d:3a:32:32:b0   5.0.0-1016-azure50  
104.42.233.18   00:0d:3a:34:ee:ba   5.0.0-1016-azure50  
104.42.233.151  00:0d:3a:34:e9:0d   5.0.0-1016-azure50  
104.40.51.248   00:0d:3a:32:27:c6   5.0.0-1016-azure50  
104.40.69.158   00:0d:3a:34:f1:5d   5.0.0-1016-azure50  
52.160.41.9500:0d:3a:35:9f:c8   5.0.0-1016-azure50  
104.42.158.74   00:0d:3a:34:c7:91   5.0.0-1016-azure50  

IP addr Mac AddrKernel  Reboots 
40.83.145.235   00:0d:3a:5a:01:f9   5.0.0-1016-azure250 
104.210.50.91   00:0d:3a:35:b2:48   5.0.0-1016-azure250 
13.88.186.166   00:0d:3a:5a:0b:86   5.0.0-1016-azure250 
40.118.185.194  00:0d:3a:35:b9:59   5.0.0-1016-azure250 
104.42.37.175   00:0d:3a:5a:06:ff   5.0.0-1016-azure250 
13.88.186.188   00:0d:3a:5a:05:da   5.0.0-1016-azure250 
104.210.48.49   00:0d:3a:35:b8:a7   5.0.0-1016-azure250 
104.210.50.215  00:0d:3a:35:ba:13   5.0.0-1016-azure250 
40.78.52.50 00:0d:3a:35:b6:50   5.0.0-1016-azure250 
40.118.186.25   00:0d:3a:35:b5:93   5.0.0-1016-azure250 
13.93.233.2600:0d:3a:37:06:7e   5.0.0-1016-azure156 crashed
13.93.136.144   00:0d:3a:37:0e:2f   5.0.0-1016-azure250 
40.118.241.192  00:0d:3a:32:c4:5d   5.0.0-1016-azure250 
40.83.160.5200:0d:3a:37:47:48   5.0.0-1016-azure250 
104.42.9.61 00:0d:3a:36:d3:a0   5.0.0-1016-azure250 


IP addr Mac AddrKernel  Reboots 
104.40.1.50 00:0d:3a:30:8d:ae   5.0.0-1016-azure500 
104.40.3.20500:0d:3a:30:81:d5   5.0.0-1016-azure500 
104.40.9.37 00:0d:3a:30:86:bb   5.0.0-1016-azure500 
104.40.0.24200:0d:3a:30:88:6b   5.0.0-1016-azure500 
104.40.12.184   00:0d:3a:30:8e:e0   5.0.0-1016-azure500 
137.135.46.72   00:0d:3a:30:ac:df   5.0.0-1016-azure500 
137.135.47.169  00:0d:3a:30:9a:3f   5.0.0-1016-azure500 
104.40.10.226   00:0d:3a:30:2d:fb   5.0.0-1016-azure500 
104.40.10.244   00:0d:3a:30:8e:12   5.0.0-1016-azure500 
104.40.15.160   00:0d:3a:30:87:4d   5.0.0-1016-azure500 
40.112.132.200:0d:3a:59:b5:80   5.0.0-1016-azure500 
13.64.97.59 00:0d:3a:59:b1:e7   5.0.0-1016-azure500 
40.78.19.22400:0d:3a:5a:bd:99   5.0.0-1016-azure500 
13.64.88.51 00:0d:3a:37:30:07   5.0.0-1016-azure500 
40.118.225.110  00:0d:3a:59:b1:8f   5.0.0-1016-azure500

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug 

[Kernel-packages] [Bug 1815178] Re: 18.04: Raid performances on kernel 4.15 and newer are suboptimal when used on NVMe devices

2019-09-24 Thread Colin Ian King
This bug has been dormant for a while with no update. I'm marking it as
won't fix. If this is still and issue, please re-open this bug.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1815178

Title:
  18.04: Raid performances on kernel 4.15 and newer are suboptimal when
  used on NVMe devices

Status in linux package in Ubuntu:
  Won't Fix

Bug description:
  Hello,
  We have been running multiple tests using the md driver to build various type 
of RAID devices. Performances on RAID5 are particularly disappointing so we 
would like to know if there are any known issue with the md driver on Bionic 
kernels. Here are some of our results :

  Test (Threads)QCT JBODQCT RAID5 (8 threads)   QCT RAID10
  read(1)   838751 (-10%)   572 (-32%)
  read(2)   1226   1172 (-4%)   1057 (-14%)
  read(4)   1129   2006 (+78%)  1760 (+56%)
  write(1)  804382 (-52%)   398 (-50%)
  write(2)  1047   175 (-83%)   667 (-36%)
  write(4)  883415 (-53%)   871 (-1%)
  randread(1)   863749 (-13%)   619 (-28%)
  randread(2)   1346   1067 (-21%)  1020 (-24%)
  randread(4)   1648   1785 (+8%)   1650 (=)
  randwrite(1)  766287 (-63%)   391 (-50%)
  randwrite(2)  1044   313 (-70%)   664 (-36%)
  randwrite(4)  916282 (-69%)   885 (-3%)

  We are preparing tests with one of the latest mainline kernel.

  TIA,

  ...Louis

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1815178/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1811730] Re: Thermald does not set max CPU after reseting the voltage using RAPL

2019-09-24 Thread Colin Ian King
** Changed in: thermald (Ubuntu)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to thermald in Ubuntu.
https://bugs.launchpad.net/bugs/1811730

Title:
  Thermald does not set max CPU after reseting the voltage using RAPL

Status in thermald package in Ubuntu:
  Fix Released
Status in thermald source package in Bionic:
  Fix Released

Bug description:
  Hi,

  I was using Ubuntu 18.10 thermald package, but I noted that, after few
  seconds at max CPU usage (max temp), thermald send the signal to RALP
  to reduce the voltage of the CPU. It set the freq to minimum (800MHz
  in my case). But when the CPU is idle and temp is lowered (35-40ºC) it
  did not send the signal to resume normal operation of the CPU.

  I compiled the latest version of thermald from git
  (https://github.com/intel/thermal_daemon) and now everything works
  fine.

  I started to thought that the problem was the hardware or BIOS
  problem, as I disabled the CPU scaling on the BIOS, but intel_pstate
  continued doing freq scaling (I think for Turbo mode).

  But the real problem was Thermald. The latest version from git works
  really fine and it automatically disables and enable the Intel Turbo
  state and balance the freqs and fan control fine.

  My hardware is a Lenovo Thinkpad P52 with i7-8850H.

  I tested it using Ubuntu kernel and Kernel 4.20 optimized for i7
  processor but with Ubuntu default config (except processor
  family="Core2/newer Xeon", Preemption Model="Preemptible Kernel (Low-
  Latency Desktop)" and Timer frequency="1000HZ". (Only changed those
  settings from make oldconfig and make deb-pkg).

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: thermald (not installed)
  ProcVersionSignature: Ubuntu 4.18.0-13.14-generic 4.18.17
  Uname: Linux 4.18.0-13-generic x86_64
  NonfreeKernelModules: nvidia_modeset nvidia
  ApportVersion: 2.20.10-0ubuntu13.1
  Architecture: amd64
  CurrentDesktop: ubuntu:GNOME
  Date: Mon Jan 14 23:45:01 2019
  InstallationDate: Installed on 2018-12-11 (34 days ago)
  InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 
(20181017.3)
  SourcePackage: thermald
  UpgradeStatus: No upgrade log present (probably fresh install)

  -

  SRU Justification
  ==

  [Impact]
   * As described by the original bug reporter, CPU usage of Lenovo P52 is 
sub-optimal under heavy load.
   * My observation is the machine exhibits a sharp drop of power usage and CPU 
frequency and takes time to slowly ramp up again (refer to the chart at 
https://bit.ly/2OJphB8)
   * Fixed by bisecting and backporting fixes from thermald project.

  [Test Case]
   * One can stress the CPU load of the machine and collect the CPU frequency 
and power usage over time to check for any anamoly. The script at 
https://people.canonical.com/~ypwong/p52_test_cpu.sh can help with this.
   * With the fix, the behaviour should be like this: https://bit.ly/2KA9EXB, a 
consistent power usage can be maintained.

  [Regression Potential]
   * Medium. The fix consists of 7 commits cherry-picked from upstream, these 
changes will affect any machines using RAPL cooling device. The impact may not 
be obvious in normal daily usage but will be manifested during heavy load, 
suggest a longer verification period so that more people can discover any 
adverse effect.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1811730/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

2019-09-24 Thread Colin Ian King
Been digging into this a bit further with lxc 3.17 on Eoan.

lxc launch ubuntu:bionic zfs-bug-test
Creating zfs-bug-test
Starting zfs-bug-test
lxc delete zfs-bug-test --force
Error: Failed to destroy ZFS filesystem: Failed to run: zfs destroy -r 
default/containers/z1: cannot destroy 'default/containers/z1': dataset is busy

However, re-running the delete works fine:
lxd.lxc delete z1 --force

Looking at system calls, it appears that the first failing delete
--force command attempts to destroy the zfs file system multiple times
and then gives up. In doing so, it umounts the zfs file system.  Hence
the second time the delete is issued it works fine because zfs is now
umounted.  So it appears that the ordering in the delete is not as it
expected.

It seems to do:
zfs destroy x 10 (or so and then gives up because of errno 16 -EBUSY)
zfs umount

It should be doing:
zfs umount
zfs destroy

This matches the observed reference counting.  The ref count is only
dropped once the umount is complete. Attempts to destroy it before that
will cause an -EBUSY.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1779156

Title:
  lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

Status in linux package in Ubuntu:
  Triaged
Status in lxc package in Ubuntu:
  Confirmed
Status in linux source package in Cosmic:
  Triaged
Status in lxc source package in Cosmic:
  Confirmed
Status in linux source package in Disco:
  New
Status in lxc source package in Disco:
  New
Status in linux source package in Eoan:
  Triaged
Status in lxc source package in Eoan:
  Confirmed

Bug description:
  I'm not sure exactly what got me into this state, but I have several
  lxc containers that cannot be deleted.

  $ lxc info
  
  api_status: stable
  api_version: "1.0"
  auth: trusted
  public: false
  auth_methods:
  - tls
  environment:
addresses: []
architectures:
- x86_64
- i686
certificate: |
  -BEGIN CERTIFICATE-
  
  -END CERTIFICATE-
certificate_fingerprint: 
3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb
driver: lxc
driver_version: 3.0.1
kernel: Linux
kernel_architecture: x86_64
kernel_version: 4.15.0-23-generic
server: lxd
server_pid: 15123
server_version: "3.2"
storage: zfs
storage_version: 0.7.5-1ubuntu15
server_clustered: false
server_name: milhouse

  $ lxc delete --force b1
  Error: Failed to destroy ZFS filesystem: cannot destroy 
'default/containers/b1': dataset is busy

  Talking in #lxc-dev, stgraber and sforeshee provided diagnosis:

   | short version is that something unshared a mount namespace causing
   | them to get a copy of the mount table at the time that dataset was
   | mounted, which then prevents zfs from being able to destroy it)

  The work around provided was

   | you can unstick this particular issue by doing:
   |  grep default/containers/b1 /proc/*/mountinfo
   | then for any of the hits, do:
   |   nsenter -t PID -m -- umount 
/var/snap/lxd/common/lxd/storage-pools/default/containers/b1
   | then try the delete again

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.15.0-23-generic 4.15.0-23.25
  ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
  Uname: Linux 4.15.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.10-0ubuntu3
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  smoser31412 F pulseaudio
   /dev/snd/controlC2:  smoser31412 F pulseaudio
   /dev/snd/controlC0:  smoser31412 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 28 10:42:45 2018
  EcryptfsInUse: Yes
  InstallationDate: Installed on 2015-07-23 (1071 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
  MachineType: 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic 
root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-23-generic N/A
   linux-backports-modules-4.15.0-23-generic  N/A
   linux-firmware 1.174
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 03/09/2015
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355
  dmi.board.asset.tag: �
  

[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-20 Thread Colin Ian King
@Robert, was there a specific class of virtual machine you were using
when this issue occurred?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, 
so the machine would power cycle by issuing a "reboot" from the terminal prompt 
just as a user would.   Once the machine came up, it captured dmesg and other 
bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not 
syncing: Attempted to kill init! exitcode=0x7f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure 
#13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 
00e7
  [   89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 
7f7bf0154d86
  [   89.508026] RDX: 007f RSI: 003c RDI: 
007f
  [   89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: 

  [   89.508026] R10: 7ffd6be6974c R11: 0206 R12: 
0018
  [   89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: 

  [   89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-18 Thread Colin Ian King
** Changed in: linux-azure (Ubuntu)
   Status: In Progress => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, 
so the machine would power cycle by issuing a "reboot" from the terminal prompt 
just as a user would.   Once the machine came up, it captured dmesg and other 
bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not 
syncing: Attempted to kill init! exitcode=0x7f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure 
#13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 
00e7
  [   89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 
7f7bf0154d86
  [   89.508026] RDX: 007f RSI: 003c RDI: 
007f
  [   89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: 

  [   89.508026] R10: 7ffd6be6974c R11: 0206 R12: 
0018
  [   89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: 

  [   89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1822118] Re: Kernel Panic while rebooting cloud instance

2019-09-18 Thread Colin Ian King
@Joseph, any ideas how we can progress on this?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1822118

Title:
  Kernel Panic while rebooting cloud instance

Status in linux-azure package in Ubuntu:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  Description:   In the event a particular Azure cloud instance is
  rebooted it's possible that it may never recover and the instance will
  break indefinitely.

  In My case, it was a kernel panic. See specifics below..

  
  Series: Disco
  Instance Size: Basic_A3
  Region: (Default) US-WEST-2
  Kernel Version: 4.18.0-1013-azure #13-Ubuntu SMP Thu Feb 28 22:54:16 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux

  
  I had a simple script to reboot an instance (X) amount of times, I chose 50, 
so the machine would power cycle by issuing a "reboot" from the terminal prompt 
just as a user would.   Once the machine came up, it captured dmesg and other 
bits then rebooted again until it reached 50. 

  After the 4th attempt, my script timed out, I took a look at the
  instance console log and the following displayed on the console.

  
  [  OK  ] Reached target Reboot.
  /shutdown: error while loading shared libra[   89.498980] Kernel panic - not 
syncing: Attempted to kill init! exitcode=0x7f00
  [   89.498980]
  [   89.500042] CPU: 0 PID: 1 Comm: shutdown Not tainted 4.18.0-1013-azure 
#13-Ubuntu
  [   89.508026] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007  06/02/2017
  [   89.508026] Call Trace:
  [   89.508026]  dump_stack+0x63/0x8a
  [   89.508026]  panic+0xe7/0x247
  [   89.508026]  do_exit.cold.23+0x26/0x75
  [   89.508026]  do_group_exit+0x43/0xb0
  [   89.508026]  __x64_sys_exit_group+0x18/0x20
  [   89.508026]  do_syscall_64+0x5a/0x110
  [   89.508026]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [   89.508026] RIP: 0033:0x7f7bf0154d86
  [   89.508026] Code: Bad RIP value.
  [   89.508026] RSP: 002b:7ffd6be693b8 EFLAGS: 0206 ORIG_RAX: 
00e7
  [   89.508026] RAX: ffda RBX: 7f7bf015e420 RCX: 
7f7bf0154d86
  [   89.508026] RDX: 007f RSI: 003c RDI: 
007f
  [   89.508026] RBP: 7f7bef9449c0 R08: 00e7 R09: 

  [   89.508026] R10: 7ffd6be6974c R11: 0206 R12: 
0018
  [   89.508026] R13: 7f7bef944ac8 R14: 7f7bef944a00 R15: 

  [   89.508026] Kernel Offset: 0x1600 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [   89.508026] ---[ end Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
  [   89.508026]  ]---

  
  this only occurred once in my testing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1822118/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1798574] Re: bionic/linux: hung task triggered by ubuntu_vfat_stress stress-ng testcase

2019-09-18 Thread Colin Ian King
I can reproduce this on 4.15.0-38 but not on a more recent kernel, e.g.
4.15.0-64, so I think this has been fixed.  I'll close this for now as
fixed released, but feel free to re-open it if we see the same issue
again.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Fix Released

** Changed in: linux (Ubuntu Bionic)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1798574

Title:
  bionic/linux: hung task triggered by ubuntu_vfat_stress stress-ng
  testcase

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  When running the ubuntu_vfat_stress on a powerpc64el system with
  kernel 4.15.0-36-generic it triggers the following hung task:

  

  10:39:09 DEBUG| [stdout] Mounted tmpfs /mnt/vfat-test-56562
  10:39:09 DEBUG| [stdout] Created loop image 
/mnt/vfat-test-56562/vfat-loop-data
  10:39:09 DEBUG| [stdout] mkfs.fat 4.1 (2017-01-24)
  10:39:09 DEBUG| [stdout]  
  10:39:09 DEBUG| [stdout] 

  10:39:09 DEBUG| [stdout] VFAT options:  dmask=777
  10:39:09 DEBUG| [stdout] Stress test:   
/home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-ng 
--verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts 
sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 
--aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 
--fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M 
--fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512
  10:39:09 DEBUG| [stdout] VFAT_IMAGE path: /mnt/vfat-test-56562
  10:39:09 DEBUG| [stdout] Mount point:   /mnt/vfat-test-56562
  10:39:09 DEBUG| [stdout] Date:  Thu Oct 18 10:39:09 UTC 2018
  10:39:09 DEBUG| [stdout] Host:  baltar
  10:39:09 DEBUG| [stdout] Kernel:4.15.0-36-generic #39-Ubuntu SMP Mon 
Sep 24 16:18:48 UTC 2018
  10:39:09 DEBUG| [stdout] Machine:   baltar ppc64le ppc64le
  10:39:09 DEBUG| [stdout] CPUs online:   160
  10:39:09 DEBUG| [stdout] CPUs total:160
  10:39:09 DEBUG| [stdout] Page size: 65536
  10:39:09 DEBUG| [stdout] Pages avail:   1907765
  10:39:09 DEBUG| [stdout] Pages total:   2089666
  10:39:09 DEBUG| [stdout] 

  10:39:09 DEBUG| [stdout]  
  10:39:09 DEBUG| [stdout]  
  10:39:09 DEBUG| [stdout] stress-ng: info:  [146983] dispatching hogs: 2 hdd, 
2 lockf, 2 seek, 2 aio, 2 dentry, 2 dir, 2 fallocate, 2 fstat, 2 lease, 2 open, 
2 rename, 2 chdir, 2 rename
  10:39:09 DEBUG| [stdout] stress-ng: info:  [146983] cache allocate: using 
built-in defaults as unable to determine cache details
  10:39:11 DEBUG| [stdout] stress-ng: fail:  [147027] stress-ng-chdir: mkdir 
failed, errno=28 (No space left on device)
  10:39:11 DEBUG| [stdout] stress-ng: fail:  [147007] stress-ng-hdd: read 
failed, errno=28 (No space left on device)
  10:39:11 DEBUG| [stdout] stress-ng: fail:  [146993] stress-ng-hdd: read 
failed, errno=28 (No space left on device)
  10:39:11 DEBUG| [stdout] stress-ng: fail:  [147005] stress-ng-chdir: mkdir 
failed, errno=28 (No space left on device)
  10:39:12 DEBUG| [stdout] stress-ng: error: [146983] process 146993 
(stress-ng-hdd) terminated with an error, exit status=1 (stress-ng core failure)
  10:39:12 DEBUG| [stdout] stress-ng: error: [146983] process 147007 
(stress-ng-hdd) terminated with an error, exit status=1 (stress-ng core failure)
  10:42:05 DEBUG| [stdout] Found kernel warning and/or call trace:
  10:42:05 DEBUG| [stdout]  
  10:42:05 DEBUG| [stdout] [ 5869.838856] TESTING: --verify --times 
--metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts 
sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 
--aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 
--fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M 
--fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512
  10:42:05 DEBUG| [stdout] [ 5870.027499] ubuntu_vfat_str (56562): drop_caches: 
1
  10:42:05 DEBUG| [stdout] [ 5870.037987] ubuntu_vfat_str (56562): drop_caches: 
2
  10:42:05 DEBUG| [stdout] [ 5870.041591] ubuntu_vfat_str (56562): drop_caches: 
3
  10:42:05 DEBUG| [stdout] [ 5870.441211] VFAT options:  dmask=777
  10:42:05 DEBUG| [stdout] [ 5870.441271] Stress test:   
/home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-ng 
--verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts 
sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 
--aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 
--fstat 2 

[Kernel-packages] [Bug 1798574] Re: bionic/linux: hung task triggered by ubuntu_vfat_stress stress-ng testcase

2019-09-16 Thread Colin Ian King
** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Colin Ian King (colin-king)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1798574

Title:
  bionic/linux: hung task triggered by ubuntu_vfat_stress stress-ng
  testcase

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Confirmed

Bug description:
  When running the ubuntu_vfat_stress on a powerpc64el system with
  kernel 4.15.0-36-generic it triggers the following hung task:

  

  10:39:09 DEBUG| [stdout] Mounted tmpfs /mnt/vfat-test-56562
  10:39:09 DEBUG| [stdout] Created loop image 
/mnt/vfat-test-56562/vfat-loop-data
  10:39:09 DEBUG| [stdout] mkfs.fat 4.1 (2017-01-24)
  10:39:09 DEBUG| [stdout]  
  10:39:09 DEBUG| [stdout] 

  10:39:09 DEBUG| [stdout] VFAT options:  dmask=777
  10:39:09 DEBUG| [stdout] Stress test:   
/home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-ng 
--verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts 
sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 
--aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 
--fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M 
--fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512
  10:39:09 DEBUG| [stdout] VFAT_IMAGE path: /mnt/vfat-test-56562
  10:39:09 DEBUG| [stdout] Mount point:   /mnt/vfat-test-56562
  10:39:09 DEBUG| [stdout] Date:  Thu Oct 18 10:39:09 UTC 2018
  10:39:09 DEBUG| [stdout] Host:  baltar
  10:39:09 DEBUG| [stdout] Kernel:4.15.0-36-generic #39-Ubuntu SMP Mon 
Sep 24 16:18:48 UTC 2018
  10:39:09 DEBUG| [stdout] Machine:   baltar ppc64le ppc64le
  10:39:09 DEBUG| [stdout] CPUs online:   160
  10:39:09 DEBUG| [stdout] CPUs total:160
  10:39:09 DEBUG| [stdout] Page size: 65536
  10:39:09 DEBUG| [stdout] Pages avail:   1907765
  10:39:09 DEBUG| [stdout] Pages total:   2089666
  10:39:09 DEBUG| [stdout] 

  10:39:09 DEBUG| [stdout]  
  10:39:09 DEBUG| [stdout]  
  10:39:09 DEBUG| [stdout] stress-ng: info:  [146983] dispatching hogs: 2 hdd, 
2 lockf, 2 seek, 2 aio, 2 dentry, 2 dir, 2 fallocate, 2 fstat, 2 lease, 2 open, 
2 rename, 2 chdir, 2 rename
  10:39:09 DEBUG| [stdout] stress-ng: info:  [146983] cache allocate: using 
built-in defaults as unable to determine cache details
  10:39:11 DEBUG| [stdout] stress-ng: fail:  [147027] stress-ng-chdir: mkdir 
failed, errno=28 (No space left on device)
  10:39:11 DEBUG| [stdout] stress-ng: fail:  [147007] stress-ng-hdd: read 
failed, errno=28 (No space left on device)
  10:39:11 DEBUG| [stdout] stress-ng: fail:  [146993] stress-ng-hdd: read 
failed, errno=28 (No space left on device)
  10:39:11 DEBUG| [stdout] stress-ng: fail:  [147005] stress-ng-chdir: mkdir 
failed, errno=28 (No space left on device)
  10:39:12 DEBUG| [stdout] stress-ng: error: [146983] process 146993 
(stress-ng-hdd) terminated with an error, exit status=1 (stress-ng core failure)
  10:39:12 DEBUG| [stdout] stress-ng: error: [146983] process 147007 
(stress-ng-hdd) terminated with an error, exit status=1 (stress-ng core failure)
  10:42:05 DEBUG| [stdout] Found kernel warning and/or call trace:
  10:42:05 DEBUG| [stdout]  
  10:42:05 DEBUG| [stdout] [ 5869.838856] TESTING: --verify --times 
--metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts 
sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 
--aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 
--fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M 
--fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512
  10:42:05 DEBUG| [stdout] [ 5870.027499] ubuntu_vfat_str (56562): drop_caches: 
1
  10:42:05 DEBUG| [stdout] [ 5870.037987] ubuntu_vfat_str (56562): drop_caches: 
2
  10:42:05 DEBUG| [stdout] [ 5870.041591] ubuntu_vfat_str (56562): drop_caches: 
3
  10:42:05 DEBUG| [stdout] [ 5870.441211] VFAT options:  dmask=777
  10:42:05 DEBUG| [stdout] [ 5870.441271] Stress test:   
/home/ubuntu/autotest/client/tmp/ubuntu_vfat_stress/src/stress-ng/stress-ng 
--verify --times --metrics-brief --syslog --keep-name -t 10s --hdd 2 --hdd-opts 
sync,wr-rnd,rd-rnd,fadv-willneed,fadv-rnd --lockf 2 --seek 2 --aio 2 
--aio-requests 32 --dentry 2 --dir 2 --dentry-order stride --fallocate 2 
--fstat 2 --dentries 100 --lease 2 --open 2 --rename 2 --hdd-bytes 4M 
--fallocate-bytes 4M --chdir 2 --rename 2 --hdd-write-size 512
  10:42:05 DEBUG| [stdout] [ 5870.441296] Mount point:   /mnt/vfa

[Kernel-packages] [Bug 1840934] Re: Change kernel compression method to improve boot speed

2019-09-09 Thread Colin Ian King
>From "Comment bridged from LTC Bugzilla" in a bug discussion:

"FWIW, I verified this on z14, and there clearly lz4 is (as expected)
the fastest decompression algorithm.

With vanilla 5.3-rc6 and defconfig I get the following kernel uncompression 
times:
lzo: 27us
lz4: 24us

An initrd (uncompressed size ~55MB) gets these uncompression times:
lzo: 62us
lz4: 49us

So I'd clearly vote to switch to lz4 on s390 as well."

Also:


"I also instrumented the kernel code to only measure the time to decompress the 
kernel. If its stckf or stcke doesn't matter in this case.
Note that if you shift a tod clock value 12 bits to the right will give you 
microseconds. (All numbers I posted were actually milliseconds not microseconds 
by the way).

I measured both runs (z13 + z14) when running within z/VM and IPL'ed
from the punch card reader.

Times used for decompressing the initrd were just extracted from dmesg;
no kernel instrumentation required here, since there are two messages
provided before and after initrd decompression.

Find below an extract of the patch to measure decompression time.

diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c
index 7b0d054..cee3d97 100644
--- a/arch/s390/boot/startup.c
+++ b/arch/s390/boot/startup.c
@@ -146,7 +146,10 @@ void startup_kernel(void)
}

if (!IS_ENABLED(CONFIG_KERNEL_UNCOMPRESSED)) {
+ start = get_tod_clock();
img = decompress_kernel();
+ end = get_tod_clock();
+ time = (end - start) >> 12;
memmove((void *)vmlinux.default_lma, img, vmlinux.image_size);
} else if (__kaslr_offset)
memcpy((void *)vmlinux.default_lma, img, vmlinux.image_size);
..."

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840934

Title:
  Change kernel compression method to improve boot speed

Status in linux package in Ubuntu:
  Fix Released

Bug description:
  Colin King has done some analysis of kernel boot speed using different
  kernel compression methods. Results for x86 are at:

  
https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/kernel-compression-method.txt
  
https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/boot-speed-compression-5.3-rc4.ods

  Testing of s390 gave the following:

  GZIP31528972
  LZ4 192348049
  LZO  85990145

  From Colin: "I used the monotonic TOD timer using the stckf opcode to
  fetch a 64 bit time value.  Not sure how this maps to 'real time' in
  seconds."

  Conclusion: We should switch x86 to LZ4 and s390 to LZO. PPC and ARM
  do not support LZO or LZ4, so we will stick with gzip there.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840934/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840934] Re: Change kernel compression method to improve boot speed

2019-09-09 Thread Colin Ian King
Also enable LZ4 for s390x as IBM has provided us with some positive
feedback about using this.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840934

Title:
  Change kernel compression method to improve boot speed

Status in linux package in Ubuntu:
  Fix Released

Bug description:
  Colin King has done some analysis of kernel boot speed using different
  kernel compression methods. Results for x86 are at:

  
https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/kernel-compression-method.txt
  
https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/boot-speed-compression-5.3-rc4.ods

  Testing of s390 gave the following:

  GZIP31528972
  LZ4 192348049
  LZO  85990145

  From Colin: "I used the monotonic TOD timer using the stckf opcode to
  fetch a 64 bit time value.  Not sure how this maps to 'real time' in
  seconds."

  Conclusion: We should switch x86 to LZ4 and s390 to LZO. PPC and ARM
  do not support LZO or LZ4, so we will stick with gzip there.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840934/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

2019-09-03 Thread Colin Ian King
The ZFS destroy checks the reference count on the dataset with
zfs_refcount_count(>ds_longholds) != expected_holds and returns
EBUSY in dsl_destroy_head_check_impl.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1779156

Title:
  lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

Status in linux package in Ubuntu:
  Triaged
Status in lxc package in Ubuntu:
  Confirmed
Status in linux source package in Cosmic:
  Triaged
Status in lxc source package in Cosmic:
  Confirmed
Status in linux source package in Disco:
  New
Status in lxc source package in Disco:
  New
Status in linux source package in Eoan:
  Triaged
Status in lxc source package in Eoan:
  Confirmed

Bug description:
  I'm not sure exactly what got me into this state, but I have several
  lxc containers that cannot be deleted.

  $ lxc info
  
  api_status: stable
  api_version: "1.0"
  auth: trusted
  public: false
  auth_methods:
  - tls
  environment:
addresses: []
architectures:
- x86_64
- i686
certificate: |
  -BEGIN CERTIFICATE-
  
  -END CERTIFICATE-
certificate_fingerprint: 
3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb
driver: lxc
driver_version: 3.0.1
kernel: Linux
kernel_architecture: x86_64
kernel_version: 4.15.0-23-generic
server: lxd
server_pid: 15123
server_version: "3.2"
storage: zfs
storage_version: 0.7.5-1ubuntu15
server_clustered: false
server_name: milhouse

  $ lxc delete --force b1
  Error: Failed to destroy ZFS filesystem: cannot destroy 
'default/containers/b1': dataset is busy

  Talking in #lxc-dev, stgraber and sforeshee provided diagnosis:

   | short version is that something unshared a mount namespace causing
   | them to get a copy of the mount table at the time that dataset was
   | mounted, which then prevents zfs from being able to destroy it)

  The work around provided was

   | you can unstick this particular issue by doing:
   |  grep default/containers/b1 /proc/*/mountinfo
   | then for any of the hits, do:
   |   nsenter -t PID -m -- umount 
/var/snap/lxd/common/lxd/storage-pools/default/containers/b1
   | then try the delete again

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.15.0-23-generic 4.15.0-23.25
  ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
  Uname: Linux 4.15.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.10-0ubuntu3
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  smoser31412 F pulseaudio
   /dev/snd/controlC2:  smoser31412 F pulseaudio
   /dev/snd/controlC0:  smoser31412 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 28 10:42:45 2018
  EcryptfsInUse: Yes
  InstallationDate: Installed on 2015-07-23 (1071 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
  MachineType: 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic 
root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-23-generic N/A
   linux-backports-modules-4.15.0-23-generic  N/A
   linux-firmware 1.174
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 03/09/2015
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355
  dmi.board.asset.tag: �
  dmi.board.name: NUC5i5RYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H40999-503
  dmi.chassis.asset.tag: �
  dmi.chassis.type: 3
  dmi.chassis.vendor: �
  dmi.chassis.version: �
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i5RYB:rvrH40999-503:cvn:ct3:cvr:
  dmi.product.family: �
  dmi.product.name: �
  dmi.product.version: �
  dmi.sys.vendor: �

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779156/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net

[Kernel-packages] [Bug 1832384] Re: Unable to unmount apparently unused filesystem

2019-09-02 Thread Colin Ian King
latter 1-2 weeks of this cycle

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1832384

Title:
  Unable to unmount apparently unused filesystem

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  We periodically see an issue where unmounting a ZFS filesystem fails
  with EBUSY, even though there appears to be no one using it.

  # cat /proc/self/mounts | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs 
rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0

  'lsof' and 'fuser' show no processes using any of the files in the
  problematic filesystem:

  # ls -l 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
  total 221
  -rw-r- 1 500 500  52736 May 22 11:01 1_19_1008904362.dbf
  -rw-r- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
  # lsof | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  #

  The filesystem was shared over NFS, but has since been unshared:

  # showmount -e | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  #

  Since no one appears to be using the filesystem, our expectation is
  that it should be possible to unmount the filesystem. However,
  attempts to unmount the filesystem fail with EBUSY:

  # zfs destroy 
domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  umount: 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target 
is busy.
  cannot unmount 
'/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive': 
umount failed
  # umount 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  umount: 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target 
is busy.

  
  Using bpftrace, we can see that the unmount is failing in 
'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can 
look at the 'mount' struct for this particular mount and see that the 
'mnt_count' refcount summed across all CPUs is 2. For filesystems that are 
eligible for unmounting, the refcount is 1.

  The only way to work around this issue that we have found is to
  reboot, at which point the filesystem can be unmounted and destroyed.

  
  So far, we have only been able to reproduce this using a workload driven by 
our application. The application mananges ZFS filesystems in groups, and the 
lifecycle of each group looks something like

  - Create and mount a group of filesystems, 1 parent and 4 children:
  /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp
  - Share all 5 filesystems over NFS
  - A client mounts all 5 shares using NFSv3
  - For a few hours, the client does NFS operations on the filesystems and 
the server occasionally takes ZFS snapshots of them
  - Unshare filesystems
  - Unmount filesystems
  - Delete filesystems

  These groups of filesystems are constantly being created and
  destroyed. At any given time, we have ~30k filesystems on the system,
  about 5k of which are shared. On average, one out of ~200-300k
  unmounts fails with this EBUSY error. To create and destroy this many
  filesystems takes us about a week or so.

  Note that we are using ZFS built from https://github.com/delphix/zfs,
  which is essentially master ZFS on Linux.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-50-generic 4.15.0-50.54
  ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18
  Uname: Linux 4.15.0-50-generic x86_64
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 May 20 19:10 seq
   crw-rw 1 root audio 116, 33 May 20 19:10 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 

[Kernel-packages] [Bug 1832384] Re: Unable to unmount apparently unused filesystem

2019-09-02 Thread Colin Ian King
See also: https://wiki.ubuntu.com/Kernel/StableReleaseCadence
and https://kernel.ubuntu.com/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1832384

Title:
  Unable to unmount apparently unused filesystem

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  We periodically see an issue where unmounting a ZFS filesystem fails
  with EBUSY, even though there appears to be no one using it.

  # cat /proc/self/mounts | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs 
rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0

  'lsof' and 'fuser' show no processes using any of the files in the
  problematic filesystem:

  # ls -l 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
  total 221
  -rw-r- 1 500 500  52736 May 22 11:01 1_19_1008904362.dbf
  -rw-r- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
  # lsof | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  #

  The filesystem was shared over NFS, but has since been unshared:

  # showmount -e | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  #

  Since no one appears to be using the filesystem, our expectation is
  that it should be possible to unmount the filesystem. However,
  attempts to unmount the filesystem fail with EBUSY:

  # zfs destroy 
domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  umount: 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target 
is busy.
  cannot unmount 
'/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive': 
umount failed
  # umount 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  umount: 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target 
is busy.

  
  Using bpftrace, we can see that the unmount is failing in 
'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can 
look at the 'mount' struct for this particular mount and see that the 
'mnt_count' refcount summed across all CPUs is 2. For filesystems that are 
eligible for unmounting, the refcount is 1.

  The only way to work around this issue that we have found is to
  reboot, at which point the filesystem can be unmounted and destroyed.

  
  So far, we have only been able to reproduce this using a workload driven by 
our application. The application mananges ZFS filesystems in groups, and the 
lifecycle of each group looks something like

  - Create and mount a group of filesystems, 1 parent and 4 children:
  /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp
  - Share all 5 filesystems over NFS
  - A client mounts all 5 shares using NFSv3
  - For a few hours, the client does NFS operations on the filesystems and 
the server occasionally takes ZFS snapshots of them
  - Unshare filesystems
  - Unmount filesystems
  - Delete filesystems

  These groups of filesystems are constantly being created and
  destroyed. At any given time, we have ~30k filesystems on the system,
  about 5k of which are shared. On average, one out of ~200-300k
  unmounts fails with this EBUSY error. To create and destroy this many
  filesystems takes us about a week or so.

  Note that we are using ZFS built from https://github.com/delphix/zfs,
  which is essentially master ZFS on Linux.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-50-generic 4.15.0-50.54
  ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18
  Uname: Linux 4.15.0-50-generic x86_64
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 May 20 19:10 seq
   crw-rw 1 root audio 116, 33 May 20 19:10 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', 

[Kernel-packages] [Bug 1832384] Re: Unable to unmount apparently unused filesystem

2019-09-02 Thread Colin Ian King
We generally have a 3 week release cycle on kernels, so if it's in
-proposed it probably in the later 1-2 weeks of this cycle.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1832384

Title:
  Unable to unmount apparently unused filesystem

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  We periodically see an issue where unmounting a ZFS filesystem fails
  with EBUSY, even though there appears to be no one using it.

  # cat /proc/self/mounts | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs 
rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0

  'lsof' and 'fuser' show no processes using any of the files in the
  problematic filesystem:

  # ls -l 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
  total 221
  -rw-r- 1 500 500  52736 May 22 11:01 1_19_1008904362.dbf
  -rw-r- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
  # lsof | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  #

  The filesystem was shared over NFS, but has since been unshared:

  # showmount -e | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  #

  Since no one appears to be using the filesystem, our expectation is
  that it should be possible to unmount the filesystem. However,
  attempts to unmount the filesystem fail with EBUSY:

  # zfs destroy 
domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  umount: 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target 
is busy.
  cannot unmount 
'/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive': 
umount failed
  # umount 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  umount: 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target 
is busy.

  
  Using bpftrace, we can see that the unmount is failing in 
'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can 
look at the 'mount' struct for this particular mount and see that the 
'mnt_count' refcount summed across all CPUs is 2. For filesystems that are 
eligible for unmounting, the refcount is 1.

  The only way to work around this issue that we have found is to
  reboot, at which point the filesystem can be unmounted and destroyed.

  
  So far, we have only been able to reproduce this using a workload driven by 
our application. The application mananges ZFS filesystems in groups, and the 
lifecycle of each group looks something like

  - Create and mount a group of filesystems, 1 parent and 4 children:
  /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp
  - Share all 5 filesystems over NFS
  - A client mounts all 5 shares using NFSv3
  - For a few hours, the client does NFS operations on the filesystems and 
the server occasionally takes ZFS snapshots of them
  - Unshare filesystems
  - Unmount filesystems
  - Delete filesystems

  These groups of filesystems are constantly being created and
  destroyed. At any given time, we have ~30k filesystems on the system,
  about 5k of which are shared. On average, one out of ~200-300k
  unmounts fails with this EBUSY error. To create and destroy this many
  filesystems takes us about a week or so.

  Note that we are using ZFS built from https://github.com/delphix/zfs,
  which is essentially master ZFS on Linux.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-50-generic 4.15.0-50.54
  ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18
  Uname: Linux 4.15.0-50-generic x86_64
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 May 20 19:10 seq
   crw-rw 1 root audio 116, 33 May 20 19:10 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  

[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

2019-08-30 Thread Colin Ian King
Reproducer is as follows:

lxc launch ubuntu:bionic zfs-bug-test
Creating zfs-bug-test
Starting zfs-bug-test
lxc delete zfs-bug-test --force
Error: Failed to destroy ZFS filesystem:

Can reproduce this on Eoan with latest 5.2, 5.3 kernel.

** Also affects: linux (Ubuntu Eoan)
   Importance: Medium
 Assignee: Colin Ian King (colin-king)
   Status: Triaged

** Also affects: lxc (Ubuntu Eoan)
   Importance: Undecided
   Status: Confirmed

** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Also affects: lxc (Ubuntu Disco)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1779156

Title:
  lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

Status in linux package in Ubuntu:
  Triaged
Status in lxc package in Ubuntu:
  Confirmed
Status in linux source package in Cosmic:
  Triaged
Status in lxc source package in Cosmic:
  Confirmed
Status in linux source package in Disco:
  New
Status in lxc source package in Disco:
  New
Status in linux source package in Eoan:
  Triaged
Status in lxc source package in Eoan:
  Confirmed

Bug description:
  I'm not sure exactly what got me into this state, but I have several
  lxc containers that cannot be deleted.

  $ lxc info
  
  api_status: stable
  api_version: "1.0"
  auth: trusted
  public: false
  auth_methods:
  - tls
  environment:
addresses: []
architectures:
- x86_64
- i686
certificate: |
  -BEGIN CERTIFICATE-
  
  -END CERTIFICATE-
certificate_fingerprint: 
3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb
driver: lxc
driver_version: 3.0.1
kernel: Linux
kernel_architecture: x86_64
kernel_version: 4.15.0-23-generic
server: lxd
server_pid: 15123
server_version: "3.2"
storage: zfs
storage_version: 0.7.5-1ubuntu15
server_clustered: false
server_name: milhouse

  $ lxc delete --force b1
  Error: Failed to destroy ZFS filesystem: cannot destroy 
'default/containers/b1': dataset is busy

  Talking in #lxc-dev, stgraber and sforeshee provided diagnosis:

   | short version is that something unshared a mount namespace causing
   | them to get a copy of the mount table at the time that dataset was
   | mounted, which then prevents zfs from being able to destroy it)

  The work around provided was

   | you can unstick this particular issue by doing:
   |  grep default/containers/b1 /proc/*/mountinfo
   | then for any of the hits, do:
   |   nsenter -t PID -m -- umount 
/var/snap/lxd/common/lxd/storage-pools/default/containers/b1
   | then try the delete again

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.15.0-23-generic 4.15.0-23.25
  ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
  Uname: Linux 4.15.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.10-0ubuntu3
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  smoser31412 F pulseaudio
   /dev/snd/controlC2:  smoser31412 F pulseaudio
   /dev/snd/controlC0:  smoser31412 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 28 10:42:45 2018
  EcryptfsInUse: Yes
  InstallationDate: Installed on 2015-07-23 (1071 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
  MachineType: 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic 
root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-23-generic N/A
   linux-backports-modules-4.15.0-23-generic  N/A
   linux-firmware 1.174
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 03/09/2015
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355
  dmi.board.asset.tag: �
  dmi.board.name: NUC5i5RYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H40999-503
  dmi.chassis.asset.tag: �
  dmi.chassis.type: 3
  dmi.chassis.vendor: �
  dmi.chassis.version: �
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i

[Kernel-packages] [Bug 1832384] Re: Unable to unmount apparently unused filesystem

2019-08-30 Thread Colin Ian King
That's really helpful to know John, thanks for the feedback.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1832384

Title:
  Unable to unmount apparently unused filesystem

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  We periodically see an issue where unmounting a ZFS filesystem fails
  with EBUSY, even though there appears to be no one using it.

  # cat /proc/self/mounts | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive zfs 
rw,nosuid,nodev,noexec,relatime,xattr,noacl 0 0

  'lsof' and 'fuser' show no processes using any of the files in the
  problematic filesystem:

  # ls -l 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
  total 221
  -rw-r- 1 500 500  52736 May 22 11:01 1_19_1008904362.dbf
  -rw-r- 1 500 500 541696 May 22 11:03 1_20_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_20_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/1_19_1008904362.dbf
  # fuser 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive/
  # lsof | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  #

  The filesystem was shared over NFS, but has since been unshared:

  # showmount -e | grep 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  #

  Since no one appears to be using the filesystem, our expectation is
  that it should be possible to unmount the filesystem. However,
  attempts to unmount the filesystem fail with EBUSY:

  # zfs destroy 
domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  umount: 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target 
is busy.
  cannot unmount 
'/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive': 
umount failed
  # umount 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  umount: 
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive: target 
is busy.

  
  Using bpftrace, we can see that the unmount is failing in 
'propagate_mount_busy()' in the kernel. Using a live kernel debugger, we can 
look at the 'mount' struct for this particular mount and see that the 
'mnt_count' refcount summed across all CPUs is 2. For filesystems that are 
eligible for unmounting, the refcount is 1.

  The only way to work around this issue that we have found is to
  reboot, at which point the filesystem can be unmounted and destroyed.

  
  So far, we have only been able to reproduce this using a workload driven by 
our application. The application mananges ZFS filesystems in groups, and the 
lifecycle of each group looks something like

  - Create and mount a group of filesystems, 1 parent and 4 children:
  /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/datafile
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/external
  
/domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/archive
  /domain0/group-38/oracle_db_container-202/oracle_timeflow-16370/temp
  - Share all 5 filesystems over NFS
  - A client mounts all 5 shares using NFSv3
  - For a few hours, the client does NFS operations on the filesystems and 
the server occasionally takes ZFS snapshots of them
  - Unshare filesystems
  - Unmount filesystems
  - Delete filesystems

  These groups of filesystems are constantly being created and
  destroyed. At any given time, we have ~30k filesystems on the system,
  about 5k of which are shared. On average, one out of ~200-300k
  unmounts fails with this EBUSY error. To create and destroy this many
  filesystems takes us about a week or so.

  Note that we are using ZFS built from https://github.com/delphix/zfs,
  which is essentially master ZFS on Linux.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-50-generic 4.15.0-50.54
  ProcVersionSignature: Ubuntu 4.15.0-50.54-generic 4.15.18
  Uname: Linux 4.15.0-50-generic x86_64
  NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 May 20 19:10 seq
   crw-rw 1 root audio 116, 33 May 20 19:10 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 

[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

2019-08-30 Thread Colin Ian King
Do we have any hunches on how to reproduce this issue?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1779156

Title:
  lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

Status in linux package in Ubuntu:
  Triaged
Status in lxc package in Ubuntu:
  Confirmed
Status in linux source package in Cosmic:
  Triaged
Status in lxc source package in Cosmic:
  Confirmed

Bug description:
  I'm not sure exactly what got me into this state, but I have several
  lxc containers that cannot be deleted.

  $ lxc info
  
  api_status: stable
  api_version: "1.0"
  auth: trusted
  public: false
  auth_methods:
  - tls
  environment:
addresses: []
architectures:
- x86_64
- i686
certificate: |
  -BEGIN CERTIFICATE-
  
  -END CERTIFICATE-
certificate_fingerprint: 
3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb
driver: lxc
driver_version: 3.0.1
kernel: Linux
kernel_architecture: x86_64
kernel_version: 4.15.0-23-generic
server: lxd
server_pid: 15123
server_version: "3.2"
storage: zfs
storage_version: 0.7.5-1ubuntu15
server_clustered: false
server_name: milhouse

  $ lxc delete --force b1
  Error: Failed to destroy ZFS filesystem: cannot destroy 
'default/containers/b1': dataset is busy

  Talking in #lxc-dev, stgraber and sforeshee provided diagnosis:

   | short version is that something unshared a mount namespace causing
   | them to get a copy of the mount table at the time that dataset was
   | mounted, which then prevents zfs from being able to destroy it)

  The work around provided was

   | you can unstick this particular issue by doing:
   |  grep default/containers/b1 /proc/*/mountinfo
   | then for any of the hits, do:
   |   nsenter -t PID -m -- umount 
/var/snap/lxd/common/lxd/storage-pools/default/containers/b1
   | then try the delete again

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.15.0-23-generic 4.15.0-23.25
  ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
  Uname: Linux 4.15.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.10-0ubuntu3
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  smoser31412 F pulseaudio
   /dev/snd/controlC2:  smoser31412 F pulseaudio
   /dev/snd/controlC0:  smoser31412 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 28 10:42:45 2018
  EcryptfsInUse: Yes
  InstallationDate: Installed on 2015-07-23 (1071 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
  MachineType: 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic 
root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-23-generic N/A
   linux-backports-modules-4.15.0-23-generic  N/A
   linux-firmware 1.174
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 03/09/2015
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355
  dmi.board.asset.tag: �
  dmi.board.name: NUC5i5RYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H40999-503
  dmi.chassis.asset.tag: �
  dmi.chassis.type: 3
  dmi.chassis.vendor: �
  dmi.chassis.version: �
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i5RYB:rvrH40999-503:cvn:ct3:cvr:
  dmi.product.family: �
  dmi.product.name: �
  dmi.product.version: �
  dmi.sys.vendor: �

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779156/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1779156] Re: lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

2019-08-30 Thread Colin Ian King
Cosmic is now end-of-life. Does this still occur on Disco?

** Changed in: linux (Ubuntu)
   Importance: High => Medium

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Colin Ian King (colin-king)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1779156

Title:
  lxc 'delete' fails to destroy ZFS filesystem 'dataset is busy'

Status in linux package in Ubuntu:
  Triaged
Status in lxc package in Ubuntu:
  Confirmed
Status in linux source package in Cosmic:
  Triaged
Status in lxc source package in Cosmic:
  Confirmed

Bug description:
  I'm not sure exactly what got me into this state, but I have several
  lxc containers that cannot be deleted.

  $ lxc info
  
  api_status: stable
  api_version: "1.0"
  auth: trusted
  public: false
  auth_methods:
  - tls
  environment:
addresses: []
architectures:
- x86_64
- i686
certificate: |
  -BEGIN CERTIFICATE-
  
  -END CERTIFICATE-
certificate_fingerprint: 
3af6f8b8233c5d9e898590a9486ded5c0bec045488384f30ea921afce51f75cb
driver: lxc
driver_version: 3.0.1
kernel: Linux
kernel_architecture: x86_64
kernel_version: 4.15.0-23-generic
server: lxd
server_pid: 15123
server_version: "3.2"
storage: zfs
storage_version: 0.7.5-1ubuntu15
server_clustered: false
server_name: milhouse

  $ lxc delete --force b1
  Error: Failed to destroy ZFS filesystem: cannot destroy 
'default/containers/b1': dataset is busy

  Talking in #lxc-dev, stgraber and sforeshee provided diagnosis:

   | short version is that something unshared a mount namespace causing
   | them to get a copy of the mount table at the time that dataset was
   | mounted, which then prevents zfs from being able to destroy it)

  The work around provided was

   | you can unstick this particular issue by doing:
   |  grep default/containers/b1 /proc/*/mountinfo
   | then for any of the hits, do:
   |   nsenter -t PID -m -- umount 
/var/snap/lxd/common/lxd/storage-pools/default/containers/b1
   | then try the delete again

  ProblemType: Bug
  DistroRelease: Ubuntu 18.10
  Package: linux-image-4.15.0-23-generic 4.15.0-23.25
  ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
  Uname: Linux 4.15.0-23-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.10-0ubuntu3
  Architecture: amd64
  AudioDevicesInUse:
   USERPID ACCESS COMMAND
   /dev/snd/controlC1:  smoser31412 F pulseaudio
   /dev/snd/controlC2:  smoser31412 F pulseaudio
   /dev/snd/controlC0:  smoser31412 F pulseaudio
  CurrentDesktop: ubuntu:GNOME
  Date: Thu Jun 28 10:42:45 2018
  EcryptfsInUse: Yes
  InstallationDate: Installed on 2015-07-23 (1071 days ago)
  InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
  MachineType: 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
 
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-23-generic 
root=UUID=f897b32a-eacf-4191-9717-844918947069 ro quiet splash vt.handoff=1
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-23-generic N/A
   linux-backports-modules-4.15.0-23-generic  N/A
   linux-firmware 1.174
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 03/09/2015
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: RYBDWi35.86A.0246.2015.0309.1355
  dmi.board.asset.tag: �
  dmi.board.name: NUC5i5RYB
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H40999-503
  dmi.chassis.asset.tag: �
  dmi.chassis.type: 3
  dmi.chassis.vendor: �
  dmi.chassis.version: �
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrRYBDWi35.86A.0246.2015.0309.1355:bd03/09/2015:svn:pn:pvr:rvnIntelCorporation:rnNUC5i5RYB:rvrH40999-503:cvn:ct3:cvr:
  dmi.product.family: �
  dmi.product.name: �
  dmi.product.version: �
  dmi.sys.vendor: �

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779156/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1841747] Re: dev test in ubuntu_stress_smoke_test crashes AWS c5.large with Disco

2019-08-29 Thread Colin Ian King
The following run as root will cause the instances to hang and then
reboot (by the watchdog?)

#include 
#include 
#include 
#include 
#include 

int main(void)
{
for (;;) {
int fd;

fd = open("/dev/hpet", O_RDONLY | O_NONBLOCK);
close(fd);
}
}


** Information type changed from Public to Public Security

** Information type changed from Public Security to Private Security

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1841747

Title:
  dev test in ubuntu_stress_smoke_test crashes AWS c5.large with Disco

Status in Stress-ng:
  New
Status in ubuntu-kernel-tests:
  New
Status in linux-aws package in Ubuntu:
  New

Bug description:
  When testing the dev test in ubuntu_stress_smoke_test, the instance
  will crash and gets rebooted.

  Spotted on:
  * c3.xlarge
  * c4.large
  * m3.large
  * m4.large
  * r3.large
  * t2.small
  * x1e.xlarge

  Test output:
  09:10:12 DEBUG| [stdout] dentry RETURNED 0
  09:10:12 DEBUG| [stdout] dentry PASSED
  09:10:12 DEBUG| [stdout] dev STARTING
  packet_write_wait: Connection to 18.237.187.217 port 22: Broken pipe

  tailing syslog:
  Aug 28 09:10:07 ip-172-31-3-117 stress-ng: info:  [19659] dispatching hogs: 4 
dentry
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: info:  [19659] successful run 
completed in 5.13s
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: invoked with './stress-n' by user 0
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: system: 'ip-172-31-3-117' Linux 
5.0.0-1012-aws #13-Ubuntu SMP Fri Aug 2 12:25:32 UTC 2019 x86_64
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: memory (MB): total 3754.94, free 
3386.23, shared 0.06, buffer 145.52, swap 1024.00, free swap 951.48
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: info:  [19674] dispatching hogs: 4 
dev
  Aug 28 09:10:12 ip-172-31-3-117 kernel: [  481.406432] PM: Marking nosave 
pages: [mem 0x-0x0fff]
  Aug 28 09:10:12 ip-172-31-3-117 kernel: [  481.406434] PM: Marking nosave 
pages: [mem 0x0009e000-0x000f]
  Aug 28 09:10:12 ip-172-31-3-117 kernel: [  481.406437] PM: Basic memory 
bitmaps created
  Aug 28 09:10:12 ip-172-31-3-117 kernel: [  481.406496] PM: Basic memory 
bitmaps freed
  packet_write_wait: Connection to 18.237.187.217 port 22: Broken pipe

  ProblemType: Bug
  DistroRelease: Ubuntu 19.04
  Package: linux-image-5.0.0-1012-aws 5.0.0-1012.13
  ProcVersionSignature: User Name 5.0.0-1012.13-aws 5.0.15
  Uname: Linux 5.0.0-1012-aws x86_64
  ApportVersion: 2.20.10-0ubuntu27.1
  Architecture: amd64
  Date: Wed Aug 28 09:21:28 2019
  Ec2AMI: ami-0b731fb4a9a36df8c
  Ec2AMIManifest: (unknown)
  Ec2AvailabilityZone: us-west-2c
  Ec2InstanceType: c4.large
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  SourcePackage: linux-aws
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/stress-ng/+bug/1841747/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1841747] Re: dev test in ubuntu_stress_smoke_test crashes AWS c5.large with Disco

2019-08-29 Thread Colin Ian King
Seems to occur when exercising /dev/hpet

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1841747

Title:
  dev test in ubuntu_stress_smoke_test crashes AWS c5.large with Disco

Status in Stress-ng:
  New
Status in ubuntu-kernel-tests:
  New
Status in linux-aws package in Ubuntu:
  New

Bug description:
  When testing the dev test in ubuntu_stress_smoke_test, the instance
  will crash and gets rebooted.

  Spotted on:
  * c3.xlarge
  * c4.large
  * m3.large
  * m4.large
  * r3.large
  * t2.small
  * x1e.xlarge

  Test output:
  09:10:12 DEBUG| [stdout] dentry RETURNED 0
  09:10:12 DEBUG| [stdout] dentry PASSED
  09:10:12 DEBUG| [stdout] dev STARTING
  packet_write_wait: Connection to 18.237.187.217 port 22: Broken pipe

  tailing syslog:
  Aug 28 09:10:07 ip-172-31-3-117 stress-ng: info:  [19659] dispatching hogs: 4 
dentry
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: info:  [19659] successful run 
completed in 5.13s
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: invoked with './stress-n' by user 0
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: system: 'ip-172-31-3-117' Linux 
5.0.0-1012-aws #13-Ubuntu SMP Fri Aug 2 12:25:32 UTC 2019 x86_64
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: memory (MB): total 3754.94, free 
3386.23, shared 0.06, buffer 145.52, swap 1024.00, free swap 951.48
  Aug 28 09:10:12 ip-172-31-3-117 stress-ng: info:  [19674] dispatching hogs: 4 
dev
  Aug 28 09:10:12 ip-172-31-3-117 kernel: [  481.406432] PM: Marking nosave 
pages: [mem 0x-0x0fff]
  Aug 28 09:10:12 ip-172-31-3-117 kernel: [  481.406434] PM: Marking nosave 
pages: [mem 0x0009e000-0x000f]
  Aug 28 09:10:12 ip-172-31-3-117 kernel: [  481.406437] PM: Basic memory 
bitmaps created
  Aug 28 09:10:12 ip-172-31-3-117 kernel: [  481.406496] PM: Basic memory 
bitmaps freed
  packet_write_wait: Connection to 18.237.187.217 port 22: Broken pipe

  ProblemType: Bug
  DistroRelease: Ubuntu 19.04
  Package: linux-image-5.0.0-1012-aws 5.0.0-1012.13
  ProcVersionSignature: User Name 5.0.0-1012.13-aws 5.0.15
  Uname: Linux 5.0.0-1012-aws x86_64
  ApportVersion: 2.20.10-0ubuntu27.1
  Architecture: amd64
  Date: Wed Aug 28 09:21:28 2019
  Ec2AMI: ami-0b731fb4a9a36df8c
  Ec2AMIManifest: (unknown)
  Ec2AvailabilityZone: us-west-2c
  Ec2InstanceType: c4.large
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  SourcePackage: linux-aws
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/stress-ng/+bug/1841747/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1840934] Re: Change kernel compression method to improve boot speed

2019-08-23 Thread Colin Ian King
@Dimitri,

Seems that for initramfs lz4 makes a lot of sense, see:

https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/boot-speed-
initramfs-decompression-eoan.ods

The load times for LZ4 is slower than the previous default, however, the
decompression time makes up for this unless one is booting off really
slow media sub-5400 RPM HDD such as slow flash.

So, the LZ4 default looks sane to me, lets see how it works out for
Eaon.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840934

Title:
  Change kernel compression method to improve boot speed

Status in linux package in Ubuntu:
  Fix Committed

Bug description:
  Colin King has done some analysis of kernel boot speed using different
  kernel compression methods. Results for x86 are at:

  
https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/kernel-compression-method.txt
  
https://kernel.ubuntu.com/~cking/boot-speed-eoan-5.3/boot-speed-compression-5.3-rc4.ods

  Testing of s390 gave the following:

  GZIP31528972
  LZ4 192348049
  LZO  85990145

  From Colin: "I used the monotonic TOD timer using the stckf opcode to
  fetch a 64 bit time value.  Not sure how this maps to 'real time' in
  seconds."

  Conclusion: We should switch x86 to LZ4 and s390 to LZO. PPC and ARM
  do not support LZO or LZ4, so we will stick with gzip there.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840934/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


<    4   5   6   7   8   9   10   11   12   13   >