Trying the first kernel without the change event sauce also fails: [ 532.823594] bcache: run_cache_set() invalidating existing data [ 532.828876] bcache: register_cache() registered cache device nvme0n1p2 [ 532.869716] bcache: register_bdev() registered backing device vda1 [ 532.994355] bcache: bch_cached_dev_attach() Caching vda1 as bcache0 on set 21d89237-231d-4af6-a4c8-4b1b8fa5eef5 [ 533.051588] bcache: register_bcache() error /dev/vda1: device already registered [ 533.094717] bcache: register_bcache() error /dev/vda1: device already registered [ 533.120063] bcache: register_bcache() error /dev/vda1: device already registered [ 533.142517] bcache: register_bcache() error /dev/vda1: device already registered [ 533.191069] bcache: register_bcache() error /dev/vda1: device already registered [ 533.249877] bcache: register_bcache() error /dev/vda1: device already registered [ 533.282653] bcache: register_bcache() error /dev/vda1: device already registered [ 533.301225] bcache: register_bcache() error /dev/vda1: device already registered [ 533.310505] bcache: register_bcache() error /dev/vda1: device already registered [ 533.318959] bcache: register_bcache() error /dev/vda1: device already registered [ 533.374121] bcache: register_bcache() error /dev/vda1: device already registered [ 533.536920] bcache: register_bcache() error /dev/vda1: device already registered [ 533.581468] bcache: register_bcache() error /dev/vda1: device already registered [ 533.589270] bcache: register_bcache() error /dev/vda1: device already registered [ 533.595986] bcache: register_bcache() error /dev/vda1: device already registered [ 533.602638] bcache: register_bcache() error /dev/vda1: device already registered [ 533.651848] bcache: register_bcache() error /dev/vda1: device already registered [ 533.677836] bcache: register_bcache() error /dev/vda1: device already registered [ 533.712074] bcache: register_bcache() error /dev/vda1: device already registered [ 533.717682] bcache: register_bcache() error /dev/vda1: device already registered [ 533.723354] bcache: register_bcache() error /dev/vda1: device already registered [ 533.728951] bcache: register_bcache() error /dev/vda1: device already registered [ 533.777602] bcache: register_bcache() error /dev/vda1: device already registered [ 553.784393] md: md0: resync done. [ 725.983387] INFO: task python3:413 blocked for more than 120 seconds. [ 725.985099] Tainted: P O 4.15.0-56-generic #62~lp1796292+1 [ 725.986820] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 725.988649] python3 D 0 413 405 0x00000000 [ 725.988652] Call Trace: [ 725.988684] __schedule+0x291/0x8a0 [ 725.988687] schedule+0x2c/0x80 [ 725.988710] bch_bucket_alloc+0x1fa/0x350 [bcache] [ 725.988722] ? wait_woken+0x80/0x80 [ 725.988726] __bch_bucket_alloc_set+0xfe/0x150 [bcache] [ 725.988729] bch_bucket_alloc_set+0x4e/0x70 [bcache] [ 725.988734] __uuid_write+0x59/0x150 [bcache] [ 725.988738] ? __write_super+0x137/0x170 [bcache] [ 725.988742] bch_uuid_write+0x16/0x40 [bcache] [ 725.988746] __cached_dev_store+0x1d8/0x8a0 [bcache] [ 725.988750] bch_cached_dev_store+0x39/0xc0 [bcache] [ 725.988758] sysfs_kf_write+0x3c/0x50 [ 725.988759] kernfs_fop_write+0x125/0x1a0 [ 725.988765] __vfs_write+0x1b/0x40 [ 725.988766] vfs_write+0xb1/0x1a0 [ 725.988767] SyS_write+0x5c/0xe0 [ 725.988774] do_syscall_64+0x73/0x130 [ 725.988777] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 725.988779] RIP: 0033:0x7fa71a23b154 [ 725.988780] RSP: 002b:00007ffec74f2828 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 725.988783] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007fa71a23b154 [ 725.988783] RDX: 0000000000000008 RSI: 0000000002a90030 RDI: 0000000000000003 [ 725.988784] RBP: 00007fa71a7366c0 R08: 0000000000000000 R09: 0000000000000000 [ 725.988785] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000003 [ 725.988785] R13: 0000000000000000 R14: 0000000002a90030 R15: 00000000027c4e60
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1796292 Title: Tight timeout for bcache removal causes spurious failures Status in curtin: Fix Released Status in linux package in Ubuntu: Confirmed Status in linux source package in Bionic: Confirmed Status in linux source package in Cosmic: Confirmed Status in linux source package in Disco: Confirmed Status in linux source package in Eoan: Confirmed Bug description: I've had a number of deployment faults where curtin would report Timeout exceeded for removal of /sys/fs/bcache/xxx when doing a mass- deployment of 30+ nodes. Upon retrying the node would usually deploy fine. Experimentally I've set the timeout ridiculously high, and it seems I'm getting no faults with this. I'm wondering if the timeout for removal is set too tight, or might need to be made configurable. --- curtin/util.py~ 2018-05-18 18:40:48.000000000 +0000 +++ curtin/util.py 2018-10-05 09:40:06.807390367 +0000 @@ -263,7 +263,7 @@ return _subp(*args, **kwargs) -def wait_for_removal(path, retries=[1, 3, 5, 7]): +def wait_for_removal(path, retries=[1, 3, 5, 7, 1200, 1200]): if not path: raise ValueError('wait_for_removal: missing path parameter') To manage notifications about this bug go to: https://bugs.launchpad.net/curtin/+bug/1796292/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp