Public bug reported:
I was able to reproduce this 2 or 3 times last 2 days. I have the
following setup:
Containers for Trusty/kilo service machines:
inaddy@workstation:~$ lxc-ls | grep tk
tkcephmon01 RUNNING 0 - 192.168.65.52 -
tkcephmon02 RUNNING 0 - 192.168.65.51 -
tkcephmon03 RUNNING 0 - 192.168.65.48 -
tkcinder RUNNING 0 - 192.168.65.49 -
tkdash RUNNING 0 - 192.168.65.50 -
tkglance RUNNING 0 - 192.168.65.53 -
tkjuju RUNNING 0 - 192.168.65.15 -
tkkeystone RUNNING 0 - 192.168.65.54 -
tkmysql RUNNING 0 - 192.168.65.55 -
tknova RUNNING 0 - 192.168.65.56 -
tkrabbit RUNNING 0 - 192.168.65.57 -
tkswiftproxy RUNNING 0 - 192.168.65.58 -
And compute nodes + neutrongw as kvm guests:
inaddy@workstation:~$ virsh list --all | grep tk
21 tkcompute01 running
22 tkcompute02 running
23 tkcompute03 running
24 tkneutrongw running
All my LXC containers are on top of ZFS:
Linux workstation 4.4.0-32-generic #51-Ubuntu SMP Tue Jul 19 18:09:07
UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
And my KVM guests are on top of ext4 + 1.2 raid0 stripped volume.
I'm getting the lockups bellow (usually for mongod, from tkjuju
container, the juju controller). After the first lockup appears
(schedule timeout coming from zfs sync logic most likely), JuJu
controller starts giving me errors on "update-status". From "juju
status":
glance/0 error idle 1.25.6 10 9292/tcp
tkglance hook failed: "update-status"
keystone/0 error idle 1.25.6 11
tkkeystone hook failed: "update-status"
mysql/0 error idle 1.25.6 12
tkmysql hook failed: "config-changed"
neutron-api/0 error idle 1.25.6 4 9696/tcp
tkneutrongw hook failed: "update-status"
nova-compute/0 error idle 1.25.6 1
tkcompute01 hook failed: "update-status"
nova-compute/1 error idle 1.25.6 2
tkcompute02 hook failed: "update-status"
nova-compute/2 error idle 1.25.6 3
tkcompute03 hook failed: "update-status"
Lockups:
[105601.816578] INFO: task mongod:14480 blocked for more than 120 seconds.
[105601.816583] Tainted: P O 4.4.0-32-generic #51-Ubuntu
[105601.816584] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[105601.816586] mongod D ffff88010ec47ba8 0 14480 16855 0x00000100
[105601.816590] ffff88010ec47ba8 0000000057992eeb ffff880108e5ee00
ffff880108e58dc0
[105601.816592] ffff88010ec48000 ffff88081ecd6d00 7fffffffffffffff
ffffffff8182a600
[105601.816594] ffff88010ec47d08 ffff88010ec47bc0 ffffffff81829e05
0000000000000000
[105601.816596] Call Trace:
[105601.816603] [<ffffffff8182a600>] ? bit_wait+0x60/0x60
[105601.816606] [<ffffffff81829e05>] schedule+0x35/0x80
[105601.816608] [<ffffffff8182cf25>] schedule_timeout+0x1b5/0x270
[105601.816612] [<ffffffff8118d939>] ? find_get_pages_tag+0x109/0x190
[105601.816614] [<ffffffff8182a600>] ? bit_wait+0x60/0x60
[105601.816616] [<ffffffff81829334>] io_schedule_timeout+0xa4/0x110
[105601.816618] [<ffffffff8182a61b>] bit_wait_io+0x1b/0x70
[105601.816620] [<ffffffff8182a1ad>] __wait_on_bit+0x5d/0x90
[105601.816622] [<ffffffff8118d04b>] wait_on_page_bit+0xcb/0xf0
[105601.816625] [<ffffffff810c3ce0>] ? autoremove_wake_function+0x40/0x40
[105601.816628] [<ffffffff8118d163>] __filemap_fdatawait_range+0xf3/0x160
[105601.816630] [<ffffffff8118d1e4>] filemap_fdatawait_range+0x14/0x30
[105601.816632] [<ffffffff8118f0df>] filemap_write_and_wait_range+0x3f/0x70
[105601.816682] [<ffffffffc0902ea8>] zpl_fsync+0x38/0x90 [zfs]
[105601.816685] [<ffffffff8124116b>] vfs_fsync_range+0x4b/0xb0
[105601.816687] [<ffffffff811cabee>] SyS_msync+0x17e/0x1f0
[105601.816689] [<ffffffff8182def2>] entry_SYSCALL_64_fastpath+0x16/0x71
[117121.961545] INFO: task txg_sync:4589 blocked for more than 120 seconds.
[117121.961549] Tainted: P O 4.4.0-32-generic #51-Ubuntu
[117121.961550] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[117121.961551] txg_sync D ffff8807e1fbbaa8 0 4589 2 0x00000000
[117121.961558] ffff8807e1fbbaa8 ffff88081ed96d00 ffff8807faf03700
ffff8807e2dbb700
[117121.961560] ffff8807e1fbc000 ffff88081ed16d00 7fffffffffffffff
ffff88017a8f2208
[117121.961561] 0000000000000001 ffff8807e1fbbac0 ffffffff81829e05
0000000000000000
[117121.961563] Call Trace:
[117121.961569] [<ffffffff81829e05>] schedule+0x35/0x80
[117121.961571] [<ffffffff8182cf25>] schedule_timeout+0x1b5/0x270
[117121.961574] [<ffffffff810ac0c2>] ? default_wake_function+0x12/0x20
[117121.961576] [<ffffffff810c35e2>] ? __wake_up_common+0x52/0x90
[117121.961578] [<ffffffff81829334>] io_schedule_timeout+0xa4/0x110
[117121.961586] [<ffffffffc0759bec>] cv_wait_common+0xbc/0x140 [spl]
[117121.961589] [<ffffffff810c3ca0>] ? wake_atomic_t_function+0x60/0x60
[117121.961593] [<ffffffffc0759cc8>] __cv_wait_io+0x18/0x20 [spl]
[117121.961633] [<ffffffffc08fd2fe>] zio_wait+0x10e/0x1f0 [zfs]
[117121.961653] [<ffffffffc0886c58>] dsl_pool_sync+0xb8/0x430 [zfs]
[117121.961676] [<ffffffffc08a25b6>] spa_sync+0x366/0xb30 [zfs]
[117121.961677] [<ffffffff810ac0c2>] ? default_wake_function+0x12/0x20
[117121.961701] [<ffffffffc08b3a4a>] txg_sync_thread+0x3ba/0x630 [zfs]
[117121.961725] [<ffffffffc08b3690>] ? txg_delay+0x180/0x180 [zfs]
[117121.961729] [<ffffffffc0754e31>] thread_generic_wrapper+0x71/0x80 [spl]
[117121.961732] [<ffffffffc0754dc0>] ? __thread_exit+0x20/0x20 [spl]
[117121.961734] [<ffffffff810a0808>] kthread+0xd8/0xf0
[117121.961736] [<ffffffff810a0730>] ? kthread_create_on_node+0x1e0/0x1e0
[117121.961737] [<ffffffff8182e28f>] ret_from_fork+0x3f/0x70
[117121.961739] [<ffffffff810a0730>] ? kthread_create_on_node+0x1e0/0x1e0
[117121.961831] INFO: task mongod:14480 blocked for more than 120 seconds.
[117121.961832] Tainted: P O 4.4.0-32-generic #51-Ubuntu
[117121.961833] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[117121.961834] mongod D ffff88010ec47ba8 0 14480 16855 0x00000100
[117121.961836] ffff88010ec47ba8 0000000057995bf6 ffff8807faf01b80
ffff880108e58dc0
[117121.961837] ffff88010ec48000 ffff88081ec96d00 7fffffffffffffff
ffffffff8182a600
[117121.961839] ffff88010ec47d08 ffff88010ec47bc0 ffffffff81829e05
0000000000000000
[117121.961841] Call Trace:
[117121.961843] [<ffffffff8182a600>] ? bit_wait+0x60/0x60
[117121.961844] [<ffffffff81829e05>] schedule+0x35/0x80
[117121.961846] [<ffffffff8182cf25>] schedule_timeout+0x1b5/0x270
[117121.961849] [<ffffffff8118d939>] ? find_get_pages_tag+0x109/0x190
[117121.961851] [<ffffffff8182a600>] ? bit_wait+0x60/0x60
[117121.961852] [<ffffffff81829334>] io_schedule_timeout+0xa4/0x110
[117121.961854] [<ffffffff8182a61b>] bit_wait_io+0x1b/0x70
[117121.961856] [<ffffffff8182a1ad>] __wait_on_bit+0x5d/0x90
[117121.961857] [<ffffffff8118d04b>] wait_on_page_bit+0xcb/0xf0
[117121.961859] [<ffffffff810c3ce0>] ? autoremove_wake_function+0x40/0x40
[117121.961861] [<ffffffff8118d163>] __filemap_fdatawait_range+0xf3/0x160
[117121.961863] [<ffffffff8118d1e4>] filemap_fdatawait_range+0x14/0x30
[117121.961864] [<ffffffff8118f0df>] filemap_write_and_wait_range+0x3f/0x70
[117121.961891] [<ffffffffc0902ea8>] zpl_fsync+0x38/0x90 [zfs]
[117121.961893] [<ffffffff8124116b>] vfs_fsync_range+0x4b/0xb0
[117121.961895] [<ffffffff811cabee>] SyS_msync+0x17e/0x1f0
[117121.961897] [<ffffffff8182def2>] entry_SYSCALL_64_fastpath+0x16/0x71
[147242.344176] INFO: task txg_sync:4589 blocked for more than 120 seconds.
[147242.344180] Tainted: P O 4.4.0-32-generic #51-Ubuntu
[147242.344181] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[147242.344182] txg_sync D ffff8807e1fbbaa8 0 4589 2 0x00000000
[147242.344185] ffff8807e1fbbaa8 ffff88081ed56d00 ffff8807faf02940
ffff8807e2dbb700
[147242.344191] ffff8807e1fbc000 ffff88081ecd6d00 7fffffffffffffff
ffff8801789e5228
[147242.344192] 0000000000000001 ffff8807e1fbbac0 ffffffff81829e05
0000000000000000
[147242.344194] Call Trace:
[147242.344200] [<ffffffff81829e05>] schedule+0x35/0x80
[147242.344202] [<ffffffff8182cf25>] schedule_timeout+0x1b5/0x270
[147242.344205] [<ffffffff810ac0c2>] ? default_wake_function+0x12/0x20
[147242.344207] [<ffffffff810c35e2>] ? __wake_up_common+0x52/0x90
[147242.344209] [<ffffffff81829334>] io_schedule_timeout+0xa4/0x110
[147242.344217] [<ffffffffc0759bec>] cv_wait_common+0xbc/0x140 [spl]
[147242.344219] [<ffffffff810c3ca0>] ? wake_atomic_t_function+0x60/0x60
[147242.344224] [<ffffffffc0759cc8>] __cv_wait_io+0x18/0x20 [spl]
[147242.344264] [<ffffffffc08fd2fe>] zio_wait+0x10e/0x1f0 [zfs]
[147242.344285] [<ffffffffc0886c58>] dsl_pool_sync+0xb8/0x430 [zfs]
[147242.344307] [<ffffffffc08a25b6>] spa_sync+0x366/0xb30 [zfs]
[147242.344309] [<ffffffff810ac0c2>] ? default_wake_function+0x12/0x20
[147242.344333] [<ffffffffc08b3a4a>] txg_sync_thread+0x3ba/0x630 [zfs]
[147242.344357] [<ffffffffc08b3690>] ? txg_delay+0x180/0x180 [zfs]
[147242.344360] [<ffffffffc0754e31>] thread_generic_wrapper+0x71/0x80 [spl]
[147242.344364] [<ffffffffc0754dc0>] ? __thread_exit+0x20/0x20 [spl]
[147242.344366] [<ffffffff810a0808>] kthread+0xd8/0xf0
[147242.344367] [<ffffffff810a0730>] ? kthread_create_on_node+0x1e0/0x1e0
[147242.344369] [<ffffffff8182e28f>] ret_from_fork+0x3f/0x70
[147242.344370] [<ffffffff810a0730>] ? kthread_create_on_node+0x1e0/0x1e0
[147242.344461] INFO: task mongod:14480 blocked for more than 120 seconds.
[147242.344462] Tainted: P O 4.4.0-32-generic #51-Ubuntu
[147242.344463] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[147242.344464] mongod D ffff88010ec47ba8 0 14480 16855 0x00000100
[147242.344466] ffff88010ec47ba8 000000005799d18d ffff8807faf03700
ffff880108e58dc0
[147242.344467] ffff88010ec48000 ffff88081ed16d00 7fffffffffffffff
ffffffff8182a600
[147242.344469] ffff88010ec47d08 ffff88010ec47bc0 ffffffff81829e05
0000000000000000
[147242.344470] Call Trace:
[147242.344473] [<ffffffff8182a600>] ? bit_wait+0x60/0x60
[147242.344474] [<ffffffff81829e05>] schedule+0x35/0x80
[147242.344476] [<ffffffff8182cf25>] schedule_timeout+0x1b5/0x270
[147242.344478] [<ffffffff8118d939>] ? find_get_pages_tag+0x109/0x190
[147242.344480] [<ffffffff8182a600>] ? bit_wait+0x60/0x60
[147242.344482] [<ffffffff81829334>] io_schedule_timeout+0xa4/0x110
[147242.344483] [<ffffffff8182a61b>] bit_wait_io+0x1b/0x70
[147242.344485] [<ffffffff8182a1ad>] __wait_on_bit+0x5d/0x90
[147242.344487] [<ffffffff8118d04b>] wait_on_page_bit+0xcb/0xf0
[147242.344488] [<ffffffff810c3ce0>] ? autoremove_wake_function+0x40/0x40
[147242.344490] [<ffffffff8118d163>] __filemap_fdatawait_range+0xf3/0x160
[147242.344492] [<ffffffff8118d1e4>] filemap_fdatawait_range+0x14/0x30
[147242.344493] [<ffffffff8118f0df>] filemap_write_and_wait_range+0x3f/0x70
[147242.344520] [<ffffffffc0902ea8>] zpl_fsync+0x38/0x90 [zfs]
[147242.344522] [<ffffffff8124116b>] vfs_fsync_range+0x4b/0xb0
[147242.344525] [<ffffffff811cabee>] SyS_msync+0x17e/0x1f0
[147242.344526] [<ffffffff8182def2>] entry_SYSCALL_64_fastpath+0x16/0x71
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1607355
Title:
Task (usually mongod) blocked more 120 seconds (lock-ups) in juju on
lxc/lxd + zfs
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1607355/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs